Sample records for structure-guided sequence profiles

  1. Sequencing Structural Variants in Cancer for Precision Therapeutics.

    PubMed

    Macintyre, Geoff; Ylstra, Bauke; Brenton, James D

    2016-09-01

    The identification of mutations that guide therapy selection for patients with cancer is now routine in many clinical centres. The majority of assays used for solid tumour profiling use DNA sequencing to interrogate somatic point mutations because they are relatively easy to identify and interpret. Many cancers, however, including high-grade serous ovarian, oesophageal, and small-cell lung cancer, are driven by somatic structural variants that are not measured by these assays. Therefore, there is currently an unmet need for clinical assays that can cheaply and rapidly profile structural variants in solid tumours. In this review we survey the landscape of 'actionable' structural variants in cancer and identify promising detection strategies based on massively-parallel sequencing. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.

    PubMed

    Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano

    2018-01-01

    Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .

  3. "The devil's in the detail": Release of an expanded, enhanced and dynamically revised forensic STR Sequence Guide.

    PubMed

    Phillips, C; Gettings, K Butler; King, J L; Ballard, D; Bodner, M; Borsuk, L; Parson, W

    2018-05-01

    The STR sequence template file published in 2016 as part of the considerations from the DNA Commission of the International Society for Forensic Genetics on minimal STR sequence nomenclature requirements, has been comprehensively revised and audited using the latest GRCh38 genome assembly. The list of forensic STRs characterized was expanded by including supplementary autosomal, X- and Y-chromosome microsatellites in less common use for routine DNA profiling, but some likely to be adopted in future massively parallel sequencing (MPS) STR panels. We outline several aspects of sequence alignment and annotation that required care and attention to detail when comparing sequences to GRCh37 and GRCh38 assemblies, as well as the necessary matching of MPS-based allele descriptions to previously established repeat region structures described in initial sequencing studies of the less well known forensic STRs. The revised sequence guide is now available in a dynamically updated FTP format from the STRidER website with a date-stamped change log to allow users to explore their own MPS data with the most up-to-date forensic STR sequence information compiled in a simple guide. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. A chemogenomic analysis of the human proteome: application to enzyme families.

    PubMed

    Bernasconi, Paul; Chen, Min; Galasinski, Scott; Popa-Burke, Ioana; Bobasheva, Anna; Coudurier, Louis; Birkos, Steve; Hallam, Rhonda; Janzen, William P

    2007-10-01

    Sequence-based phylogenies (SBP) are well-established tools for describing relationships between proteins. They have been used extensively to predict the behavior and sensitivity toward inhibitors of enzymes within a family. The utility of this approach diminishes when comparing proteins with little sequence homology. Even within an enzyme family, SBPs must be complemented by an orthogonal method that is independent of sequence to better predict enzymatic behavior. A chemogenomic approach is demonstrated here that uses the inhibition profile of a 130,000 diverse molecule library to uncover relationships within a set of enzymes. The profile is used to construct a semimetric additive distance matrix. This matrix, in turn, defines a sequence-independent phylogeny (SIP). The method was applied to 97 enzymes (kinases, proteases, and phosphatases). SIP does not use structural information from the molecules used for establishing the profile, thus providing a more heuristic method than the current approaches, which require knowledge of the specific inhibitor's structure. Within enzyme families, SIP shows a good overall correlation with SBP. More interestingly, SIP uncovers distances within families that are not recognizable by sequence-based methods. In addition, SIP allows the determination of distance between enzymes with no sequence homology, thus uncovering novel relationships not predicted by SBP. This chemogenomic approach, used in conjunction with SBP, should prove to be a powerful tool for choosing target combinations for drug discovery programs as well as for guiding the selection of profiling and liability targets.

  5. Partial DNA-guided Cas9 enables genome editing with reduced off-target activity

    PubMed Central

    Yin, Hao; Song, Chun-Qing; Suresh, Sneha; Kwan, Suet-Yan; Wu, Qiongqiong; Walsh, Stephen; Ding, Junmei; Bogorad, Roman L; Zhu, Lihua Julie; Wolfe, Scot A; Koteliansky, Victor; Xue, Wen; Langer, Robert; Anderson, Daniel G

    2018-01-01

    CRISPR–Cas9 is a versatile RNA-guided genome editing tool. Here we demonstrate that partial replacement of RNA nucleotides with DNA nucleotides in CRISPR RNA (crRNA) enables efficient gene editing in human cells. This strategy of partial DNA replacement retains on-target activity when used with both crRNA and sgRNA, as well as with multiple guide sequences. Partial DNA replacement also works for crRNA of Cpf1, another CRISPR system. We find that partial DNA replacement in the guide sequence significantly reduces off-target genome editing through focused analysis of off-target cleavage, measurement of mismatch tolerance and genome-wide profiling of off-target sites. Using the structure of the Cas9–sgRNA complex as a guide, the majority of the 3′ end of crRNA can be replaced with DNA nucleotide, and the 5 - and 3′-DNA-replaced crRNA enables efficient genome editing. Cas9 guided by a DNA–RNA chimera may provide a generalized strategy to reduce both the cost and the off-target genome editing in human cells. PMID:29377001

  6. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy

    PubMed Central

    Parmeggiani, Fabio; Velasco, D. Alejandro Fernandez; Höcker, Birte; Baker, David

    2015-01-01

    Despite efforts for over 25 years, de novo protein design has not succeeded in achieving the TIM-barrel fold. Here we describe the computational design of 4-fold symmetrical (β/α)8-barrels guided by geometrical and chemical principles. Experimental characterization of 33 designs revealed the importance of sidechain-backbone hydrogen bonding for defining the strand register between repeat units. The X-ray crystal structure of a designed thermostable 184-residue protein is nearly identical with the designed TIM-barrel model. PSI-BLAST searches do not identify sequence similarities to known TIM-barrel proteins, and sensitive profile-profile searches indicate that the design sequence is distant from other naturally occurring TIM-barrel superfamilies, suggesting that Nature has only sampled a subset of the sequence space available to the TIM-barrel fold. The ability to de novo design TIM-barrels opens new possibilities for custom-made enzymes. PMID:26595462

  7. MollDE: a homology modeling framework you can click with.

    PubMed

    Canutescu, Adrian A; Dunbrack, Roland L

    2005-06-15

    Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. http://dunbrack.fccc.edu/molide/molide.php rl_dunbrack@fccc.edu.

  8. CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles.

    PubMed

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole; Petersen, Thomas Nordahl

    2010-07-01

    CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 A when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is <20 min. The web server is available at http://www.cbs.dtu.dk/services/CPHmodels/.

  9. DNA recognition by an RNA-guided bacterial Argonaute

    PubMed Central

    Doudna, Jennifer A.

    2017-01-01

    Argonaute (Ago) proteins are widespread in prokaryotes and eukaryotes and share a four-domain architecture capable of RNA- or DNA-guided nucleic acid recognition. Previous studies identified a prokaryotic Argonaute protein from the eubacterium Marinitoga piezophila (MpAgo), which binds preferentially to 5′-hydroxylated guide RNAs and cleaves single-stranded RNA (ssRNA) and DNA (ssDNA) targets. Here we present a 3.2 Å resolution crystal structure of MpAgo bound to a 21-nucleotide RNA guide and a complementary 21-nucleotide ssDNA substrate. Comparison of this ternary complex to other target-bound Argonaute structures reveals a unique orientation of the N-terminal domain, resulting in a straight helical axis of the entire RNA-DNA heteroduplex through the central cleft of the protein. Additionally, mismatches introduced into the heteroduplex reduce MpAgo cleavage efficiency with a symmetric profile centered around the middle of the helix. This pattern differs from the canonical mismatch tolerance of other Argonautes, which display decreased cleavage efficiency for substrates bearing sequence mismatches to the 5′ region of the guide strand. This structural analysis of MpAgo bound to a hybrid helix advances our understanding of the diversity of target recognition mechanisms by Argonaute proteins. PMID:28520746

  10. Protein Folding and Structure Prediction from the Ground Up: The Atomistic Associative Memory, Water Mediated, Structure and Energy Model.

    PubMed

    Chen, Mingchen; Lin, Xingcheng; Zheng, Weihua; Onuchic, José N; Wolynes, Peter G

    2016-08-25

    The associative memory, water mediated, structure and energy model (AWSEM) is a coarse-grained force field with transferable tertiary interactions that incorporates local in sequence energetic biases using bioinformatically derived structural information about peptide fragments with locally similar sequences that we call memories. The memory information from the protein data bank (PDB) database guides proper protein folding. The structural information about available sequences in the database varies in quality and can sometimes lead to frustrated free energy landscapes locally. One way out of this difficulty is to construct the input fragment memory information from all-atom simulations of portions of the complete polypeptide chain. In this paper, we investigate this approach first put forward by Kwac and Wolynes in a more complete way by studying the structure prediction capabilities of this approach for six α-helical proteins. This scheme which we call the atomistic associative memory, water mediated, structure and energy model (AAWSEM) amounts to an ab initio protein structure prediction method that starts from the ground up without using bioinformatic input. The free energy profiles from AAWSEM show that atomistic fragment memories are sufficient to guide the correct folding when tertiary forces are included. AAWSEM combines the efficiency of coarse-grained simulations on the full protein level with the local structural accuracy achievable from all-atom simulations of only parts of a large protein. The results suggest that a hybrid use of atomistic fragment memory and database memory in structural predictions may well be optimal for many practical applications.

  11. A bacterial Argonaute with noncanonical guide RNA specificity

    PubMed Central

    Kaya, Emine; Doxzen, Kevin W.; Knoll, Kilian R.; Wilson, Ross C.; Strutt, Steven C.; Kranzusch, Philip J.; Doudna, Jennifer A.

    2016-01-01

    Eukaryotic Argonaute proteins induce gene silencing by small RNA-guided recognition and cleavage of mRNA targets. Although structural similarities between human and prokaryotic Argonautes are consistent with shared mechanistic properties, sequence and structure-based alignments suggested that Argonautes encoded within CRISPR-cas [clustered regularly interspaced short palindromic repeats (CRISPR)-associated] bacterial immunity operons have divergent activities. We show here that the CRISPR-associated Marinitoga piezophila Argonaute (MpAgo) protein cleaves single-stranded target sequences using 5′-hydroxylated guide RNAs rather than the 5′-phosphorylated guides used by all known Argonautes. The 2.0-Å resolution crystal structure of an MpAgo–RNA complex reveals a guide strand binding site comprising residues that block 5′ phosphate interactions. Using structure-based sequence alignment, we were able to identify other putative MpAgo-like proteins, all of which are encoded within CRISPR-cas loci. Taken together, our data suggest the evolution of an Argonaute subclass with noncanonical specificity for a 5′-hydroxylated guide. PMID:27035975

  12. Clinical Actionability of Comprehensive Genomic Profiling for Management of Rare or Refractory Cancers

    PubMed Central

    Hirshfield, Kim M.; Tolkunov, Denis; Zhong, Hua; Ali, Siraj M.; Stein, Mark N.; Murphy, Susan; Vig, Hetal; Vazquez, Alexei; Glod, John; Moss, Rebecca A.; Belyi, Vladimir; Chan, Chang S.; Chen, Suzie; Goodell, Lauri; Foran, David; Yelensky, Roman; Palma, Norma A.; Sun, James X.; Miller, Vincent A.; Stephens, Philip J.; Ross, Jeffrey S.; Kaufman, Howard; Poplin, Elizabeth; Mehnert, Janice; Tan, Antoinette R.; Bertino, Joseph R.; Aisner, Joseph; DiPaola, Robert S.

    2016-01-01

    Background. The frequency with which targeted tumor sequencing results will lead to implemented change in care is unclear. Prospective assessment of the feasibility and limitations of using genomic sequencing is critically important. Methods. A prospective clinical study was conducted on 100 patients with diverse-histology, rare, or poor-prognosis cancers to evaluate the clinical actionability of a Clinical Laboratory Improvement Amendments (CLIA)-certified, comprehensive genomic profiling assay (FoundationOne), using formalin-fixed, paraffin-embedded tumors. The primary objectives were to assess utility, feasibility, and limitations of genomic sequencing for genomically guided therapy or other clinical purpose in the setting of a multidisciplinary molecular tumor board. Results. Of the tumors from the 92 patients with sufficient tissue, 88 (96%) had at least one genomic alteration (average 3.6, range 0–10). Commonly altered pathways included p53 (46%), RAS/RAF/MAPK (rat sarcoma; rapidly accelerated fibrosarcoma; mitogen-activated protein kinase) (45%), receptor tyrosine kinases/ligand (44%), PI3K/AKT/mTOR (phosphatidylinositol-4,5-bisphosphate 3-kinase; protein kinase B; mammalian target of rapamycin) (35%), transcription factors/regulators (31%), and cell cycle regulators (30%). Many low frequency but potentially actionable alterations were identified in diverse histologies. Use of comprehensive profiling led to implementable clinical action in 35% of tumors with genomic alterations, including genomically guided therapy, diagnostic modification, and trigger for germline genetic testing. Conclusion. Use of targeted next-generation sequencing in the setting of an institutional molecular tumor board led to implementable clinical action in more than one third of patients with rare and poor-prognosis cancers. Major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access. Early and serial sequencing in the clinical course and expanded access to genomically guided early-phase clinical trials and targeted agents may increase actionability. Implications for Practice: Identification of key factors that facilitate use of genomic tumor testing results and implementation of genomically guided therapy may lead to enhanced benefit for patients with rare or difficult to treat cancers. Clinical use of a targeted next-generation sequencing assay in the setting of an institutional molecular tumor board led to implementable clinical action in over one third of patients with rare and poor prognosis cancers. The major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access both on trial and off label. Approaches to increase actionability include early and serial sequencing in the clinical course and expanded access to genomically guided early phase clinical trials and targeted agents. PMID:27566247

  13. Simple chained guide trees give high-quality protein multiple sequence alignments

    PubMed Central

    Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.

    2014-01-01

    Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495

  14. To Guide or Not to Guide: Issues in the Sequencing of Pedagogical Structure in Computational Model-Based Learning

    ERIC Educational Resources Information Center

    Jacobson, Michael J.; Kim, Beaumie; Pathak, Suneeta; Zhang, BaoHui

    2015-01-01

    This research explores issues related to the sequencing of structure that is provided as pedagogical guidance. A study was conducted that involved grade 10 students in Singapore as they learned concepts about electricity using four NetLogo Investigations of Electricity agent-based models. It was found that the low-to-high structure learning…

  15. Evolutionary profiles from the QR factorization of multiple sequence alignments

    PubMed Central

    Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

    2005-01-01

    We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270

  16. The molecular genetic makeup of acute lymphoblastic leukemia.

    PubMed

    Mullighan, Charles G

    2012-01-01

    Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention. Mutations in genes regulating lymphoid development are a hallmark of ALL, and alterations of the lymphoid transcription factor gene IKZF1 (IKAROS) are associated with a high risk of treatment failure in B-ALL. Approximately 20% of B-ALL cases harbor genetic alterations that activate kinase signaling that may be amenable to treatment with tyrosine kinase inhibitors, including rearrangements of the cytokine receptor gene CRLF2; rearrangements of ABL1, JAK2, and PDGFRB; and mutations of JAK1 and JAK2. Whole-genome sequencing has also identified novel targets of mutation in aggressive T-lineage ALL, including hematopoietic regulators (ETV6 and RUNX1), tyrosine kinases, and epigenetic regulators. Challenges for the future are to comprehensively identify and experimentally validate all genetic alterations driving leukemogenesis and treatment failure in childhood and adult ALL and to implement genomic profiling into the clinical setting to guide risk stratification and targeted therapy.

  17. Optimized guide RNA structure for genome editing via Cas9

    PubMed Central

    Xu, Jianyong; Lian, Wei; Jia, Yuning; Li, Lingyun; Huang, Zhong

    2017-01-01

    The genome editing tool Cas9-gRNA (guide RNA) has been successfully applied in different cell types and organisms with high efficiency. However, more efforts need to be made to enhance both efficiency and specificity. In the current study, we optimized the guide RNA structure of Streptococcus pyogenes CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated) system to improve its genome editing efficiency. Comparing with the original functional structure of guide RNA, which is composed of crRNA and tracrRNA, the widely used chimeric gRNA has shorter crRNA and tracrRNA sequence. The deleted RNA sequence could form extra loop structure, which might enhance the stability of the guide RNA structure and subsequently the genome editing efficiency. Thus the genome editing efficiency of different forms of guide RNA was tested. And we found that the chimeric structure of gRNA with original full length of crRNA and tracrRNA showed higher genome editing efficiency than the conventional chimeric structure or other types of gRNA we tested. Therefore our data here uncovered the new type of gRNA structure with higher genome editing efficiency. PMID:29212218

  18. Biochemical identification of Argonaute 2 as the sole protein required for RNA-induced silencing complex activity

    PubMed Central

    Rand, Tim A.; Ginalski, Krzysztof; Grishin, Nick V.; Wang, Xiaodong

    2004-01-01

    RNA interference is carried out by the small double-stranded RNA-induced silencing complex (RISC). The RISC-bound small RNA guides the RISC complex to identify and cleave mRNAs with complementary sequences. The proteins that make up the RISC complex and cleave mRNA have not been unequivocally defined. Here, we report the biochemical purification of RISC activity to homogeneity from Drosophila Schnieder 2 cell extracts. Argonaute 2 (Ago-2) is the sole protein component present in the purified, functional RISC. By using a bioinformatics method that combines sequence-profile analysis with predicted protein secondary structure, we found homology between the PIWI domain of Ago-2 and endonuclease V and identified potential active-site amino acid residues within the PIWI domain of Ago-2. PMID:15452342

  19. Biochemical identification of Argonaute 2 as the sole protein required for RNA-induced silencing complex activity.

    PubMed

    Rand, Tim A; Ginalski, Krzysztof; Grishin, Nick V; Wang, Xiaodong

    2004-10-05

    RNA interference is carried out by the small double-stranded RNA-induced silencing complex (RISC). The RISC-bound small RNA guides the RISC complex to identify and cleave mRNAs with complementary sequences. The proteins that make up the RISC complex and cleave mRNA have not been unequivocally defined. Here, we report the biochemical purification of RISC activity to homogeneity from Drosophila Schnieder 2 cell extracts. Argonaute 2 (Ago-2) is the sole protein component present in the purified, functional RISC. By using a bioinformatics method that combines sequence-profile analysis with predicted protein secondary structure, we found homology between the PIWI domain of Ago-2 and endonuclease V and identified potential active-site amino acid residues within the PIWI domain of Ago-2.

  20. An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis

    PubMed Central

    Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang

    2013-01-01

    Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234

  1. Guiding principles for peptide nanotechnology through directed discovery.

    PubMed

    Lampel, A; Ulijn, R V; Tuttle, T

    2018-05-21

    Life's diverse molecular functions are largely based on only a small number of highly conserved building blocks - the twenty canonical amino acids. These building blocks are chemically simple, but when they are organized in three-dimensional structures of tremendous complexity, new properties emerge. This review explores recent efforts in the directed discovery of functional nanoscale systems and materials based on these same amino acids, but that are not guided by copying or editing biological systems. The review summarises insights obtained using three complementary approaches of searching the sequence space to explore sequence-structure relationships for assembly, reactivity and complexation, namely: (i) strategic editing of short peptide sequences; (ii) computational approaches to predicting and comparing assembly behaviours; (iii) dynamic peptide libraries that explore the free energy landscape. These approaches give rise to guiding principles on controlling order/disorder, complexation and reactivity by peptide sequence design.

  2. High throughput profile-profile based fold recognition for the entire human proteome.

    PubMed

    McGuffin, Liam J; Smith, Richard T; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T

    2006-06-07

    In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power. In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.

  3. GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

    PubMed Central

    Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

    2014-01-01

    CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782

  4. PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences.

    PubMed

    Ganesan, K; Parthasarathy, S

    2011-12-01

    Annotation of any newly determined protein sequence depends on the pairwise sequence identity with known sequences. However, for the twilight zone sequences which have only 15-25% identity, the pair-wise comparison methods are inadequate and the annotation becomes a challenging task. Such sequences can be annotated by using methods that recognize their fold. Bowie et al. described a 3D1D profile method in which the amino acid sequences that fold into a known 3D structure are identified by their compatibility to that known 3D structure. We have improved the above method by using the predicted secondary structure information and employ it for fold recognition from the twilight zone sequences. In our Protein Secondary Structure 3D1D (PSS-3D1D) method, a score (w) for the predicted secondary structure of the query sequence is included in finding the compatibility of the query sequence to the known fold 3D structures. In the benchmarks, the PSS-3D1D method shows a maximum of 21% improvement in predicting correctly the α + β class of folds from the sequences with twilight zone level of identity, when compared with the 3D1D profile method. Hence, the PSS-3D1D method could offer more clues than the 3D1D method for the annotation of twilight zone sequences. The web based PSS-3D1D method is freely available in the PredictFold server at http://bioinfo.bdu.ac.in/servers/ .

  5. Genomic Heterogeneity as a Barrier to Precision Medicine in Gastroesophageal Adenocarcinoma.

    PubMed

    Pectasides, Eirini; Stachler, Matthew D; Derks, Sarah; Liu, Yang; Maron, Steven; Islam, Mirazul; Alpert, Lindsay; Kwak, Heewon; Kindler, Hedy; Polite, Blase; Sharma, Manish R; Allen, Kenisha; O'Day, Emily; Lomnicki, Samantha; Maranto, Melissa; Kanteti, Rajani; Fitzpatrick, Carrie; Weber, Christopher; Setia, Namrata; Xiao, Shu-Yuan; Hart, John; Nagy, Rebecca J; Kim, Kyoung-Mee; Choi, Min-Gew; Min, Byung-Hoon; Nason, Katie S; O'Keefe, Lea; Watanabe, Masayuki; Baba, Hideo; Lanman, Rick; Agoston, Agoston T; Oh, David J; Dunford, Andrew; Thorner, Aaron R; Ducar, Matthew D; Wollison, Bruce M; Coleman, Haley A; Ji, Yuan; Posner, Mitchell C; Roggin, Kevin; Turaga, Kiran; Chang, Paul; Hogarth, Kyle; Siddiqui, Uzma; Gelrud, Andres; Ha, Gavin; Freeman, Samuel S; Rhoades, Justin; Reed, Sarah; Gydush, Greg; Rotem, Denisse; Davison, Jon; Imamura, Yu; Adalsteinsson, Viktor; Lee, Jeeyun; Bass, Adam J; Catenacci, Daniel V

    2018-01-01

    Gastroesophageal adenocarcinoma (GEA) is a lethal disease where targeted therapies, even when guided by genomic biomarkers, have had limited efficacy. A potential reason for the failure of such therapies is that genomic profiling results could commonly differ between the primary and metastatic tumors. To evaluate genomic heterogeneity, we sequenced paired primary GEA and synchronous metastatic lesions across multiple cohorts, finding extensive differences in genomic alterations, including discrepancies in potentially clinically relevant alterations. Multiregion sequencing showed significant discrepancy within the primary tumor (PT) and between the PT and disseminated disease, with oncogene amplification profiles commonly discordant. In addition, a pilot analysis of cell-free DNA (cfDNA) sequencing demonstrated the feasibility of detecting genomic amplifications not detected in PT sampling. Lastly, we profiled paired primary tumors, metastatic tumors, and cfDNA from patients enrolled in the personalized antibodies for GEA (PANGEA) trial of targeted therapies in GEA and found that genomic biomarkers were recurrently discrepant between the PT and untreated metastases. Divergent primary and metastatic tissue profiling led to treatment reassignment in 32% (9/28) of patients. In discordant primary and metastatic lesions, we found 87.5% concordance for targetable alterations in metastatic tissue and cfDNA, suggesting the potential for cfDNA profiling to enhance selection of therapy. Significance: We demonstrate frequent baseline heterogeneity in targetable genomic alterations in GEA, indicating that current tissue sampling practices for biomarker testing do not effectively guide precision medicine in this disease and that routine profiling of metastatic lesions and/or cfDNA should be systematically evaluated. Cancer Discov; 8(1); 37-48. ©2017 AACR. See related commentary by Sundar and Tan, p. 14 See related article by Janjigian et al., p. 49 This article is highlighted in the In This Issue feature, p. 1 . ©2017 American Association for Cancer Research.

  6. A population study of the minicircles in Trypanosoma cruzi: predicting guide RNAs in the absence of empirical RNA editing.

    PubMed

    Thomas, Sean; Martinez, L L Isadora Trejo; Westenberger, Scott J; Sturm, Nancy R

    2007-05-24

    The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised. Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found. The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.

  7. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  8. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    PubMed Central

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  9. Structure and specificity of the RNA-guided endonuclease Cas9 during DNA interrogation, target binding and cleavage

    PubMed Central

    Josephs, Eric A.; Kocak, D. Dewran; Fitzgibbon, Christopher J.; McMenemy, Joshua; Gersbach, Charles A.; Marszalek, Piotr E.

    2015-01-01

    CRISPR-associated endonuclease Cas9 cuts DNA at variable target sites designated by a Cas9-bound RNA molecule. Cas9's ability to be directed by single ‘guide RNA’ molecules to target nearly any sequence has been recently exploited for a number of emerging biological and medical applications. Therefore, understanding the nature of Cas9's off-target activity is of paramount importance for its practical use. Using atomic force microscopy (AFM), we directly resolve individual Cas9 and nuclease-inactive dCas9 proteins as they bind along engineered DNA substrates. High-resolution imaging allows us to determine their relative propensities to bind with different guide RNA variants to targeted or off-target sequences. Mapping the structural properties of Cas9 and dCas9 to their respective binding sites reveals a progressive conformational transformation at DNA sites with increasing sequence similarity to its target. With kinetic Monte Carlo (KMC) simulations, these results provide evidence of a ‘conformational gating’ mechanism driven by the interactions between the guide RNA and the 14th–17th nucleotide region of the targeted DNA, the stabilities of which we find correlate significantly with reported off-target cleavage rates. KMC simulations also reveal potential methodologies to engineer guide RNA sequences with improved specificity by considering the invasion of guide RNAs into targeted DNA duplex. PMID:26384421

  10. Protein Interaction Profile Sequencing (PIP-seq).

    PubMed

    Foley, Shawn W; Gregory, Brian D

    2016-10-10

    Every eukaryotic RNA transcript undergoes extensive post-transcriptional processing from the moment of transcription up through degradation. This regulation is performed by a distinct cohort of RNA-binding proteins which recognize their target transcript by both its primary sequence and secondary structure. Here, we describe protein interaction profile sequencing (PIP-seq), a technique that uses ribonuclease-based footprinting followed by high-throughput sequencing to globally assess both protein-bound RNA sequences and RNA secondary structure. PIP-seq utilizes single- and double-stranded RNA-specific nucleases in the absence of proteins to infer RNA secondary structure. These libraries are also compared to samples that undergo nuclease digestion in the presence of proteins in order to find enriched protein-bound sequences. Combined, these four libraries provide a comprehensive, transcriptome-wide view of RNA secondary structure and RNA protein interaction sites from a single experimental technique. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  11. ORION: a web server for protein fold recognition and structure prediction using evolutionary hybrid profiles

    PubMed Central

    Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G.; Gelly, Jean-Christophe

    2016-01-01

    Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation —with Protein Blocks—, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the ‘Hard’ category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/. PMID:27319297

  12. ORION: a web server for protein fold recognition and structure prediction using evolutionary hybrid profiles.

    PubMed

    Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G; Gelly, Jean-Christophe

    2016-06-20

    Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation -with Protein Blocks-, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the 'Hard' category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/.

  13. Understanding Genetic Toxicity Through Data Mining: The ...

    EPA Pesticide Factsheets

    This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies. This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies.

  14. Selective 2′-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile, and accurate RNA structure analysis

    PubMed Central

    Smola, Matthew J.; Rice, Greggory M.; Busan, Steven; Siegfried, Nathan A.; Weeks, Kevin M.

    2016-01-01

    SHAPE chemistries exploit small electrophilic reagents that react with the 2′-hydroxyl group to interrogate RNA structure at single-nucleotide resolution. Mutational profiling (MaP) identifies modified residues based on the ability of reverse transcriptase to misread a SHAPE-modified nucleotide and then counting the resulting mutations by massively parallel sequencing. The SHAPE-MaP approach measures the structure of large and transcriptome-wide systems as accurately as for simple model RNAs. This protocol describes the experimental steps, implemented over three days, required to perform SHAPE probing and construct multiplexed SHAPE-MaP libraries suitable for deep sequencing. These steps include RNA folding and SHAPE structure probing, mutational profiling by reverse transcription, library construction, and sequencing. Automated processing of MaP sequencing data is accomplished using two software packages. ShapeMapper converts raw sequencing files into mutational profiles, creates SHAPE reactivity plots, and provides useful troubleshooting information, often within an hour. SuperFold uses these data to model RNA secondary structures, identify regions with well-defined structures, and visualize probable and alternative helices, often in under a day. We illustrate these algorithms with the E. coli thiamine pyrophosphate riboswitch, E. coli 16S rRNA, and HIV-1 genomic RNAs. SHAPE-MaP can be used to make nucleotide-resolution biophysical measurements of individual RNA motifs, rare components of complex RNA ensembles, and entire transcriptomes. The straightforward MaP strategy greatly expands the number, length, and complexity of analyzable RNA structures. PMID:26426499

  15. Guide and position of the International Society of Nutrigenetics/Nutrigenomics on personalised nutrition: Part 1 - fields of precision nutrition

    USDA-ARS?s Scientific Manuscript database

    Diversity in the genetic profile between individuals and specific ethnic groups affects nutrient requirements, metabolism and response to nutritional and dietary interventions. Indeed, individuals respond differently to lifestyle interventions (diet, physical activity, smoking, etc.). The sequencing...

  16. ReprDB and panDB: minimalist databases with maximal microbial representation.

    PubMed

    Zhou, Wei; Gay, Nicole; Oh, Julia

    2018-01-18

    Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.

  17. Mining SNPs from EST sequences using filters and ensemble classifiers.

    PubMed

    Wang, J; Zou, Q; Guo, M Z

    2010-05-04

    Abundant single nucleotide polymorphisms (SNPs) provide the most complete information for genome-wide association studies. However, due to the bottleneck of manual discovery of putative SNPs and the inaccessibility of the original sequencing reads, it is essential to develop a more efficient and accurate computational method for automated SNP detection. We propose a novel computational method to rapidly find true SNPs in public-available EST (expressed sequence tag) databases; this method is implemented as SNPDigger. EST sequences are clustered and aligned. SNP candidates are then obtained according to a measure of redundant frequency. Several new informative biological features, such as the structural neighbor profiles and the physical position of the SNP, were extracted from EST sequences, and the effectiveness of these features was demonstrated. An ensemble classifier, which employs a carefully selected feature set, was included for the imbalanced training data. The sensitivity and specificity of our method both exceeded 80% for human genetic data in the cross validation. Our method enables detection of SNPs from the user's own EST dataset and can be used on species for which there is no genome data. Our tests showed that this method can effectively guide SNP discovery in ESTs and will be useful to avoid and save the cost of biological analyses.

  18. The discovery and preclinical evaluation of BMS-707035, a potent HIV-1 integrase strand transfer inhibitor.

    PubMed

    Naidu, B Narasimhulu; Walker, Michael A; Sorenson, Margaret E; Ueda, Yasutsugu; Matiskella, John D; Connolly, Timothy P; Dicker, Ira B; Lin, Zeyu; Bollini, Sagarika; Terry, Brian J; Higley, Helen; Zheng, Ming; Parker, Dawn D; Wu, Dedong; Adams, Stephen; Krystal, Mark R; Meanwell, Nicholas A

    2018-07-01

    BMS-707035 is an HIV-1 integrase strand transfer inhibitor (INSTI) discovered by systematic optimization of N-methylpyrimidinone carboxamides guided by structure-activity relationships (SARs) and the single crystal X-ray structure of compound 10. It was rationalized that the unexpectedly advantageous profiles of N-methylpyrimidinone carboxamides with a saturated C2-substitutent may be due, in part, to the geometric relationship between the C2-substituent and the pyrimidinone core. The single crystal X-ray structure of 10 provided support for this reasoning and guided the design of a spirocyclic series 12 which led to discovery of the morpholino-fused pyrimidinone series 13. Several carboxamides derived from this bicyclic scaffold displayed improved antiviral activity and pharmacokinetic profiles when compared with corresponding spirocyclic analogs. Based on the excellent antiviral activity, preclinical profiles and acceptable in vitro and in vivo toxicity profiles, 13a (BMS-707035) was selected for advancement into phase I clinical trials. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. Selective 2'-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile and accurate RNA structure analysis.

    PubMed

    Smola, Matthew J; Rice, Greggory M; Busan, Steven; Siegfried, Nathan A; Weeks, Kevin M

    2015-11-01

    Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistries exploit small electrophilic reagents that react with 2'-hydroxyl groups to interrogate RNA structure at single-nucleotide resolution. Mutational profiling (MaP) identifies modified residues by using reverse transcriptase to misread a SHAPE-modified nucleotide and then counting the resulting mutations by massively parallel sequencing. The SHAPE-MaP approach measures the structure of large and transcriptome-wide systems as accurately as can be done for simple model RNAs. This protocol describes the experimental steps, implemented over 3 d, that are required to perform SHAPE probing and to construct multiplexed SHAPE-MaP libraries suitable for deep sequencing. Automated processing of MaP sequencing data is accomplished using two software packages. ShapeMapper converts raw sequencing files into mutational profiles, creates SHAPE reactivity plots and provides useful troubleshooting information. SuperFold uses these data to model RNA secondary structures, identify regions with well-defined structures and visualize probable and alternative helices, often in under 1 d. SHAPE-MaP can be used to make nucleotide-resolution biophysical measurements of individual RNA motifs, rare components of complex RNA ensembles and entire transcriptomes.

  20. NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data.

    PubMed

    Mao, Wusong; Cong, Peisheng; Wang, Zhiheng; Lu, Longjian; Zhu, Zhongliang; Li, Tonghua

    2013-01-01

    Shape string is structural sequence and is an extremely important structure representation of protein backbone conformations. Nuclear magnetic resonance chemical shifts give a strong correlation with the local protein structure, and are exploited to predict protein structures in conjunction with computational approaches. Here we demonstrate a novel approach, NMRDSP, which can accurately predict the protein shape string based on nuclear magnetic resonance chemical shifts and structural profiles obtained from sequence data. The NMRDSP uses six chemical shifts (HA, H, N, CA, CB and C) and eight elements of structure profiles as features, a non-redundant set (1,003 entries) as the training set, and a conditional random field as a classification algorithm. For an independent testing set (203 entries), we achieved an accuracy of 75.8% for S8 (the eight states accuracy) and 87.8% for S3 (the three states accuracy). This is higher than only using chemical shifts or sequence data, and confirms that the chemical shift and the structure profile are significant features for shape string prediction and their combination prominently improves the accuracy of the predictor. We have constructed the NMRDSP web server and believe it could be employed to provide a solid platform to predict other protein structures and functions. The NMRDSP web server is freely available at http://cal.tongji.edu.cn/NMRDSP/index.jsp.

  1. Energy--Structure--Life. A Learning System for Understanding Science, Book Two.

    ERIC Educational Resources Information Center

    Bixby, Louis W.; And Others

    This learning guide contains materials for the second year of Energy/Structure/ Life, a two year high school program in integrated science. The guide is programed to permit the student to proceed on his own at a self-determined pace. The two year course is a sequence of physics, chemistry, and biology with the chemical (continued from the first…

  2. Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies.

    PubMed

    Card, Daren C; Schield, Drew R; Reyes-Velasco, Jacobo; Fujita, Matthew K; Andrew, Audra L; Oyler-McCance, Sara J; Fike, Jennifer A; Tomback, Diana F; Ruggiero, Robert P; Castoe, Todd A

    2014-01-01

    As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5-5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.

  3. Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies

    USGS Publications Warehouse

    Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthre K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

    2014-01-01

    As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (~3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.

  4. Expression profiling of the mouse early embryo: Reflections and Perspectives

    PubMed Central

    Ko, Minoru S. H.

    2008-01-01

    Laboratory mouse plays important role in our understanding of early mammalian development and provides invaluable model for human early embryos, which are difficult to study for ethical and technical reasons. Comprehensive collection of cDNA clones, their sequences, and complete genome sequence information, which have been accumulated over last two decades, have provided even more advantages to mouse models. Here the progress in global gene expression profiling in early mouse embryos and, to some extent, stem cells are reviewed and the future directions and challenges are discussed. The discussions include the restatement of global gene expression profiles as snapshot of cellular status, and subsequent distinction between the differentiation state and physiological state of the cells. The discussions then extend to the biological problems that can be addressed only through global expression profiling, which include: bird’s-eye view of global gene expression changes, molecular index for developmental potency, cell lineage trajectory, microarray-guided cell manipulation, and the possibility of delineating gene regulatory cascades and networks. PMID:16739220

  5. Guide-substrate base-pairing requirement for box H/ACA RNA-guided RNA pseudouridylation.

    PubMed

    De Zoysa, Meemanage D; Wu, Guowei; Katz, Raviv; Yu, Yi-Tao

    2018-06-05

    Box H/ACA RNAs are a group of small RNAs found in abundance in eukaryotes (as well as in archaea). Although their sequences differ, eukaryotic box H/ACA RNAs all share the same unique hairpin-hinge-hairpin-tail structure. Almost all of them function as guides that primarily direct pseudouridylation of rRNAs and spliceosomal snRNAs at specific sites. Although box H/ACA RNA-guided pseudouridylation has been extensively studied, the detailed rules governing this reaction, especially those concerning the guide RNA-substrate RNA base-pairing interactions that determine the specificity and efficiency of pseudouridylation, are still not exactly clear. This is particularly relevant given that the lengths of the guide sequences involved in base-pairing vary from one box H/ACA RNA to another. Here, we carry out a detailed investigation into guide-substrate base-pairing interactions, and identify the minimum number of base-pairs (8), required for RNA-guided pseudouridylation. In addition, we find that the pseudouridylation pocket, present in each hairpin of box H/ACA RNA, exhibits flexibility in fitting slightly different substrate sequences. Our results are consistent across three independent pseudouridylation pockets tested, suggesting that our findings are generally applicable to box H/ACA RNA-guided RNA pseudouridylation. Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. Structure of yeast Argonaute with guide RNA

    PubMed Central

    Nakanishi, Kotaro; Weinberg, David E.; Bartel, David P.; Patel, Dinshaw J.

    2012-01-01

    The RNA-induced silencing complex, comprising Argonaute and guide RNA, mediates RNA interference. Here we report the 3.2 Å crystal structure of Kluyveromyces Argonaute (KpAGO) fortuitously complexed with guide RNA originating from small-RNA duplexes autonomously loaded and processed by recombinant KpAGO. Despite their diverse sequences, guide-RNA nucleotides 1–8 are positioned similarly, with sequence-independent contacts to bases, phosphates and 2′-hydroxyl groups pre-organizing the backbone of nucleotides 2–8 in a near–A-form conformation. Compared with prokaryotic Argonautes, KpAGO has numerous surface-exposed insertion segments, with a cluster of conserved insertions repositioning the N domain to enable full propagation of guide–target pairing. Compared with Argonautes in inactive conformations, KpAGO has a hydrogen-bond network that stabilizes an expanded and repositioned loop, which inserts an invariant glutamate into the catalytic pocket. Mutation analyses and analogies to Ribonuclease H indicate that insertion of this glutamate finger completes a universally conserved catalytic tetrad, thereby activating Argonaute for RNA cleavage. PMID:22722195

  7. Functional evolution and structural conservation in chimeric cytochromes p450: calibrating a structure-guided approach.

    PubMed

    Otey, Christopher R; Silberg, Jonathan J; Voigt, Christopher A; Endelman, Jeffrey B; Bandara, Geethani; Arnold, Frances H

    2004-03-01

    Recombination generates chimeric proteins whose ability to fold depends on minimizing structural perturbations that result when portions of the sequence are inherited from different parents. These chimeric sequences can display functional properties characteristic of the parents or acquire entirely new functions. Seventeen chimeras were generated from two CYP102 members of the functionally diverse cytochrome p450 family. Chimeras predicted to have limited structural disruption, as defined by the SCHEMA algorithm, displayed CO binding spectra characteristic of folded p450s. Even this small population exhibited significant functional diversity: chimeras displayed altered substrate specificities, a wide range in thermostabilities, up to a 40-fold increase in peroxidase activity, and ability to hydroxylate a substrate toward which neither parent heme domain shows detectable activity. These results suggest that SCHEMA-guided recombination can be used to generate diverse p450s for exploring function evolution within the p450 structural framework.

  8. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    PubMed

    Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

    2015-01-01

    Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  9. Information Security: A Scientometric Study of the Profile, Structure, and Dynamics of an Emerging Scholarly Specialty

    ERIC Educational Resources Information Center

    Olijnyk, Nicholas Victor

    2014-01-01

    The central aim of the current research is to explore and describe the profile, dynamics, and structure of the information security specialty. This study's objectives are guided by four research questions: 1. What are the salient features of information security as a specialty? 2. How has the information security specialty emerged and evolved from…

  10. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

    PubMed

    Neuwald, Andrew F

    2009-08-01

    The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.

  11. CATH-Gene3D: Generation of the Resource and Its Use in Obtaining Structural and Functional Annotations for Protein Sequences.

    PubMed

    Dawson, Natalie L; Sillitoe, Ian; Lees, Jonathan G; Lam, Su Datt; Orengo, Christine A

    2017-01-01

    This chapter describes the generation of the data in the CATH-Gene3D online resource and how it can be used to study protein domains and their evolutionary relationships. Methods will be presented for: comparing protein structures, recognizing homologs, predicting domain structures within protein sequences, and subclassifying superfamilies into functionally pure families, together with a guide on using the webpages.

  12. Structural Basis for Guide RNA Processing and Seed-Dependent DNA Targeting by CRISPR-Cas12a.

    PubMed

    Swarts, Daan C; van der Oost, John; Jinek, Martin

    2017-04-20

    The CRISPR-associated protein Cas12a (Cpf1), which has been repurposed for genome editing, possesses two distinct nuclease activities: endoribonuclease activity for processing its own guide RNAs and RNA-guided DNase activity for target DNA cleavage. To elucidate the molecular basis of both activities, we determined crystal structures of Francisella novicida Cas12a bound to guide RNA and in complex with an R-loop formed by a non-cleavable guide RNA precursor and a full-length target DNA. Corroborated by biochemical experiments, these structures reveal the mechanisms of guide RNA processing and pre-ordering of the seed sequence in the guide RNA that primes Cas12a for target DNA binding. Furthermore, the R-loop complex structure reveals the strand displacement mechanism that facilitates guide-target hybridization and suggests a mechanism for double-stranded DNA cleavage involving a single active site. Together, these insights advance our mechanistic understanding of Cas12a enzymes and may contribute to further development of genome editing technologies. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Sequence-structure mapping errors in the PDB: OB-fold domains

    PubMed Central

    Venclovas, Česlovas; Ginalski, Krzysztof; Kang, Chulhee

    2004-01-01

    The Protein Data Bank (PDB) is the single most important repository of structural data for proteins and other biologically relevant molecules. Therefore, it is critically important to keep the PDB data, as much as possible, error-free. In this study, we have analyzed PDB crystal structures possessing oligonucleotide/oligosaccharide binding (OB)-fold, one of the highly populated folds, for the presence of sequence-structure mapping errors. Using energy-based structure quality assessment coupled with sequence analyses, we have found that there are at least five OB-structures in the PDB that have regions where sequences have been incorrectly mapped onto the structure. We have demonstrated that the combination of these computation techniques is effective not only in detecting sequence-structure mapping errors, but also in providing guidance to correct them. Namely, we have used results of computational analysis to direct a revision of X-ray data for one of the PDB entries containing a fairly inconspicuous sequence-structure mapping error. The revised structure has been deposited with the PDB. We suggest use of computational energy assessment and sequence analysis techniques to facilitate structure determination when homologs having known structure are available to use as a reference. Such computational analysis may be useful in either guiding the sequence-structure assignment process or verifying the sequence mapping within poorly defined regions. PMID:15133161

  14. A high-throughput approach to profile RNA structure.

    PubMed

    Delli Ponti, Riccardo; Marti, Stefanie; Armaos, Alexandros; Tartaglia, Gian Gaetano

    2017-03-17

    Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Argonaute-based programmable RNase as a tool for cleavage of highly-structured RNA.

    PubMed

    Dayeh, Daniel M; Cantara, William A; Kitzrow, Jonathan P; Musier-Forsyth, Karin; Nakanishi, Kotaro

    2018-06-12

    The recent identification and development of RNA-guided enzymes for programmable cleavage of target nucleic acids offers exciting possibilities for both therapeutic and biotechnological applications. However, critical challenges such as expensive guide RNAs and inability to predict the efficiency of target recognition, especially for highly-structured RNAs, remain to be addressed. Here, we introduce a programmable RNA restriction enzyme, based on a budding yeast Argonaute (AGO), programmed with cost-effective 23-nucleotide (nt) single-stranded DNAs as guides. DNA guides offer the advantage that diverse sequences can be easily designed and purchased, enabling high-throughput screening to identify optimal recognition sites in the target RNA. Using this DNA-induced slicing complex (DISC) programmed with 11 different guide DNAs designed to span the sequence, sites of cleavage were identified in the 352-nt human immunodeficiency virus type 1 5'-untranslated region. This assay, coupled with primer extension and capillary electrophoresis, allows detection and relative quantification of all DISC-cleavage sites simultaneously in a single reaction. Comparison between DISC cleavage and RNase H cleavage reveals that DISC not only cleaves solvent-exposed sites, but also sites that become more accessible upon DISC binding. This study demonstrates the advantages of the DISC system for programmable cleavage of highly-structured, functional RNAs.

  16. Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

    PubMed

    Bidargaddi, Niranjan P; Chetty, Madhu; Kamruzzaman, Joarder

    2008-06-01

    Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.

  17. Ribosome profiling reveals the what, when, where and how of protein synthesis.

    PubMed

    Brar, Gloria A; Weissman, Jonathan S

    2015-11-01

    Ribosome profiling, which involves the deep sequencing of ribosome-protected mRNA fragments, is a powerful tool for globally monitoring translation in vivo. The method has facilitated discovery of the regulation of gene expression underlying diverse and complex biological processes, of important aspects of the mechanism of protein synthesis, and even of new proteins, by providing a systematic approach for experimental annotation of coding regions. Here, we introduce the methodology of ribosome profiling and discuss examples in which this approach has been a key factor in guiding biological discovery, including its prominent role in identifying thousands of novel translated short open reading frames and alternative translation products.

  18. RNomics and Modomics in the halophilic archaea Haloferax volcanii: identification of RNA modification genes

    PubMed Central

    Grosjean, Henri; Gaspin, Christine; Marck, Christian; Decatur, Wayne A; de Crécy-Lagard, Valérie

    2008-01-01

    Background Naturally occurring RNAs contain numerous enzymatically altered nucleosides. Differences in RNA populations (RNomics) and pattern of RNA modifications (Modomics) depends on the organism analyzed and are two of the criteria that distinguish the three kingdoms of life. If the genomic sequences of the RNA molecules can be derived from whole genome sequence information, the modification profile cannot and requires or direct sequencing of the RNAs or predictive methods base on the presence or absence of the modifications genes. Results By employing a comparative genomics approach, we predicted almost all of the genes coding for the t+rRNA modification enzymes in the mesophilic moderate halophile Haloferax volcanii. These encode both guide RNAs and enzymes. Some are orthologous to previously identified genes in Archaea, Bacteria or in Saccharomyces cerevisiae, but several are original predictions. Conclusion The number of modifications in t+rRNAs in the halophilic archaeon is surprisingly low when compared with other Archaea or Bacteria, particularly the hyperthermophilic organisms. This may result from the specific lifestyle of halophiles that require high intracellular salt concentration for survival. This salt content could allow RNA to maintain its functional structural integrity with fewer modifications. We predict that the few modifications present must be particularly important for decoding, accuracy of translation or are modifications that cannot be functionally replaced by the electrostatic interactions provided by the surrounding salt-ions. This analysis also guides future experimental validation work aiming to complete the understanding of the function of RNA modifications in Archaeal translation. PMID:18844986

  19. Impact of genomic profiling on the treatment and outcomes of patients with advanced gastrointestinal malignancies.

    PubMed

    Dhir, Mashaal; Choudry, Haroon A; Holtzman, Matthew P; Pingpank, James F; Ahrendt, Steven A; Zureikat, Amer H; Hogg, Melissa E; Bartlett, David L; Zeh, Herbert J; Singhi, Aatur D; Bahary, Nathan

    2017-01-01

    The impact of genomic profiling on the outcomes of patients with advanced gastrointestinal (GI) malignancies remains unknown. The primary objectives of the study were to investigate the clinical benefit of genomic-guided therapy, defined as complete response (CR), partial response (PR), or stable disease (SD) at 3 months, and its impact on progression-free survival (PFS) in patients with advanced GI malignancies. Clinical and genomic data of all consecutive GI tumor samples from April, 2013 to April, 2016 sequenced by FoundationOne were obtained and analyzed. A total of 101 samples from 97 patients were analyzed. Ninety-eight samples from 95 patients could be amplified making this approach feasible in 97% of the samples. After removing duplicates, 95 samples from 95 patients were included in the further analysis. Median time from specimen collection to reporting was 11 days. Genomic alteration-guided treatment recommendations were considered new and clinically relevant in 38% (36/95) of the patients. Rapid decline in functional status was noted in 25% (9/36) of these patients who could therefore not receive genomic-guided therapy. Genomic-guided therapy was utilized in 13 patients (13.7%) and 7 patients (7.4%) experienced clinical benefit (6 PR and 1 SD). Among these seven patients, median PFS was 10 months with some ongoing durable responses. Genomic profiling-guided therapy can lead to clinical benefit in a subset of patients with advanced GI malignancies. Attempting genomic profiling earlier in the course of treatment prior to functional decline may allow more patients to benefit from these therapies. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  20. Moving Stimuli Facilitate Synchronization But Not Temporal Perception

    PubMed Central

    Silva, Susana; Castro, São Luís

    2016-01-01

    Recent studies have shown that a moving visual stimulus (e.g., a bouncing ball) facilitates synchronization compared to a static stimulus (e.g., a flashing light), and that it can even be as effective as an auditory beep. We asked a group of participants to perform different tasks with four stimulus types: beeps, siren-like sounds, visual flashes (static) and bouncing balls. First, participants performed synchronization with isochronous sequences (stimulus-guided synchronization), followed by a continuation phase in which the stimulus was internally generated (imagery-guided synchronization). Then they performed a perception task, in which they judged whether the final part of a temporal sequence was compatible with the previous beat structure (stimulus-guided perception). Similar to synchronization, an imagery-guided variant was added, in which sequences contained a gap in between (imagery-guided perception). Balls outperformed flashes and matched beeps (powerful ball effect) in stimulus-guided synchronization but not in perception (stimulus- or imagery-guided). In imagery-guided synchronization, performance accuracy decreased for beeps and balls, but not for flashes and sirens. Our findings suggest that the advantages of moving visual stimuli over static ones are grounded in action rather than perception, and they support the hypothesis that the sensorimotor coupling mechanisms for auditory (beeps) and moving visual stimuli (bouncing balls) overlap. PMID:27909419

  1. Moving Stimuli Facilitate Synchronization But Not Temporal Perception.

    PubMed

    Silva, Susana; Castro, São Luís

    2016-01-01

    Recent studies have shown that a moving visual stimulus (e.g., a bouncing ball) facilitates synchronization compared to a static stimulus (e.g., a flashing light), and that it can even be as effective as an auditory beep. We asked a group of participants to perform different tasks with four stimulus types: beeps, siren-like sounds, visual flashes (static) and bouncing balls. First, participants performed synchronization with isochronous sequences (stimulus-guided synchronization), followed by a continuation phase in which the stimulus was internally generated (imagery-guided synchronization). Then they performed a perception task, in which they judged whether the final part of a temporal sequence was compatible with the previous beat structure (stimulus-guided perception). Similar to synchronization, an imagery-guided variant was added, in which sequences contained a gap in between (imagery-guided perception). Balls outperformed flashes and matched beeps (powerful ball effect) in stimulus-guided synchronization but not in perception (stimulus- or imagery-guided). In imagery-guided synchronization, performance accuracy decreased for beeps and balls, but not for flashes and sirens. Our findings suggest that the advantages of moving visual stimuli over static ones are grounded in action rather than perception, and they support the hypothesis that the sensorimotor coupling mechanisms for auditory (beeps) and moving visual stimuli (bouncing balls) overlap.

  2. Determinants for DNA target structure selectivity of the human LINE-1 retrotransposon endonuclease.

    PubMed

    Repanas, Kostas; Zingler, Nora; Layer, Liliana E; Schumann, Gerald G; Perrakis, Anastassis; Weichenrieder, Oliver

    2007-01-01

    The human LINE-1 endonuclease (L1-EN) is the targeting endonuclease encoded by the human LINE-1 (L1) retrotransposon. L1-EN guides the genomic integration of new L1 and Alu elements that presently account for approximately 28% of the human genome. L1-EN bears considerable technological interest, because its target selectivity may ultimately be engineered to allow the site-specific integration of DNA into defined genomic locations. Based on the crystal structure, we generated L1-EN mutants to analyze and manipulate DNA target site recognition. Crystal structures and their dynamic and functional analysis show entire loop grafts to be feasible, resulting in altered specificity, while individual point mutations do not change the nicking pattern of L1-EN. Structural parameters of the DNA target seem more important for recognition than the nucleotide sequence, and nicking profiles on DNA oligonucleotides in vitro are less well defined than the respective integration site consensus in vivo. This suggests that additional factors other than the DNA nicking specificity of L1-EN contribute to the targeted integration of non-LTR retrotransposons.

  3. Folding and Stabilization of Native-Sequence-Reversed Proteins

    PubMed Central

    Zhang, Yuanzhao; Weber, Jeffrey K; Zhou, Ruhong

    2016-01-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols. PMID:27113844

  4. Folding and Stabilization of Native-Sequence-Reversed Proteins

    NASA Astrophysics Data System (ADS)

    Zhang, Yuanzhao; Weber, Jeffrey K.; Zhou, Ruhong

    2016-04-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols.

  5. Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling.

    PubMed

    Schudoma, Christian; May, Patrick; Nikiforova, Viktoria; Walther, Dirk

    2010-01-01

    The specific function of RNA molecules frequently resides in their seemingly unstructured loop regions. We performed a systematic analysis of RNA loops extracted from experimentally determined three-dimensional structures of RNA molecules. A comprehensive loop-structure data set was created and organized into distinct clusters based on structural and sequence similarity. We detected clear evidence of the hallmark of homology present in the sequence-structure relationships in loops. Loops differing by <25% in sequence identity fold into very similar structures. Thus, our results support the application of homology modeling for RNA loop model building. We established a threshold that may guide the sequence divergence-based selection of template structures for RNA loop homology modeling. Of all possible sequences that are, under the assumption of isosteric relationships, theoretically compatible with actual sequences observed in RNA structures, only a small fraction is contained in the Rfam database of RNA sequences and classes implying that the actual RNA loop space may consist of a limited number of unique loop structures and conserved sequences. The loop-structure data sets are made available via an online database, RLooM. RLooM also offers functionalities for the modeling of RNA loop structures in support of RNA engineering and design efforts.

  6. A global view of structure–function relationships in the tautomerase superfamily

    PubMed Central

    Davidson, Rebecca; Baas, Bert-Jan; Akiva, Eyal; Holliday, Gemma L.; Polacco, Benjamin J.; LeVieux, Jake A.; Pullara, Collin R.; Zhang, Yan Jessie; Whitman, Christian P.

    2018-01-01

    The tautomerase superfamily (TSF) consists of more than 11,000 nonredundant sequences present throughout the biosphere. Characterized members have attracted much attention because of the unusual and key catalytic role of an N-terminal proline. These few characterized members catalyze a diverse range of chemical reactions, but the full scale of their chemical capabilities and biological functions remains unknown. To gain new insight into TSF structure–function relationships, we performed a global analysis of similarities across the entire superfamily and computed a sequence similarity network to guide classification into distinct subgroups. Our results indicate that TSF members are found in all domains of life, with most being present in bacteria. The eukaryotic members of the cis-3-chloroacrylic acid dehalogenase subgroup are limited to fungal species, whereas the macrophage migration inhibitory factor subgroup has wide eukaryotic representation (including mammals). Unexpectedly, we found that 346 TSF sequences lack Pro-1, of which 85% are present in the malonate semialdehyde decarboxylase subgroup. The computed network also enabled the identification of similarity paths, namely sequences that link functionally diverse subgroups and exhibit transitional structural features that may help explain reaction divergence. A structure-guided comparison of these linker proteins identified conserved transitions between them, and kinetic analysis paralleled these observations. Phylogenetic reconstruction of the linker set was consistent with these findings. Our results also suggest that contemporary TSF members may have evolved from a short 4-oxalocrotonate tautomerase–like ancestor followed by gene duplication and fusion. Our new linker-guided strategy can be used to enrich the discovery of sequence/structure/function transitions in other enzyme superfamilies. PMID:29184004

  7. Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.

    PubMed

    Raghav, Sunil Kumar; Deplancke, Bart

    2012-01-01

    Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.

  8. Two Low Coverage Bird Genomes and a Comparison of Reference-Guided versus De Novo Genome Assemblies

    PubMed Central

    Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthew K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

    2014-01-01

    As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies. PMID:25192061

  9. How Design Guides Learning from Matrix Diagrams

    ERIC Educational Resources Information Center

    van der Meij, Jan; van Amelsvoort, Marije; Anjewierden, Anjo

    2017-01-01

    Compared to text, diagrams are superior in their ability to structure and summarize information and to show relations between concepts and ideas. Perceptual cues, like arrows, are expected to improve the retention of diagrams by guiding the learner towards important elements or showing a preferred reading sequence. In our experiment, we analyzed…

  10. Distribution and Features of the Six Classes of Peroxiredoxins

    PubMed Central

    Poole, Leslie B.; Nelson, Kimberly J.

    2016-01-01

    Peroxiredoxins are cysteine-dependent peroxide reductases that group into 6 different, structurally discernable classes. In 2011, our research team reported the application of a bioinformatic approach called active site profiling to extract active site-proximal sequence segments from the 29 distinct, structurally-characterized peroxiredoxins available at the time. These extracted sequences were then used to create unique profiles for the six groups which were subsequently used to search GenBank(nr), allowing identification of ∼3500 peroxiredoxin sequences and their respective subgroups. Summarized in this minireview are the features and phylogenetic distributions of each of these peroxiredoxin subgroups; an example is also provided illustrating the use of the web accessible, searchable database known as PREX to identify subfamily-specific peroxiredoxin sequences for the organism Vitis vinifera (grape). PMID:26810075

  11. Sequence Stratigraphy of the Dakota Sandstone, Eastern San Juan Basin, New Mexico, and its Relationship to Reservoir Compartmentalization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Varney, Peter J.

    2002-04-23

    This research established the Dakota-outcrop sequence stratigraphy in part of the eastern San Juan Basin, New Mexico, and relates reservoir quality lithologies in depositional sequences to structure and reservoir compartmentalization in the South Lindrith Field area. The result was a predictive tool that will help guide further exploration and development.

  12. Protein Structure Determination using Metagenome sequence data

    PubMed Central

    Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David

    2017-01-01

    Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891

  13. Beyond small molecule SAR – using the dopamine D3 receptor crystal structure to guide drug design

    PubMed Central

    Keck, Thomas M.; Burzynski, Caitlin; Shi, Lei; Newman, Amy Hauck

    2016-01-01

    The dopamine D3 receptor is a target of pharmacotherapeutic interest in a variety of neurological disorders including schizophrenia, restless leg syndrome, and drug addiction. The high protein sequence homology between the D3 and D2 receptors has posed a challenge to developing D3 receptor-selective ligands whose behavioral actions can be attributed to D3 receptor engagement, in vivo. However, through primarily small molecule structure-activity relationship (SAR) studies, a variety of chemical scaffolds have been discovered over the past two decades that have resulted in several D3 receptor-selective ligands with high affinity and in vivo activity. Nevertheless, viable clinical candidates remain limited. The recent determination of the high-resolution crystal structure of the D3 receptor has invigorated structure-based drug design, providing refinements to the molecular dynamic models and testable predictions about receptor-ligand interactions. This review will highlight recent preclinical and clinical studies demonstrating potential utility of D3 receptor-selective ligands in the treatment of addiction. In addition, new structure-based rational drug design strategies for D3 receptor-selective ligands that complement traditional small molecule SAR to improve the selectivity and directed efficacy profiles are examined. PMID:24484980

  14. Residue contacts predicted by evolutionary covariance extend the application of ab initio molecular replacement to larger and more challenging protein folds.

    PubMed

    Simkovic, Felix; Thomas, Jens M H; Keegan, Ronan M; Winn, Martyn D; Mayans, Olga; Rigden, Daniel J

    2016-07-01

    For many protein families, the deluge of new sequence information together with new statistical protocols now allow the accurate prediction of contacting residues from sequence information alone. This offers the possibility of more accurate ab initio (non-homology-based) structure prediction. Such models can be used in structure solution by molecular replacement (MR) where the target fold is novel or is only distantly related to known structures. Here, AMPLE, an MR pipeline that assembles search-model ensembles from ab initio structure predictions ('decoys'), is employed to assess the value of contact-assisted ab initio models to the crystallographer. It is demonstrated that evolutionary covariance-derived residue-residue contact predictions improve the quality of ab initio models and, consequently, the success rate of MR using search models derived from them. For targets containing β-structure, decoy quality and MR performance were further improved by the use of a β-strand contact-filtering protocol. Such contact-guided decoys achieved 14 structure solutions from 21 attempted protein targets, compared with nine for simple Rosetta decoys. Previously encountered limitations were superseded in two key respects. Firstly, much larger targets of up to 221 residues in length were solved, which is far larger than the previously benchmarked threshold of 120 residues. Secondly, contact-guided decoys significantly improved success with β-sheet-rich proteins. Overall, the improved performance of contact-guided decoys suggests that MR is now applicable to a significantly wider range of protein targets than were previously tractable, and points to a direct benefit to structural biology from the recent remarkable advances in sequencing.

  15. Residue contacts predicted by evolutionary covariance extend the application of ab initio molecular replacement to larger and more challenging protein folds

    PubMed Central

    Simkovic, Felix; Thomas, Jens M. H.; Keegan, Ronan M.; Winn, Martyn D.; Mayans, Olga; Rigden, Daniel J.

    2016-01-01

    For many protein families, the deluge of new sequence information together with new statistical protocols now allow the accurate prediction of contacting residues from sequence information alone. This offers the possibility of more accurate ab initio (non-homology-based) structure prediction. Such models can be used in structure solution by molecular replacement (MR) where the target fold is novel or is only distantly related to known structures. Here, AMPLE, an MR pipeline that assembles search-model ensembles from ab initio structure predictions (‘decoys’), is employed to assess the value of contact-assisted ab initio models to the crystallographer. It is demonstrated that evolutionary covariance-derived residue–residue contact predictions improve the quality of ab initio models and, consequently, the success rate of MR using search models derived from them. For targets containing β-structure, decoy quality and MR performance were further improved by the use of a β-strand contact-filtering protocol. Such contact-guided decoys achieved 14 structure solutions from 21 attempted protein targets, compared with nine for simple Rosetta decoys. Previously encountered limitations were superseded in two key respects. Firstly, much larger targets of up to 221 residues in length were solved, which is far larger than the previously benchmarked threshold of 120 residues. Secondly, contact-guided decoys significantly improved success with β-sheet-rich proteins. Overall, the improved performance of contact-guided decoys suggests that MR is now applicable to a significantly wider range of protein targets than were previously tractable, and points to a direct benefit to structural biology from the recent remarkable advances in sequencing. PMID:27437113

  16. Predicting turns in proteins with a unified model.

    PubMed

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  17. Predicting Turns in Proteins with a Unified Model

    PubMed Central

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Motivation Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. Results In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications. PMID:23144872

  18. Structure-Templated Predictions of Novel Protein Interactions from Sequence Information

    PubMed Central

    Betel, Doron; Breitkreuz, Kevin E; Isserlin, Ruth; Dewar-Darch, Danielle; Tyers, Mike; Hogue, Christopher W. V

    2007-01-01

    The multitude of functions performed in the cell are largely controlled by a set of carefully orchestrated protein interactions often facilitated by specific binding of conserved domains in the interacting proteins. Interacting domains commonly exhibit distinct binding specificity to short and conserved recognition peptides called binding profiles. Although many conserved domains are known in nature, only a few have well-characterized binding profiles. Here, we describe a novel predictive method known as domain–motif interactions from structural topology (D-MIST) for elucidating the binding profiles of interacting domains. A set of domains and their corresponding binding profiles were derived from extant protein structures and protein interaction data and then used to predict novel protein interactions in yeast. A number of the predicted interactions were verified experimentally, including new interactions of the mitotic exit network, RNA polymerases, nucleotide metabolism enzymes, and the chaperone complex. These results demonstrate that new protein interactions can be predicted exclusively from sequence information. PMID:17892321

  19. Impact of target mRNA structure on siRNA silencing efficiency: A large-scale study.

    PubMed

    Gredell, Joseph A; Berger, Angela K; Walton, S Patrick

    2008-07-01

    The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5'- and 3'-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5'-end or 3'-end were silenced, on average, approximately 10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. (c) 2008 Wiley Periodicals, Inc.

  20. Impact of target mRNA structure on siRNA silencing efficiency: a large-scale study

    PubMed Central

    Gredell, Joseph A.; Berger, Angela K.; Walton, S. Patrick

    2009-01-01

    The selection of active siRNAs is generally based on identifying siRNAs with certain sequence and structural properties. However, the efficiency of RNA interference has also been shown to depend on the structure of the target mRNA, primarily through studies using exogenous transcripts with well-defined secondary structures in the vicinity of the target sequence. While these studies provide a means for examining the impact of target sequence and structure independently, the predicted secondary structures for these transcripts are often not reflective of structures that form in full-length, native mRNAs where interactions can occur between relatively remote segments of the mRNAs. Here, using a combination of experimental results and analysis of a large dataset, we demonstrate that the accessibility of certain local target structures on the mRNA is an important determinant in the gene silencing ability of siRNAs. siRNAs targeting the enhanced green fluorescent protein were chosen using a minimal siRNA selection algorithm followed by classification based on the predicted minimum free energy structures of the target transcripts. Transfection into HeLa and HepG2 cells revealed that siRNAs targeting regions of the mRNA predicted to have unpaired 5’- and 3’-ends resulted in greater gene silencing than regions predicted to have other types of secondary structure. These results were confirmed by analysis of gene silencing data from previously published siRNAs, which showed that mRNA target regions unpaired at either the 5’-end or 3’-end were silenced, on average, ~10% more strongly than target regions unpaired in the center or primarily paired throughout. We found this effect to be independent of the structure of the siRNA guide strand. Taken together, these results suggest minimal requirements for nucleation of hybridization between the siRNA guide strand and mRNA and that both mRNA and guide strand structure should be considered when choosing candidate siRNAs. PMID:18306428

  1. 27nt-RNAs guide histone variant deposition via 'RNA-induced DNA replication interference' and thus transmit parental genome partitioning in Stylonychia.

    PubMed

    Postberg, Jan; Jönsson, Franziska; Weil, Patrick Philipp; Bulic, Aneta; Juranek, Stefan Andreas; Lipps, Hans-Joachim

    2018-06-12

    During sexual reproduction in the unicellular ciliate Stylonychia somatic macronuclei differentiate from germline micronuclei. Thereby, programmed sequence reduction takes place, leading to the elimination of > 95% of germline sequences, which priorly adopt heterochromatin structure via H3K27me3. Simultaneously, 27nt-ncRNAs become synthesized from parental transcripts and are bound by the Argonaute protein PIWI1. These 27nt-ncRNAs cover sequences destined to the developing macronucleus and are thought to protect them from degradation. We provide evidence and propose that RNA/DNA base-pairing guides PIWI1/27nt-RNA complexes to complementary macronucleus-destined DNA target sequences, hence transiently causing locally stalled replication during polytene chromosome formation. This spatiotemporal delay enables the selective deposition of temporarily available histone H3.4K27me3 nucleosomes at all other sequences being continuously replicated, thus dictating their prospective heterochromatin structure before becoming developmentally eliminated. Concomitantly, 27nt-RNA-covered sites remain protected. We introduce the concept of 'RNA-induced DNA replication interference' and explain how the parental functional genome partition could become transmitted to the progeny.

  2. Divergence of Structure and Function in the Haloacid Dehalogenase Enzyme Superfamily: Bacteroides thetaiotaomicron BT2127 is an Inorganic Pyrophosphatase+

    PubMed Central

    Huang, Hua; Yury, Patskovsky; Toro, Rafael; Farelli, Jeremiah D.; Pandya, Chetanya; Almo, Steven C.; Allen, Karen N.; Dunaway-Mariano, Debra

    2012-01-01

    The explosion of protein sequence information requires that current strategies for function assignment must evolve to complement experimental approaches with computationally-based function prediction. This necessitates the development of strategies based on the identification of sequence markers in the form of specificity determinants and a more informed definition of orthologues. Herein, we have undertaken the function assignment of the unknown Haloalkanoate Dehalogenase superfamily member BT2127 (Uniprot accession # Q8A5V9) from Bacteroides thetaiotaomicron using an integrated bioinformatics/structure/mechanism approach. The substrate specificity profile and steady-state rate constants of BT2127 (with kcat/Km value for pyrophosphate of ∼1 × 105 M−1 s−1), together with the gene context, supports the assigned in vivo function as an inorganic pyrophosphatase. The X-ray structural analysis of the wild-type BT2127 and several variants generated by site-directed mutagenesis shows that substrate discrimination is based, in part, on active site space restrictions imposed by the cap domain (specifically by residues Tyr76 and Glu47). Structure guided site directed mutagenesis coupled with kinetic analysis of the mutant enzymes identified the residues required for catalysis, substrate binding, and domain-domain association. Based on this structure-function analysis, the catalytic residues Asp11, Asp13, Thr113, and Lys147 as well the metal binding residues Asp171, Asn172 and Glu47 were used as markers to confirm BT2127 orthologues identified via sequence searches. This bioinformatic analysis demonstrated that the biological range of BT2127 orthologue is restricted to the phylum Bacteroidetes/Chlorobi. The key structural determinants in the divergence of BT2127 and its closest homologue β-phosphoglucomutase control the leaving group size (phosphate vs. glucose-phosphate) and the position of the Asp acid/base in the open vs. closed conformations. HADSF pyrophosphatases represent a third mechanistic and fold type for bacterial pyrophosphatases. PMID:21894910

  3. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

    PubMed

    Tan, Yen Hock; Huang, He; Kihara, Daisuke

    2006-08-15

    Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.

  4. Biocuration in the structure-function linkage database: the anatomy of a superfamily.

    PubMed

    Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal; Mischel, David; Hicks, Michael A; Morris, John H; Huang, Conrad C; Meng, Elaine C; Pegg, Scott C-H; Ferrin, Thomas E; Babbitt, Patricia C

    2017-01-01

    With ever-increasing amounts of sequence data available in both the primary literature and sequence repositories, there is a bottleneck in annotating molecular function to a sequence. This article describes the biocuration process and methods used in the structure-function linkage database (SFLD) to help address some of the challenges. We discuss how the hierarchy within the SFLD allows us to infer detailed functional properties for functionally diverse enzyme superfamilies in which all members are homologous, conserve an aspect of their chemical function and have associated conserved structural features that enable the chemistry. Also presented is the Enzyme Structure-Function Ontology (ESFO), which has been designed to capture the relationships between enzyme sequence, structure and function that underlie the SFLD and is used to guide the biocuration processes within the SFLD. http://sfld.rbvi.ucsf.edu/. © The Author 2017. Published by Oxford University Press.

  5. Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.

    PubMed

    Chen, Peng; Li, Jinyan

    2010-05-17

    Prediction of long-range inter-residue contacts is an important topic in bioinformatics research. It is helpful for determining protein structures, understanding protein foldings, and therefore advancing the annotation of protein functions. In this paper, we propose a novel ensemble of genetic algorithm classifiers (GaCs) to address the long-range contact prediction problem. Our method is based on the key idea called sequence profile centers (SPCs). Each SPC is the average sequence profiles of residue pairs belonging to the same contact class or non-contact class. GaCs train on multiple but different pairs of long-range contact data (positive data) and long-range non-contact data (negative data). The negative data sets, having roughly the same sizes as the positive ones, are constructed by random sampling over the original imbalanced negative data. As a result, about 21.5% long-range contacts are correctly predicted. We also found that the ensemble of GaCs indeed makes an accuracy improvement by around 5.6% over the single GaC. Classifiers with the use of sequence profile centers may advance the long-range contact prediction. In line with this approach, key structural features in proteins would be determined with high efficiency and accuracy.

  6. HIPPI: highly accurate protein family classification with ensembles of HMMs.

    PubMed

    Nguyen, Nam-Phuong; Nute, Michael; Mirarab, Siavash; Warnow, Tandy

    2016-11-11

    Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics. We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification). HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy. HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at https://github.com/smirarab/sepp .

  7. Sequence dependency of canonical base pair opening in the DNA double helix

    PubMed Central

    Villa, Alessandra

    2017-01-01

    The flipping-out of a DNA base from the double helical structure is a key step of many cellular processes, such as DNA replication, modification and repair. Base pair opening is the first step of base flipping and the exact mechanism is still not well understood. We investigate sequence effects on base pair opening using extensive classical molecular dynamics simulations targeting the opening of 11 different canonical base pairs in two DNA sequences. Two popular biomolecular force fields are applied. To enhance sampling and calculate free energies, we bias the simulation along a simple distance coordinate using a newly developed adaptive sampling algorithm. The simulation is guided back and forth along the coordinate, allowing for multiple opening pathways. We compare the calculated free energies with those from an NMR study and check assumptions of the model used for interpreting the NMR data. Our results further show that the neighboring sequence is an important factor for the opening free energy, but also indicates that other sequence effects may play a role. All base pairs are observed to have a propensity for opening toward the major groove. The preferred opening base is cytosine for GC base pairs, while for AT there is sequence dependent competition between the two bases. For AT opening, we identify two non-canonical base pair interactions contributing to a local minimum in the free energy profile. For both AT and CG we observe long-lived interactions with water and with sodium ions at specific sites on the open base pair. PMID:28369121

  8. Chromatin accessibility and guide sequence secondary structure affect CRISPR-Cas9 gene editing efficiency.

    PubMed

    Jensen, Kristopher Torp; Fløe, Lasse; Petersen, Trine Skov; Huang, Jinrong; Xu, Fengping; Bolund, Lars; Luo, Yonglun; Lin, Lin

    2017-07-01

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated protein 9 (CRISPR-Cas9) systems have emerged as the method of choice for genome editing, but large variations in on-target efficiencies continue to limit their applicability. Here, we investigate the effect of chromatin accessibility on Cas9-mediated gene editing efficiency for 20 gRNAs targeting 10 genomic loci in HEK293T cells using both SpCas9 and the eSpCas9(1.1) variant. Our study indicates that gene editing is more efficient in euchromatin than in heterochromatin, and we validate this finding in HeLa cells and in human fibroblasts. Furthermore, we investigate the gRNA sequence determinants of CRISPR-Cas9 activity using a surrogate reporter system and find that the efficiency of Cas9-mediated gene editing is dependent on guide sequence secondary structure formation. This knowledge can aid in the further improvement of tools for gRNA design. © 2017 Federation of European Biochemical Societies.

  9. SFESA: a web server for pairwise alignment refinement by secondary structure shifts.

    PubMed

    Tong, Jing; Pei, Jimin; Grishin, Nick V

    2015-09-03

    Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate. We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software. SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.

  10. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    PubMed

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  11. AlignMe—a membrane protein sequence alignment web server

    PubMed Central

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  12. Active bacterial community structure along vertical redox gradients in Baltic Sea sediment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jansson, Janet; Edlund, Anna; Hardeman, Fredrik

    Community structures of active bacterial populations were investigated along a vertical redox profile in coastal Baltic Sea sediments by terminal-restriction fragment length polymorphism (T-RFLP) and clone library analysis. According to correspondence analysis of T-RFLP results and sequencing of cloned 16S rRNA genes, the microbial community structures at three redox depths (179 mV, -64 mV and -337 mV) differed significantly. The bacterial communities in the community DNA differed from those in bromodeoxyuridine (BrdU)-labeled DNA, indicating that the growing members of the community that incorporated BrdU were not necessarily the most dominant members. The structures of the actively growing bacterial communities weremore » most strongly correlated to organic carbon followed by total nitrogen and redox potentials. Bacterial identification by sequencing of 16S rRNA genes from clones of BrdU-labeled DNA and DNA from reverse transcription PCR (rt-PCR) showed that bacterial taxa involved in nitrogen and sulfur cycling were metabolically active along the redox profiles. Several sequences had low similarities to previously detected sequences indicating that novel lineages of bacteria are present in Baltic Sea sediments. Also, a high number of different 16S rRNA gene sequences representing different phyla were detected at all sampling depths.« less

  13. Image Display And Manipulation System (IDAMS), user's guide

    NASA Technical Reports Server (NTRS)

    Cecil, R. W.

    1972-01-01

    A combination operator's guide and user's handbook for the Image Display and Manipulation System (IDAMS) is reported. Information is presented to define how to operate the computer equipment, how to structure a run deck, and how to select parameters necessary for executing a sequence of IDAMS task routines. If more detailed information is needed on any IDAMS program, see the IDAMS program documentation.

  14. (Pea)nuts and bolts of visual narrative: Structure and meaning in sequential image comprehension

    PubMed Central

    Cohn, Neil; Paczynski, Martin; Jackendoff, Ray; Holcomb, Phillip J.; Kuperberg, Gina R.

    2012-01-01

    Just as syntax differentiates coherent sentences from scrambled word strings, the comprehension of sequential images must also use a cognitive system to distinguish coherent narrative sequences from random strings of images. We conducted experiments analogous to two classic studies of language processing to examine the contributions of narrative structure and semantic relatedness to processing sequential images. We compared four types of comic strips: 1) Normal sequences with both structure and meaning, 2) Semantic Only sequences (in which the panels were related to a common semantic theme, but had no narrative structure), 3) Structural Only sequences (narrative structure but no semantic relatedness), and 4) Scrambled sequences of randomly-ordered panels. In Experiment 1, participants monitored for target panels in sequences presented panel-by-panel. Reaction times were slowest to panels in Scrambled sequences, intermediate in both Structural Only and Semantic Only sequences, and fastest in Normal sequences. This suggests that both semantic relatedness and narrative structure offer advantages to processing. Experiment 2 measured ERPs to all panels across the whole sequence. The N300/N400 was largest to panels in both the Scrambled and Structural Only sequences, intermediate in Semantic Only sequences and smallest in the Normal sequences. This implies that a combination of narrative structure and semantic relatedness can facilitate semantic processing of upcoming panels (as reflected by the N300/N400). Also, panels in the Scrambled sequences evoked a larger left-lateralized anterior negativity than panels in the Structural Only sequences. This localized effect was distinct from the N300/N400, and appeared despite the fact that these two sequence types were matched on local semantic relatedness between individual panels. These findings suggest that sequential image comprehension uses a narrative structure that may be independent of semantic relatedness. Altogether, we argue that the comprehension of visual narrative is guided by an interaction between structure and meaning. PMID:22387723

  15. Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles

    DTIC Science & Technology

    2016-09-09

    evaluating 18 mutants using either the A or B conformer is only r = ~ 0.2. Given the poor performance of approximating the observed experimental ...1    Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles Mark A. Olson,1 Patricia...unusually high thermal stability is explored by a combined computational and experimental study. Starting with the crystallographic structure

  16. Learning predictive statistics from temporal sequences: Dynamics and strategies

    PubMed Central

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe

    2017-01-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111

  17. Learning predictive statistics from temporal sequences: Dynamics and strategies.

    PubMed

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

    2017-10-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.

  18. A prospective pilot study of genome-wide exome and transcriptome profiling in patients with small cell lung cancer progressing after first-line therapy.

    PubMed

    Weiss, Glen J; Byron, Sara A; Aldrich, Jessica; Sangal, Ashish; Barilla, Heather; Kiefer, Jeffrey A; Carpten, John D; Craig, David W; Whitsett, Timothy G

    2017-01-01

    Small cell lung cancer (SCLC) that has progressed after first-line therapy is an aggressive disease with few effective therapeutic strategies. In this prospective study, we employed next-generation sequencing (NGS) to identify therapeutically actionable alterations to guide treatment for advanced SCLC patients. Twelve patients with SCLC were enrolled after failing platinum-based chemotherapy. Following informed consent, genome-wide exome and RNA-sequencing was performed in a CLIA-certified, CAP-accredited environment. Actionable targets were identified and therapeutic recommendations made from a pharmacopeia of FDA-approved drugs. Clinical response to genomically-guided treatment was evaluated by Response Evaluation Criteria in Solid Tumors (RECIST) 1.1. The study completed its accrual goal of 12 evaluable patients. The minimum tumor content for successful NGS was 20%, with a median turnaround time from sample collection to genomics-based treatment recommendation of 27 days. At least two clinically actionable targets were identified in each patient, and six patients (50%) received treatment identified by NGS. Two had partial responses by RECIST 1.1 on a clinical trial involving a PD-1 inhibitor + irinotecan (indicated by MLH1 alteration). The remaining patients had clinical deterioration before NGS recommended therapy could be initiated. Comprehensive genomic profiling using NGS identified clinically-actionable alterations in SCLC patients who progressed on initial therapy. Recommended PD-1 therapy generated partial responses in two patients. Earlier access to NGS guided therapy, along with improved understanding of those SCLC patients likely to respond to immune-based therapies, should help to extend survival in these cases with poor outcomes.

  19. An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.

    PubMed

    Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2017-04-01

    Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  20. Effects of surface roughness and absorption on light propagation in graded-profile waveguides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Danilenko, S S; Osovitskii, A N

    2011-06-30

    This paper examines the effects of surface roughness and absorption on laser light propagation in graded-profile waveguiding structures. We derive analytical expressions for the scattering and absorption coefficients of guided waves and analyse these coefficients in relation to parameters of the waveguiding structure and the roughness of its boundary. A new approach is proposed to measuring roughness parameters of precision dielectric surfaces. Experimental evidence is presented which supports the main conclusions of the theory. (integraled-optical waweguides)

  1. Depth profiling of marker layers using x-ray waveguide structures

    NASA Astrophysics Data System (ADS)

    Gupta, Ajay; Rajput, Parasmani; Saraiya, Amit; Reddy, V. R.; Gupta, Mukul; Bernstorff, Sigrid; Amenitsch, H.

    2005-08-01

    It is demonstrated that x-ray waveguide structures can be used for depth profiling of a marker layer inside the guiding layer with an accuracy of better than 0.2 nm. A combination of x-ray fluorescence and x-ray reflectivity measurements can provide detailed information about the structure of the guiding layer. The position and thickness of the marker layer affect different aspects of the angle-dependent x-ray fluorescence pattern, thus making it possible to determine the structure of the marker layer in an unambiguous manner. As an example, effects of swift heavy ion irradiation on a Si/M/Si trilayer ( M=Fe , W), forming the cavity of the waveguide structure, have been studied. It is found that in accordance with the prediction of thermal spike model, Fe is much more sensitive to swift heavy ion induced modifications as compared to W, even in thin film form. However, a clear evidence of movement of the Fe marker layer towards the surface is observed after irradiation, which cannot be understood in terms of the thermal spike model alone.

  2. Rational Protein Engineering Guided by Deep Mutational Scanning

    PubMed Central

    Shin, HyeonSeok; Cho, Byung-Kwan

    2015-01-01

    Sequence–function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design. PMID:26404267

  3. The Phyre2 web portal for protein modelling, prediction and analysis

    PubMed Central

    Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael JE

    2017-01-01

    Summary Phyre2 is a suite of tools available on the web to predict and analyse protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a protocol. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites, and analyse the effect of amino-acid variants (e.g. nsSNPs) for a user’s protein sequence. Users are guided through results by a simple interface at a level of detail determined by them. This protocol will guide a user from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30mins and 2 hours after submission. PMID:25950237

  4. Effect of Structured Touch and Guided Imagery for Pain and Anxiety in Elective Joint Replacement Patients--A Randomized Controlled Trial: M-TIJRP.

    PubMed

    Forward, John Brent; Greuter, Nancy Elizabeth; Crisall, Santa J; Lester, Houston F

    2015-01-01

    Postoperative management of pain after total joint arthroplasty remains a challenge despite advancements in analgesics. Evidence shows that complementary modalities with mind-body and tactile-based approaches are valid and effective adjuncts to reduce pain and anxiety postoperatively. To investigate the effectiveness of the "M" Technique (M), a registered method of structured touch using a set sequence and number of strokes, and a consistent level of pressure on hands and feet, compared with guided imagery and usual care, for the reduction of pain and anxiety in patients undergoing elective total knee or hip replacement surgery. Randomized controlled trial: M-TIJRP (MiTechnique and guided Imagery in Joint Replacement Patients [Mighty Junior P]). At a community hospital, 225 male and female patients, aged 38 to 90 years, undergoing elective total hip or knee replacement were randomly assigned to 1 of 3 groups (75 patients in each): M, guided imagery, or usual care. They were blinded to their assignment until the intervention. Reduction of pain and anxiety postoperatively. Secondary outcomes measured use of pain medication and patient satisfaction. This study yielded positive findings for the management of pain and anxiety in patients undergoing elective joint replacement using M and guided imagery for 18 to 20 minutes compared with usual care. M showed the largest predicted decreases in both pain and anxiety between groups. There was no significant difference in narcotic pain medication use between groups. Patient satisfaction survey ratings were highest for M, followed by guided imagery. The benefit of M may be because of the specifically structured sequence of touch by competent caring, trained providers.

  5. Effect of Structured Touch and Guided Imagery for Pain and Anxiety in Elective Joint Replacement Patients—A Randomized Controlled Trial: M-TIJRP

    PubMed Central

    Forward, John Brent; Greuter, Nancy Elizabeth; Crisall, Santa J; Lester, Houston F

    2015-01-01

    Context: Postoperative management of pain after total joint arthroplasty remains a challenge despite advancements in analgesics. Evidence shows that complementary modalities with mind-body and tactile-based approaches are valid and effective adjuncts to reduce pain and anxiety postoperatively. Objective: To investigate the effectiveness of the “M” Technique (M), a registered method of structured touch using a set sequence and number of strokes, and a consistent level of pressure on hands and feet, compared with guided imagery and usual care, for the reduction of pain and anxiety in patients undergoing elective total knee or hip replacement surgery. Methods: Randomized controlled trial: M-TIJRP (MiTechnique and guided Imagery in Joint Replacement Patients [Mighty Junior P]). At a community hospital, 225 male and female patients, aged 38 to 90 years, undergoing elective total hip or knee replacement were randomly assigned to 1 of 3 groups (75 patients in each): M, guided imagery, or usual care. They were blinded to their assignment until the intervention. Main Outcome Measures: Reduction of pain and anxiety postoperatively. Secondary outcomes measured use of pain medication and patient satisfaction. Results: This study yielded positive findings for the management of pain and anxiety in patients undergoing elective joint replacement using M and guided imagery for 18 to 20 minutes compared with usual care. M showed the largest predicted decreases in both pain and anxiety between groups. There was no significant difference in narcotic pain medication use between groups. Patient satisfaction survey ratings were highest for M, followed by guided imagery. Conclusion: The benefit of M may be because of the specifically structured sequence of touch by competent caring, trained providers. PMID:26222093

  6. Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds

    PubMed Central

    Roessler, Christian G.; Hall, Branwen M.; Anderson, William J.; Ingram, Wendy M.; Roberts, Sue A.; Montfort, William R.; Cordes, Matthew H. J.

    2008-01-01

    Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a “stepping-stone” method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and λ. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and λ. The domains show 40% sequence identity but differ by switching of α-helix to β-sheet in a C-terminal region spanning ≈25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization. PMID:18227506

  7. Genome Sequence of Mycobacterium hassiacum DSM 44199, a Rare Source of Heat-Stable Mycobacterial Proteins

    PubMed Central

    Tiago, Igor; Maranha, Ana; Mendes, Vitor; Alarico, Susana; Moynihan, Patrick J.; Clarke, Anthony J.; Macedo-Ribeiro, Sandra; Pereira, Pedro J. B.

    2012-01-01

    Mycobacterium hassiacum is a rapidly growing mycobacterium isolated from human urine and so far the most thermophilic among mycobacterial species. Its thermotolerance and phylogenetic relationship to M. tuberculosis render its proteins attractive tools for crystallization and structure-guided drug design. We report the draft genome sequence of M. hassiacum DSM 44199. PMID:23209251

  8. Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches.

    PubMed

    Zhang, Yiming; Jin, Quan; Wang, Shuting; Ren, Ren

    2011-05-01

    The mobile behavior of 1481 peptides in ion mobility spectrometry (IMS), which are generated by protease digestion of the Drosophila melanogaster proteome, is modeled and predicted based on two different types of characterization methods, i.e. sequence-based approach and structure-based approach. In this procedure, the sequence-based approach considers both the amino acid composition of a peptide and the local environment profile of each amino acid in the peptide; the structure-based approach is performed with the CODESSA protocol, which regards a peptide as a common organic compound and generates more than 200 statistically significant variables to characterize the whole structure profile of a peptide molecule. Subsequently, the nonlinear support vector machine (SVM) and Gaussian process (GP) as well as linear partial least squares (PLS) regression is employed to correlate the structural parameters of the characterizations with the IMS drift times of these peptides. The obtained quantitative structure-spectrum relationship (QSSR) models are evaluated rigorously and investigated systematically via both one-deep and two-deep cross-validations as well as the rigorous Monte Carlo cross-validation (MCCV). We also give a comprehensive comparison on the resulting statistics arising from the different combinations of variable types with modeling methods and find that the sequence-based approach can give the QSSR models with better fitting ability and predictive power but worse interpretability than the structure-based approach. In addition, though the QSSR modeling using sequence-based approach is not needed for the preparation of the minimization structures of peptides before the modeling, it would be considerably efficient as compared to that using structure-based approach. Copyright © 2011 Elsevier Ltd. All rights reserved.

  9. SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.

    PubMed

    Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

    2016-06-15

    Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  10. Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carte, Jason; Wang, Ruiying; Li, Hong

    An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here we identified Pyrococcus furiosus Cas6 as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targetingmore » RNAs. Cas6 interacts with a specific sequence motif in the 5{prime} region of the CRISPR repeat element and cleaves at a defined site within the 3{prime} region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal-independent. cas6 is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR-derived guide RNAs in numerous bacteria and archaea.« less

  11. Application of the MAFFT sequence alignment program to large data—reexamination of the usefulness of chained guide trees

    PubMed Central

    Yamada, Kazunori D.; Tomii, Kentaro; Katoh, Kazutaka

    2016-01-01

    Motivation: Large multiple sequence alignments (MSAs), consisting of thousands of sequences, are becoming more and more common, due to advances in sequencing technologies. The MAFFT MSA program has several options for building large MSAs, but their performances have not been sufficiently assessed yet, because realistic benchmarking of large MSAs has been difficult. Recently, such assessments have been made possible through the HomFam and ContTest benchmark protein datasets. Along with the development of these datasets, an interesting theory was proposed: chained guide trees increase the accuracy of MSAs of structurally conserved regions. This theory challenges the basis of progressive alignment methods and needs to be examined by being compared with other known methods including computationally intensive ones. Results: We used HomFam, ContTest and OXFam (an extended version of OXBench) to evaluate several methods enabled in MAFFT: (1) a progressive method with approximate guide trees, (2) a progressive method with chained guide trees, (3) a combination of an iterative refinement method and a progressive method and (4) a less approximate progressive method that uses a rigorous guide tree and consistency score. Other programs, Clustal Omega and UPP, available for large MSAs, were also included into the comparison. The effect of method 2 (chained guide trees) was positive in ContTest but negative in HomFam and OXFam. Methods 3 and 4 increased the benchmark scores more consistently than method 2 for the three datasets, suggesting that they are safer to use. Availability and Implementation: http://mafft.cbrc.jp/alignment/software/ Contact: katoh@ifrec.osaka-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27378296

  12. FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

    PubMed

    El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

    2016-01-01

    A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein-DNA interfaces.

  13. Structural basis for microRNA targeting

    DOE PAGES

    Schirle, Nicole T.; Sheu-Gruttadauria, Jessica; MacRae, Ian J.

    2014-10-31

    MicroRNAs (miRNAs) control expression of thousands of genes in plants and animals. miRNAs function by guiding Argonaute proteins to complementary sites in messenger RNAs (mRNAs) targeted for repression. In this paper, we determined crystal structures of human Argonaute-2 (Ago2) bound to a defined guide RNA with and without target RNAs representing miRNA recognition sites. These structures suggest a stepwise mechanism, in which Ago2 primarily exposes guide nucleotides (nt) 2 to 5 for initial target pairing. Pairing to nt 2 to 5 promotes conformational changes that expose nt 2 to 8 and 13 to 16 for further target recognition. Interactions withmore » the guide-target minor groove allow Ago2 to interrogate target RNAs in a sequence-independent manner, whereas an adenosine binding-pocket opposite guide nt 1 further facilitates target recognition. Spurious slicing of miRNA targets is avoided through an inhibitory coordination of one catalytic magnesium ion. Finally, these results explain the conserved nucleotide-pairing patterns in animal miRNA target sites first observed over two decades ago.« less

  14. Genetic structure of Eurasian and North American Leymus (Triticeae) wildryes assessed by chloroplast DNA sequences and AFLP profiles

    Treesearch

    C. Mae Culumber; Steve R. Larson; Kevin B. Jensen; Thomas A. Jones

    2011-01-01

    Leymus is a genomically defined allopolyploid of genus Triticeae with two distinct subgenomes. Chloroplast DNA sequences of Eurasian and North American species are distinct and polyphyletic. However, phylogenies derived from chloroplast and nuclear DNA sequences are confounded by polyploidy and lack of polymorphism among many taxa. The AFLP technique can resolve...

  15. Association of gut microbiota with post-operative clinical course in Crohn’s disease

    PubMed Central

    2013-01-01

    Background The gut microbiome is altered in Crohn’s disease. Although individual taxa have been correlated with post-operative clinical course, global trends in microbial diversity have not been described in this context. Methods We collected mucosal biopsies from the terminal ileum and ascending colon during surgery and post-operative colonoscopy in 6 Crohn’s patients undergoing ileocolic resection (and 40 additional Crohn’s and healthy control patients undergoing either surgery or colonoscopy). Using next-generation sequencing technology, we profiled the gut microbiota in order to identify changes associated with remission or recurrence of inflammation. Results We performed 16S ribosomal profiling using 101 base-pair single-end sequencing on the Illumina GAIIx platform with deep coverage, at an average depth of 1.3 million high quality reads per sample. At the time of surgery, Crohn’s patients who would remain in remission were more similar to controls and more species-rich than Crohn’s patients with subsequent recurrence. Patients remaining in remission also exhibited greater stability of the microbiota through time. Conclusions These observations permitted an association of gut microbial profiles with probability of recurrence in this limited single-center study. These results suggest that profiling the gut microbiota may be useful in guiding treatment of Crohn’s patients undergoing surgery. PMID:23964800

  16. The limits of protein sequence comparison?

    PubMed Central

    Pearson, William R; Sierk, Michael L

    2010-01-01

    Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized. PMID:15919194

  17. An Application of Cartesian Graphing to Seismic Exploration.

    ERIC Educational Resources Information Center

    Robertson, Douglas Frederick

    1992-01-01

    Describes how college students enrolled in a course in elementary algebra apply graphing and algebra to data collected from a seismic profile to uncover the structure of a subterranean rock formation. Includes steps guiding the activity. (MDH)

  18. A prospective pilot study of genome-wide exome and transcriptome profiling in patients with small cell lung cancer progressing after first-line therapy

    PubMed Central

    Byron, Sara A.; Aldrich, Jessica; Sangal, Ashish; Barilla, Heather; Kiefer, Jeffrey A.; Carpten, John D.; Craig, David W.; Whitsett, Timothy G.

    2017-01-01

    Background Small cell lung cancer (SCLC) that has progressed after first-line therapy is an aggressive disease with few effective therapeutic strategies. In this prospective study, we employed next-generation sequencing (NGS) to identify therapeutically actionable alterations to guide treatment for advanced SCLC patients. Methods Twelve patients with SCLC were enrolled after failing platinum-based chemotherapy. Following informed consent, genome-wide exome and RNA-sequencing was performed in a CLIA-certified, CAP-accredited environment. Actionable targets were identified and therapeutic recommendations made from a pharmacopeia of FDA-approved drugs. Clinical response to genomically-guided treatment was evaluated by Response Evaluation Criteria in Solid Tumors (RECIST) 1.1. Results The study completed its accrual goal of 12 evaluable patients. The minimum tumor content for successful NGS was 20%, with a median turnaround time from sample collection to genomics-based treatment recommendation of 27 days. At least two clinically actionable targets were identified in each patient, and six patients (50%) received treatment identified by NGS. Two had partial responses by RECIST 1.1 on a clinical trial involving a PD-1 inhibitor + irinotecan (indicated by MLH1 alteration). The remaining patients had clinical deterioration before NGS recommended therapy could be initiated. Conclusions Comprehensive genomic profiling using NGS identified clinically-actionable alterations in SCLC patients who progressed on initial therapy. Recommended PD-1 therapy generated partial responses in two patients. Earlier access to NGS guided therapy, along with improved understanding of those SCLC patients likely to respond to immune-based therapies, should help to extend survival in these cases with poor outcomes. PMID:28586388

  19. Advanced colorectal adenoma related gene expression signature may predict prognostic for colorectal cancer patients with adenoma-carcinoma sequence.

    PubMed

    Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun

    2015-01-01

    There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.

  20. Hierarchical kernel mixture models for the prediction of AIDS disease progression using HIV structural gp120 profiles

    PubMed Central

    2010-01-01

    Changes to the glycosylation profile on HIV gp120 can influence viral pathogenesis and alter AIDS disease progression. The characterization of glycosylation differences at the sequence level is inadequate as the placement of carbohydrates is structurally complex. However, no structural framework is available to date for the study of HIV disease progression. In this study, we propose a novel machine-learning based framework for the prediction of AIDS disease progression in three stages (RP, SP, and LTNP) using the HIV structural gp120 profile. This new intelligent framework proves to be accurate and provides an important benchmark for predicting AIDS disease progression computationally. The model is trained using a novel HIV gp120 glycosylation structural profile to detect possible stages of AIDS disease progression for the target sequences of HIV+ individuals. The performance of the proposed model was compared to seven existing different machine-learning models on newly proposed gp120-Benchmark_1 dataset in terms of error-rate (MSE), accuracy (CCI), stability (STD), and complexity (TBM). The novel framework showed better predictive performance with 67.82% CCI, 30.21 MSE, 0.8 STD, and 2.62 TBM on the three stages of AIDS disease progression of 50 HIV+ individuals. This framework is an invaluable bioinformatics tool that will be useful to the clinical assessment of viral pathogenesis. PMID:21143806

  1. CRISPR/Cas9 in Genome Editing and Beyond.

    PubMed

    Wang, Haifeng; La Russa, Marie; Qi, Lei S

    2016-06-02

    The Cas9 protein (CRISPR-associated protein 9), derived from type II CRISPR (clustered regularly interspaced short palindromic repeats) bacterial immune systems, is emerging as a powerful tool for engineering the genome in diverse organisms. As an RNA-guided DNA endonuclease, Cas9 can be easily programmed to target new sites by altering its guide RNA sequence, and its development as a tool has made sequence-specific gene editing several magnitudes easier. The nuclease-deactivated form of Cas9 further provides a versatile RNA-guided DNA-targeting platform for regulating and imaging the genome, as well as for rewriting the epigenetic status, all in a sequence-specific manner. With all of these advances, we have just begun to explore the possible applications of Cas9 in biomedical research and therapeutics. In this review, we describe the current models of Cas9 function and the structural and biochemical studies that support it. We focus on the applications of Cas9 for genome editing, regulation, and imaging, discuss other possible applications and some technical considerations, and highlight the many advantages that CRISPR/Cas9 technology offers.

  2. UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

    PubMed

    Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

    2016-01-04

    The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. A computational proposal for designing structured RNA pools for in vitro selection of RNAs.

    PubMed

    Kim, Namhee; Gan, Hin Hark; Schlick, Tamar

    2007-04-01

    Although in vitro selection technology is a versatile experimental tool for discovering novel synthetic RNA molecules, finding complex RNA molecules is difficult because most RNAs identified from random sequence pools are simple motifs, consistent with recent computational analysis of such sequence pools. Thus, enriching in vitro selection pools with complex structures could increase the probability of discovering novel RNAs. Here we develop an approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a "mixing matrix" approach combined with a graph theory analysis. We define five classes of mixing matrices motivated by covariance mutations in RNA; these constructs define nucleotide transition rates and are applied to chosen starting sequences to yield specific nonrandom pools. We examine the coverage of sequence space as a function of the mixing matrix and starting sequence via clustering analysis. We show that, in contrast to random sequences, which are associated only with a local region of sequence space, our designed pools, including a structured pool for GTP aptamers, can target specific motifs. It follows that experimental synthesis of designed pools can benefit from using optimized starting sequences, mixing matrices, and pool fractions associated with each of our constructed pools as a guide. Automation of our approach could provide practical tools for pool design applications for in vitro selection of RNAs and related problems.

  4. Energy--Structure--Life, A Learning System for Understanding Science.

    ERIC Educational Resources Information Center

    Bixby, Louis W.; And Others

    Material for the first year of Energy/Structure/Life, a two-year high school program in integrated science, is contained in this learning guide. The program, a sequence of physics, chemistry, and biology, presents the physical science phase during the first year with these 13 chapters: (1) distance/time/velocity; (2) velocity/change/acceleration;…

  5. Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins

    PubMed Central

    Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie

    2005-01-01

    Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486

  6. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    PubMed

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  7. A low-complexity add-on score for protein remote homology search with COMER.

    PubMed

    Margelevicius, Mindaugas

    2018-06-15

    Protein sequence alignment forms the basis for comparative modeling, the most reliable approach to protein structure prediction, among many other applications. Alignment between sequence families, or profile-profile alignment, represents one of the most, if not the most, sensitive means for homology detection but still necessitates improvement. We aim at improving the quality of profile-profile alignments and the sensitivity induced by them by refining profile-profile substitution scores. We have developed a new score that represents an additional component of profile-profile substitution scores. A comprehensive evaluation shows that the new add-on score statistically significantly improves both the sensitivity and the alignment quality of the COMER method. We discuss why the score leads to the improvement and its almost optimal computational complexity that makes it easily implementable in any profile-profile alignment method. An implementation of the add-on score in the open-source COMER software and data are available at https://sourceforge.net/projects/comer. The COMER software is also available on Github at https://github.com/minmarg/comer and as a Docker image (minmar/comer). Supplementary data are available at Bioinformatics online.

  8. Single molecule sequencing-guided scaffolding and correction of draft assemblies.

    PubMed

    Zhu, Shenglong; Chen, Danny Z; Emrich, Scott J

    2017-12-06

    Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.

  9. Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling.

    PubMed

    Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T

    2014-06-01

    RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.

  10. Effect of lipophilicity modulation on inhibition of human rhinovirus capsid binders.

    PubMed

    Morley, Andrew; Tomkinson, Nicholas; Cook, Andrew; MacDonald, Catherine; Weaver, Richard; King, Sarah; Jenkinson, Lesley; Unitt, John; McCrae, Christopher; Phillips, Tim

    2011-10-15

    To try and generate broad spectrum human rhinovirus VP1 inhibitors with more attractive physicochemical, DMPK and safety profiles, we explored the current SAR of known VP1 compounds. This lead to the identification of specific structural regions where reduction in polarity can be achieved, so guiding chemistry to analogues with significantly superior profiles to previously reported inhibitors. Copyright © 2011 Elsevier Ltd. All rights reserved.

  11. Evolutionary profiles derived from the QR factorization of multiple structural alignments gives an economy of information.

    PubMed

    O'Donoghue, Patrick; Luthey-Schulten, Zaida

    2005-02-25

    We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.

  12. The crystal structure of an oligo(U):pre-mRNA duplex from a trypanosome RNA editing substrate

    PubMed Central

    Mooers, Blaine H.M.; Singh, Amritanshu

    2011-01-01

    Guide RNAs bind antiparallel to their target pre-mRNAs to form editing substrates in reaction cycles that insert or delete uridylates (Us) in most mitochondrial transcripts of trypanosomes. The 5′ end of each guide RNA has an anchor sequence that binds to the pre-mRNA by base-pair complementarity. The template sequence in the middle of the guide RNA directs the editing reactions. The 3′ ends of most guide RNAs have ∼15 contiguous Us that bind to the purine-rich unedited pre-mRNA upstream of the editing site. The resulting U-helix is rich in G·U wobble base pairs. To gain insights into the structure of the U-helix, we crystallized 8 bp of the U-helix in one editing substrate for the A6 mRNA of Trypanosoma brucei. The fragment provides three samples of the 5′-AGA-3′/5′-UUU-3′ base-pair triple. The fusion of two identical U-helices head-to-head promoted crystallization. We obtained X-ray diffraction data with a resolution limit of 1.37 Å. The U-helix had low and high twist angles before and after each G·U wobble base pair; this variation was partly due to shearing of the wobble base pairs as revealed in comparisons with a crystal structure of a 16-nt RNA with all Watson–Crick base pairs. Both crystal structures had wider major grooves at the junction between the poly(U) and polypurine tracts. This junction mimics the junction between the template helix and the U-helix in RNA-editing substrates and may be a site of major groove invasion by RNA editing proteins. PMID:21878548

  13. ATLAS, an integrated structural analysis and design system. Volume 1: ATLAS user's guide

    NASA Technical Reports Server (NTRS)

    Dreisbach, R. L. (Editor)

    1979-01-01

    Some of the many analytical capabilities provided by the ATLAS Version 4.0 System in the logical sequence are described in which model-definition data are prepared and the subsequent computer job is executed. The example data presented and the fundamental technical considerations that are highlighted can be used as guides during the problem solving process. This guide does not describe the details of the ATLAS capabilities, but provides an introduction to the new user of ATLAS to the level at which the complete array of capabilities described in the ATLAS User's Manual can be exploited fully.

  14. Crack depth profiling using guided wave angle dependent reflectivity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Volker, Arno, E-mail: arno.volker@tno.nl; Pahlavan, Lotfollah, E-mail: arno.volker@tno.nl; Blacquiere, Gerrit, E-mail: arno.volker@tno.nl

    2015-03-31

    Tomographic corrosion monitoring techniques have been developed, using two rings of sensors around the circumference of a pipe. This technique is capable of providing a detailed wall thickness map, however this might not be the only type of structural damage. Therefore this concept is expanded to detect and size cracks and small corrosion defects like root corrosion. The expanded concept uses two arrays of guided-wave transducers, collecting both reflection and transmission data. The data is processed such that the angle-dependent reflectivity is obtained without using a baseline signal of a defect-free situation. The angle-dependent reflectivity is the input of anmore » inversion scheme that calculates a crack depth profile. From this profile, the depth and length of the crack can be determined. Preliminary experiments show encouraging results. The depth sizing accuracy is in the order of 0.5 mm.« less

  15. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding.

    PubMed

    Shahi, Payam; Kim, Samuel C; Haliburton, John R; Gartner, Zev J; Abate, Adam R

    2017-03-14

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing.

  16. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding

    NASA Astrophysics Data System (ADS)

    Shahi, Payam; Kim, Samuel C.; Haliburton, John R.; Gartner, Zev J.; Abate, Adam R.

    2017-03-01

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing.

  17. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding

    PubMed Central

    Shahi, Payam; Kim, Samuel C.; Haliburton, John R.; Gartner, Zev J.; Abate, Adam R.

    2017-01-01

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing. PMID:28290550

  18. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    PubMed

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  19. Aromatic claw: A new fold with high aromatic content that evades structural prediction: Aromatic Claw

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sachleben, Joseph R.; Adhikari, Aashish N.; Gawlak, Grzegorz

    2016-11-10

    We determined the NMR structure of a highly aromatic (13%) protein of unknown function, Aq1974 from Aquifex aeolicus (PDB ID: 5SYQ). The unusual sequence of this protein has a tryptophan content five times the normal (six tryptophan residues of 114 or 5.2% while the average tryptophan content is 1.0%) with the tryptophans occurring in a WXW motif. It has no detectable sequence homology with known protein structures. Although its NMR spectrum suggested that the protein was rich in β-sheet, upon resonance assignment and solution structure determination, the protein was found to be primarily α-helical with a small two-stranded β-sheet withmore » a novel fold that we have termed an Aromatic Claw. As this fold was previously unknown and the sequence unique, we submitted the sequence to CASP10 as a target for blind structural prediction. At the end of the competition, the sequence was classified a hard template based model; the structural relationship between the template and the experimental structure was small and the predictions all failed to predict the structure. CSRosetta was found to predict the secondary structure and its packing; however, it was found that there was little correlation between CSRosetta score and the RMSD between the CSRosetta structure and the NMR determined one. This work demonstrates that even in relatively small proteins, we do not yet have the capacity to accurately predict the fold for all primary sequences. The experimental discovery of new folds helps guide the improvement of structural prediction methods.« less

  20. Peculiar Evolutionary History of miR390-Guided TAS3-Like Genes in Land Plants

    PubMed Central

    Krasnikova, Maria S.; Goryunov, Denis V.; Troitsky, Alexey V.; Solovyev, Andrey G.; Ozerova, Lydmila V.; Morozov, Sergey Y.

    2013-01-01

    PCR-based approach was used as a phylogenetic profiling tool to probe genomic DNA samples from representatives of evolutionary distant moss taxa, namely, classes Bryopsida, Tetraphidopsida, Polytrichopsida, Andreaeopsida, and Sphagnopsida. We found relatives of all Physcomitrella patens miR390 and TAS3-like loci in these plant taxa excluding Sphagnopsida. Importantly, cloning and sequencing of Marchantia polymorpha genomic DNA showed miR390 and TAS3-like sequences which were also found among genomic reads of M. polymorpha at NCBI database. Our data suggest that the ancient plant miR390-dependent TAS molecular machinery firstly evolved to target AP2-like mRNAs in Marchantiophyta and only then both ARF- and AP2-specific mRNAs in mosses. The presented analysis shows that moss TAS3 families may undergone losses of tasiAP2 sites during evolution toward ferns and seed plants. These data confirm that miR390-guided genes coding for ARF- and AP2-specific ta-siRNAs have been gradually changed during land plant evolution. PMID:24302881

  1. DSISoft—a MATLAB VSP data processing package

    NASA Astrophysics Data System (ADS)

    Beaty, K. S.; Perron, G.; Kay, I.; Adam, E.

    2002-05-01

    DSISoft is a public domain vertical seismic profile processing software package developed at the Geological Survey of Canada. DSISoft runs under MATLAB version 5.0 and above and hence is portable between computer operating systems supported by MATLAB (i.e. Unix, Windows, Macintosh, Linux). The package includes processing modules for reading and writing various standard seismic data formats, performing data editing, sorting, filtering, and other basic processing modules. The processing sequence can be scripted allowing batch processing and easy documentation. A structured format has been developed to ensure future additions to the package are compatible with existing modules. Interactive modules have been created using MATLAB's graphical user interface builder for displaying seismic data, picking first break times, examining frequency spectra, doing f- k filtering, and plotting the trace header information. DSISoft modular design facilitates the incorporation of new processing algorithms as they are developed. This paper gives an overview of the scope of the software and serves as a guide for the addition of new modules.

  2. Skin Microbiome Surveys Are Strongly Influenced by Experimental Design.

    PubMed

    Meisel, Jacquelyn S; Hannigan, Geoffrey D; Tyldsley, Amanda S; SanMiguel, Adam J; Hodkinson, Brendan P; Zheng, Qi; Grice, Elizabeth A

    2016-05-01

    Culture-independent studies to characterize skin microbiota are increasingly common, due in part to affordable and accessible sequencing and analysis platforms. Compared to culture-based techniques, DNA sequencing of the bacterial 16S ribosomal RNA (rRNA) gene or whole metagenome shotgun (WMS) sequencing provides more precise microbial community characterizations. Most widely used protocols were developed to characterize microbiota of other habitats (i.e., gastrointestinal) and have not been systematically compared for their utility in skin microbiome surveys. Here we establish a resource for the cutaneous research community to guide experimental design in characterizing skin microbiota. We compare two widely sequenced regions of the 16S rRNA gene to WMS sequencing for recapitulating skin microbiome community composition, diversity, and genetic functional enrichment. We show that WMS sequencing most accurately recapitulates microbial communities, but sequencing of hypervariable regions 1-3 of the 16S rRNA gene provides highly similar results. Sequencing of hypervariable region 4 poorly captures skin commensal microbiota, especially Propionibacterium. WMS sequencing, which is resource and cost intensive, provides evidence of a community's functional potential; however, metagenome predictions based on 16S rRNA sequence tags closely approximate WMS genetic functional profiles. This study highlights the importance of experimental design for downstream results in skin microbiome surveys. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Skin microbiome surveys are strongly influenced by experimental design

    PubMed Central

    Meisel, Jacquelyn S.; Hannigan, Geoffrey D.; Tyldsley, Amanda S.; SanMiguel, Adam J.; Hodkinson, Brendan P.; Zheng, Qi; Grice, Elizabeth A.

    2016-01-01

    Culture-independent studies to characterize skin microbiota are increasingly common, due in part to affordable and accessible sequencing and analysis platforms. Compared to culture-based techniques, DNA sequencing of the bacterial 16S ribosomal RNA (rRNA) gene or whole metagenome shotgun (WMS) sequencing provide more precise microbial community characterizations. Most widely used protocols were developed to characterize microbiota of other habitats (i.e. gastrointestinal), and have not been systematically compared for their utility in skin microbiome surveys. Here we establish a resource for the cutaneous research community to guide experimental design in characterizing skin microbiota. We compare two widely sequenced regions of the 16S rRNA gene to WMS sequencing for recapitulating skin microbiome community composition, diversity, and genetic functional enrichment. We show that WMS sequencing most accurately recapitulates microbial communities, but sequencing of hypervariable regions 1-3 of the 16S rRNA gene provides highly similar results. Sequencing of hypervariable region 4 poorly captures skin commensal microbiota, especially Propionibacterium. WMS sequencing, which is resource- and cost-intensive, provides evidence of a community’s functional potential; however, metagenome predictions based on 16S rRNA sequence tags closely approximate WMS genetic functional profiles. This work highlights the importance of experimental design for downstream results in skin microbiome surveys. PMID:26829039

  4. High Dispersion Line Profile Studies of TW HYA and Other Pre-Main Sequence Stars

    NASA Astrophysics Data System (ADS)

    Linsky, Jeffrey L.

    1984-07-01

    We propose to extend our study of line profiles in T Tauri stars by obtaining a 16 hour SWP-HI spectrum of TW Hya and 6-8 hour LWP-HI spectra of TW Hya, AK Sco, CoD -35 10525 and CoD -33 10685. High dispersion spectra of pre-main sequence (PMS) stars provide unique information on line widths, shifts, and asymmetries, as well as evidence for mass outflow, circumstellar absorption, and diagnostics for the temperature structure of the outer atmosphere layers of these complex yet incredibly important objects. We have previously obtained and studied line profiles in RU Lupi and the prototype star T Tau. RU Lupi has line profiles that are dominated by the wind expansion, for example the MgII and FeII multiplet UV1 profiles are unique in that they have a classical P Cygni shape, whereas T Tau has more symmetric emission profiles indicative of a chromosphere and hotter layers not dominated by expansion. TW Hya is different from these two previously studied stars in that it may be the brightest known example of a post-T Tauri star, and hence less active and older than the other PMS stars. We intend to compare its line profiles with those of RU Lupi and T Tauri in order to understand the differences in the non-thermal mass motions, wind expansion, and thermal structures of these three very different T Tau stars. The requested LWPHI spectra are to obtain MgII and FeII multiplet UV1. profiles of 4 different T Tauri objects so as to infer the expansion and thermal structure in their chromospheric layers.

  5. SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption.

    PubMed

    Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae

    2013-12-20

    Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions.

  6. SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption

    PubMed Central

    Ho, Michelle L.; Adler, Benjamin A.; Torre, Michael L.; Silberg, Jonathan J.; Suh, Junghae

    2013-01-01

    Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications, but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions. PMID:23899192

  7. Joint refinement model for the spin resolved one-electron reduced density matrix of YTiO3 using magnetic structure factors and magnetic Compton profiles data.

    PubMed

    Gueddida, Saber; Yan, Zeyin; Kibalin, Iurii; Voufack, Ariste Bolivard; Claiser, Nicolas; Souhassou, Mohamed; Lecomte, Claude; Gillon, Béatrice; Gillet, Jean-Michel

    2018-04-28

    In this paper, we propose a simple cluster model with limited basis sets to reproduce the unpaired electron distributions in a YTiO 3 ferromagnetic crystal. The spin-resolved one-electron-reduced density matrix is reconstructed simultaneously from theoretical magnetic structure factors and directional magnetic Compton profiles using our joint refinement algorithm. This algorithm is guided by the rescaling of basis functions and the adjustment of the spin population matrix. The resulting spin electron density in both position and momentum spaces from the joint refinement model is in agreement with theoretical and experimental results. Benefits brought from magnetic Compton profiles to the entire spin density matrix are illustrated. We studied the magnetic properties of the YTiO 3 crystal along the Ti-O 1 -Ti bonding. We found that the basis functions are mostly rescaled by means of magnetic Compton profiles, while the molecular occupation numbers are mainly modified by the magnetic structure factors.

  8. The molecular genetic makeup of acute lymphoblastic leukemia | Office of Cancer Genomics

    Cancer.gov

    Abstract: Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention.

  9. Self-Study and Evaluation Guide.

    ERIC Educational Resources Information Center

    National Accreditation Council for Agencies Serving the Blind and Visually Handicapped, New York, NY.

    Standards developed for agencies over a 3-year period are presented. The following are provided or specified: a manual of procedures for agency self-study, an agency and community profile, agency function and structure, financial accounting and service reporting, personnel administration and volunteer service, physical facilities, public relations…

  10. Variation in Soil Microbial Community Structure Associated with Different Legume Species Is Greater than that Associated with Different Grass Species

    PubMed Central

    Zhou, Yang; Zhu, Honghui; Fu, Shenglei; Yao, Qing

    2017-01-01

    Plants are the essential factors shaping soil microbial community (SMC) structure. When most studies focus on the difference in the SMC structure associated different plant species, the variation in the SMC structure associated with phylogenetically close species is less investigated. Legume (Fabaceae) and grass (Poaceae) are functionally important plant groups; however, their influences on the SMC structure are seldom compared, and the variation in the SMC structure among legume or grass species is largely unknown. In this study, we grew three legume species vs. three grass species in mesocosms, and monitored the soil chemical property, quantified the abundance of bacteria and fungi. The SMC structure was also characterized using PCR-DGGE and Miseq sequencing. Results showed that legume and grass differentially affected soil pH, dissolved organic C, total N content, and available P content, and that legume enriched fungi more greatly than grass. Both DGGE profiling and Miseq-sequencing indicated that the bacterial diversity associated with legume was higher than that associated with grass. When legume increased the abundance of Verrucomicrobia, grass decreased it, and furthermore, linear discriminant analysis identified some group-specific microbial taxa as potential biomarkers of legume or grass. These data suggest that legume and grass differentially select for the SMC. More importantly, clustering analysis based on both DGGE profiling and Miseq-sequencing demonstrated that the variation in the SMC structure associated with three legume species was greater than that associated with three grass species. PMID:28620371

  11. Profile of small interfering RNAs from cotton plants infected with the polerovirus Cotton leafroll dwarf virus.

    PubMed

    Silva, Tatiane F; Romanel, Elisson A C; Andrade, Roberto R S; Farinelli, Laurent; Østerås, Magne; Deluen, Cécile; Corrêa, Régis L; Schrago, Carlos E G; Vaslin, Maite F S

    2011-08-24

    In response to infection, viral genomes are processed by Dicer-like (DCL) ribonuclease proteins into viral small RNAs (vsRNAs) of discrete sizes. vsRNAs are then used as guides for silencing the viral genome. The profile of vsRNAs produced during the infection process has been extensively studied for some groups of viruses. However, nothing is known about the vsRNAs produced during infections of members of the economically important family Luteoviridae, a group of phloem-restricted viruses. Here, we report the characterization of a population of vsRNAs from cotton plants infected with Cotton leafroll dwarf virus (CLRDV), a member of the genus Polerovirus, family Luteoviridae. Deep sequencing of small RNAs (sRNAs) from leaves of CLRDV-infected cotton plants revealed that the vsRNAs were 21- to 24-nucleotides (nt) long and that their sequences matched the viral genome, with higher frequencies of matches in the 3- region. There were equivalent amounts of sense and antisense vsRNAs, and the 22-nt class of small RNAs was predominant. During infection, cotton Dcl transcripts appeared to be up-regulated, while Dcl2 appeared to be down-regulated. This is the first report on the profile of sRNAs in a plant infected with a virus from the family Luteoviridae. Our sequence data strongly suggest that virus-derived double-stranded RNA functions as one of the main precursors of vsRNAs. Judging by the profiled size classes, all cotton DCLs might be working to silence the virus. The possible causes for the unexpectedly high accumulation of 22-nt vsRNAs are discussed. CLRDV is the causal agent of Cotton blue disease, which occurs worldwide. Our results are an important contribution for understanding the molecular mechanisms involved in this and related diseases.

  12. Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

    PubMed Central

    2010-01-01

    Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. Conclusions SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites. PMID:20102603

  13. Comprehensive mutagenesis of HIV-1 protease: a computational geometry approach.

    PubMed

    Masso, Majid; Vaisman, Iosif I

    2003-05-30

    A computational geometry technique based on Delaunay tessellation of protein structure, represented by C(alpha) atoms, is used to study effects of single residue mutations on sequence-structure compatibility in HIV-1 protease. Profiles of residue scores derived from the four-body statistical potential are constructed for all 1881 mutants of the HIV-1 protease monomer and compared with the profile of the wild-type protein. The profiles for an isolated monomer of HIV-1 protease and the identical monomer in a dimeric state with an inhibitor are analyzed to elucidate changes to structural stability. Protease residues shown to undergo the greatest impact are those forming the dimer interface and flap region, as well as those known to be involved in inhibitor binding.

  14. Evol and ProDy for bridging protein sequence evolution and structural dynamics

    PubMed Central

    Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R.; Bahar, Ivet

    2014-01-01

    Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar@pitt.edu PMID:24849577

  15. Crystal Structure of Cocosin, A Potential Food Allergen from Coconut (Cocos nucifera).

    PubMed

    Jin, Tengchuan; Wang, Cheng; Zhang, Caiying; Wang, Yang; Chen, Yu-Wei; Guo, Feng; Howard, Andrew; Cao, Min-Jie; Fu, Tong-Jen; McHugh, Tara H; Zhang, Yuzhu

    2017-08-30

    Coconut (Cocos nucifera) is an important palm tree. Coconut fruit is widely consumed. The most abundant storage protein in coconut fruit is cocosin (a likely food allergen), which belongs to the 11S globulin family. Cocosin was crystallized near a century ago, but its structure remains unknown. By optimizing crystallization conditions and cryoprotectant solutions, we were able to obtain cocosin crystals that diffracted to 1.85 Å. The cocosin gene was cloned from genomic DNA isolated from dry coconut tissue. The protein sequence deduced from the predicted cocosin coding sequence was used to guide model building and structure refinement. The structure of cocosin was determined for the first time, and it revealed a typical 11S globulin feature of a double layer doughnut-shaped hexamer.

  16. Effects of the Laramide Structures on the Regional Distribution of Tight-Gas Sandstone in the Upper Mesaverde Group, Uinta Basin, Utah

    NASA Astrophysics Data System (ADS)

    Sitaula, R. P.; Aschoff, J.

    2013-12-01

    Regional-scale sequence stratigraphic correlation, well log analysis, syntectonic unconformity mapping, isopach maps, and depositional environment maps of the upper Mesaverde Group (UMG) in Uinta basin, Utah suggest higher accommodation in northeastern part (Natural Buttes area) and local development of lacustrine facies due to increased subsidence caused by uplift of San Rafael Swell (SRS) in southern and Uinta Uplift in northern parts. Recently discovered lacustrine facies in Natural Buttes area are completely different than the dominant fluvial facies in outcrops along Book Cliffs and could have implications for significant amount of tight-gas sand production from this area. Data used for sequence stratigraphic correlation, isopach maps and depositional environmental maps include > 100 well logs, 20 stratigraphic profiles, 35 sandstone thin sections and 10 outcrop-based gamma ray profiles. Seven 4th order depositional sequences (~0.5 my duration) are identified and correlated within UMG. Correlation was constructed using a combination of fluvial facies and stacking patterns in outcrops, chert-pebble conglomerates and tidally influenced strata. These surfaces were extrapolated into subsurface by matching GR profiles. GR well logs and core log of Natural Buttes area show intervals of coarsening upward patterns suggesting possible lacustrine intervals that might contain high TOC. Locally, younger sequences are completely truncated across SRS whereas older sequences are truncated and thinned toward SRS. The cycles of truncation and thinning represent phases of SRS uplift. Thinning possibly related with the Uinta Uplift is also observed in northwestern part. Paleocurrents are consistent with interpretation of periodic segmentation and deflection of sedimentation. Regional paleocurrents are generally E-NE-directed in Sequences 1-4, and N-directed in Sequences 5-7. From isopach maps and paleocurrent direction it can be interpreted that uplift of SRS changed route of sediment supply from west to southwest. Locally, paleocurrents are highly variable near SRS further suggesting UMG basin-fill was partitioned by uplift of SRS. Sandstone composition analysis also suggests the uplift of SRS causing the variation of source rocks in upper sequences than the lower sequences. In conclusion, we suggest that Uinta basin was episodically partitioned during the deposition of UMG due to uplift of Laramide structures in the basin and accommodation was localized in northeastern part. Understanding of structural controls on accommodation, sedimentation patterns and depositional environments will aid prediction of the best-producing gas reservoirs.

  17. An approach to functionally relevant clustering of the protein universe: Active site profile‐based clustering of protein structures and sequences

    PubMed Central

    Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.

    2017-01-01

    Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422

  18. Feasibility of implementing molecular-guided therapy for the treatment of patients with relapsed or refractory neuroblastoma

    PubMed Central

    Saulnier Sholler, Giselle L; Bond, Jeffrey P; Bergendahl, Genevieve; Dutta, Akshita; Dragon, Julie; Neville, Kathleen; Ferguson, William; Roberts, William; Eslin, Don; Kraveka, Jacqueline; Kaplan, Joel; Mitchell, Deanna; Parikh, Nehal; Merchant, Melinda; Ashikaga, Takamaru; Hanna, Gina; Lescault, Pamela Jean; Siniard, Ashley; Corneveaux, Jason; Huentelman, Matthew; Trent, Jeffrey

    2015-01-01

    The primary objective of the study was to evaluate the feasibility and safety of a process which would utilize genome-wide expression data from tumor biopsies to support individualized treatment decisions. Current treatment options for recurrent neuroblastoma are limited and ineffective, with a survival rate of <10%. Molecular profiling may provide data which will enable the practitioner to select the most appropriate therapeutic option for individual patients, thus improving outcomes. Sixteen patients with neuroblastoma were enrolled of which fourteen were eligible for this study. Feasibility was defined as completion of tumor biopsy, pathological evaluation, RNA quality control, gene expression profiling, bioinformatics analysis, generation of a drug prediction report, molecular tumor board yielding a treatment plan, independent medical monitor review, and treatment initiation within a 21 day period. All eligible biopsies passed histopathology and RNA quality control. Expression profiling by microarray and RNA sequencing were mutually validated. The average time from biopsy to report generation was 5.9 days and from biopsy to initiation of treatment was 12.4 days. No serious adverse events were observed and all adverse events were expected. Clinical benefit was seen in 64% of patients as stabilization of disease for at least one cycle of therapy or partial response. The overall response rate was 7% and the progression free survival was 59 days. This study demonstrates the feasibility and safety of performing real-time genomic profiling to guide treatment decision making for pediatric neuroblastoma patients. PMID:25720842

  19. Mode selective generation of guided waves by systematic optimization of the interfacial shear stress profile

    NASA Astrophysics Data System (ADS)

    Yazdanpanah Moghadam, Peyman; Quaegebeur, Nicolas; Masson, Patrice

    2015-01-01

    Piezoelectric transducers are commonly used in structural health monitoring systems to generate and measure ultrasonic guided waves (GWs) by applying interfacial shear and normal stresses to the host structure. In most cases, in order to perform damage detection, advanced signal processing techniques are required, since a minimum of two dispersive modes are propagating in the host structure. In this paper, a systematic approach for mode selection is proposed by optimizing the interfacial shear stress profile applied to the host structure, representing the first step of a global optimization of selective mode actuator design. This approach has the potential of reducing the complexity of signal processing tools as the number of propagating modes could be reduced. Using the superposition principle, an analytical method is first developed for GWs excitation by a finite number of uniform segments, each contributing with a given elementary shear stress profile. Based on this, cost functions are defined in order to minimize the undesired modes and amplify the selected mode and the optimization problem is solved with a parallel genetic algorithm optimization framework. Advantages of this method over more conventional transducers tuning approaches are that (1) the shear stress can be explicitly optimized to both excite one mode and suppress other undesired modes, (2) the size of the excitation area is not constrained and mode-selective excitation is still possible even if excitation width is smaller than all excited wavelengths, and (3) the selectivity is increased and the bandwidth extended. The complexity of the optimal shear stress profile obtained is shown considering two cost functions with various optimal excitation widths and number of segments. Results illustrate that the desired mode (A0 or S0) can be excited dominantly over other modes up to a wave power ratio of 1010 using an optimal shear stress profile.

  20. Short-Term Dynamic and Local Epidemiological Trends in the South American HIV-1B Epidemic.

    PubMed

    Junqueira, Dennis Maletich; de Medeiros, Rubia Marília; Gräf, Tiago; Almeida, Sabrina Esteves de Matos

    2016-01-01

    The human displacement and sexual behavior are the main factors driving the HIV-1 pandemic to the current profile. The intrinsic structure of the HIV transmission among different individuals has valuable importance for the understanding of the epidemic and for the public health response. The aim of this study was to characterize the HIV-1 subtype B (HIV-1B) epidemic in South America through the identification of transmission links and infer trends about geographical patterns and median time of transmission between individuals. Sequences of the protease and reverse transcriptase coding regions from 4,810 individuals were selected from GenBank. Maximum likelihood phylogenies were inferred and submitted to ClusterPicker to identify transmission links. Bayesian analyses were applied only for clusters including ≥5 dated samples in order to estimate the median maximum inter-transmission interval. This study analyzed sequences sampled from 12 South American countries, from individuals of different exposure categories, under different antiretroviral profiles, and from a wide period of time (1989-2013). Continentally, Brazil, Argentina and Venezuela were revealed important sites for the spread of HIV-1B among countries inside South America. Of note, from all the clusters identified about 70% of the HIV-1B infections are primarily occurring among individuals living in the same geographic region. In addition, these transmissions seem to occur early after the infection of an individual, taking in average 2.39 years (95% CI 1.48-3.30) to succeed. Homosexual/Bisexual individuals transmit the virus as quickly as almost half time of that estimated for the general population sampled here. Public health services can be broadly benefitted from this kind of information whether to focus on specific programs of response to the epidemic whether as guiding of prevention campaigns to specific risk groups.

  1. Phylogenetic profiles reveal structural/functional determinants of TRPC3 signal-sensing antennae

    PubMed Central

    Ko, Kyung Dae; Bhardwaj, Gaurav; Hong, Yoojin; Chang, Gue Su; Kiselyov, Kirill

    2009-01-01

    Biochemical assessment of channel structure/function is incredibly challenging. Developing computational tools that provide these data would enable translational research, accelerating mechanistic experimentation for the bench scientist studying ion channels. Starting with the premise that protein sequence encodes information about structure, function and evolution (SF&E), we developed a unified framework for inferring SF&E from sequence information using a knowledge-based approach. The Gestalt Domain Detection Algorithm-Basic Local Alignment Tool (GDDA-BLAST) provides phylogenetic profiles that can model, ab initio, SF&E relationships of biological sequences at the whole protein, single domain and single-amino acid level.1,2 In our recent paper,4 we have applied GDDA-BLAST analysis to study canonical TRP (TRPC) channels1 and empirically validated predicted lipid-binding and trafficking activities contained within the TRPC3 TRP_2 domain of unknown function. Overall, our in silico, in vitro, and in vivo experiments support a model in which TRPC3 has signal-sensing antennae which are adorned with lipid-binding, trafficking and calmodulin regulatory domains. In this Addendum, we correlate our functional domain analysis with the cryo-EM structure of TRPC3.3 In addition, we synthesize recent studies with our new findings to provide a refined model on the mechanism(s) of TRPC3 activation/deactivation. PMID:19704910

  2. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    PubMed

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  3. Comparison of the Microbial Community Structures of Untreated Wastewaters from Different Geographic Locales

    PubMed Central

    Shanks, Orin C.; Newton, Ryan J.; Kelty, Catherine A.; Huse, Susan M.; Sogin, Mitchell L.

    2013-01-01

    Microbial sewage communities consist of a combination of human fecal microorganisms and nonfecal microorganisms, which may be residents of urban sewer infrastructure or flowthrough originating from gray water or rainwater inputs. Together, these different microorganism sources form an identifiable community structure that may serve as a signature for sewage discharges and as candidates for alternative indicators specific for human fecal pollution. However, the structure and variability of this community across geographic space remains uncharacterized. We used massively parallel 454 pyrosequencing of the V6 region in 16S rRNA genes to profile microbial communities from 13 untreated sewage influent samples collected from a wide range of geographic locations in the United States. We obtained a total of 380,175 high-quality sequences for sequence-based clustering, taxonomic analyses, and profile comparisons. The sewage profile included a discernible core human fecal signature made up of several abundant taxonomic groups within Firmicutes, Bacteroidetes, Actinobacteria, and Proteobacteria. DNA sequences were also classified into fecal, sewage infrastructure (i.e., nonfecal), and transient groups based on data comparisons with fecal samples. Across all sewage samples, an estimated 12.1% of sequences were fecal in origin, while 81.4% were consistently associated with the sewage infrastructure. The composition of feces-derived operational taxonomic units remained congruent across all sewage samples regardless of geographic locale; however, the sewage infrastructure community composition varied among cities, with city latitude best explaining this variation. Together, these results suggest that untreated sewage microbial communities harbor a core group of fecal bacteria across geographically dispersed wastewater sewage lines and that ambient water quality indicators targeting these select core microorganisms may perform well across the United States. PMID:23435885

  4. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

    PubMed Central

    Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

    2014-01-01

    RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209

  5. Methylation guide RNA evolution in archaea: structure, function and genomic organization of 110 C/D box sRNA families across six Pyrobaculum species.

    PubMed

    Lui, Lauren M; Uzilov, Andrew V; Bernick, David L; Corredor, Andrea; Lowe, Todd M; Dennis, Patrick P

    2018-05-16

    Archaeal homologs of eukaryotic C/D box small nucleolar RNAs (C/D box sRNAs) guide precise 2'-O-methyl modification of ribosomal and transfer RNAs. Although C/D box sRNA genes constitute one of the largest RNA gene families in archaeal thermophiles, most genomes have incomplete sRNA gene annotation because reliable, fully automated detection methods are not available. We expanded and curated a comprehensive gene set across six species of the crenarchaeal genus Pyrobaculum, particularly rich in C/D box sRNA genes. Using high-throughput small RNA sequencing, specialized computational searches and comparative genomics, we analyzed 526 Pyrobaculum C/D box sRNAs, organizing them into 110 families based on synteny and conservation of guide sequences which determine methylation targets. We examined gene duplications and rearrangements, including one family that has expanded in a pattern similar to retrotransposed repetitive elements in eukaryotes. New training data and inclusion of kink-turn secondary structural features enabled creation of an improved search model. Our analyses provide the most comprehensive, dynamic view of C/D box sRNA evolutionary history within a genus, in terms of modification function, feature plasticity, and gene mobility.

  6. Curriculum Development Guide Based on a Technical Program.

    ERIC Educational Resources Information Center

    Belle-Isle, Louis Phillip

    This "Guide" is intended for educators who have been mandated to develop or modify an educational program's curriculum. The guide presupposes the formulation of an exit-profile and focuses exclusively on activities after the exit-profile has been developed. The development of a curriculum is based on an exit-profile that mirrors the…

  7. NARSTO SOS99NASH WIND PROFILER DATA

    Atmospheric Science Data Center

    2018-04-16

    NARSTO SOS99NASH WIND PROFILER DATA Project Title:  NARSTO ... Platform:  Ground Station Instrument:  Wind Profiler Location:  Nashville, Tennessee Spatial ... Data Guide Documents:  SOS99Nash Wind Profiler Guide Related Data:  Southern Oxidants ...

  8. Retrosynthetic Analysis-Guided Breaking Tile Symmetry for the Assembly of Complex DNA Nanostructures.

    PubMed

    Wang, Pengfei; Wu, Siyu; Tian, Cheng; Yu, Guimei; Jiang, Wen; Wang, Guansong; Mao, Chengde

    2016-10-11

    Current tile-based DNA self-assembly produces simple repetitive or highly symmetric structures. In the case of 2D lattices, the unit cell often contains only one basic tile because the tiles often are symmetric (in terms of either the backbone or the sequence). In this work, we have applied retrosynthetic analysis to determine the minimal asymmetric units for complex DNA nanostructures. Such analysis guides us to break the intrinsic structural symmetries of the tiles to achieve high structural complexities. This strategy has led to the construction of several DNA nanostructures that are not accessible from conventional symmetric tile designs. Along with previous studies, herein we have established a set of four fundamental rules regarding tile-based assembly. Such rules could serve as guidelines for the design of DNA nanostructures.

  9. Structure and Engineering of Francisella novicida Cas9

    PubMed Central

    Hirano, Hisato; Gootenberg, Jonathan S.; Horii, Takuro; Abudayyeh, Omar O.; Kimura, Mika; Hsu, Patrick D.; Nakane, Takanori; Ishitani, Ryuichiro; Hatada, Izuho; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

    2016-01-01

    Summary The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA, and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5′-NGG-3′ PAM, and used the structural information to create a variant that can recognize the more relaxed 5′-YG-3′ PAM. Furthermore, we demonstrated that pre-assembled FnCas9 ribonucleoprotein complexes can be microinjected into mouse zygotes to edit endogenous sites with the 5′-YG-3′ PAMs, thus expanding the target space of the CRISPR-Cas9 toolbox. PMID:26875867

  10. Structure and Engineering of Francisella novicida Cas9.

    PubMed

    Hirano, Hisato; Gootenberg, Jonathan S; Horii, Takuro; Abudayyeh, Omar O; Kimura, Mika; Hsu, Patrick D; Nakane, Takanori; Ishitani, Ryuichiro; Hatada, Izuho; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

    2016-02-25

    The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5'-NGG-3' PAM, and used the structural information to create a variant that can recognize the more relaxed 5'-YG-3' PAM. Furthermore, we demonstrated that the FnCas9-ribonucleoprotein complex can be microinjected into mouse zygotes to edit endogenous sites with the 5'-YG-3' PAM, thus expanding the target space of the CRISPR-Cas9 toolbox. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Cytophotometric and biochemical analyses of DNA in pentaploid and diploid Agave species.

    PubMed

    Cavallini, A; Natali, L; Cionini, G; Castorena-Sanchez, I

    1996-04-01

    Nuclear DNA content, chromatin structure, and DNA composition were investigated in four Agave species: two diploid, Agave tequilana Weber and Agave angustifolia Haworth var. marginata Hort., and two pentaploid, Agave fourcroydes Lemaire and Agave sisalana Perrine. It was determined that the genome size of pentaploid species is nearly 2.5 times that of diploid ones. Cytophotometric analyses of chromatin structure were performed following Feulgen or DAPI staining to determine optical density profiles of interphase nuclei. Pentaploid species showed higher frequencies of condensed chromatin (heterochromatin) than diploid species. On the other hand, a lower frequency of A-T rich (DAPI stained) heterochromatin was found in pentaploid species than in diploid ones, indicating that heterochromatin in pentaploid species is made up of sequences with base compositions different from those of diploid species. Since thermal denaturation profiles of extracted DNA showed minor variations in the base composition of the genomes of the four species, it is supposed that, in pentaploid species, the large heterochromatin content is not due to an overrepresentation of G-C repetitive sequences but rather to the condensation of nonrepetitive sequences, such as, for example, redundant gene copies switched off in the polyploid complement. It is suggested that speciation in the genus Agave occurs through point mutations and minor DNA rearrangements, as is also indicated by the relative stability of the karyotype of this genus. Key words : Agave, DNA cytophotometry, DNA melting profiles, chromatin structure, genome size.

  12. PASS2: an automated database of protein alignments organised as structural superfamilies.

    PubMed

    Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

    2004-04-02

    The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html

  13. Sequence stratigraphy, seismic stratigraphy, and seismic structures of the lower intermediate confining unit and most of the Floridan aquifer system, Broward County, Florida

    USGS Publications Warehouse

    Cunningham, Kevin J.; Kluesner, Jared W.; Westcott, Richard L.; Robinson, Edward; Walker, Cameron; Khan, Shakira A.

    2017-12-08

    Deep well injection and disposal of treated wastewater into the highly transmissive saline Boulder Zone in the lower part of the Floridan aquifer system began in 1971. The zone of injection is a highly transmissive hydrogeologic unit, the Boulder Zone, in the lower part of the Floridan aquifer system. Since the 1990s, however, treated wastewater injection into the Boulder Zone in southeastern Florida has been detected at three treated wastewater injection utilities in the brackish upper part of the Floridan aquifer system designated for potential use as drinking water. At a time when usage of the Boulder Zone for treated wastewater disposal is increasing and the utilization of the upper part of the Floridan aquifer system for drinking water is intensifying, there is an urgency to understand the nature of cross-formational fluid flow and identify possible fluid pathways from the lower to upper zones of the Floridan aquifer system. To better understand the hydrogeologic controls on groundwater movement through the Floridan aquifer system in southeastern Florida, the U.S. Geological Survey and the Broward County Environmental Planning and Community Resilience Division conducted a 3.5-year cooperative study from July 2012 to December 2015. The study characterizes the sequence stratigraphy, seismic stratigraphy, and seismic structures of the lower part of the intermediate confining unit aquifer and most of the Floridan aquifer system.Data obtained to meet the study objective include 80 miles of high-resolution, two-dimensional (2D), seismic-reflection profiles acquired from canals in eastern Broward County. These profiles have been used to characterize the sequence stratigraphy, seismic stratigraphy, and seismic structures in a 425-square-mile study area. Horizon mapping of the seismic-reflection profiles and additional data collection from well logs and cores or cuttings from 44 wells were focused on construction of three-dimensional (3D) visualizations of eight sequence stratigraphic cycles that compose the Eocene to Miocene Oldsmar, Avon Park, and Arcadia Formations. The mapping of these seismic-reflection and well data has produced a refined Cenozoic sequence stratigraphic, seismic stratigraphic, and hydrogeologic framework of southeastern Florida. The upward transition from the Oldsmar Formation to the Avon Park Formation and the Arcadia Formation embodies the evolution from (1) a tropical to subtropical, shallow-marine, carbonate platform, represented by the Oldsmar and Avon Park Formations, to (2) a broad, temperate, mixed carbonate-siliciclastic shallow marine shelf, represented by the lower part of the Arcadia Formation, and to (3) a temperate, distally steepened carbonate ramp represented by the upper part of the Arcadia Formation.In the study area, the depositional sequences and seismic sequences have a direct correlation with hydrogeologic units. The approximate upper boundary of four principal permeable units of the Floridan aquifer system (Upper Floridan aquifer, Avon Park permeable zone, uppermost major permeable zone of the Lower Floridan aquifer, and Boulder Zone) have sequence stratigraphic and seismic-reflection signatures that were identified on cross sections, mapped, or both, and therefore the sequence stratigraphy and seismic stratigraphy were used to guide the development of a refined spatial representation of these hydrogeologic units. In all cases, the permeability of the four permeable units is related to stratiform megaporosity generated by ancient dissolution of carbonate rock associated with subaerial exposure and unconformities at the upper surfaces of carbonate depositional cycles of several hierarchical scales ranging from high-frequency cycles to depositional sequences. Additionally, interparticle porosity also contributes substantially to the stratiform permeability in much of the Upper Floridan aquifer. Information from seismic stratigraphy allowed 3D geomodeling of hydrogeologic units—an approach never before applied to this area. Notably, the 3D geomodeling provided 3D visualizations and geocellular models of the depositional sequences, hydrostratigraphy, and structural features. The geocellular data could be used to update the hydrogeologic structure inherent to groundwater flow simulations that are designed to address the sustainability of the water resources of the Floridan aquifer system.Two kinds of pathways that could enable upward cross-formational flow of injected treated wastewater from the Boulder Zone have been identified in the 80 miles of high-resolution seismic data collected for this study: a near-vertical reverse fault and karst collapse structures. The single reverse fault, inferred to be of tectonic origin, is in extreme northeastern Broward County and has an offset of about 19 feet at the level of the Arcadia Formation. Most of the 17 karst collapse structures identified manifest as columniform, vertically stacked sagging seismic reflections that span early Eocene to Miocene age rocks equivalent to much of the Floridan aquifer system and the lower part of the overlying intermediate confining unit. In some cases, the seismic-sag structures extend upward into strata of Pliocene age. The seismic-sag structures are interpreted to have a semicircular shape in plan view on the basis of comparison to (1) other seismic-sag structures in southeastern Florida mapped with two 2D seismic cross lines or 3D data, (2) comparison to these structures located in other carbonate provinces, and (3) plausible extensional ring faults detected with multi-attribute analysis. The seismic-sag structures in the study area have heights as great as 2,500 vertical feet, though importantly, one spans about 7,800 feet. Both multi-attribute analysis and visual detection of offset of seismic reflections within the seismic-sag structures indicate faults and fractures are associated with many of the structures. Multi-attribute analysis highlighting chimney fluid pathways also indicates that the seismic-sag structures have a high probability for potential vertical cross-formational fluid flow along the faulted and fractured structures. A collapse of the seismic-sag structures within a deep burial setting evokes an origin related to hypogenic karst processes by ascending flow of subsurface fluids. In addition, paleo-epigenic karst related to major regional subaerial unconformities within the Florida Platform generated collapse structures (paleo-sinkholes) that are much smaller in scale than the cross-formational seismic-sag structures.

  14. A Precision Medicine Initiative for Alzheimer's disease: the road ahead to biomarker-guided integrative disease modeling.

    PubMed

    Hampel, H; O'Bryant, S E; Durrleman, S; Younesi, E; Rojkova, K; Escott-Price, V; Corvol, J-C; Broich, K; Dubois, B; Lista, S

    2017-04-01

    After intense scientific exploration and more than a decade of failed trials, Alzheimer's disease (AD) remains a fatal global epidemic. A traditional research and drug development paradigm continues to target heterogeneous late-stage clinically phenotyped patients with single 'magic bullet' drugs. Here, we propose that it is time for a paradigm shift towards the implementation of precision medicine (PM) for enhanced risk screening, detection, treatment, and prevention of AD. The overarching structure of how PM for AD can be achieved will be provided through the convergence of breakthrough technological advances, including big data science, systems biology, genomic sequencing, blood-based biomarkers, integrated disease modeling and P4 medicine. It is hypothesized that deconstructing AD into multiple genetic and biological subsets existing within this heterogeneous target population will provide an effective PM strategy for treating individual patients with the specific agent(s) that are likely to work best based on the specific individual biological make-up. The Alzheimer's Precision Medicine Initiative (APMI) is an international collaboration of leading interdisciplinary clinicians and scientists devoted towards the implementation of PM in Neurology, Psychiatry and Neuroscience. It is hypothesized that successful realization of PM in AD and other neurodegenerative diseases will result in breakthrough therapies, such as in oncology, with optimized safety profiles, better responder rates and treatment responses, particularly through biomarker-guided early preclinical disease-stage clinical trials.

  15. Template-based protein structure modeling using the RaptorX web server.

    PubMed

    Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

    2012-07-19

    A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.

  16. Template-based protein structure modeling using the RaptorX web server

    PubMed Central

    Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

    2016-01-01

    A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world. PMID:22814390

  17. Molecular characterization of physis tissue by RNA sequencing.

    PubMed

    Paradise, Christopher R; Galeano-Garces, Catalina; Galeano-Garces, Daniela; Dudakovic, Amel; Milbrandt, Todd A; Saris, Daniel B F; Krych, Aaron J; Karperien, Marcel; Ferguson, Gabriel B; Evseenko, Denis; Riester, Scott M; van Wijnen, Andre J; Noelle Larson, A

    2018-08-20

    The physis is a well-established and anatomically distinct cartilaginous structure that is crucial for normal long-bone development and growth. Abnormalities in physis function are linked to growth plate disorders and other pediatric musculoskeletal diseases. Understanding the molecular pathways operative in the physis may permit development of regenerative therapies to complement surgically-based procedures that are the current standard of care for growth plate disorders. Here, we performed next generation RNA sequencing on mRNA isolated from human physis and other skeletal tissues (e.g., articular cartilage and bone; n = 7 for each tissue). We observed statistically significant enrichment of gene sets in the physis when compared to the other musculoskeletal tissues. Further analysis of these upregulated genes identified physis-specific networks of extracellular matrix proteins including collagens (COL2A1, COL6A1, COL9A1, COL14A1, COL16A1) and matrilins (MATN1, MATN2, MATN3), and signaling proteins in the WNT pathway (WNT10B, FZD1, FZD10, DKK2) or the FGF pathway (FGF10, FGFR4). Our results provide further insight into the gene expression networks that contribute to the physis' unique structural composition and regulatory signaling networks. Physis-specific expression profiles may guide ongoing initiatives in tissue engineering and cell-based therapies for treatment of growth plate disorders and growth modulation therapies. Furthermore, our findings provide new leads for therapeutic drug discovery that would permit future intervention through pharmacological rather than surgical strategies. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas

    2006-03-09

    The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.

  19. Improved Selection of Internal Transcribed Spacer-Specific Primers Enables Quantitative, Ultra-High-Throughput Profiling of Fungal Communities

    PubMed Central

    Bokulich, Nicholas A.

    2013-01-01

    Ultra-high-throughput sequencing (HTS) of fungal communities has been restricted by short read lengths and primer amplification bias, slowing the adoption of newer sequencing technologies to fungal community profiling. To address these issues, we evaluated the performance of several common internal transcribed spacer (ITS) primers and designed a novel primer set and work flow for simultaneous quantification and species-level interrogation of fungal consortia. Primer comparison and validation were predicted in silico and by sequencing a “mock community” of mixed yeast species to explore the challenges of amplicon length and amplification bias for reconstructing defined yeast community structures. The amplicon size and distribution of this primer set are smaller than for all preexisting ITS primer sets, maximizing sequencing coverage of hypervariable ITS domains by very-short-amplicon, high-throughput sequencing platforms. This feature also enables the optional integration of quantitative PCR (qPCR) directly into the HTS preparatory work flow by substituting qPCR with these primers for standard PCR, yielding quantification of individual community members. The complete work flow described here, utilizing any of the qualified primer sets evaluated, can rapidly profile mixed fungal communities and capably reconstructed well-characterized beer and wine fermentation fungal communities. PMID:23377949

  20. High-resolution phylogenetic microbial community profiling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin

    Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less

  1. High-resolution phylogenetic microbial community profiling

    DOE PAGES

    Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin; ...

    2016-02-09

    Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less

  2. Nature of the protein universe

    PubMed Central

    Levitt, Michael

    2009-01-01

    The protein universe is the set of all proteins of all organisms. Here, all currently known sequences are analyzed in terms of families that have single-domain or multidomain architectures and whether they have a known three-dimensional structure. Growth of new single-domain families is very slow: Almost all growth comes from new multidomain architectures that are combinations of domains characterized by ≈15,000 sequence profiles. Single-domain families are mostly shared by the major groups of organisms, whereas multidomain architectures are specific and account for species diversity. There are known structures for a quarter of the single-domain families, and >70% of all sequences can be partially modeled thanks to their membership in these families. PMID:19541617

  3. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    PubMed

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

  4. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    PubMed

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .

  5. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    PubMed

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  6. Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

    PubMed Central

    Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

    2004-01-01

    The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941

  7. Reprint of: Early Behavioural Facilitation by Temporal Expectations in Complex Visual-motor Sequences.

    PubMed

    Heideman, Simone G; van Ede, Freek; Nobre, Anna C

    2018-05-24

    In daily life, temporal expectations may derive from incidental learning of recurring patterns of intervals. We investigated the incidental acquisition and utilisation of combined temporal-ordinal (spatial/effector) structure in complex visual-motor sequences using a modified version of a serial reaction time (SRT) task. In this task, not only the series of targets/responses, but also the series of intervals between subsequent targets was repeated across multiple presentations of the same sequence. Each participant completed three sessions. In the first session, only the repeating sequence was presented. During the second and third session, occasional probe blocks were presented, where a new (unlearned) spatial-temporal sequence was introduced. We first confirm that participants not only got faster over time, but that they were slower and less accurate during probe blocks, indicating that they incidentally learned the sequence structure. Having established a robust behavioural benefit induced by the repeating spatial-temporal sequence, we next addressed our central hypothesis that implicit temporal orienting (evoked by the learned temporal structure) would have the largest influence on performance for targets following short (as opposed to longer) intervals between temporally structured sequence elements, paralleling classical observations in tasks using explicit temporal cues. We found that indeed, reaction time differences between new and repeated sequences were largest for the short interval, compared to the medium and long intervals, and that this was the case, even when comparing late blocks (where the repeated sequence had been incidentally learned), to early blocks (where this sequence was still unfamiliar). We conclude that incidentally acquired temporal expectations that follow a sequential structure can have a robust facilitatory influence on visually-guided behavioural responses and that, like more explicit forms of temporal orienting, this effect is most pronounced for sequence elements that are expected at short inter-element intervals. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  8. Career Academy Course Sequences.

    ERIC Educational Resources Information Center

    Markham, Thom; Lenz, Robert

    This career academy course sequence guide is designed to give teachers a quick overview of the course sequences of well-known career academy and career pathway programs from across the country. The guide presents a variety of sample course sequences for the following academy themes: (1) arts and communication; (2) business and finance; (3)…

  9. The segmented non-uniform dielectric module design for uniformity control of plasma profile in a capacitively coupled plasma chamber

    NASA Astrophysics Data System (ADS)

    Xia, Huanxiong; Xiang, Dong; Yang, Wang; Mou, Peng

    2014-12-01

    Low-temperature plasma technique is one of the critical techniques in IC manufacturing process, such as etching and thin-film deposition, and the uniformity greatly impacts the process quality, so the design for the plasma uniformity control is very important but difficult. It is hard to finely and flexibly regulate the spatial distribution of the plasma in the chamber via controlling the discharge parameters or modifying the structure in zero-dimensional space, and it just can adjust the overall level of the process factors. In the view of this problem, a segmented non-uniform dielectric module design solution is proposed for the regulation of the plasma profile in a CCP chamber. The solution achieves refined and flexible regulation of the plasma profile in the radial direction via configuring the relative permittivity and the width of each segment. In order to solve this design problem, a novel simulation-based auto-design approach is proposed, which can automatically design the positional sequence with multi independent variables to make the output target profile in the parameterized simulation model approximate the one that users preset. This approach employs an idea of quasi-closed-loop control system, and works in an iterative mode. It starts from initial values of the design variable sequences, and predicts better sequences via the feedback of the profile error between the output target profile and the expected one. It never stops until the profile error is narrowed in the preset tolerance.

  10. Improve homology search sensitivity of PacBio data by correcting frameshifts.

    PubMed

    Du, Nan; Sun, Yanni

    2016-09-01

    Single-molecule, real-time sequencing (SMRT) developed by Pacific BioSciences produces longer reads than secondary generation sequencing technologies such as Illumina. The long read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and identify gene isoforms with higher accuracy in transcriptomic sequencing. However, PacBio data has high sequencing error rate and most of the errors are insertion or deletion errors. During alignment-based homology search, insertion or deletion errors in genes will cause frameshifts and may only lead to marginal alignment scores and short alignments. As a result, it is hard to distinguish true alignments from random alignments and the ambiguity will incur errors in structural and functional annotation. Existing frameshift correction tools are designed for data with much lower error rate and are not optimized for PacBio data. As an increasing number of groups are using SMRT, there is an urgent need for dedicated homology search tools for PacBio data. In this work, we introduce Frame-Pro, a profile homology search tool for PacBio reads. Our tool corrects sequencing errors and also outputs the profile alignments of the corrected sequences against characterized protein families. We applied our tool to both simulated and real PacBio data. The results showed that our method enables more sensitive homology search, especially for PacBio data sets of low sequencing coverage. In addition, we can correct more errors when comparing with a popular error correction tool that does not rely on hybrid sequencing. The source code is freely available at https://sourceforge.net/projects/frame-pro/ yannisun@msu.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Relationships between the geometry of seismogenic faults and observed seismicty: a contribute from reflection seismic

    NASA Astrophysics Data System (ADS)

    Ciaccio, M. G.; Mirabella, F.; Stucchi, E.

    2003-04-01

    We analyze the seismogenic structures of the the Colfiorito area (central Italy), strucked by the 1997-98 relevant seismic sequence. This area has been used as a test site to investigate the possible interactions between earthquake seismology, reflection seismology and structural geology. Here we show the results obtained from the interpretation of the re-processed seismic reflection profile, acquired in the 80' for hydrocarbon exploration by ENI-Agip, crossing the epicentral area and the relationships between relating hypocentral locations and geological features derived from surface and from seismic data. The dense distribution of seismic stations connected to a temporary network installed after the occurrence of the first two large shocks (Mw=5.7 and Mw=6.0) provided high quality data showing earthquakes located at depth varying from 3 to 9 km and characterised by normal faulting mechanisms, with a NE-SW tension axis oriented about N55^o. The non conventional reprocessing sequence adopted was aimed to the early removal of the coherent and random noise and to the optimal definition of fault systems. The obtained profile shows an outstanding increase in the resolution of the geological structures with a better evidence of the faults and allows a much better correlation of surface geology features with the reflectors and the banning of parts of the profiles which run along the strike of the geological structures. The profile also shows a good image of the deep structure which has been interpreted as the depth image of the major fault of the Colfiorito fault system. A first attempt of projection of the earthquakes of the 1997-98 sequence shows a basic consistence with the inferred extensional structures at depth. The study also evidences that at least the upper part of the basement is involved in the thrust sheets, with a stepping and deepening of the basement from west to east from 5.5, to 9 km depth. The average dip at depth of the active faults is about 40^o fitting with the slip plane inferred from the focal mechanism of the main shocks and with the aftershocks distribution alignment in cross section of the aftershock sequence. At a depth of about 8 km, the trace of the active normal fault corresponds to the position of a Basement step, hence suggesting that the position of the Basement steps, generated by Miocene-Pliocene thrust tectonics, may have controlled the location of the subsequent normal faults.

  12. Enhancing the Skill-Building Phase of Introductory Organic Chemistry Lab through a Reflective Peer Review Structure

    ERIC Educational Resources Information Center

    Pontrello, Jason K.

    2016-01-01

    Introductory organic laboratory courses frequently begin with a set of activities built around developing basic experimental skills and techniques, often with guided-inquiry components. A sequence of skill-based activities is described to promote reflection, analysis of, and interpersonal communication around science. A multistage process was used…

  13. Developmental Pragmatics: Linguistic and Extralinguistic Bases of Early Conversations.

    ERIC Educational Resources Information Center

    Luszcz, M. A.; Bacharach, V. R.

    The inferential use of linguistic and extralinguistic information in structuring conversations was studied in 90 three- and five-year-old children. Pictures portraying an actor-action-object relation, e.g., a child picking a flower, were used to guide conversational sequences. Both active pictures (which emphasized an action relating actor and…

  14. A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

    PubMed Central

    Tang, Haixu; Li, Sujun; Ye, Yuzhen

    2016-01-01

    Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579

  15. Protein Structure and Function Prediction Using I-TASSER

    PubMed Central

    Yang, Jianyi; Zhang, Yang

    2016-01-01

    I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386

  16. Biomagnification profiles of polycyclic aromatic hydrocarbons, alkylphenols and polychlorinated biphenyls in Tokyo Bay elucidated by delta13C and delta15N isotope ratios as guides to trophic web structure.

    PubMed

    Takeuchi, Ichiro; Miyoshi, Noriko; Mizukawa, Kaoruko; Takada, Hideshige; Ikemoto, Tokutaka; Omori, Koji; Tsuchiya, Kotaro

    2009-05-01

    Biomagnification profiles of polycyclic aromatic hydrocarbons (PAHs), alkylphenols, and polychlorinated biphenyls (PCBs) from the innermost part of Tokyo Bay, Japan were analyzed using stable carbon (delta(13)C) and nitrogen (delta(15)N) isotope ratios as guides to trophic web structure. delta(15)N analysis indicated that all species of mollusks tested were primary consumers, while decapods and fish were secondary consumers. Higher concentrations of PCBs occurred in decapods and fish than in mollusks. In contrast, concentrations of PAHs and alkylphenols were lower in decapods and fish than in mollusks. Unlike PCBs, whose concentrations largely increased with increasing delta(15)N (i.e. increasing trophic level), all PAHs and alkylphenols analyzed followed a reverse trend. Molecular weights of PAHs are lower than those of PCBs, therefore low membrane permeability caused by large molecular size is an unlikely factor in the "biodilution" of PAHs. Organisms at higher trophic levels may rapidly metabolize PAHs or they may assimilate less of them.

  17. Job Profiling Guide. Results of 1994 Job Profiling. Part of the Ohio Vocational Competency Assessment (OVCA) Package.

    ERIC Educational Resources Information Center

    Ohio State Univ., Columbus. Vocational Instructional Materials Lab.

    This guide explains the process of job profiling and details the results of a 1994 profiling of 34 occupations. Discussed in section 1 are the following: purpose and components of the Ohio Vocational Competency Assessment (OVCA) package; purpose, contents, and use of the Ohio Competency Analysis Profiles and Work Keys components of the OVCA…

  18. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  19. Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

    PubMed Central

    Laehnemann, David; Borkhardt, Arndt

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159

  20. Development of a sub-cm high resolution ion Doppler tomography diagnostics for fine structure measurement of guide field reconnection in TS-U

    NASA Astrophysics Data System (ADS)

    Tanabe, Hiroshi; Koike, Hideya; Hatano, Hironori; Hayashi, Takumi; Cao, Qinghong; Himeno, Shunichi; Kaneda, Taishi; Akimitsu, Moe; Sawada, Asuka; Ono, Yasushi

    2017-10-01

    A new type of high-throughput/high-resolution 96CH ion Doppler tomography diagnostics has been developed using ``multi-slit'' spectroscopy technique for detailed investigation of fine structure formation during high guide field magnetic reconnection. In the last three years, high field merging experiment in MAST pioneered new frontiers of reconnection heating: formation of highly peaked structure around X-point in high guide field condition (Bt > 0.3 T), outflow dissipation under the influence of better plasma confinement to form high temperature ring structure which aligns with closed flux surface of toroidal plasma, and interaction between ion and electron temperature profile during transport/confinement phase to form triple peak structure (τeiE 4 ms). To investigate more detailed mechanism with in-situ magnetic measurement, the university of Tokyo starts the upgrade of plasma parameters and spatial resolution of optical diagnostics as in MAST. Now, a new type of high-throughput/high-resolution 96CH ion Doppler tomography diagnostics system construction has been completed and it successfully resolved fine structure of ion heating downstream, aligned with closed flux surface formed by reconnected field. This work was supported by JSPS KAKENHI Grant Numbers 15H05750, 15K14279 and 17H04863.

  1. Mutational profiling of non-small-cell lung cancer patients resistant to first-generation EGFR tyrosine kinase inhibitors using next generation sequencing

    PubMed Central

    Jin, Ying; Shao, Yang; Shi, Xun; Lou, Guangyuan; Zhang, Yiping; Wu, Xue; Tong, Xiaoling; Yu, Xinmin

    2016-01-01

    Patients with advanced non-small-cell lung cancer (NSCLC) harboring sensitive epithelial growth factor receptor (EGFR) mutations invariably develop acquired resistance to EGFR tyrosine kinase inhibitors (TKIs). Identification of actionable genetic alterations conferring drug-resistance can be helpful for guiding the subsequent treatment decision. One of the major resistant mechanisms is secondary EGFR-T790M mutation. Other mechanisms, such as HER2 and MET amplifications, and PIK3CA mutations, were also reported. However, the mechanisms in the remaining patients are still unknown. In this study, we performed mutational profiling in a cohort of 83 NSCLC patients with TKI-sensitizing EGFR mutations at diagnosis and acquired resistance to three different first-generation EGFR TKIs using targeted next generation sequencing (NGS) of 416 cancer-related genes. In total, we identified 322 genetic alterations with a median of 3 mutations per patient. 61% of patients still exhibit TKI-sensitizing EGFR mutations, and 36% of patients acquired EGFR-T790M. Besides other known resistance mechanisms, we identified TET2 mutations in 12% of patients. Interestingly, we also observed SOX2 amplification in EGFR-T790M negative patients, which are restricted to Icotinib treatment resistance, a drug widely used in Chinese NSCLC patients. Our study uncovered mutational profiles of NSCLC patients with first-generation EGFR TKIs resistance with potential therapeutic implications. PMID:27528220

  2. The diploid genome sequence of an Asian individual

    PubMed Central

    Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

    2009-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735

  3. Cleavage Entropy as Quantitative Measure of Protease Specificity

    PubMed Central

    Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Margreiter, Michael A.; Spitzer, Gudrun M.; Wallnoefer, Hannes G.; Liedl, Klaus R.

    2013-01-01

    A purely information theory-guided approach to quantitatively characterize protease specificity is established. We calculate an entropy value for each protease subpocket based on sequences of cleaved substrates extracted from the MEROPS database. We compare our results with known subpocket specificity profiles for individual proteases and protease groups (e.g. serine proteases, metallo proteases) and reflect them quantitatively. Summation of subpocket-wise cleavage entropy contributions yields a measure for overall protease substrate specificity. This total cleavage entropy allows ranking of different proteases with respect to their specificity, separating unspecific digestive enzymes showing high total cleavage entropy from specific proteases involved in signaling cascades. The development of a quantitative cleavage entropy score allows an unbiased comparison of subpocket-wise and overall protease specificity. Thus, it enables assessment of relative importance of physicochemical and structural descriptors in protease recognition. We present an exemplary application of cleavage entropy in tracing substrate specificity in protease evolution. This highlights the wide range of substrate promiscuity within homologue proteases and hence the heavy impact of a limited number of mutations on individual substrate specificity. PMID:23637583

  4. Probing the structural dynamics of the CRISPR-Cas9 RNA-guided DNA-cleavage system by coarse-grained modeling.

    PubMed

    Zheng, Wenjun

    2017-02-01

    In the adaptive immune systems of many bacteria and archaea, the Cas9 endonuclease forms a complex with specific guide/scaffold RNA to identify and cleave complementary target sequences in foreign DNA. This DNA targeting machinery has been exploited in numerous applications of genome editing and transcription control. However, the molecular mechanism of the Cas9 system is still obscure. Recently, high-resolution structures have been solved for Cas9 in different structural forms (e.g., unbound forms, RNA-bound binary complexes, and RNA-DNA-bound tertiary complexes, corresponding to an inactive state, a pre-target-bound state, and a cleavage-competent or product state), which offered key structural insights to the Cas9 mechanism. To further probe the structural dynamics of Cas9 interacting with RNA and DNA at the amino-acid level of details, we have performed systematic coarse-grained modeling using an elastic network model and related analyses. Our normal mode analysis predicted a few key modes of collective motions that capture the observed conformational changes featuring large domain motions triggered by binding of RNA and DNA. Our flexibility analysis identified specific regions with high or low flexibility that coincide with key functional sites (such as DNA/RNA-binding sites, nuclease cleavage sites, and key hinges). We also identified a small set of hotspot residues that control the energetics of functional motions, which overlap with known functional sites and offer promising targets for future mutagenesis efforts to improve the specificity of Cas9. Finally, we modeled the conformational transitions of Cas9 from the unbound form to the binary complex and then the tertiary complex, and predicted a distinct sequence of domain motions. In sum, our findings have offered rich structural and dynamic details relevant to the Cas9 machinery, and will guide future investigation and engineering of the Cas9 systems. Proteins 2017; 85:342-353. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  5. Crystal Structure of the Minimal Cas9 from Campylobacter jejuni Reveals the Molecular Diversity in the CRISPR-Cas9 Systems.

    PubMed

    Yamada, Mari; Watanabe, Yuto; Gootenberg, Jonathan S; Hirano, Hisato; Ran, F Ann; Nakane, Takanori; Ishitani, Ryuichiro; Zhang, Feng; Nishimasu, Hiroshi; Nureki, Osamu

    2017-03-16

    The RNA-guided endonuclease Cas9 generates a double-strand break at DNA target sites complementary to the guide RNA and has been harnessed for the development of a variety of new technologies, such as genome editing. Here, we report the crystal structures of Campylobacter jejuni Cas9 (CjCas9), one of the smallest Cas9 orthologs, in complex with an sgRNA and its target DNA. The structures provided insights into a minimal Cas9 scaffold and revealed the remarkable mechanistic diversity of the CRISPR-Cas9 systems. The CjCas9 guide RNA contains a triple-helix structure, which is distinct from known RNA triple helices, thereby expanding the natural repertoire of RNA triple helices. Furthermore, unlike the other Cas9 orthologs, CjCas9 contacts the nucleotide sequences in both the target and non-target DNA strands and recognizes the 5'-NNNVRYM-3' as the protospacer-adjacent motif. Collectively, these findings improve our mechanistic understanding of the CRISPR-Cas9 systems and may facilitate Cas9 engineering. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Evol and ProDy for bridging protein sequence evolution and structural dynamics.

    PubMed

    Bakan, Ahmet; Dutta, Anindita; Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R; Bahar, Ivet

    2014-09-15

    Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. The proteome: structure, function and evolution

    PubMed Central

    Fleming, Keiran; Kelley, Lawrence A; Islam, Suhail A; MacCallum, Robert M; Muller, Arne; Pazos, Florencio; Sternberg, Michael J.E

    2006-01-01

    This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family. PMID:16524832

  8. Thin film fabrication and system integration test run for a microactuator for a tuneable lens

    NASA Astrophysics Data System (ADS)

    Hoheisel, Dominik; Rissing, Lutz

    2014-03-01

    An electromagnetic microactuator, for controlling of a tuneable lens, with an integrated electrostatic element is fabricated by thin film technology. The actuator consists of two parts: the first part with microcoil and flux guide and the second part with a ring shaped back iron on a polyimide membrane. The back iron is additionally useable as electrode for electrostatic measurement of the air gap and for electrostatic actuation. By attracting the back iron an optical liquid is displaced and forms a liquid lens inside the back iron ring covered by the membrane. For testing the thin film fabrication sequence, up-scaled systems are generated in a test run. To fabricate the flux guide in an easy and quick way, a Ni-Fe foil with a thickness of 50 μm is laminated on the Si-wafer. This foil is also utilized in the following fabrication sequence as seed layer for electroplating. Compared to Ni-Fe structures deposited by electroplating, the foil is featuring better soft magnetic properties. The foil is structured by wet chemical etching and the backside of the wafer is structured by deep reactive ion etching (DRIE). For post fabrication thinning, the polyimide membrane is treated by oxygen plasma etching. To align the back iron to the microcoil and the flux guide, a flip-chip-bonder is used during test run of system integration. To adjust a constant air gap, a water solvable polymer is tested. A two component epoxy and a polyimide based glue are compared for their bonding properties of the actuator parts.

  9. Effects of the amino acid sequence on thermal conduction through β-sheet crystals of natural silk protein.

    PubMed

    Zhang, Lin; Bai, Zhitong; Ban, Heng; Liu, Ling

    2015-11-21

    Recent experiments have discovered very different thermal conductivities between the spider silk and the silkworm silk. Decoding the molecular mechanisms underpinning the distinct thermal properties may guide the rational design of synthetic silk materials and other biomaterials for multifunctionality and tunable properties. However, such an understanding is lacking, mainly due to the complex structure and phonon physics associated with the silk materials. Here, using non-equilibrium molecular dynamics, we demonstrate that the amino acid sequence plays a key role in the thermal conduction process through β-sheets, essential building blocks of natural silks and a variety of other biomaterials. Three representative β-sheet types, i.e. poly-A, poly-(GA), and poly-G, are shown to have distinct structural features and phonon dynamics leading to different thermal conductivities. A fundamental understanding of the sequence effects may stimulate the design and engineering of polymers and biopolymers for desired thermal properties.

  10. SUGGESTIONS FOR DEVELOPING INDEPENDENT WORD ATTACK IN READING, FOR USE IN BASIC INSTITUTE MEETINGS, GRADES THREE AND FOUR.

    ERIC Educational Resources Information Center

    REECE, THOMAS E.; AND OTHERS

    A GUIDE FOR PLANNING SPECIFIC INSTRUCTION FOR DEVELOPING INDEPENDENT WORD ATTACK PRESENTS THE SKILLS NECESSARY FOR MASTERING SIGHT VOCABULARY, WORD RECOGNITION, AND THE USE OF THE DICTIONARY. SPECIFIC DEFINITIONS OF TERMS AND EXAMPLES OF TEACHING TECHNIQUES WITH THE SEQUENCE OF INSTRUCTION FOR THE DEVELOPMENT OF PHONETIC AND STRUCTURAL ANALYSIS…

  11. Using evaporation to control capillary instabilities in micro-systems.

    PubMed

    Ledesma-Aguilar, Rodrigo; Laghezza, Gianluca; Yeomans, Julia M; Vella, Dominic

    2017-12-06

    The instabilities of fluid interfaces represent both a limitation and an opportunity for the fabrication of small-scale devices. Just as non-uniform capillary pressures can destroy micro-electrical mechanical systems (MEMS), so they can guide the assembly of novel solid and fluid structures. In many such applications the interface appears during an evaporation process and is therefore only present temporarily. It is commonly assumed that this evaporation simply guides the interface through a sequence of equilibrium configurations, and that the rate of evaporation only sets the timescale of this sequence. Here, we use Lattice-Boltzmann simulations and a theoretical analysis to show that, in fact, the rate of evaporation can be a factor in determining the onset and form of dynamical capillary instabilities. Our results shed light on the role of evaporation in previous experiments, and open the possibility of exploiting diffusive mass transfer to directly control capillary flows in MEMS applications.

  12. Local Renyi entropic profiles of DNA sequences.

    PubMed

    Vinga, Susana; Almeida, Jonas S

    2007-10-16

    In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.

  13. Local Renyi entropic profiles of DNA sequences

    PubMed Central

    Vinga, Susana; Almeida, Jonas S

    2007-01-01

    Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871

  14. Sequence stratigraphy of the subaqueous Changjiang (Yangtze River) delta since the Last Glacial Maximum

    NASA Astrophysics Data System (ADS)

    Xu, Taoyu; Wang, Guoqing; Shi, Xuefa; Wang, Xin; Yao, Zhengquan; Yang, Gang; Fang, Xisheng; Qiao, Shuqing; Liu, Shengfa; Wang, Xuchen; Zhao, Quanhong

    2016-01-01

    This study focuses on sedimentary research at the subaqueous Changjiang (Yangtze River) delta, based on five high-resolution seismic profiles and seven borehole cores with accurate AMS 14C datings. Three distinct seismic units were identified from the seismic profiles according to seismic reflection characteristics, and five sedimentary facies were recognized from borehole cores. These facies constituted a fining upward sedimentary sequence in relation to postglacial sea-level transgression. Three sequence surfaces (sequence boundary (SB), transgressive surface (TS), and maximum flooding surface (MFS)) demarcate the boundaries between early transgressive system tract (E-TST), late transgressive system tract (L-TST), early highstand system tract (E-HST) and late highstand system tract (L-HST), which constitute the sixth order sequence. These system tracts were developed coevally with postglacial sea-level rise. E-TST (~ 19-12 ka BP) corresponds to an incised-valley infilling in the early stages of postglacial transgression whereas L-TST (~ 12-7.5 ka BP) was formed during the last stage of postglacial transgression. The progradational structure of L-TST reflected in seismic profiles is possibly related to the intensification of the East Asian summer monsoon. E-HST (~ 7.5-2 ka BP) was deposited in response to the highstand after maximum postglacial transgression was reached, while L-HST (~ 2 ka BP-present) was initiated by accelerated progradation of the Changjiang delta.

  15. Structural features based genome-wide characterization and prediction of nucleosome organization

    PubMed Central

    2012-01-01

    Background Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. Results We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. Conclusions Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization. The implementation of our DLaNe method based on structural features is available online. PMID:22449207

  16. HangOut: generating clean PSI-BLAST profiles for domains with long insertions.

    PubMed

    Kim, Bong-Hyun; Cong, Qian; Grishin, Nick V

    2010-06-15

    Profile-based similarity search is an essential step in structure-function studies of proteins. However, inclusion of non-homologous sequence segments into a profile causes its corruption and results in false positives. Profile corruption is common in multidomain proteins, and single domains with long insertions are a significant source of errors. We developed a procedure (HangOut) that, for a single domain with specified insertion position, cleans erroneously extended PSI-BLAST alignments to generate better profiles. HangOut is implemented in Python 2.3 and runs on all Unix-compatible platforms. The source code is available under the GNU GPL license at http://prodata.swmed.edu/HangOut/. Supplementary data are available at Bioinformatics online.

  17. Developing a Learning Object Metadata Application Profile Based on LOM Suitable for the Australian Higher Education Context

    ERIC Educational Resources Information Center

    Agostinho, Shirley; Bennett, Sue; Lockyer, Lori; Harper, Barry

    2004-01-01

    This paper reports recent work in developing of structures and processes that support university teachers and instructional designers incorporating learning objects into higher education focused learning designs. The aim of the project is to develop a framework to guide the design and implementation of high quality learning experiences. This…

  18. Program for User-Friendly Management of Input and Output Data Sets

    NASA Technical Reports Server (NTRS)

    Klimeck, Gerhard

    2003-01-01

    A computer program manages large, hierarchical sets of input and output (I/O) parameters (typically, sequences of alphanumeric data) involved in computational simulations in a variety of technological disciplines. This program represents sets of parameters as structures coded in object-oriented but otherwise standard American National Standards Institute C language. Each structure contains a group of I/O parameters that make sense as a unit in the simulation program with which this program is used. The addition of options and/or elements to sets of parameters amounts to the addition of new elements to data structures. By association of child data generated in response to a particular user input, a hierarchical ordering of input parameters can be achieved. Associated with child data structures are the creation and description mechanisms within the parent data structures. Child data structures can spawn further child data structures. In this program, the creation and representation of a sequence of data structures is effected by one line of code that looks for children of a sequence of structures until there are no more children to be found. A linked list of structures is created dynamically and is completely represented in the data structures themselves. Such hierarchical data presentation can guide users through otherwise complex setup procedures and it can be integrated within a variety of graphical representations.

  19. Optimization of nonbinary slanted surface-relief gratings as high-efficiency broadband couplers for light guides.

    PubMed

    Bai, Benfeng; Laukkanen, Janne; Kuittinen, Markku; Siitonen, Samuli

    2010-10-01

    We propose and investigate the use of slanted surface-relief gratings with nonbinary profiles as high-efficiency broadband couplers for light guides. First, a Chandezon-method-based rigorous numerical formulation is presented for modeling the slanted gratings with overhanging profiles. Then, two typical types of slanted grating couplers--a sinusoidal one and a trapezoidal one--are studied and optimized numerically, both exhibiting a high coupling efficiency of over 50% over the full band of white LED under the normal illumination of unpolarized light. Reasonable structural parameters with nice tolerance have been obtained for the optimized designs. It is found that the performance of the couplers depends little on the grating profile shape, but primarily on the grating period and the slant angle of the ridge. The underlying mechanism is analyzed by the equivalence rules of gratings, which provide useful guidelines for the design and fabrication of the couplers. Preliminary investigation has been performed on the fabrication and replication of the slanted overhanging grating couplers, which shows the feasibility of fabrication with mature microfabrication techniques and the perspective for mass production.

  20. Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) Version 3.0 User Guide

    EPA Science Inventory

    User Guide to describe the complete functionality of the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) Version 3.0 online tool. The US Environmental Protection Agency Sequence Alignment to Predict Across Species Susceptibility tool (SeqAPASS; https://seqa...

  1. Profile of small interfering RNAs from cotton plants infected with the polerovirus Cotton leafroll dwarf virus

    PubMed Central

    2011-01-01

    Background In response to infection, viral genomes are processed by Dicer-like (DCL) ribonuclease proteins into viral small RNAs (vsRNAs) of discrete sizes. vsRNAs are then used as guides for silencing the viral genome. The profile of vsRNAs produced during the infection process has been extensively studied for some groups of viruses. However, nothing is known about the vsRNAs produced during infections of members of the economically important family Luteoviridae, a group of phloem-restricted viruses. Here, we report the characterization of a population of vsRNAs from cotton plants infected with Cotton leafroll dwarf virus (CLRDV), a member of the genus Polerovirus, family Luteoviridae. Results Deep sequencing of small RNAs (sRNAs) from leaves of CLRDV-infected cotton plants revealed that the vsRNAs were 21- to 24-nucleotides (nt) long and that their sequences matched the viral genome, with higher frequencies of matches in the 3- region. There were equivalent amounts of sense and antisense vsRNAs, and the 22-nt class of small RNAs was predominant. During infection, cotton Dcl transcripts appeared to be up-regulated, while Dcl2 appeared to be down-regulated. Conclusions This is the first report on the profile of sRNAs in a plant infected with a virus from the family Luteoviridae. Our sequence data strongly suggest that virus-derived double-stranded RNA functions as one of the main precursors of vsRNAs. Judging by the profiled size classes, all cotton DCLs might be working to silence the virus. The possible causes for the unexpectedly high accumulation of 22-nt vsRNAs are discussed. CLRDV is the causal agent of Cotton blue disease, which occurs worldwide. Our results are an important contribution for understanding the molecular mechanisms involved in this and related diseases. PMID:21864377

  2. Signatures of ecological processes in microbial community time series.

    PubMed

    Faust, Karoline; Bauchinger, Franziska; Laroche, Béatrice; de Buyl, Sophie; Lahti, Leo; Washburne, Alex D; Gonze, Didier; Widder, Stefanie

    2018-06-28

    Growth rates, interactions between community members, stochasticity, and immigration are important drivers of microbial community dynamics. In sequencing data analysis, such as network construction and community model parameterization, we make implicit assumptions about the nature of these drivers and thereby restrict model outcome. Despite apparent risk of methodological bias, the validity of the assumptions is rarely tested, as comprehensive procedures are lacking. Here, we propose a classification scheme to determine the processes that gave rise to the observed time series and to enable better model selection. We implemented a three-step classification scheme in R that first determines whether dependence between successive time steps (temporal structure) is present in the time series and then assesses with a recently developed neutrality test whether interactions between species are required for the dynamics. If the first and second tests confirm the presence of temporal structure and interactions, then parameters for interaction models are estimated. To quantify the importance of temporal structure, we compute the noise-type profile of the community, which ranges from black in case of strong dependency to white in the absence of any dependency. We applied this scheme to simulated time series generated with the Dirichlet-multinomial (DM) distribution, Hubbell's neutral model, the generalized Lotka-Volterra model and its discrete variant (the Ricker model), and a self-organized instability model, as well as to human stool microbiota time series. The noise-type profiles for all but DM data clearly indicated distinctive structures. The neutrality test correctly classified all but DM and neutral time series as non-neutral. The procedure reliably identified time series for which interaction inference was suitable. Both tests were required, as we demonstrated that all structured time series, including those generated with the neutral model, achieved a moderate to high goodness of fit to the Ricker model. We present a fast and robust scheme to classify community structure and to assess the prevalence of interactions directly from microbial time series data. The procedure not only serves to determine ecological drivers of microbial dynamics, but also to guide selection of appropriate community models for prediction and follow-up analysis.

  3. An Accurate Scalable Template-based Alignment Algorithm

    PubMed Central

    Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

    2013-01-01

    The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376

  4. New insights into transcription fidelity: thermal stability of non-canonical structures in template DNA regulates transcriptional arrest, pause, and slippage.

    PubMed

    Tateishi-Karimata, Hisae; Isono, Noburu; Sugimoto, Naoki

    2014-01-01

    The thermal stability and topology of non-canonical structures of G-quadruplexes and hairpins in template DNA were investigated, and the effect of non-canonical structures on transcription fidelity was evaluated quantitatively. We designed ten template DNAs: A linear sequence that does not have significant higher-order structure, three sequences that form hairpin structures, and six sequences that form G-quadruplex structures with different stabilities. Templates with non-canonical structures induced the production of an arrested, a slipped, and a full-length transcript, whereas the linear sequence produced only a full-length transcript. The efficiency of production for run-off transcripts (full-length and slipped transcripts) from templates that formed the non-canonical structures was lower than that from the linear. G-quadruplex structures were more effective inhibitors of full-length product formation than were hairpin structure even when the stability of the G-quadruplex in an aqueous solution was the same as that of the hairpin. We considered that intra-polymerase conditions may differentially affect the stability of non-canonical structures. The values of transcription efficiencies of run-off or arrest transcripts were correlated with stabilities of non-canonical structures in the intra-polymerase condition mimicked by 20 wt% polyethylene glycol (PEG). Transcriptional arrest was induced when the stability of the G-quadruplex structure (-ΔG°37) in the presence of 20 wt% PEG was more than 8.2 kcal mol(-1). Thus, values of stability in the presence of 20 wt% PEG are an important indicator of transcription perturbation. Our results further our understanding of the impact of template structure on the transcription process and may guide logical design of transcription-regulating drugs.

  5. New Insights into Transcription Fidelity: Thermal Stability of Non-Canonical Structures in Template DNA Regulates Transcriptional Arrest, Pause, and Slippage

    PubMed Central

    Tateishi-Karimata, Hisae; Isono, Noburu; Sugimoto, Naoki

    2014-01-01

    The thermal stability and topology of non-canonical structures of G-quadruplexes and hairpins in template DNA were investigated, and the effect of non-canonical structures on transcription fidelity was evaluated quantitatively. We designed ten template DNAs: A linear sequence that does not have significant higher-order structure, three sequences that form hairpin structures, and six sequences that form G-quadruplex structures with different stabilities. Templates with non-canonical structures induced the production of an arrested, a slipped, and a full-length transcript, whereas the linear sequence produced only a full-length transcript. The efficiency of production for run-off transcripts (full-length and slipped transcripts) from templates that formed the non-canonical structures was lower than that from the linear. G-quadruplex structures were more effective inhibitors of full-length product formation than were hairpin structure even when the stability of the G-quadruplex in an aqueous solution was the same as that of the hairpin. We considered that intra-polymerase conditions may differentially affect the stability of non-canonical structures. The values of transcription efficiencies of run-off or arrest transcripts were correlated with stabilities of non-canonical structures in the intra-polymerase condition mimicked by 20 wt% polyethylene glycol (PEG). Transcriptional arrest was induced when the stability of the G-quadruplex structure (−ΔGo 37) in the presence of 20 wt% PEG was more than 8.2 kcal mol−1. Thus, values of stability in the presence of 20 wt% PEG are an important indicator of transcription perturbation. Our results further our understanding of the impact of template structure on the transcription process and may guide logical design of transcription-regulating drugs. PMID:24594642

  6. Health Occupations: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 3-year program in health occupations. The guide consists of a course description; general course…

  7. VOE Computer Programming: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 3-year program in computer programming. The guide consists of a course description; general course…

  8. Urban Horticulture: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 4-year program in urban horticulture. The guide consists of a course description; general course…

  9. VOE Accounting: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 2-year program in accounting. The guide consists of a course description; general course objectives;…

  10. Agriculture: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 3-year program in agriculture. The guide consists of a course description; general course objectives;…

  11. Population genetic analysis of Enterocytozoon bieneusi in humans.

    PubMed

    Li, Wei; Cama, Vitaliano; Feng, Yaoyu; Gilman, Robert H; Bern, Caryn; Zhang, Xichen; Xiao, Lihua

    2012-01-01

    Genotyping based on sequence analysis of the ribosomal internal transcribed spacer has revealed significant genetic diversity in Enterocytozoonbieneusi. Thus far, the population genetics of E. bieneusi and its significance in the epidemiology of microsporidiosis have not been examined. In this study, a multilocus sequence typing of E. bieneusi in AIDS patients in Lima, Peru was conducted, using 72 specimens previously genotyped as A, D, IV, EbpC, WL11, Peru7, Peru8, Peru10 and Peru11 at the internal transcribed spacer locus. Altogether, 39 multilocus genotypes were identified among the 72 specimens. The observation of strong intragenic linkage disequilibria and limited genetic recombination among markers were indicative of an overall clonal population structure of E. bieneusi. Measures of pair-wise intergenic linkage disequilibria and a standardised index of association (IAS) based on allelic profile data further supported this conclusion. Both sequence-based and allelic profile-based phylogenetic analyses showed the presence of two genetically isolated groups in the study population, one (group 1) containing isolates of the anthroponotic internal transcribed spacer genotype A, and the other (group 2) containing isolates of multiple internal transcribed spacer genotypes (mainly genotypes D and IV) with zoonotic potential. The measurement of linkage disequilibria and recombination indicated group 2 had a clonal population structure, whereas group 1 had an epidemic population structure. The formation of the two sub-populations was confirmed by STRUCTURE and Wright's fixation index (FST) analyses. The data highlight the power of MLST in understanding the epidemiology of E. bieneusi. Published by Elsevier Ltd.

  12. Computational Modeling of Bloch Surface Waves in One-Dimensional Periodic and Aperiodic Multilayer Structures

    NASA Astrophysics Data System (ADS)

    Koju, Vijay

    Photonic crystals and their use in exciting Bloch surface waves have received immense attention over the past few decades. This interest is mainly due to their applications in bio-sensing, wave-guiding, and other optical phenomena such as surface field enhanced Raman spectroscopy. Improvement in numerical modeling techniques, state of the art computing resources, and advances in fabrication techniques have also assisted in growing interest in this field. The ability to model photonic crystals computationally has benefited both the theoretical as well as experimental communities. It helps the theoretical physicists in solving complex problems which cannot be solved analytically and helps to acquire useful insights that cannot be obtained otherwise. Experimentalists, on the other hand, can test different variants of their devices by changing device parameters to optimize performance before fabrication. In this dissertation, we develop two commonly used numerical techniques, namely transfer matrix method, and rigorous coupled wave analysis, in C++ and MATLAB, and use two additional software packages, one open-source and another commercial, to model one-dimensional photonic crystals. Different variants of one-dimensional multilayered structures such as perfectly periodic dielectric multilayers, quasicrystals, aperiodic multilayer are modeled, along with one-dimensional photonic crystals with gratings on the top layer. Applications of Bloch surface waves, along with new and novel aperiodic dielectric multilayer structures that support Bloch surface waves are explored in this dissertation. We demonstrate a slow light configuration that makes use of Bloch Surface Waves as an intermediate excitation in a double-prism tunneling configuration. This method is simple compared to the more usual techniques for slowing light using the phenomenon of electromagnetically induced transparency in atomic gases or doped ionic crystals operated at temperatures below 4K. Using a semi-numerical approach, we show that a 1D photonic crystal, a multilayer structure composed of alternating layers of TiO2 and SiO2 , can be used to slow down light by a factor of up to 400. The results also show that better control of the speed of light can be achieved by changing the number of bilayers and the air-gap thickness appropriately. The existence of Bloch surface waves in periodic dielectric multilayer structures with a surface defect is well-known. Not yet recognized is that quasi-crystals and aperiodic dielectric multilayers can also support Bloch-like surface waves. We numerically show the excitation of Bloch-like surface waves in Fibonacci quasi-crystals, Thue-Morse aperiodic dielectric multilayers using the prism coupling method. We report improved surface electric field intensity and penetration depth of Bloch-like surface waves in the air side in such structures compared to their periodic counterparts. Bloch surface waves have also demonstrated significant potential in the field of bios-ensing technology. We further extend our study into a new type of multilayer structure based on Maximal-length sequence, which is a pseudo random sequence. We study the characteristics of Bloch surface waves in a 32 layered Maximal-length sequence multilayer and perform angular, as well as spectral sensitivity analysis for refractive index change detection. We demonstrate numerically that Maximal-length sequence multilayers significantly enhance the sensitivity of Bloch surface waves. Another type of structure that support Bloch surface waves are dielectric multilayer structures with a grating profile on the top-most layer. The grating profile adds an additional degree of freedom to the phase matching conditions for Bloch surface wave excitation. In such structures, the conditions for Bloch surface wave coupling can also be achieved by rotating both polar and azimuthal angles. The generation of Bloch surface waves as a function of azimuthal angle have similar characteristics to conventional grating coupled Bloch surface waves. However, azimuthal generated Bloch surface waves have enhanced angular sensitivity compared to conventional polar angle coupled modes, which makes them appropriate for detecting tiny variations in surface refractive index due to the addition of nano-particles such as protein molecules.

  13. External Guide Sequences Targeting the aac(6′)-Ib mRNA Induce Inhibition of Amikacin Resistance▿

    PubMed Central

    Bistué, Alfonso J. C. Soler; Ha, Hongphuc; Sarno, Renee; Don, Michelle; Zorreguieta, Angeles; Tolmasky, Marcelo E.

    2007-01-01

    The dissemination of AAC(6′)-I-type acetyltransferases have rendered amikacin and other aminoglycosides all but useless in some parts of the world. Antisense technologies could be an alternative to extend the life of these antibiotics. External guide sequences are short antisense oligoribonucleotides that induce RNase P-mediated cleavage of a target RNA by forming a precursor tRNA-like complex. Thirteen-nucleotide external guide sequences complementary to locations within five regions accessible for interaction with antisense oligonucleotides in the mRNA that encodes AAC(6′)-Ib were analyzed. While small variations in the location targeted by different external guide sequences resulted in big changes in efficiency of binding to native aac(6′)-Ib mRNA, most of them induced high levels of RNase P-mediated cleavage in vitro. Recombinant plasmids coding for selected external guide sequences were introduced into Escherichia coli harboring aac(6′)-Ib, and the transformant strains were tested to determine their resistance to amikacin. The two external guide sequences that showed the strongest binding efficiency to the mRNA in vitro, EGSC3 and EGSA2, interfered with expression of the resistance phenotype at different degrees. Growth curve experiments showed that E. coli cells harboring a plasmid coding for EGSC3, the external guide sequence with the highest mRNA binding affinity in vitro, did not grow for at least 300 min in the presence of 15 μg of amikacin/ml. EGSA2, which had a lower mRNA-binding affinity in vitro than EGSC3, inhibited the expression of amikacin resistance at a lesser level; growth of E. coli harboring a plasmid coding for EGSA2, in the presence of 15 μg of amikacin/ml was undetectable for 200 min but reached an optical density at 600 nm of 0.5 after 5 h of incubation. Our results indicate that the use of external guide sequences could be a viable strategy to preserve the efficacy of amikacin. PMID:17387154

  14. Investigation of Sequence Clipping and Structural Heterogeneity of an HIV Broadly Neutralizing Antibody by a Comprehensive LC-MS Analysis

    NASA Astrophysics Data System (ADS)

    Ivleva, Vera B.; Schneck, Nicole A.; Gollapudi, Deepika; Arnold, Frank; Cooper, Jonathan W.; Lei, Q. Paula

    2018-05-01

    CAP256 is one of the highly potent, broadly neutralizing monoclonal antibodies (bNAb) designed for HIV-1 therapy. During the process development of one of the constructs, an unexpected product-related impurity was observed via microfluidics gel electrophoresis. A panel of complementary LC-MS analyses was applied for the comprehensive characterization of CAP256 which included the analysis of the intact and reduced protein, the middle-up approach, and a set of complementary peptide mapping techniques and verification of the disulfide bonds. The designed workflow allowed to identify a clip within a protruding acidic loop in the CDR-H3 region of the heavy chain, which can lead to the decrease of bNAb potency. This characterization explained the origin of the additional species reflected by the reducing gel profile. An intra-loop disulfide bond linking the two fragments was identified, which explained why the non-reducing capillary electrophoresis (CE) profile was not affected. The extensive characterization of CAP256 post-translational modifications was performed to investigate a possible cause of CE profile complexity and to illustrate other structural details related to this molecule's biological function. Two sites of the engineered Tyr sulfation were verified in the antigen-binding loop, and pyroglutamate formation was used as a tool for monitoring the extent of antibody clipping. Overall, the comprehensive LC-MS study was crucial to (1) identify the impurity as sequence clipping, (2) pinpoint the clipping location and justify its susceptibility relative to the molecular structure, (3) lead to an upstream process optimization to mitigate product quality risk, and (4) ultimately re-engineer the sequence to be clip-resistant. [Figure not available: see fulltext.

  15. Diversity and population structure of sewage derived microorganisms in wastewater treatment plant influent

    PubMed Central

    McLellan, S.L.; Huse, S.M.; Mueller-Spitz, S.R.; Andreishcheva, E.N.; Sogin, M.L.

    2009-01-01

    The release of untreated sewage introduces non-indigenous microbial populations of uncertain composition into surface waters. We used massively parallel 454 sequencing of hypervariable regions in rRNA genes to profile microbial communities from eight untreated sewage influent samples of two wastewater treatment plants (WWTP) in metropolitan Milwaukee. The sewage profiles included a discernable human fecal signature made up of several taxonomic groups including multiple Bifidobacteriaceae, Coriobacteriaceae, Bacteroidaceae, Lachnospiraceae, and Ruminococcaceae genera. The fecal signature made up a small fraction of the taxa present in sewage but the relative abundance of these sequence tags mirrored the population structures of human fecal samples. These genera were much more prevalent in the sewage influent than standard indicators species. High-abundance sequences from taxonomic groups within the Beta- and Gammaproteobacteria dominated the sewage samples but occurred at very low levels in fecal and surface water samples, suggesting that these organisms proliferate within the sewer system. Samples from Jones Island (JI – servicing residential plus a combined sewer system) and South Shore (SS – servicing a residential area) WWTPs had very consistent community profiles, with greater similarity between WWTPs on a given collection day than the same plant collected on different days. Rainfall increased influent flows at SS and JI WWTPs, and this corresponded to greater diversity in the community at both plants. Overall, the sewer system appears to be a defined environment with both infiltration of rainwater and stormwater inputs modulating community composition. Microbial sewage communities represent a combination of inputs from human fecal microbes and enrichment of specific microbes from the environment to form a unique population structure. PMID:19840106

  16. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    NASA Astrophysics Data System (ADS)

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  17. New Era of Studying RNA Secondary Structure and Its Influence on Gene Regulation in Plants.

    PubMed

    Yang, Xiaofei; Yang, Minglei; Deng, Hongjing; Ding, Yiliang

    2018-01-01

    The dynamic structure of RNA plays a central role in post-transcriptional regulation of gene expression such as RNA maturation, degradation, and translation. With the rise of next-generation sequencing, the study of RNA structure has been transformed from in vitro low-throughput RNA structure probing methods to in vivo high-throughput RNA structure profiling. The development of these methods enables incremental studies on the function of RNA structure to be performed, revealing new insights of novel regulatory mechanisms of RNA structure in plants. Genome-wide scale RNA structure profiling allows us to investigate general RNA structural features over 10s of 1000s of mRNAs and to compare RNA structuromes between plant species. Here, we provide a comprehensive and up-to-date overview of: (i) RNA structure probing methods; (ii) the biological functions of RNA structure; (iii) genome-wide RNA structural features corresponding to their regulatory mechanisms; and (iv) RNA structurome evolution in plants.

  18. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE PAGES

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...

    2017-07-18

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  19. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  20. COACH: profile-profile alignment of protein families using hidden Markov models.

    PubMed

    Edgar, Robert C; Sjölander, Kimmen

    2004-05-22

    Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed and have been shown to improve sensitivity and alignment quality compared with sequence-sequence methods (such as BLAST) and profile-sequence methods (e.g. PSI-BLAST). Here we present a new approach to profile-profile alignment we call Comparison of Alignments by Constructing Hidden Markov Models (HMMs) (COACH). COACH aligns two multiple sequence alignments by constructing a profile HMM from one alignment and aligning the other to that HMM. We compare the alignment accuracy of COACH with two recently published methods: Yona and Levitt's prof_sim and Sadreyev and Grishin's COMPASS. On two sets of reference alignments selected from the FSSP database, we find that COACH is able, on average, to produce alignments giving the best coverage or the fewest errors, depending on the chosen parameter settings. COACH is freely available from www.drive5.com/lobster

  1. Acute multi-sgRNA knockdown of KEOPS complex genes reproduces the microcephaly phenotype of the stable knockout zebrafish model.

    PubMed

    Jobst-Schwan, Tilman; Schmidt, Johanna Magdalena; Schneider, Ronen; Hoogstraten, Charlotte A; Ullmann, Jeremy F P; Schapiro, David; Majmundar, Amar J; Kolb, Amy; Eddy, Kaitlyn; Shril, Shirlee; Braun, Daniela A; Poduri, Annapurna; Hildebrandt, Friedhelm

    2018-01-01

    Until recently, morpholino oligonucleotides have been widely employed in zebrafish as an acute and efficient loss-of-function assay. However, off-target effects and reproducibility issues when compared to stable knockout lines have compromised their further use. Here we employed an acute CRISPR/Cas approach using multiple single guide RNAs targeting simultaneously different positions in two exemplar genes (osgep or tprkb) to increase the likelihood of generating mutations on both alleles in the injected F0 generation and to achieve a similar effect as morpholinos but with the reproducibility of stable lines. This multi single guide RNA approach resulted in median likelihoods for at least one mutation on each allele of >99% and sgRNA specific insertion/deletion profiles as revealed by deep-sequencing. Immunoblot showed a significant reduction for Osgep and Tprkb proteins. For both genes, the acute multi-sgRNA knockout recapitulated the microcephaly phenotype and reduction in survival that we observed previously in stable knockout lines, though milder in the acute multi-sgRNA knockout. Finally, we quantify the degree of mutagenesis by deep sequencing, and provide a mathematical model to quantitate the chance for a biallelic loss-of-function mutation. Our findings can be generalized to acute and stable CRISPR/Cas targeting for any zebrafish gene of interest.

  2. CMOS-Compatible Fabrication for Photonic Crystal-Based Nanofluidic Structure.

    PubMed

    Peng, Wang; Chen, Youping; Ai, Wu; Zhang, Dailin; Song, Han; Xiong, Hui; Huang, Pengcheng

    2017-12-01

    Photonic crystal (PC)-based devices have been widely used since 1990s, while PC has just stepped into the research area of nanofluidic. In this paper, photonic crystal had been used as a complementary metal oxide semiconductors (CMOS) compatible part to create a nanofluidic structure. A nanofluidic structure prototype had been fabricated with CMOS-compatible techniques. The nanofluidic channels were sealed by direct bonding polydimethylsiloxane (PDMS) and the periodic gratings on photonic crystal structure. The PC was fabricated on a 4-in. Si wafer with Si 3 N 4 as the guided mode layer and SiO 2 film as substrate layer. The higher order mode resonance wavelength of PC-based nanofluidic structure had been selected, which can confine the enhanced electrical field located inside the nanochannel area. A design flow chart was used to guide the fabrication process. By optimizing the fabrication device parameters, the periodic grating of PC-based nanofluidic structure had a high-fidelity profile with fill factor at 0.5. The enhanced electric field was optimized and located within the channel area, and it can be used for PC-based nanofluidic applications with high performance.

  3. Exploring the limits of sequence and structure in a variant βγ-crystallin domain of the protein absent in melanoma-1 (AIM1)

    PubMed Central

    Aravind, Penmatsa; Wistow, Graeme; Sharma, Yogendra; Sankaranarayanan, Rajan

    2008-01-01

    βγ-Crystallins belong to a superfamily of proteins in prokaryotes and eukaryotes that are based on duplications of a characteristic, highly conserved Greek Key motif. Most members of the superfamily in vertebrates are structural proteins of the eye lens that contain four motifs arranged as two structural domains. Absent in melanoma-1 (AIM1), an unusual member of the superfamily whose expression is associated with suppression of malignancy in melanoma, contains 12 βγ-crystallin motifs in six domains. Some of these motifs diverge considerably from the canonical motif sequence. AIM1g1, the first βγ-crystallin domain of AIM1, is the most variant of βγ-crystallin domains currently known. In order to understand the limits of sequence variation on the structure, we report the crystal structure of AIM1g1 at 1.9Å resolution. In spite of having changes in key residues, the domain retains the overall βγ-crystallin fold. The domain also contains an unusual extended surface loop that significantly alters the shape of the domain and its charge profile. This structure illustrates the resilience of the βγ fold to considerable sequence changes and its remarkable ability to adapt for novel functions. PMID:18582473

  4. Deep sequencing in library selection projects: what insight does it bring?

    PubMed

    Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

    2015-08-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Deep sequencing in library selection projects: what insight does it bring?

    PubMed Central

    Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

    2015-01-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649

  6. JNSViewer—A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures

    PubMed Central

    Dong, Min; Graham, Mitchell; Yadav, Nehul

    2017-01-01

    Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html. PMID:28582416

  7. Marketing and Distributive Education: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 2-year program in marketing and distributive education. The guide consists of a course description;…

  8. Industrial Cooperative Education Co-op: Scope and Sequence.

    ERIC Educational Resources Information Center

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This guide, which was written as an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System, outlines the suggested scope and sequence of a 2-year cooperative program in industrial education. The guide consists of a course description;…

  9. Precision Medicine and PET/Computed Tomography in Melanoma.

    PubMed

    Mena, Esther; Sanli, Yasemin; Marcus, Charles; Subramaniam, Rathan M

    2017-10-01

    Recent advances in genomic profiling and sequencing of melanoma have provided new insights into the development of the basis for molecular biology to more accurately subgroup patients with melanoma. The development of novel mutation-targeted and immunomodulation therapy as a major component of precision oncology has revolutionized the management and outcome of patients with metastatic melanoma. PET imaging plays an important role in noninvasively assessing the tumor biological behavior, to guide individualized treatment and assess response to therapy. This review summarizes the recent genomic discoveries in melanoma in the era of targeted therapy and their implications for functional PET imaging. Published by Elsevier Inc.

  10. Comparative modeling without implicit sequence alignments.

    PubMed

    Kolinski, Andrzej; Gront, Dominik

    2007-10-01

    The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.

  11. Omokage search: shape similarity search service for biomolecular structures in both the PDB and EMDB.

    PubMed

    Suzuki, Hirofumi; Kawabata, Takeshi; Nakamura, Haruki

    2016-02-15

    Omokage search is a service to search the global shape similarity of biological macromolecules and their assemblies, in both the Protein Data Bank (PDB) and Electron Microscopy Data Bank (EMDB). The server compares global shapes of assemblies independent of sequence order and number of subunits. As a search query, the user inputs a structure ID (PDB ID or EMDB ID) or uploads an atomic model or 3D density map to the server. The search is performed usually within 1 min, using one-dimensional profiles (incremental distance rank profiles) to characterize the shapes. Using the gmfit (Gaussian mixture model fitting) program, the found structures are fitted onto the query structure and their superimposed structures are displayed on the Web browser. Our service provides new structural perspectives to life science researchers. Omokage search is freely accessible at http://pdbj.org/omokage/. © The Author 2015. Published by Oxford University Press.

  12. Genome and transcriptome sequencing in prospective metastatic triple-negative breast cancer uncovers therapeutic vulnerabilities.

    PubMed

    Craig, David W; O'Shaughnessy, Joyce A; Kiefer, Jeffrey A; Aldrich, Jessica; Sinari, Shripad; Moses, Tracy M; Wong, Shukmei; Dinh, Jennifer; Christoforides, Alexis; Blum, Joanne L; Aitelli, Cristi L; Osborne, Cynthia R; Izatt, Tyler; Kurdoglu, Ahmet; Baker, Angela; Koeman, Julie; Barbacioru, Catalin; Sakarya, Onur; De La Vega, Francisco M; Siddiqui, Asim; Hoang, Linh; Billings, Paul R; Salhia, Bodour; Tolcher, Anthony W; Trent, Jeffrey M; Mousses, Spyro; Von Hoff, Daniel; Carpten, John D

    2013-01-01

    Triple-negative breast cancer (TNBC) is characterized by the absence of expression of estrogen receptor, progesterone receptor, and HER-2. Thirty percent of patients recur after first-line treatment, and metastatic TNBC (mTNBC) has a poor prognosis with median survival of one year. Here, we present initial analyses of whole genome and transcriptome sequencing data from 14 prospective mTNBC. We have cataloged the collection of somatic genomic alterations in these advanced tumors, particularly those that may inform targeted therapies. Genes mutated in multiple tumors included TP53, LRP1B, HERC1, CDH5, RB1, and NF1. Notable genes involved in focal structural events were CTNNA1, PTEN, FBXW7, BRCA2, WT1, FGFR1, KRAS, HRAS, ARAF, BRAF, and PGCP. Homozygous deletion of CTNNA1 was detected in 2 of 6 African Americans. RNA sequencing revealed consistent overexpression of the FOXM1 gene when tumor gene expression was compared with nonmalignant breast samples. Using an outlier analysis of gene expression comparing one cancer with all the others, we detected expression patterns unique to each patient's tumor. Integrative DNA/RNA analysis provided evidence for deregulation of mutated genes, including the monoallelic expression of TP53 mutations. Finally, molecular alterations in several cancers supported targeted therapeutic intervention on clinical trials with known inhibitors, particularly for alterations in the RAS/RAF/MEK/ERK and PI3K/AKT/mTOR pathways. In conclusion, whole genome and transcriptome profiling of mTNBC have provided insights into somatic events occurring in this difficult to treat cancer. These genomic data have guided patients to investigational treatment trials and provide hypotheses for future trials in this irremediable cancer.

  13. Bacterial community structure in the hyperarid core of the Atacama Desert, Chile

    USGS Publications Warehouse

    Drees, Kevin P.; Neilson, Julia W.; Betancourt, Julio L.; Quade, Jay; Henderson, David A.; Pryor, Barry M.; Maier, Raina M.

    2006-01-01

    Soils from the hyperarid Atacama Desert of northern Chile were sampled along an east-west elevational transect (23.75 to 24.70 degrees S) through the driest sector to compare the relative structure of bacterial communities. Analysis of denaturing gradient gel electrophoresis (DGGE) profiles from each of the samples revealed that microbial communities from the extreme hyperarid core of the desert clustered separately from all of the remaining communities. Bands sequenced from DGGE profiles of two samples taken at a 22-month interval from this core region revealed the presence of similar populations dominated by bacteria from the Gemmatimonadetes and Planctomycetes phyla.

  14. A Predictive Algorithm to Detect Opioid Use Disorder

    PubMed Central

    Lee, Chee; Sharma, Maneesh; Kantorovich, Svetlana

    2018-01-01

    Purpose: The purpose of this study was to determine the clinical utility of an algorithm-based decision tool designed to assess risk associated with opioid use in the primary care setting. Methods: A prospective, longitudinal study was conducted to assess the utility of precision medicine testing in 1822 patients across 18 family medicine/primary care clinics in the United States. Using the profile, patients were categorized into low, moderate, and high risk for opioid use. Physicians who ordered testing were asked to complete patient evaluations and document their actions, decisions, and perceptions regarding the utility of the precision medicine tests. Results: Approximately 47% of primary care physicians surveyed used the profile to guide clinical decision-making. These physicians rated the benefit of the profile on patient care an average of 3.6 on a 5-point scale (1 indicating no benefit and 5 indicating significant benefit). Eighty-eight percent of all clinicians surveyed felt the test exhibited some benefit to their patient care. The most frequent utilization for the profile was to guide a change in opioid prescribed. Physicians reported greater benefit of profile utilization for minority patients. Patients whose treatment was guided by the profile had pain levels that were reduced, on average, 2.7 levels on the numeric rating scale. Conclusions: The profile provided primary care physicians with a useful tool to stratify the risk of opioid use disorder and was rated as beneficial for decision-making and patient improvement by the majority of physicians surveyed. Physicians reported the profile resulted in greater clinical improvement for minorities, highlighting the objective use of this profile to guide judicial use of opioids in high-risk patients. Significantly, when physicians used the profile to guide treatment decisions, patient-reported pain was greatly reduced. PMID:29383324

  15. A Predictive Algorithm to Detect Opioid Use Disorder: What Is the Utility in a Primary Care Setting?

    PubMed

    Lee, Chee; Sharma, Maneesh; Kantorovich, Svetlana; Brenton, Ashley

    2018-01-01

    The purpose of this study was to determine the clinical utility of an algorithm-based decision tool designed to assess risk associated with opioid use in the primary care setting. A prospective, longitudinal study was conducted to assess the utility of precision medicine testing in 1822 patients across 18 family medicine/primary care clinics in the United States. Using the profile, patients were categorized into low, moderate, and high risk for opioid use. Physicians who ordered testing were asked to complete patient evaluations and document their actions, decisions, and perceptions regarding the utility of the precision medicine tests. Approximately 47% of primary care physicians surveyed used the profile to guide clinical decision-making. These physicians rated the benefit of the profile on patient care an average of 3.6 on a 5-point scale (1 indicating no benefit and 5 indicating significant benefit). Eighty-eight percent of all clinicians surveyed felt the test exhibited some benefit to their patient care. The most frequent utilization for the profile was to guide a change in opioid prescribed. Physicians reported greater benefit of profile utilization for minority patients. Patients whose treatment was guided by the profile had pain levels that were reduced, on average, 2.7 levels on the numeric rating scale. The profile provided primary care physicians with a useful tool to stratify the risk of opioid use disorder and was rated as beneficial for decision-making and patient improvement by the majority of physicians surveyed. Physicians reported the profile resulted in greater clinical improvement for minorities, highlighting the objective use of this profile to guide judicial use of opioids in high-risk patients. Significantly, when physicians used the profile to guide treatment decisions, patient-reported pain was greatly reduced.

  16. Numerical investigation on performance and sediment erosion of Francis runner with different guide vane profiles

    NASA Astrophysics Data System (ADS)

    Lama, R.; Dahal, D. R.; Gautam, S.; Acharya, N.; Neopane, H.; Thapa, B. S.

    2018-06-01

    Francis turbine are ideal turbines for Himalayan and Andes region where both low and high-altitude mountains are located. Turbines operating in such regions face operational and maintenance problems due to the sediment erosion. In order to reduce the erosion effects on these components the design of components for higher sediment handling is essence. This paper presents performance analysis of Francis runner and prediction of sediment erosion on the runner blades for different operating conditions with different guide vane profiles. The simulations were carried out for 11 guide vane opening angles using Tabakoff erosion model. At full load and best efficiency point the erosion was localized at pressure side of runner blades outlet due to higher relative velocity. On the other hand, at part load condition, erosion was observed at suction side of the blades. Application of asymmetric guide vane profile NACA 4412 showed higher efficiency for all operating conditions with minimum erosion on runner blades in compare to symmetric guide vane profile NACA 0012.

  17. Spontaneous formation of structurally diverse membrane channel architectures from a single antimicrobial peptide

    NASA Astrophysics Data System (ADS)

    Wang, Yukun; Chen, Charles H.; Hu, Dan; Ulmschneider, Martin B.; Ulmschneider, Jakob P.

    2016-11-01

    Many antimicrobial peptides (AMPs) selectively target and form pores in microbial membranes. However, the mechanisms of membrane targeting, pore formation and function remain elusive. Here we report an experimentally guided unbiased simulation methodology that yields the mechanism of spontaneous pore assembly for the AMP maculatin at atomic resolution. Rather than a single pore, maculatin forms an ensemble of structurally diverse temporarily functional low-oligomeric pores, which mimic integral membrane protein channels in structure. These pores continuously form and dissociate in the membrane. Membrane permeabilization is dominated by hexa-, hepta- and octamers, which conduct water, ions and small dyes. Pores form by consecutive addition of individual helices to a transmembrane helix or helix bundle, in contrast to current poration models. The diversity of the pore architectures--formed by a single sequence--may be a key feature in preventing bacterial resistance and could explain why sequence-function relationships in AMPs remain elusive.

  18. The Profiles in Practice School Reporting Software.

    ERIC Educational Resources Information Center

    Griffin, Patrick

    "The Profiles in Practice: School Reporting Software" provides a framework for reports on different aspects of performance in an assessment program. This booklet is the installation guide and user manual for the Profiles in Practice software, which is included as a CD-ROM. The chapters of the guide are: (1) "Installation"; (2) "Starting the…

  19. Principles for Predicting RNA Secondary Structure Design Difficulty.

    PubMed

    Anderson-Lee, Jeff; Fisker, Eli; Kosaraju, Vineet; Wu, Michelle; Kong, Justin; Lee, Jeehyung; Lee, Minjae; Zada, Mathew; Treuille, Adrien; Das, Rhiju

    2016-02-27

    Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess "designability" of single RNA structures, as well as of switches for in vitro and in vivo applications. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Structural features of northern Tarim basin: Implications for regional tectonics and petroleum traps

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dong Jia; Juafu Lu; Dongsheng Cai

    1998-01-01

    The rhombus-shaped Tarim basin in northwestern China is controlled mainly by two left-lateral strike-slip systems: the northeast-trending Altun fault zone along its southeastern side and the northeast-trending Aheqi fault zone along its northwestern side. In this paper, we discuss the northern Tarim basin`s structural features, which include three main tectonic units: the Kalpin uplift, the Kuqa depression, and the North Tarim uplift along the northern margin of the Tarim basin. Structural mapping in the Kalpin uplift shows that a series of imbricated thrust sheets have been overprinted by strike-slip faulting. The amount of strike-slip displacement is estimated to be 148more » km by restoration of strike-slip structures in the uplift. The Kuqa depression is a Mesozoic-Cenozoic foredeep depression with well-developed flat-ramp structures and fault-related folds. The Baicheng basin, a Quaternary pull-apart basin, developed at the center of the Kuqa depression. Subsurface structures in the North Tarim uplift can be divided into the Mesozoic-Cenozoic and the Paleozoic lithotectonic sequences in seismic profiles. The Paleozoic litho-tectonic sequence exhibits the interference of earlier left-lateral and later right-lateral strike-slip structures. Many normal faults in the Mesozoic-Cenozoic litho-tectonic sequence form the negative flower structures in the North Tarim uplift; these structures commonly directly overlie the positive flower structures in the Paleozoic litho-tectonic sequence. The interference regions of the northwest-trending and northeast-trending folds in the Paleozoic tectonic sequence have been identified to have the best trap structures. Our structural analysis indicates that the Tarim basin is a transpressional foreland basin rejuvenated during the Cenozoic.« less

  1. Identifying functionally informative evolutionary sequence profiles.

    PubMed

    Gil, Nelson; Fiser, Andras

    2018-04-15

    Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.

  2. Self-Study and Evaluation Guide/1979 Edition. Section B-1: Agency Profile.

    ERIC Educational Resources Information Center

    National Accreditation Council for Agencies Serving the Blind and Visually Handicapped, New York, NY.

    This guide on developing an agency profile is one of 28 guides designed for organizations serving the blind and the visually handicapped who are undertaking a self-study as part of the process for accreditation by the National Accreditation Council (NAC). Instructions for preparing a packet of informative data and material for advance study by…

  3. RaptorX-Property: a web server for protein structure property prediction.

    PubMed

    Wang, Sheng; Li, Wei; Liu, Shiwang; Xu, Jinbo

    2016-07-08

    RaptorX Property (http://raptorx2.uchicago.edu/StructurePropertyPred/predict/) is a web server predicting structure property of a protein sequence without using any templates. It outperforms other servers, especially for proteins without close homologs in PDB or with very sparse sequence profile (i.e. carries little evolutionary information). This server employs a powerful in-house deep learning model DeepCNF (Deep Convolutional Neural Fields) to predict secondary structure (SS), solvent accessibility (ACC) and disorder regions (DISO). DeepCNF not only models complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent property labels. Our experimental results show that, tested on CASP10, CASP11 and the other benchmarks, this server can obtain ∼84% Q3 accuracy for 3-state SS, ∼72% Q8 accuracy for 8-state SS, ∼66% Q3 accuracy for 3-state solvent accessibility, and ∼0.89 area under the ROC curve (AUC) for disorder prediction. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Enniatin and Beauvericin Biosynthesis in Fusarium Species: Production Profiles and Structural Determinant Prediction.

    PubMed

    Liuzzi, Vania C; Mirabelli, Valentina; Cimmarusti, Maria Teresa; Haidukowski, Miriam; Leslie, John F; Logrieco, Antonio F; Caliandro, Rocco; Fanelli, Francesca; Mulè, Giuseppina

    2017-01-25

    Members of the fungal genus Fusarium can produce numerous secondary metabolites, including the nonribosomal mycotoxins beauvericin (BEA) and enniatins (ENNs). Both mycotoxins are synthesized by the multifunctional enzyme enniatin synthetase (ESYN1) that contains both peptide synthetase and S-adenosyl-l-methionine-dependent N -methyltransferase activities. Several Fusarium species can produce ENNs, BEA or both, but the mechanism(s) enabling these differential metabolic profiles is unknown. In this study, we analyzed the primary structure of ESYN1 by sequencing esyn1 transcripts from different Fusarium species. We measured ENNs and BEA production by ultra-performance liquid chromatography coupled with photodiode array and Acquity QDa mass detector (UPLC-PDA-QDa) analyses. We predicted protein structures, compared the predictions by multivariate analysis methods and found a striking correlation between BEA/ENN-producing profiles and ESYN1 three-dimensional structures. Structural differences in the β strand's Asn789-Ala793 and His797-Asp802 portions of the amino acid adenylation domain can be used to distinguish BEA/ENN-producing Fusarium isolates from those that produce only ENN.

  5. Glycogenomics as a mass spectrometry-guided genome-mining method for microbial glycosylated molecules.

    PubMed

    Kersten, Roland D; Ziemert, Nadine; Gonzalez, David J; Duggan, Brendan M; Nizet, Victor; Dorrestein, Pieter C; Moore, Bradley S

    2013-11-19

    Glycosyl groups are an essential mediator of molecular interactions in cells and on cellular surfaces. There are very few methods that directly relate sugar-containing molecules to their biosynthetic machineries. Here, we introduce glycogenomics as an experiment-guided genome-mining approach for fast characterization of glycosylated natural products (GNPs) and their biosynthetic pathways from genome-sequenced microbes by targeting glycosyl groups in microbial metabolomes. Microbial GNPs consist of aglycone and glycosyl structure groups in which the sugar unit(s) are often critical for the GNP's bioactivity, e.g., by promoting binding to a target biomolecule. GNPs are a structurally diverse class of molecules with important pharmaceutical and agrochemical applications. Herein, O- and N-glycosyl groups are characterized in their sugar monomers by tandem mass spectrometry (MS) and matched to corresponding glycosylation genes in secondary metabolic pathways by a MS-glycogenetic code. The associated aglycone biosynthetic genes of the GNP genotype then classify the natural product to further guide structure elucidation. We highlight the glycogenomic strategy by the characterization of several bioactive glycosylated molecules and their gene clusters, including the anticancer agent cinerubin B from Streptomyces sp. SPB74 and an antibiotic, arenimycin B, from Salinispora arenicola CNB-527.

  6. Assembling in Sequence: A Saleable Work Skill. Occupation Simulation Packet. Grades 3rd-4th.

    ERIC Educational Resources Information Center

    Hueston, Jean

    This teacher's guide for grades 3 and 4 contains simulated work experiences for students using the isolated skill concept - assembling in sequence. Teacher instructions include objectives, evaluation, and sequence of activities. The guide contains pre-tests and post-tests with instructions and answer keys. Three pre-skill activities are suggested,…

  7. Distinct profiling of antimicrobial peptide families

    PubMed Central

    Khamis, Abdullah M.; Essack, Magbubah; Gao, Xin; Bajic, Vladimir B.

    2015-01-01

    Motivation: The increased prevalence of multi-drug resistant (MDR) pathogens heightens the need to design new antimicrobial agents. Antimicrobial peptides (AMPs) exhibit broad-spectrum potent activity against MDR pathogens and kills rapidly, thus giving rise to AMPs being recognized as a potential substitute for conventional antibiotics. Designing new AMPs using current in-silico approaches is, however, challenging due to the absence of suitable models, large number of design parameters, testing cycles, production time and cost. To date, AMPs have merely been categorized into families according to their primary sequences, structures and functions. The ability to computationally determine the properties that discriminate AMP families from each other could help in exploring the key characteristics of these families and facilitate the in-silico design of synthetic AMPs. Results: Here we studied 14 AMP families and sub-families. We selected a specific description of AMP amino acid sequence and identified compositional and physicochemical properties of amino acids that accurately distinguish each AMP family from all other AMPs with an average sensitivity, specificity and precision of 92.88%, 99.86% and 95.96%, respectively. Many of our identified discriminative properties have been shown to be compositional or functional characteristics of the corresponding AMP family in literature. We suggest that these properties could serve as guides for in-silico methods in design of novel synthetic AMPs. The methodology we developed is generic and has a potential to be applied for characterization of any protein family. Contact: vladimir.bajic@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388148

  8. MS/MS fragmentation-guided search of TMG-chitooligomycins and their structure-activity relationship in specific β-N-acetylglucosaminidase inhibition.

    PubMed

    Usuki, Hirokazu; Yamamoto, Yukihiro; Kumagai, Yuya; Nitoda, Teruhiko; Kanzaki, Hiroshi; Hatanaka, Tadashi

    2011-04-21

    The reducing tetrasaccharide TMG-chitotriomycin (1) is an inhibitor of β-N-acetylglucosaminidase (GlcNAcase), produced by the actinomycete Streptomyces anulatus NBRC13369. The inhibitor shows a unique inhibitory spectrum, that is, selectivity toward enzymes from chitin-containing organisms such as insects and fungi. Nevertheless, its structure-selectivity relationship remains to be clarified. In this study, we conducted a structure-guided search of analogues of 1 in order to obtain diverse N,N,N-trimethylglucosaminium (TMG)-containing chitooligosaccharides. In this approach, the specific fragmentation profile of 1 on ESI-MS/MS analysis was used for the selective detection of desired compounds. As a result, two new analogues, named TMG-chitomonomycin (3) and TMG-chitobiomycin (2), were obtained from a culture filtrate of 1-producing Streptomyces. Their enzyme-inhibiting activity revealed that the potency and selectivity depended on the degree of polymerization of the reducing end GlcNAc units. Furthermore, a computational modeling study inspired the inhibitory mechanism of TMG-related compounds as a mimic of the substrate in the Michaelis complex of the GH20 enzyme. This study is an example of the successful application of a MS/MS experiment for structure-guided isolation of natural products.

  9. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo.

    PubMed

    Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M; Weissman, Jonathan S; Rouskin, Silvi

    2017-01-01

    Coupling of structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structure studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduce biases and necessitate population-average assessments of RNA structure. Here we present dimethyl sulfate (DMS) mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase. DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low-abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in noncanonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs with their mature isoforms. These applications illustrate DMS-MaPseq's capacity to dramatically expand in vivo analysis of RNA structure.

  10. Prediction of pi-turns in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Wang, Yan; Xue, Zhi-Dong; Shi, Xiao-Hong; Xu, Jin

    2006-09-01

    Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.

  11. A sequence-based survey of the complex structural organization of tumor genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less

  12. Mice and Men Environmental Balance, Parts Three and Four of an Integrated Science Sequence, Teacher's Guide, 1970 Edition.

    ERIC Educational Resources Information Center

    Portland Project Committee, OR.

    This teacher's guide contains parts three and four of the four-part first year Portland Project, a three-year secondary integrated science curriculum sequence. Part three of the guide deals with topics such as the cell, reproduction, embryology, genetics, genetic diseases, genetics and change, populations, effects of density on populations,…

  13. CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

    PubMed Central

    Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

    2015-01-01

    Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360

  14. Propagation of Electromagnetic Waves in Slab Waveguide Structure Consisting of Chiral Nihility Claddings and Negative-Index Material Core Layer

    NASA Astrophysics Data System (ADS)

    Helal, Alaa N. Abu; Taya, Sofyan A.; Elwasife, Khitam Y.

    2018-06-01

    The dispersion equation of an asymmetric three-layer slab waveguide, in which all layers are chiral materials is presented. Then, the dispersion equation of a symmetric slab waveguide, in which the claddings are chiral materials and the core layer is negative index material, is derived. Normalized cut-off frequencies, field profile, and energies flow of right-handed and left-handed circularly polarized modes are derived and plotted. We consider both odd and even guided modes. Numerical results of guided low-order modes are provided. Some novel features, such as abnormal dispersion curves, are found.

  15. Focusing guided waves using surface bonded elastic metamaterials

    NASA Astrophysics Data System (ADS)

    Yan, Xiang; Zhu, Rui; Huang, Guoliang; Yuan, Fuh-Gwo

    2013-09-01

    Bonding a two-dimensional planar array of small lead discs on an aluminum plate with silicone rubber is shown numerically to focus low-frequency flexural guided waves. The "effective mass density profile" of this type of elastic metamaterials (EMMs), perpendicular to wave propagation direction, is carefully tailored and designed, which allows rays of flexural A0 mode Lamb waves to bend in succession and then focus through a 7 × 9 planar array. Numerical simulations show that Lamb waves can be focused beyond EMMs region with amplified displacement and yet largely retained narrow banded waveform, which may have potential application in structural health monitoring.

  16. Fisher: a program for the detection of H/ACA snoRNAs using MFE secondary structure prediction and comparative genomics - assessment and update.

    PubMed

    Freyhult, Eva; Edvardsson, Sverker; Tamas, Ivica; Moulton, Vincent; Poole, Anthony M

    2008-07-21

    The H/ACA family of small nucleolar RNAs (snoRNAs) plays a central role in guiding the pseudouridylation of ribosomal RNA (rRNA). In an effort to systematically identify the complete set of rRNA-modifying H/ACA snoRNAs from the genome sequence of the budding yeast, Saccharomyces cerevisiae, we developed a program - Fisher - and previously presented several candidate snoRNAs based on our analysis 1. In this report, we provide a brief update of this work, which was aborted after the publication of experimentally-identified snoRNAs 2 identical to candidates we had identified bioinformatically using Fisher. Our motivation for revisiting this work is to report on the status of the candidate snoRNAs described in 1, and secondly, to report that a modified version of Fisher together with the available multiple yeast genome sequences was able to correctly identify several H/ACA snoRNAs for modification sites not identified by the snoGPS program 3. While we are no longer developing Fisher, we briefly consider the merits of the Fisher algorithm relative to snoGPS, which may be of use for workers considering pursuing a similar search strategy for the identification of small RNAs. The modified source code for Fisher is made available as supplementary material. Our results confirm the validity of using minimum free energy (MFE) secondary structure prediction to guide comparative genomic screening for RNA families with few sequence constraints.

  17. Directing folding pathways for multi-component DNA origami nanostructures with complex topology

    NASA Astrophysics Data System (ADS)

    Marras, A. E.; Zhou, L.; Kolliopoulos, V.; Su, H.-J.; Castro, C. E.

    2016-05-01

    Molecular self-assembly has become a well-established technique to design complex nanostructures and hierarchical mesoscale assemblies. The typical approach is to design binding complementarity into nucleotide or amino acid sequences to achieve the desired final geometry. However, with an increasing interest in dynamic nanodevices, the need to design structures with motion has necessitated the development of multi-component structures. While this has been achieved through hierarchical assembly of similar structural units, here we focus on the assembly of topologically complex structures, specifically with concentric components, where post-folding assembly is not feasible. We exploit the ability to direct folding pathways to program the sequence of assembly and present a novel approach of designing the strand topology of intermediate folding states to program the topology of the final structure, in this case a DNA origami slider structure that functions much like a piston-cylinder assembly in an engine. The ability to program the sequence and control orientation and topology of multi-component DNA origami nanostructures provides a foundation for a new class of structures with internal and external moving parts and complex scaffold topology. Furthermore, this work provides critical insight to guide the design of intermediate states along a DNA origami folding pathway and to further understand the details of DNA origami self-assembly to more broadly control folding states and landscapes.

  18. Optimizing sgRNA structure to improve CRISPR-Cas9 knockout efficiency.

    PubMed

    Dang, Ying; Jia, Gengxiang; Choi, Jennie; Ma, Hongming; Anaya, Edgar; Ye, Chunting; Shankar, Premlata; Wu, Haoquan

    2015-12-15

    Single-guide RNA (sgRNA) is one of the two key components of the clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome-editing system. The current commonly used sgRNA structure has a shortened duplex compared with the native bacterial CRISPR RNA (crRNA)-transactivating crRNA (tracrRNA) duplex and contains a continuous sequence of thymines, which is the pause signal for RNA polymerase III and thus could potentially reduce transcription efficiency. Here, we systematically investigate the effect of these two elements on knockout efficiency and showed that modifying the sgRNA structure by extending the duplex length and mutating the fourth thymine of the continuous sequence of thymines to cytosine or guanine significantly, and sometimes dramatically, improves knockout efficiency in cells. In addition, the optimized sgRNA structure also significantly increases the efficiency of more challenging genome-editing procedures, such as gene deletion, which is important for inducing a loss of function in non-coding genes. By a systematic investigation of sgRNA structure we find that extending the duplex by approximately 5 bp combined with mutating the continuous sequence of thymines at position 4 to cytosine or guanine significantly increases gene knockout efficiency in CRISPR-Cas9-based genome editing experiments.

  19. Standard Transistor Array (Star): SIMLOG/TESTGN programmer's guide, volume 2, addendum 2

    NASA Technical Reports Server (NTRS)

    Carroll, B. D.

    1979-01-01

    A brief introduction to the SIMLOG/TESTGN system of programs is given. SIMLOG is a logic simulation program, whereas TESTGN is a program for generating test sequences from output produced by SIMLOG. The structures of the two programs are described. Data base, main program, and subprogram details are also given. Guidelines for program modifications are discussed. Commented program listings are included.

  20. Identifying mRNA sequence elements for target recognition by human Argonaute proteins

    PubMed Central

    Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei

    2014-01-01

    It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241

  1. Novel complex MAD phasing and RNase H structural insights using selenium oligonucleotides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Abdur, Rob; Gerlits, Oksana O.; Gan, Jianhua

    2014-02-01

    Selenium-derivatized oligonucleotides may facilitate phase determination and high-resolution structure determination for protein–nucleic acid crystallography. The Se atom-specific mutagenesis (SAM) strategy may also enhance the study of nuclease catalysis. The crystal structures of protein–nucleic acid complexes are commonly determined using selenium-derivatized proteins via MAD or SAD phasing. Here, the first protein–nucleic acid complex structure determined using selenium-derivatized nucleic acids is reported. The RNase H–RNA/DNA complex is used as an example to demonstrate the proof of principle. The high-resolution crystal structure indicates that this selenium replacement results in a local subtle unwinding of the RNA/DNA substrate duplex, thereby shifting the RNA scissilemore » phosphate closer to the transition state of the enzyme-catalyzed reaction. It was also observed that the scissile phosphate forms a hydrogen bond to the water nucleophile and helps to position the water molecule in the structure. Consistently, it was discovered that the substitution of a single O atom by a Se atom in a guide DNA sequence can largely accelerate RNase H catalysis. These structural and catalytic studies shed new light on the guide-dependent RNA cleavage.« less

  2. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

    PubMed Central

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore

    2017-01-01

    Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659

  3. A comparative molecular analysis of water-filled limestone sinkholes in north-eastern Mexico.

    PubMed

    Sahl, Jason W; Gary, Marcus O; Harris, J Kirk; Spear, John R

    2011-01-01

    Sistema Zacatón in north-eastern Mexico is host to several deep, water-filled, anoxic, karstic sinkholes (cenotes). These cenotes were explored, mapped, and geochemically and microbiologically sampled by the autonomous underwater vehicle deep phreatic thermal explorer (DEPTHX). The community structure of the filterable fraction of the water column and extensive microbial mats that coat the cenote walls was investigated by comparative analysis of small-subunit (SSU) 16S rRNA gene sequences. Full-length Sanger gene sequence analysis revealed novel microbial diversity that included three putative bacterial candidate phyla and three additional groups that showed high intra-clade distance with poorly characterized bacterial candidate phyla. Limited functional gene sequence analysis in these anoxic environments identified genes associated with methanogenesis, sulfate reduction and anaerobic ammonium oxidation. A directed, barcoded amplicon, multiplex pyrosequencing approach was employed to compare ∼100,000 bacterial SSU gene sequences from water column and wall microbial mat samples from five cenotes in Sistema Zacatón. A new, high-resolution sequence distribution profile (SDP) method identified changes in specific phylogenetic types (phylotypes) in microbial mats at varied depths; Mantel tests showed a correlation of the genetic distances between mat communities in two cenotes and the geographic location of each cenote. Community structure profiles from the water column of three neighbouring cenotes showed distinct variation; statistically significant differences in the concentration of geochemical constituents suggest that the variation observed in microbial communities between neighbouring cenotes are due to geochemical variation. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.

  4. Correct primary structure assessment and extensive glyco-profiling of cetuximab by a combination of intact, middle-up, middle-down and bottom-up ESI and MALDI mass spectrometry techniques.

    PubMed

    Ayoub, Daniel; Jabs, Wolfgang; Resemann, Anja; Evers, Waltraud; Evans, Catherine; Main, Laura; Baessmann, Carsten; Wagner-Rousset, Elsa; Suckau, Detlev; Beck, Alain

    2013-01-01

    The European Medicines Agency received recently the first marketing authorization application for a biosimilar monoclonal antibody (mAb) and adopted the final guidelines on biosimilar mAbs and Fc-fusion proteins. The agency requires high similarity between biosimilar and reference products for approval. Specifically, the amino acid sequences must be identical. The glycosylation pattern of the antibody is also often considered to be a very important quality attribute due to its strong effect on quality, safety, immunogenicity, pharmacokinetics and potency. Here, we describe a case study of cetuximab, which has been marketed since 2004. Biosimilar versions of the product are now in the pipelines of numerous therapeutic antibody biosimilar developers. We applied a combination of intact, middle-down, middle-up and bottom-up electrospray ionization and matrix assisted laser desorption ionization mass spectrometry techniques to characterize the amino acid sequence and major post-translational modifications of the marketed cetuximab product, with special emphasis on glycosylation. Our results revealed a sequence error in the reported sequence of the light chain in databases and in publications, thus highlighting the potency of mass spectrometry to establish correct antibody sequences. We were also able to achieve a comprehensive identification of cetuximab's glycoforms and glycosylation profile assessment on both Fab and Fc domains. Taken together, the reported approaches and data form a solid framework for the comparability of antibodies and their biosimilar candidates that could be further applied to routine structural assessments of these and other antibody-based products.

  5. Visualizing nD Point Clouds as Topological Landscape Profiles to Guide Local Data Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oesterling, Patrick; Heine, Christian; Weber, Gunther H.

    2012-05-04

    Analyzing high-dimensional point clouds is a classical challenge in visual analytics. Traditional techniques, such as projections or axis-based techniques, suffer from projection artifacts, occlusion, and visual complexity.We propose to split data analysis into two parts to address these shortcomings. First, a structural overview phase abstracts data by its density distribution. This phase performs topological analysis to support accurate and non-overlapping presentation of the high-dimensional cluster structure as a topological landscape profile. Utilizing a landscape metaphor, it presents clusters and their nesting as hills whose height, width, and shape reflect cluster coherence, size, and stability, respectively. A second local analysis phasemore » utilizes this global structural knowledge to select individual clusters or point sets for further, localized data analysis. Focusing on structural entities significantly reduces visual clutter in established geometric visualizations and permits a clearer, more thorough data analysis. In conclusion, this analysis complements the global topological perspective and enables the user to study subspaces or geometric properties, such as shape.« less

  6. Profiling the nucleobase and structure selectivity of anticancer drugs and other DNA alkylating agents by RNA sequencing.

    PubMed

    Gillingham, Dennis; Sauter, Basilius

    2018-05-06

    Drugs that covalently modify DNA are components of most chemotherapy regimens, often serving as first-line treatments. Classically the chemical reactivity of DNA alkylators has been determined in vitro with short oligonucleotides. Here we use next generation RNA sequencing to report on the chemoselectivity of alkylating agents. We develop the method with the well-known clinically used DNA modifiying drugs streptozotocin and temozolomide, and then apply the technique to profile RNA modification with uncharacterized alkylation reactions such as with powerful electrophiles like trimethylsilyldiazomethane. The multiplexed and massively parallel format of NGS offers analyses of chemical reactivity in nucleic acids to be accomplished in less time with greater statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Functional structure of the bromeliad tank microbiome is strongly shaped by local geochemical conditions.

    PubMed

    Louca, Stilianos; Jacques, Saulo M S; Pires, Aliny P F; Leal, Juliana S; González, Angélica L; Doebeli, Michael; Farjalla, Vinicius F

    2017-08-01

    Phytotelmata in tank-forming Bromeliaceae plants are regarded as potential miniature models for aquatic ecology, but detailed investigations of their microbial communities are rare. Hence, the biogeochemistry in bromeliad tanks remains poorly understood. Here we investigate the structure of bacterial and archaeal communities inhabiting the detritus within the tanks of two bromeliad species, Aechmea nudicaulis and Neoregelia cruenta, from a Brazilian sand dune forest. We used metagenomic sequencing for functional community profiling and 16S sequencing for taxonomic profiling. We estimated the correlation between functional groups and various environmental variables, and compared communities between bromeliad species. In all bromeliads, microbial communities spanned a metabolic network adapted to oxygen-limited conditions, including all denitrification steps, ammonification, sulfate respiration, methanogenesis, reductive acetogenesis and anoxygenic phototrophy. Overall, CO2 reducers dominated in abundance over sulfate reducers, and anoxygenic phototrophs largely outnumbered oxygenic photoautotrophs. Functional community structure correlated strongly with environmental variables, between and within a single bromeliad species. Methanogens and reductive acetogens correlated with detrital volume and canopy coverage, and exhibited higher relative abundances in N. cruenta. A comparison of bromeliads to freshwater lake sediments and soil from around the world, revealed stark differences in terms of taxonomic as well as functional microbial community structure. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  8. Profiling of engineering hotspots identifies an allosteric CRISPR-Cas9 switch.

    PubMed

    Oakes, Benjamin L; Nadler, Dana C; Flamholz, Avi; Fellmann, Christof; Staahl, Brett T; Doudna, Jennifer A; Savage, David F

    2016-06-01

    The clustered, regularly interspaced, short palindromic repeats (CRISPR)-associated protein Cas9 from Streptococcus pyogenes is an RNA-guided DNA endonuclease with widespread utility for genome modification. However, the structural constraints limiting the engineering of Cas9 have not been determined. Here we experimentally profile Cas9 using randomized insertional mutagenesis and delineate hotspots in the structure capable of tolerating insertions of a PDZ domain without disruption of the enzyme's binding and cleavage functions. Orthogonal domains or combinations of domains can be inserted into the identified sites with minimal functional consequence. To illustrate the utility of the identified sites, we construct an allosterically regulated Cas9 by insertion of the estrogen receptor-α ligand-binding domain. This protein showed robust, ligand-dependent activation in prokaryotic and eukaryotic cells, establishing a versatile one-component system for inducible and reversible Cas9 activation. Thus, domain insertion profiling facilitates the rapid generation of new Cas9 functionalities and provides useful data for future engineering of Cas9.

  9. Digital Imagery Compression Best Practices Guide - A Motion Imagery Standards Profile (MISP) Compliant Architecture

    DTIC Science & Technology

    2012-06-01

    MISP) COMPLIANT ARCHITECTURE WHITE SANDS MISSILE RANGE REAGAN TEST SITE YUMA PROVING GROUND DUGWAY PROVING GROUND ABERDEEN TEST CENTER...DIGITAL MOTION IMAGERY COMPRESSION BEST PRACTICES GUIDE – A MOTION IMAGERY STANDARDS PROFILE (MISP) COMPLIANT ARCHITECTURE ...delivery, and archival purposes. These practices are based on a Motion Imagery Standards Profile (MISP) compliant architecture , which has been defined

  10. Know Your Rights on Campus: A Guide on Racial Profiling, and Hate Crime for International Students in the United States.

    ERIC Educational Resources Information Center

    Harvard Civil Rights Project, Cambridge, MA.

    This guide to the rights of international students explains racial profiling and hate crimes. Since the terrorist attacks of September 11, 2001, many immigrants and international students have experienced heightened scrutiny and outright discrimination. Racial profiling refers to the reliance by law enforcement officers on a person's ethnicity,…

  11. Molecular biomarkers to guide precision medicine in localized prostate cancer.

    PubMed

    Smits, Minke; Mehra, Niven; Sedelaar, Michiel; Gerritsen, Winald; Schalken, Jack A

    2017-08-01

    Major advances through tumor profiling technologies, that include next-generation sequencing, epigenetic, proteomic and transcriptomic methods, have been made in primary prostate cancer, providing novel biomarkers that may guide precision medicine in the near future. Areas covered: The authors provided an overview of novel molecular biomarkers in tissue, blood and urine that may be used as clinical tools to assess prognosis, improve selection criteria for active surveillance programs, and detect disease relapse early in localized prostate cancer. Expert commentary: Active surveillance (AS) in localized prostate cancer is an accepted strategy in patients with very low-risk prostate cancer. Many more patients may benefit from watchful waiting, and include patients of higher clinical stage and grade, however selection criteria have to be optimized and early recognition of transformation from localized to lethal disease has to be improved by addition of molecular biomarkers. The role of non-invasive biomarkers is challenging the need for repeat biopsies, commonly performed at 1 and 4 years in men under AS programs.

  12. Associations between soil bacterial community structure and nutrient cycling functions in long-term organic farm soils following cover crop and organic fertilizer amendment.

    PubMed

    Fernandez, Adria L; Sheaffer, Craig C; Wyse, Donald L; Staley, Christopher; Gould, Trevor J; Sadowsky, Michael J

    2016-10-01

    Agricultural management practices can produce changes in soil microbial populations whose functions are crucial to crop production and may be detectable using high-throughput sequencing of bacterial 16S rRNA. To apply sequencing-derived bacterial community structure data to on-farm decision-making will require a better understanding of the complex associations between soil microbial community structure and soil function. Here 16S rRNA sequencing was used to profile soil bacterial communities following application of cover crops and organic fertilizer treatments in certified organic field cropping systems. Amendment treatments were hairy vetch (Vicia villosa), winter rye (Secale cereale), oilseed radish (Raphanus sativus), buckwheat (Fagopyrum esculentum), beef manure, pelleted poultry manure, Sustane(®) 8-2-4, and a no-amendment control. Enzyme activities, net N mineralization, soil respiration, and soil physicochemical properties including nutrient levels, organic matter (OM) and pH were measured. Relationships between these functional and physicochemical parameters and soil bacterial community structure were assessed using multivariate methods including redundancy analysis, discriminant analysis, and Bayesian inference. Several cover crops and fertilizers affected soil functions including N-acetyl-β-d-glucosaminidase and β-glucosidase activity. Effects, however, were not consistent across locations and sampling timepoints. Correlations were observed among functional parameters and relative abundances of individual bacterial families and phyla. Bayesian analysis inferred no directional relationships between functional activities, bacterial families, and physicochemical parameters. Soil functional profiles were more strongly predicted by location than by treatment, and differences were largely explained by soil physicochemical parameters. Composition of soil bacterial communities was predictive of soil functional profiles. Differences in soil function were better explained using both soil physicochemical test values and bacterial community structure data than using soil tests alone. Pursuing a better understanding of bacterial community composition and how it is affected by farming practices is a promising avenue for increasing our ability to predict the impact of management practices on important soil functions. Copyright © 2016. Published by Elsevier B.V.

  13. Protein 8-class secondary structure prediction using conditional neural fields.

    PubMed

    Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

    2011-10-01

    Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Guide-bound structures of an RNA-targeting A-cleaving CRISPR–Cas13a enzyme

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Knott, Gavin J.; East-Seletsky, Alexandra; Cofsky, Joshua C.

    CRISPR adaptive immune systems protect bacteria from infections by deploying CRISPR RNA (crRNA)-guided enzymes to recognize and cut foreign nucleic acids. Type VI-A CRISPR–Cas systems include the Cas13a enzyme, an RNA-activated RNase capable of crRNA processing and single-stranded RNA degradation upon target-transcript binding. Here we present the 2.0-Å resolution crystal structure of a crRNA-bound Lachnospiraceae bacterium Cas13a (LbaCas13a), representing a recently discovered Cas13a enzyme subtype. This structure and accompanying biochemical experiments define the Cas13a catalytic residues that are directly responsible for crRNA maturation. In addition, the orientation of the foreign-derived target-RNA-specifying sequence in the protein interior explains the conformational gatingmore » of Cas13a nuclease activation. These results describe how Cas13a enzymes generate functional crRNAs and how catalytic activity is blocked before target-RNA recognition, with implications for both bacterial immunity and diagnostic applications.« less

  15. Guide-bound structures of an RNA-targeting A-cleaving CRISPR–Cas13a enzyme

    DOE PAGES

    Knott, Gavin J.; East-Seletsky, Alexandra; Cofsky, Joshua C.; ...

    2017-09-11

    CRISPR adaptive immune systems protect bacteria from infections by deploying CRISPR RNA (crRNA)-guided enzymes to recognize and cut foreign nucleic acids. Type VI-A CRISPR–Cas systems include the Cas13a enzyme, an RNA-activated RNase capable of crRNA processing and single-stranded RNA degradation upon target-transcript binding. Here we present the 2.0-Å resolution crystal structure of a crRNA-bound Lachnospiraceae bacterium Cas13a (LbaCas13a), representing a recently discovered Cas13a enzyme subtype. This structure and accompanying biochemical experiments define the Cas13a catalytic residues that are directly responsible for crRNA maturation. In addition, the orientation of the foreign-derived target-RNA-specifying sequence in the protein interior explains the conformational gatingmore » of Cas13a nuclease activation. These results describe how Cas13a enzymes generate functional crRNAs and how catalytic activity is blocked before target-RNA recognition, with implications for both bacterial immunity and diagnostic applications.« less

  16. Radio-guided sentinel lymph node identification by lymphoscintigraphy fused with an anatomical vector profile: clinical applications.

    PubMed

    Niccoli Asabella, A; Antonica, F; Renna, M A; Rubini, D; Notaristefano, A; Nicoletti, A; Rubini, G

    2013-12-01

    To develop a method to fuse lymphoscintigraphic images with an adaptable anatomical vector profile and to evaluate its role in the clinical practice. We used Adobe Illustrator CS6 to create different vector profiles, we fused those profiles, using Adobe Photoshop CS6, with lymphoscintigraphic images of the patient. We processed 197 lymphoscintigraphies performed in patients with cutaneous melanomas, breast cancer or delayed lymph drainage. Our models can be adapted to every patient attitude or position and contain different levels of anatomical details ranging from external body profiles to the internal anatomical structures like bones, muscles, vessels, and lymph nodes. If needed, more new anatomical details can be added and embedded in the profile without redrawing them, saving a lot of time. Details can also be easily hidden, allowing the physician to view only relevant information and structures. Fusion times are about 85 s. The diagnostic confidence of the observers increased significantly. The validation process showed a slight shift (mean 4.9 mm). We have created a new, practical, inexpensive digital technique based on commercial software for fusing lymphoscintigraphic images with built-in anatomical reference profiles. It is easily reproducible and does not alter the original scintigraphic image. Our method allows a more meaningful interpretation of lymphoscintigraphies, an easier recognition of the anatomical site and better lymph node dissection planning.

  17. Structural and Functional Aspects of Class A Carbapenemases

    PubMed Central

    Naas, Thierry; Dortet, Laurent; Iorga, Bogdan I.

    2016-01-01

    The fight against infectious diseases is probably one of the greatest public health challenges faced by our society, especially with the emergence of carbapenem-resistant gram-negatives that are in some cases pan-drug resistant. Currently, β-lactamase-mediated resistance does not spare even the newest and most powerful β-lactams (carbapenems), whose activity is challenged by carbapenemases. The worldwide dissemination of carbapenemases in gram-negative organisms threatens to take medicine back into the pre-antibiotic era since the mortality associated with infections caused by these “superbugs” is very high, due to limited treatment options. Clinically-relevant carbapenemases belong either to metallo-β-lactamases (MBLs) of Ambler class B or to serine-β-lactamases (SBLs) of Ambler class A and D enzymes. Class A carbapenemases may be chromosomally-encoded (SME, NmcA, SFC-1, BIC-1, PenA, FPH-1, SHV-38), plasmid-encoded (KPC, GES, FRI-1) or both (IMI). The plasmid-encoded enzymes are often associated with mobile elements responsible for their mobilization. These enzymes, even though weakly related in terms of sequence identities, share structural features and a common mechanism of action. They variably hydrolyse penicillins, cephalosporins, monobactams, carbapenems, and are inhibited by clavulanate and tazobactam. Three-dimensional structures of class A carbapenemases, in the apo form or in complex with substrates/inhibitors, together with site-directed mutagenesis studies, provide essential input for identifying the structural factors and subtle conformational changes that influence the hydrolytic profile and inhibition of these enzymes. Overall, these data represent the building blocks for understanding the structure-function relationships that define the phenotypes of class A carbapenemases and can guide the design of new molecules of therapeutic interest. PMID:26960341

  18. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  19. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    PubMed Central

    Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting

    2016-01-01

    ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181

  20. Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing.

    PubMed

    Euskirchen, Philipp; Bielle, Franck; Labreche, Karim; Kloosterman, Wigard P; Rosenberg, Shai; Daniau, Mailys; Schmitt, Charlotte; Masliah-Planchon, Julien; Bourdeaut, Franck; Dehais, Caroline; Marie, Yannick; Delattre, Jean-Yves; Idbaih, Ahmed

    2017-11-01

    Molecular classification of cancer has entered clinical routine to inform diagnosis, prognosis, and treatment decisions. At the same time, new tumor entities have been identified that cannot be defined histologically. For central nervous system tumors, the current World Health Organization classification explicitly demands molecular testing, e.g., for 1p/19q-codeletion or IDH mutations, to make an integrated histomolecular diagnosis. However, a plethora of sophisticated technologies is currently needed to assess different genomic and epigenomic alterations and turnaround times are in the range of weeks, which makes standardized and widespread implementation difficult and hinders timely decision making. Here, we explored the potential of a pocket-size nanopore sequencing device for multimodal and rapid molecular diagnostics of cancer. Low-pass whole genome sequencing was used to simultaneously generate copy number (CN) and methylation profiles from native tumor DNA in the same sequencing run. Single nucleotide variants in IDH1, IDH2, TP53, H3F3A, and the TERT promoter region were identified using deep amplicon sequencing. Nanopore sequencing yielded ~0.1X genome coverage within 6 h and resulting CN and epigenetic profiles correlated well with matched microarray data. Diagnostically relevant alterations, such as 1p/19q codeletion, and focal amplifications could be recapitulated. Using ad hoc random forests, we could perform supervised pan-cancer classification to distinguish gliomas, medulloblastomas, and brain metastases of different primary sites. Single nucleotide variants in IDH1, IDH2, and H3F3A were identified using deep amplicon sequencing within minutes of sequencing. Detection of TP53 and TERT promoter mutations shows that sequencing of entire genes and GC-rich regions is feasible. Nanopore sequencing allows same-day detection of structural variants, point mutations, and methylation profiling using a single device with negligible capital cost. It outperforms hybridization-based and current sequencing technologies with respect to time to diagnosis and required laboratory equipment and expertise, aiming to make precision medicine possible for every cancer patient, even in resource-restricted settings.

  1. In Situ Guided Wave Structural Health Monitoring System

    NASA Technical Reports Server (NTRS)

    Zhao, George; Tittmann, Bernhard R.

    2011-01-01

    Aircraft engine rotating equipment operates at high temperatures and stresses. Noninvasive inspection of microcracks in those components poses a challenge for nondestructive evaluation. A low-cost, low-profile, high-temperature ultrasonic guided wave sensor was developed that detects cracks in situ. The transducer design provides nondestructive evaluation of structures and materials. A key feature of the sensor is that it withstands high temperatures and excites strong surface wave energy to inspect surface and subsurface cracks. The sol-gel bismuth titanate-based surface acoustic wave (SAW) sensor can generate efficient SAWs for crack inspection. The sensor is very thin (submillimeter) and can generate surface waves up to 540 C. Finite element analysis of the SAW transducer design was performed to predict the sensor behavior, and experimental studies confirmed the results. The sensor can be implemented on structures of various shapes. With a spray-coating process, the sensor can be applied to the surface of large curvatures. It has minimal effect on airflow or rotating equipment imbalance, and provides good sensitivity.

  2. Compound surface-plasmon-polariton waves guided by a thin metal layer sandwiched between a homogeneous isotropic dielectric material and a structurally chiral material

    NASA Astrophysics Data System (ADS)

    Chiadini, Francesco; Fiumara, Vincenzo; Scaglione, Antonio; Lakhtakia, Akhlesh

    2016-03-01

    Multiple compound surface plasmon-polariton (SPP) waves can be guided by a structure consisting of a sufficiently thick layer of metal sandwiched between a homogeneous isotropic dielectric (HID) material and a dielectric structurally chiral material (SCM). The compound SPP waves are strongly bound to both metal/dielectric interfaces when the thickness of the metal layer is comparable to the skin depth but just to one of the two interfaces when the thickness is much larger. The compound SPP waves differ in phase speed, attenuation rate, and field profile, even though all are excitable at the same frequency. Some compound SPP waves are not greatly affected by the choice of the direction of propagation in the transverse plane but others are, depending on metal thickness. For fixed metal thickness, the number of compound SPP waves depends on the relative permittivity of the HID material, which can be useful for sensing applications.

  3. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  4. Probing the hammerhead ribozyme structure with ribonucleases.

    PubMed Central

    Hodgson, R A; Shirley, N J; Symons, R H

    1994-01-01

    Susceptibility to RNase digestion has been used to probe the conformation of the hammerhead ribozyme structure prepared from chemically synthesised RNAs. Less than about 1.5% of the total sample was digested to obtain a profile of RNase digestion sites. The observed digestion profiles confirmed the predicted base-paired secondary structure for the hammerhead. Digestion profiles of both cis and trans hammerhead structures were nearly identical which indicated that the structural interactions leading to self-cleavage were similar for both systems. Furthermore, the presence or absence of Mg2+ did not affect the RNase digestion profiles, thus indicating that Mg2+ did not modify the hammerhead structure significantly to induce self-cleavage. The base-paired stems I and II in the hammerhead structure were stable whereas stem III, which was susceptible to digestion, appeared to be an unstable region. The single strand domains separating the stems were susceptible to digestion with the exception of sites adjacent to guanosines; GL2.1 in the stem II loop and G12 in the conserved GAAAC sequence, which separates stems II and III. The absence of digestion at GL2.1 in the stem II hairpin loop of the hammerhead complex was maintained in uncomplexed ribozyme and in short oligonucleotides containing only the stem II hairpin region. In contrast, the G12 site became susceptible when the ribozyme was not complexed with its substrate. Overall the results are consistent with the role of Mg2+ in the hammerhead self-cleavage reaction being catalytic and not structural. Images PMID:8202361

  5. iFeature: a python package and web server for features extraction and selection from protein and peptide sequences.

    PubMed

    Chen, Zhen; Zhao, Pei; Li, Fuyi; Leier, André; Marquez-Lago, Tatiana T; Wang, Yanan; Webb, Geoffrey I; Smith, A Ian; Daly, Roger J; Chou, Kuo-Chen; Song, Jiangning

    2018-03-08

    Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection, and dimensionality reduction algorithms, greatly facilitating training, analysis, and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. jiangning.song@monash.edu; kcchou@gordonlifescience.org; roger.daly@monash.edu. Supplementary data are available at Bioinformatics online.

  6. Construction of a large collection of small genome variations in French dairy and beef breeds using whole-genome sequences.

    PubMed

    Boussaha, Mekki; Michot, Pauline; Letaief, Rabia; Hozé, Chris; Fritz, Sébastien; Grohs, Cécile; Esquerré, Diane; Duchesne, Amandine; Philippe, Romain; Blanquet, Véronique; Phocas, Florence; Floriot, Sandrine; Rocha, Dominique; Klopp, Christophe; Capitan, Aurélien; Boichard, Didier

    2016-11-15

    In recent years, several bovine genome sequencing projects were carried out with the aim of developing genomic tools to improve dairy and beef production efficiency and sustainability. In this study, we describe the first French cattle genome variation dataset obtained by sequencing 274 whole genomes representing several major dairy and beef breeds. This dataset contains over 28 million single nucleotide polymorphisms (SNPs) and small insertions and deletions. Comparisons between sequencing results and SNP array genotypes revealed a very high genotype concordance rate, which indicates the good quality of our data. To our knowledge, this is the first large-scale catalog of small genomic variations in French dairy and beef cattle. This resource will contribute to the study of gene functions and population structure and also help to improve traits through genotype-guided selection.

  7. Inhibition of herpes simplex virus 1 gene expression and replication by RNase P-associated external guide sequences.

    PubMed

    Liu, Jin; Shao, Luyao; Trang, Phong; Yang, Zhu; Reeves, Michael; Sun, Xu; Vu, Gia-Phong; Wang, Yu; Li, Hongjian; Zheng, Congyi; Lu, Sangwei; Liu, Fenyong

    2016-06-09

    An external guide sequence (EGS) is a RNA sequence which can interact with a target mRNA to form a tertiary structure like a pre-tRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, to degrade target mRNA. Previously, an in vitro selection procedure has been used by us to engineer new EGSs that are more robust in inducing human RNase P to cleave their targeted mRNAs. In this study, we constructed EGSs from a variant to target the mRNA encoding herpes simplex virus 1 (HSV-1) major transcription regulator ICP4, which is essential for the expression of viral early and late genes and viral growth. The EGS variant induced human RNase P cleavage of ICP4 mRNA sequence 60 times better than the EGS generated from a natural pre-tRNA. A decrease of about 97% and 75% in the level of ICP4 gene expression and an inhibition of about 7,000- and 500-fold in viral growth were observed in HSV infected cells expressing the variant and the pre-tRNA-derived EGS, respectively. This study shows that engineered EGSs can inhibit HSV-1 gene expression and viral growth. Furthermore, these results demonstrate the potential for engineered EGS RNAs to be developed and used as anti-HSV therapeutics.

  8. Inhibition of herpes simplex virus 1 gene expression and replication by RNase P-associated external guide sequences

    PubMed Central

    Liu, Jin; Shao, Luyao; Trang, Phong; Yang, Zhu; Reeves, Michael; Sun, Xu; Vu, Gia-Phong; Wang, Yu; Li, Hongjian; Zheng, Congyi; Lu, Sangwei; Liu, Fenyong

    2016-01-01

    An external guide sequence (EGS) is a RNA sequence which can interact with a target mRNA to form a tertiary structure like a pre-tRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, to degrade target mRNA. Previously, an in vitro selection procedure has been used by us to engineer new EGSs that are more robust in inducing human RNase P to cleave their targeted mRNAs. In this study, we constructed EGSs from a variant to target the mRNA encoding herpes simplex virus 1 (HSV-1) major transcription regulator ICP4, which is essential for the expression of viral early and late genes and viral growth. The EGS variant induced human RNase P cleavage of ICP4 mRNA sequence 60 times better than the EGS generated from a natural pre-tRNA. A decrease of about 97% and 75% in the level of ICP4 gene expression and an inhibition of about 7,000- and 500-fold in viral growth were observed in HSV infected cells expressing the variant and the pre-tRNA-derived EGS, respectively. This study shows that engineered EGSs can inhibit HSV-1 gene expression and viral growth. Furthermore, these results demonstrate the potential for engineered EGS RNAs to be developed and used as anti-HSV therapeutics. PMID:27279482

  9. Health Grades K-6. Skills-Based Scope and Sequence Guide. Target Skills and Sample Assessment Methods.

    ERIC Educational Resources Information Center

    Williamson, Anne; Beegle, Jenny; Gilbert, Lisa; Safaii, SeAnne; Eck, Paul; Remaley, Renea; Hasselquist, Claudia; Hatch, Kathy C.; Thompson, Kay

    This guide is organized around a suggested list of health skills that all students should know and be able to do at each grade level from kindergarten through grade 6. The guide will help provide parents, teachers, and students with knowledge of what is being taught in a logical scope and sequence by grade level. It is designed to help build a…

  10. Comprehensive Profiling of the Androgen Receptor in Liquid Biopsies from Castration-resistant Prostate Cancer Reveals Novel Intra-AR Structural Variation and Splice Variant Expression Patterns.

    PubMed

    De Laere, Bram; van Dam, Pieter-Jan; Whitington, Tom; Mayrhofer, Markus; Diaz, Emanuela Henao; Van den Eynden, Gert; Vandebroek, Jean; Del-Favero, Jurgen; Van Laere, Steven; Dirix, Luc; Grönberg, Henrik; Lindberg, Johan

    2017-08-01

    Expression of the androgen receptor splice variant 7 (AR-V7) is associated with poor response to second-line endocrine therapy in castration-resistant prostate cancer (CRPC). However, a large fraction of nonresponding patients are AR-V7-negative. To investigate if a comprehensive liquid biopsy-based AR profile may improve patient stratification in the context of second-line endocrine therapy. Peripheral blood was collected from patients with CRPC (n=30) before initiation of a new line of systemic therapy. We performed profiling of circulating tumour DNA via low-pass whole-genome sequencing and targeted sequencing of the entire AR gene, including introns. Targeted RNA sequencing was performed on enriched circulating tumour cell fractions to assess the expression levels of seven AR splice variants (ARVs). Somatic AR variations, including copy-number alterations, structural variations, and point mutations, were combined with ARV expression patterns and correlated to clinicopathologic parameters. Collectively, any AR perturbation, including ARV, was detected in 25/30 patients. Surprisingly, intra-AR structural variation was present in 15/30 patients, of whom 14 expressed ARVs. The majority of ARV-positive patients expressed multiple ARVs, with AR-V3 the most abundantly expressed. The presence of any ARV was associated with progression-free survival after second-line endocrine treatment (hazard ratio 4.53, 95% confidence interval 1.424-14.41; p=0.0105). Six out of 17 poor responders were AR-V7-negative, but four carried other AR perturbations. Comprehensive AR profiling, which is feasible using liquid biopsies, is necessary to increase our understanding of the mechanisms underpinning resistance to endocrine treatment. Alterations in the androgen receptor are associated with endocrine treatment outcomes. This study demonstrates that it is possible to identify different types of alterations via simple blood draws. Follow-up studies are needed to determine the effect of such alterations on hormonal therapy. Copyright © 2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  11. Entropic Profiler – detection of conservation in genomes using information theory

    PubMed Central

    Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana

    2009-01-01

    Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538

  12. Effets non lineaires transversaux dans les guides d'ondes plans

    NASA Astrophysics Data System (ADS)

    Dumais, Patrick

    Les effets non lineaires transversaux dus a l'effet Kerr optique non resonant sont etudies dans deux types de guides a geometrie plane. D'abord (au chapitre 2), l'emission de solitons spatiaux d'un guide de type canal est etudie historiquement, analytiquement et numeriquement dans le but d'en faire la conception et la fabrication, en AlGaAs, dans la region spectrale en deca de la moitie de la bande interdite de ce materiau, soit autour de 1,5 microns. Le composant, tel que concu, comporte une structure de multipuits quantiques. Le desordonnement local de cette structure permet une variation locale du coefficient Kerr dans le guide, ce qui mene a l'emission d'un soliton spatial au-dela d'une puissance optique de seuil. L'observation experimentale d'un changement en fonction de l'intensite du profil de champ a la sortie du guide realise est presentee. Deuxiemement (au chapitre 3) une technique de mesure du coefficient Kerr dans un guide plan est presentee. Cette technique consiste a mesurer le changement de transmission au travers d'un cache place a la sortie du guide en fonction de l'intensite crete a l'entree du guide plan. Une methode pour determiner les conditions optimales pour la sensibilite de la mesure est presentee, illustree de plusieurs exemples. Finalement, la realisation d'un oscillateur parametrique optique basee sur un cristal de niobate de lithium a domaines periodiquement inverses est presentee. La theorie des oscillateurs parametriques optiques est exposee avec une emphase sur la generation d'impulsions intenses a des longueurs d'onde autour de 1,5 microns a partir d'un laser Ti:saphir, dans le but d'obtenir une source pour faire les experiences sur l'emission solitonique.

  13. Transcriptome Profiling of Bovine Milk Oligosaccharide Metabolism Genes Using RNA-Sequencing

    PubMed Central

    Wickramasinghe, Saumya; Hua, Serenus; Rincon, Gonzalo; Islas-Trejo, Alma; German, J. Bruce; Lebrilla, Carlito B.; Medrano, Juan F.

    2011-01-01

    This study examines the genes coding for enzymes involved in bovine milk oligosaccharide metabolism by comparing the oligosaccharide profiles with the expressions of glycosylation-related genes. Fresh milk samples (n = 32) were collected from four Holstein and Jersey cows at days 1, 15, 90 and 250 of lactation and free milk oligosaccharide profiles were analyzed. RNA was extracted from milk somatic cells at days 15 and 250 of lactation (n = 12) and gene expression analysis was conducted by RNA-Sequencing. A list was created of 121 glycosylation-related genes involved in oligosaccharide metabolism pathways in bovine by analyzing the oligosaccharide profiles and performing an extensive literature search. No significant differences were observed in either oligosaccharide profiles or expressions of glycosylation-related genes between Holstein and Jersey cows. The highest concentrations of free oligosaccharides were observed in the colostrum samples and a sharp decrease was observed in the concentration of free oligosaccharides on day 15, followed by progressive decrease on days 90 and 250. Ninety-two glycosylation-related genes were expressed in milk somatic cells. Most of these genes exhibited higher expression in day 250 samples indicating increases in net glycosylation-related metabolism in spite of decreases in free milk oligosaccharides in late lactation milk. Even though fucosylated free oligosaccharides were not identified, gene expression indicated the likely presence of fucosylated oligosaccharides in bovine milk. Fucosidase genes were expressed in milk and a possible explanation for not detecting fucosylated free oligosaccharides is the degradation of large fucosylated free oligosaccharides by the fucosidases. Detailed characterization of enzymes encoded by the 92 glycosylation-related genes identified in this study will provide the basic knowledge for metabolic network analysis of oligosaccharides in mammalian milk. These candidate genes will guide the design of a targeted breeding strategy to optimize the content of beneficial oligosaccharides in bovine milk. PMID:21541029

  14. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

    PubMed

    Roca, Alberto I

    2014-01-01

    The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.

  15. Simultaneous excitation system for efficient guided wave structural health monitoring

    NASA Astrophysics Data System (ADS)

    Hua, Jiadong; Michaels, Jennifer E.; Chen, Xin; Lin, Jing

    2017-10-01

    Many structural health monitoring systems utilize guided wave transducer arrays for defect detection and localization. Signals are usually acquired using the ;pitch-catch; method whereby each transducer is excited in turn and the response is received by the remaining transducers. When extensive signal averaging is performed, the data acquisition process can be quite time-consuming, especially for metallic components that require a low repetition rate to allow signals to die out. Such a long data acquisition time is particularly problematic if environmental and operational conditions are changing while data are being acquired. To reduce the total data acquisition time, proposed here is a methodology whereby multiple transmitters are simultaneously triggered, and each transmitter is driven with a unique excitation. The simultaneously transmitted waves are captured by one or more receivers, and their responses are processed by dispersion-compensated filtering to extract the response from each individual transmitter. The excitation sequences are constructed by concatenating a series of chirps whose start and stop frequencies are randomly selected from a specified range. The process is optimized using a Monte-Carlo approach to select sequences with impulse-like autocorrelations and relatively flat cross-correlations. The efficacy of the proposed methodology is evaluated by several metrics and is experimentally demonstrated with sparse array imaging of simulated damage.

  16. Analysis of Bacterial Community Structure in Sulfurous-Oil-Containing Soils and Detection of Species Carrying Dibenzothiophene Desulfurization (dsz) Genes

    PubMed Central

    Duarte, Gabriela Frois; Rosado, Alexandre Soares; Seldin, Lucy; de Araujo, Welington; van Elsas, Jan Dirk

    2001-01-01

    The selective effects of sulfur-containing hydrocarbons, with respect to changes in bacterial community structure and selection of desulfurizing organisms and genes, were studied in soil. Samples taken from a polluted field soil (A) along a concentration gradient of sulfurous oil and from soil microcosms treated with dibenzothiophene (DBT)-containing petroleum (FSL soil) were analyzed. Analyses included plate counts of total bacteria and of DBT utilizers, molecular community profiling via soil DNA-based PCR-denaturing gradient gel electrophoresis (PCR-DGGE), and detection of genes that encode enzymes involved in the desulfurization of hydrocarbons, i.e., dszA, dszB, and dszC.Data obtained from the A soil showed no discriminating effects of oil levels on the culturable bacterial numbers on either medium used. Generally, counts of DBT degraders were 10- to 100-fold lower than the total culturable counts. However, PCR-DGGE showed that the numbers of bands detected in the molecular community profiles decreased with increasing oil content of the soil. Analysis of the sequences of three prominent bands of the profiles generated with the highly polluted soil samples suggested that the underlying organisms were related to Actinomyces sp., Arthrobacter sp., and a bacterium of uncertain affiliation. dszA, dszB, and dszC genes were present in all A soil samples, whereas a range of unpolluted soils gave negative results in this analysis. Results from the study of FSL soil revealed minor effects of the petroleum-DBT treatment on culturable bacterial numbers and clear effects on the DBT-utilizing communities. The molecular community profiles were largely stable over time in the untreated soil, whereas they showed a progressive change over time following treatment with DBT-containing petroleum. Direct PCR assessment revealed the presence of dszB-related signals in the untreated FSL soil and the apparent selection of dszA- and dszC-related sequences by the petroleum-DBT treatment. PCR-DGGE applied to sequential enrichment cultures in DBT-containing sulfur-free basal salts medium prepared from the A and treated FSL soils revealed the selection of up to 10 distinct bands. Sequencing a subset of these bands provided evidence for the presence of organisms related to Pseudomonas putida, a Pseudomonas sp., Stenotrophomonas maltophilia, and Rhodococcus erythropolis. Several of 52 colonies obtained from the A and FSL soils on agar plates with DBT as the sole sulfur source produced bands that matched the migration of bands selected in the enrichment cultures. Evidence for the presence of dszB in 12 strains was obtained, whereas dszA and dszC genes were found in only 7 and 6 strains, respectively. Most of the strains carrying dszA or dszC were classified as R. erythropolis related, and all revealed the capacity to desulfurize DBT. A comparison of 37 dszA sequences, obtained via PCR from the A and FSL soils, from enrichments of these soils, and from isolates, revealed the great similarity of all sequences to the canonical (R. erythropolis strain IGTS8) dszA sequence and a large degree of internal conservation. The 37 sequences recovered were grouped in three clusters. One group, consisting of 30 sequences, was minimally 98% related to the IGTS8 sequence, a second group of 2 sequences was slightly different, and a third group of 5 sequences was 95% similar. The first two groups contained sequences obtained from both soil types and enrichment cultures (including isolates), but the last consisted of sequences obtained directly from the polluted A soil. PMID:11229891

  17. Relationships between residue Voronoi volume and sequence conservation in proteins.

    PubMed

    Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung

    2018-02-01

    Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Sequencing Stories in Spanish and English.

    ERIC Educational Resources Information Center

    Steckbeck, Pamela Meza

    The guide was designed for speech pathologists, bilingual teachers, and specialists in English as a second language who work with Spanish-speaking children. The guide contains twenty illustrated stories that facilitate the learning of auditory sequencing, auditory and visual memory, receptive and expressive vocabulary, and expressive language…

  19. Acute multi-sgRNA knockdown of KEOPS complex genes reproduces the microcephaly phenotype of the stable knockout zebrafish model

    PubMed Central

    Schneider, Ronen; Hoogstraten, Charlotte A.; Schapiro, David; Majmundar, Amar J.; Kolb, Amy; Eddy, Kaitlyn; Shril, Shirlee; Braun, Daniela A.; Poduri, Annapurna

    2018-01-01

    Until recently, morpholino oligonucleotides have been widely employed in zebrafish as an acute and efficient loss-of-function assay. However, off-target effects and reproducibility issues when compared to stable knockout lines have compromised their further use. Here we employed an acute CRISPR/Cas approach using multiple single guide RNAs targeting simultaneously different positions in two exemplar genes (osgep or tprkb) to increase the likelihood of generating mutations on both alleles in the injected F0 generation and to achieve a similar effect as morpholinos but with the reproducibility of stable lines. This multi single guide RNA approach resulted in median likelihoods for at least one mutation on each allele of >99% and sgRNA specific insertion/deletion profiles as revealed by deep-sequencing. Immunoblot showed a significant reduction for Osgep and Tprkb proteins. For both genes, the acute multi-sgRNA knockout recapitulated the microcephaly phenotype and reduction in survival that we observed previously in stable knockout lines, though milder in the acute multi-sgRNA knockout. Finally, we quantify the degree of mutagenesis by deep sequencing, and provide a mathematical model to quantitate the chance for a biallelic loss-of-function mutation. Our findings can be generalized to acute and stable CRISPR/Cas targeting for any zebrafish gene of interest. PMID:29346415

  20. Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles

    PubMed Central

    Brender, Jeffrey R.; Zhang, Yang

    2015-01-01

    The formation of protein-protein complexes is essential for proteins to perform their physiological functions in the cell. Mutations that prevent the proper formation of the correct complexes can have serious consequences for the associated cellular processes. Since experimental determination of protein-protein binding affinity remains difficult when performed on a large scale, computational methods for predicting the consequences of mutations on binding affinity are highly desirable. We show that a scoring function based on interface structure profiles collected from analogous protein-protein interactions in the PDB is a powerful predictor of protein binding affinity changes upon mutation. As a standalone feature, the differences between the interface profile score of the mutant and wild-type proteins has an accuracy equivalent to the best all-atom potentials, despite being two orders of magnitude faster once the profile has been constructed. Due to its unique sensitivity in collecting the evolutionary profiles of analogous binding interactions and the high speed of calculation, the interface profile score has additional advantages as a complementary feature to combine with physics-based potentials for improving the accuracy of composite scoring approaches. By incorporating the sequence-derived and residue-level coarse-grained potentials with the interface structure profile score, a composite model was constructed through the random forest training, which generates a Pearson correlation coefficient >0.8 between the predicted and observed binding free-energy changes upon mutation. This accuracy is comparable to, or outperforms in most cases, the current best methods, but does not require high-resolution full-atomic models of the mutant structures. The binding interface profiling approach should find useful application in human-disease mutation recognition and protein interface design studies. PMID:26506533

  1. Application of the Ramanujan Fourier Transform for the analysis of secondary structure content in amino acid sequences.

    PubMed

    Mainardi, L T; Pattini, L; Cerutti, S

    2007-01-01

    A novel method is presented for the investigation of protein properties of sequences using Ramanujan Fourier Transform (RFT). The new methodology involves the preprocessing of protein sequence data by numerically encoding it and then applying the RFT. The RFT is based on projecting the obtained numerical series on a set of basis functions constituted by Ramanujan sums (RS). In RS components, periodicities of finite integer length, rather than frequency, (as in classical harmonic analysis) are considered. The potential of the new approach is documented by a few examples in the analysis of hydrophobic profiles of proteins in two classes including abundance of alpha-helices (group A) or beta-strands (group B). Different patterns are provided as evidence. RFT can be used to characterize the structural properties of proteins and integrate complementary information provided by other signal processing transforms.

  2. Consistent global structures of complex RNA states through multidimensional chemical mapping

    PubMed Central

    Cheng, Clarence Yu; Chou, Fang-Chieh; Kladwang, Wipapat; Tian, Siqi; Cordero, Pablo; Das, Rhiju

    2015-01-01

    Accelerating discoveries of non-coding RNA (ncRNA) in myriad biological processes pose major challenges to structural and functional analysis. Despite progress in secondary structure modeling, high-throughput methods have generally failed to determine ncRNA tertiary structures, even at the 1-nm resolution that enables visualization of how helices and functional motifs are positioned in three dimensions. We report that integrating a new method called MOHCA-seq (Multiplexed •OH Cleavage Analysis with paired-end sequencing) with mutate-and-map secondary structure inference guides Rosetta 3D modeling to consistent 1-nm accuracy for intricately folded ncRNAs with lengths up to 188 nucleotides, including a blind RNA-puzzle challenge, the lariat-capping ribozyme. This multidimensional chemical mapping (MCM) pipeline resolves unexpected tertiary proximities for cyclic-di-GMP, glycine, and adenosylcobalamin riboswitch aptamers without their ligands and a loose structure for the recently discovered human HoxA9D internal ribosome entry site regulon. MCM offers a sequencing-based route to uncovering ncRNA 3D structure, applicable to functionally important but potentially heterogeneous states. DOI: http://dx.doi.org/10.7554/eLife.07600.001 PMID:26035425

  3. Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

    PubMed Central

    2010-01-01

    Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480

  4. The +vbar breakout during approach to Space Station Freedom

    NASA Technical Reports Server (NTRS)

    Dunham, Scott D.

    1993-01-01

    A set of burn profiles was developed to provide bounding jet firing histories for a +vbar breakout during approaches to Space Station Freedom. The delta-v sequences were designed to place the Orbiter on a safe trajectory under worst case conditions and to try to minimize plume impingement on Space Station Freedom structure.

  5. High-resolution phylogenetic microbial community profiling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singer, Esther; Coleman-Derr, Devin; Bowman, Brett

    2014-03-17

    The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance ourmore » knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.« less

  6. Early experience with formalin-fixed paraffin-embedded (FFPE) based commercial clinical genomic profiling of gliomas-robust and informative with caveats.

    PubMed

    Movassaghi, Masoud; Shabihkhani, Maryam; Hojat, Seyed A; Williams, Ryan R; Chung, Lawrance K; Im, Kyuseok; Lucey, Gregory M; Wei, Bowen; Mareninov, Sergey; Wang, Michael W; Ng, Denise W; Tashjian, Randy S; Magaki, Shino; Perez-Rosendahl, Mari; Yang, Isaac; Khanlou, Negar; Vinters, Harry V; Liau, Linda M; Nghiemphu, Phioanh L; Lai, Albert; Cloughesy, Timothy F; Yong, William H

    2017-08-01

    Commercial targeted genomic profiling with next generation sequencing using formalin-fixed paraffin embedded (FFPE) tissue has recently entered into clinical use for diagnosis and for the guiding of therapy. However, there is limited independent data regarding the accuracy or robustness of commercial genomic profiling in gliomas. As part of patient care, FFPE samples of gliomas from 71 patients were submitted for targeted genomic profiling to one commonly used commercial vendor, Foundation Medicine. Genomic alterations were determined for the following grades or groups of gliomas; Grade I/II, Grade III, primary glioblastomas (GBMs), recurrent primary GBMs, and secondary GBMs. In addition, FFPE samples from the same patients were independently assessed with conventional methods such as immunohistochemistry (IHC), Quantitative real-time PCR (qRT-PCR), or Fluorescence in situ hybridization (FISH) for three genetic alterations: IDH1 mutations, EGFR amplification, and EGFRvIII expression. A total of 100 altered genes were detected by the aforementioned targeted genomic profiling assay. The number of different genomic alterations was significantly different between the five groups of gliomas and consistent with the literature. CDKN2A/B, TP53, and TERT were the most common genomic alterations seen in primary GBMs, whereas IDH1, TP53, and PIK3CA were the most common in secondary GBMs. Targeted genomic profiling demonstrated 92.3%-100% concordance with conventional methods. The targeted genomic profiling report provided an average of 5.5 drugs, and listed an average of 8.4 clinical trials for the 71 glioma patients studied but only a third of the trials were appropriate for glioma patients. In this limited comparison study, this commercial next generation sequencing based-targeted genomic profiling showed a high concordance rate with conventional methods for the 3 genetic alterations and identified mutations expected for the type of glioma. While it may not be feasible to exhaustively independently validate a commercial genomic profiling assay, examination of a few markers provides some reassurance of its robustness. While potential targeted drugs are recommended based on genetic alterations, to date most targeted therapies have failed in glioblasomas so the usefulness of such recommendations will increase with development of novel and efficacious drugs. Copyright © 2017. Published by Elsevier Inc.

  7. The gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis contains a group I intron.

    PubMed Central

    De Wachter, R; Neefs, J M; Goris, A; Van de Peer, Y

    1992-01-01

    The nucleotide sequence of the gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis was determined. It revealed the presence of a group I intron with a length of 411 nucleotides. This is the third occurrence of such an intron discovered in a small subunit rRNA gene encoded by a eukaryotic nuclear genome. The other two occurrences are in Pneumocystis carinii, a fungus of uncertain taxonomic status, and Ankistrodesmus stipitatus, a green alga. The nucleotides of the conserved core structure of 101 group I intron sequences present in different genes and genome types were aligned and their evolutionary relatedness was examined. This revealed a cluster including all group I introns hitherto found in eukaryotic nuclear genes coding for small and large subunit rRNAs. A secondary structure model was designed for the area of the Ustilago maydis small ribosomal subunit RNA precursor where the intron is situated. It shows that the internal guide sequence pairing with the intron boundaries fits between two helices of the small subunit rRNA, and that minimal rearrangement of base pairs suffices to achieve the definitive secondary structure of the 18S rRNA upon splicing. PMID:1561081

  8. Sex-Specific Effects of Organophosphate Diazinon on the Gut Microbiome and Its Metabolic Functions.

    PubMed

    Gao, Bei; Bian, Xiaoming; Mahbub, Ridwan; Lu, Kun

    2017-02-01

    There is growing recognition of the significance of the gut microbiome to human health, and the association between a perturbed gut microbiome with human diseases has been established. Previous studies also show the role of environmental toxicants in perturbing the gut microbiome and its metabolic functions. The wide agricultural use of diazinon, an organophosphate insecticide, has raised serious environmental health concerns since it is a potent neurotoxicant. With studies demonstrating the presence of a microbiome-gut-brain axis, it is possible that gut microbiome perturbation may also contribute to diazinon toxicity. We investigated the impact of diazinon exposure on the gut microbiome composition and its metabolic functions in C57BL/6 mice. We used a combination of 16S rRNA gene sequencing, metagenomics sequencing, and mass spectrometry-based metabolomics profiling in a mouse model to examine the functional impact of diazinon on the gut microbiome. 16S rRNA gene sequencing revealed that diazinon exposure significantly perturbed the gut microbiome, and metagenomic sequencing found that diazinon exposure altered the functional metagenome. Moreover, metabolomics profiling revealed an altered metabolic profile arising from exposure. Of particular significance, these changes were more pronounced for male mice than for female mice. Diazinon exposure perturbed the gut microbiome community structure, functional metagenome, and associated metabolic profiles in a sex-specific manner. These findings may provide novel insights regarding perturbations of the gut microbiome and its functions as a potential new mechanism contributing to diazinon neurotoxicity and, in particular, its sex-selective effects. Citation: Gao B, Bian X, Mahbub R, Lu K. 2017. Sex-specific effects of organophosphate diazinon on the gut microbiome and its metabolic functions. Environ Health Perspect 125:198-206; http://dx.doi.org/10.1289/EHP202.

  9. Optical ridge waveguides in Er3+/Yb3+ co-doped phosphate glass produced by ion irradiation combined with femtosecond laser ablation for guided-wave green and red upconversion emissions

    NASA Astrophysics Data System (ADS)

    Chen, Chen; He, Ruiyun; Tan, Yang; Wang, Biao; Akhmadaliev, Shavkat; Zhou, Shengqiang; de Aldana, Javier R. Vázquez; Hu, Lili; Chen, Feng

    2016-01-01

    This work reports on the fabrication of ridge waveguides in Er3+/Yb3+ co-doped phosphate glass by the combination of femtosecond laser ablation and following swift carbon ion irradiation. The guiding properties of waveguides have been investigated at 633 and 1064 nm through end face coupling arrangement. The refractive index profile on the cross section of the waveguide has been constructed. The propagation losses can be reduced considerably after annealing treatment. Under the optical pump laser at 980 nm, the upconversion emission of both green and red fluorescence has been realized through the ridge waveguide structures.

  10. CORAL: aligning conserved core regions across domain families.

    PubMed

    Fong, Jessica H; Marchler-Bauer, Aron

    2009-08-01

    Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.

  11. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design

    PubMed Central

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R.

    2018-01-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228–256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody–antigen complexes, using two design strategies—optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody–antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters. PMID:29702641

  12. RosettaAntibodyDesign (RAbD): A general framework for computational antibody design.

    PubMed

    Adolf-Bryfogle, Jared; Kalyuzhniy, Oleks; Kubitz, Michael; Weitzner, Brian D; Hu, Xiaozhen; Adachi, Yumiko; Schief, William R; Dunbrack, Roland L

    2018-04-01

    A structural-bioinformatics-based computational methodology and framework have been developed for the design of antibodies to targets of interest. RosettaAntibodyDesign (RAbD) samples the diverse sequence, structure, and binding space of an antibody to an antigen in highly customizable protocols for the design of antibodies in a broad range of applications. The program samples antibody sequences and structures by grafting structures from a widely accepted set of the canonical clusters of CDRs (North et al., J. Mol. Biol., 406:228-256, 2011). It then performs sequence design according to amino acid sequence profiles of each cluster, and samples CDR backbones using a flexible-backbone design protocol incorporating cluster-based CDR constraints. Starting from an existing experimental or computationally modeled antigen-antibody structure, RAbD can be used to redesign a single CDR or multiple CDRs with loops of different length, conformation, and sequence. We rigorously benchmarked RAbD on a set of 60 diverse antibody-antigen complexes, using two design strategies-optimizing total Rosetta energy and optimizing interface energy alone. We utilized two novel metrics for measuring success in computational protein design. The design risk ratio (DRR) is equal to the frequency of recovery of native CDR lengths and clusters divided by the frequency of sampling of those features during the Monte Carlo design procedure. Ratios greater than 1.0 indicate that the design process is picking out the native more frequently than expected from their sampled rate. We achieved DRRs for the non-H3 CDRs of between 2.4 and 4.0. The antigen risk ratio (ARR) is the ratio of frequencies of the native amino acid types, CDR lengths, and clusters in the output decoys for simulations performed in the presence and absence of the antigen. For CDRs, we achieved cluster ARRs as high as 2.5 for L1 and 1.5 for H2. For sequence design simulations without CDR grafting, the overall recovery for the native amino acid types for residues that contact the antigen in the native structures was 72% in simulations performed in the presence of the antigen and 48% in simulations performed without the antigen, for an ARR of 1.5. For the non-contacting residues, the ARR was 1.08. This shows that the sequence profiles are able to maintain the amino acid types of these conserved, buried sites, while recovery of the exposed, contacting residues requires the presence of the antigen-antibody interface. We tested RAbD experimentally on both a lambda and kappa antibody-antigen complex, successfully improving their affinities 10 to 50 fold by replacing individual CDRs of the native antibody with new CDR lengths and clusters.

  13. Do satellite galaxies trace matter in galaxy clusters?

    NASA Astrophysics Data System (ADS)

    Wang, Chunxiang; Li, Ran; Gao, Liang; Shan, Huanyuan; Kneib, Jean-Paul; Wang, Wenting; Chen, Gang; Makler, Martin; Pereira, Maria E. S.; Wang, Lin; Maia, Marcio A. G.; Erben, Thomas

    2018-04-01

    The spatial distribution of satellite galaxies encodes rich information of the structure and assembly history of galaxy clusters. In this paper, we select a red-sequence Matched-filter Probabilistic Percolation cluster sample in SDSS Stripe 82 region with 0.1 ≤ z ≤ 0.33, 20 < λ < 100, and Pcen > 0.7. Using the high-quality weak lensing data from CS82 Survey, we constrain the mass profile of this sample. Then we compare directly the mass density profile with the satellite number density profile. We find that the total mass and number density profiles have the same shape, both well fitted by an NFW profile. The scale radii agree with each other within a 1σ error (r_s,gal=0.34_{-0.03}^{+0.04} Mpc versus r_s=0.37_{-0.10}^{+0.15} Mpc).

  14. Structure-related statistical singularities along protein sequences: a correlation study.

    PubMed

    Colafranceschi, Mauro; Colosimo, Alfredo; Zbilut, Joseph P; Uversky, Vladimir N; Giuliani, Alessandro

    2005-01-01

    A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein-protein and protein-DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.

  15. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo

    PubMed Central

    Zubradt, Meghan; Gupta, Paromita; Persad, Sitara; Lambowitz, Alan M.; Weissman, Jonathan S.; Rouskin, Silvi

    2017-01-01

    Coupling structure-specific in vivo chemical modification to next-generation sequencing is transforming RNA secondary structural studies in living cells. The dominant strategy for detecting in vivo chemical modifications uses reverse transcriptase truncation products, which introduces biases and necessitates population-average assessments of RNA structure. Here we present dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq), which encodes DMS modifications as mismatches using a thermostable group II intron reverse transcriptase (TGIRT). DMS-MaPseq yields a high signal-to-noise ratio, can report multiple structural features per molecule, and allows both genome-wide studies and focused in vivo investigations of even low abundance RNAs. We apply DMS-MaPseq for the first analysis of RNA structure within an animal tissue and to identify a functional structure involved in non-canonical translation initiation. Additionally, we use DMS-MaPseq to compare the in vivo structure of pre-mRNAs to their mature isoforms. These applications illustrate DMS-MaPseq’s capacity to dramatically expand in vivo analysis of RNA structure. PMID:27819661

  16. The NnCenH3 protein and centromeric DNA sequence profiles of Nelumbo nucifera Gaertn. (sacred lotus) reveal the DNA structures and dynamics of centromeres in basal eudicots.

    PubMed

    Zhu, Zhixuan; Gui, Songtao; Jin, Jing; Yi, Rong; Wu, Zhihua; Qian, Qian; Ding, Yi

    2016-09-01

    Centromeres on eukaryotic chromosomes consist of large arrays of DNA repeats that undergo very rapid evolution. Nelumbo nucifera Gaertn. (sacred lotus) is a phylogenetic relict and an aquatic perennial basal eudicot. Studies concerning the centromeres of this basal eudicot species could provide ancient evolutionary perspectives. In this study, we characterized the centromeric marker protein NnCenH3 (sacred lotus centromere-specific histone H3 variant), and used a chromatin immunoprecipitation (ChIP)-based technique to recover the NnCenH3 nucleosome-associated sequences of sacred lotus. The properties of the centromere-binding protein and DNA sequences revealed notable divergence between sacred lotus and other flowering plants, including the following factors: (i) an NnCenH3 alternative splicing variant comprising only a partial centromere-targeting domain, (ii) active genes with low transcription levels in the NnCenH3 nucleosomal regions, and (iii) the prevalence of the Ty1/copia class of long terminal repeat (LTR) retrotransposons in the centromeres of sacred lotus chromosomes. In addition, the dynamic natures of the centromeric region showed that some of the centromeric repeat DNA sequences originated from telomeric repeats, and a pair of centromeres on the dicentric chromosome 1 was inactive in the metaphase cells of sacred lotus. Our characterization of the properties of centromeric DNA structure within the sacred lotus genome describes a centromeric profile in ancient basal eudicots and might provide evidence of the origins and evolution of centromeres. Furthermore, the identification of centromeric DNA sequences is of great significance for the assembly of the sacred lotus genome. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  17. Probing the electrostatics and pharmacologic modulation of sequence-specific binding by the DNA-binding domain of the ETS-family transcription factor PU.1: a binding affinity and kinetics investigation

    PubMed Central

    Munde, Manoj; Poon, Gregory M. K.; Wilson, W. David

    2013-01-01

    Members of the ETS family of transcription factors regulate a functionally diverse array of genes. All ETS proteins share a structurally-conserved but sequence-divergent DNA-binding domain, known as the ETS domain. Although the structure and thermodynamics of the ETS-DNA complexes are well known, little is known about the kinetics of sequence recognition, a facet that offers potential insight into its molecular mechanism. We have characterized DNA binding by the ETS domain of PU.1 by biosensor-surface plasmon resonance (SPR). SPR analysis revealed a striking kinetic profile for DNA binding by the PU.1 ETS domain. At low salt concentrations, it binds high-affinity cognate DNA with a very slow association rate constant (≤105 M−1 s−1), compensated by a correspondingly small dissociation rate constant. The kinetics are strongly salt-dependent but mutually balance to produce a relatively weak dependence in the equilibrium constant. This profile contrasts sharply with reported data for other ETS domains (e.g., Ets-1, TEL) for which high-affinity binding is driven by rapid association (>107 M−1 s−1). We interpret this difference in terms of the hydration properties of ETS-DNA binding and propose that at least two mechanisms of sequence recognition are employed by this family of DNA-binding domain. Additionally, we use SPR to demonstrate the potential for pharmacological inhibition of sequence-specific ETS-DNA binding, using the minor groove-binding distamycin as a model compound. Our work establishes SPR as a valuable technique for extending our understanding of the molecular mechanisms of ETS-DNA interactions as well as developing potential small-molecule agents for biotechnological and therapeutic purposes. PMID:23416556

  18. RNA editing with CRISPR-Cas13.

    PubMed

    Cox, David B T; Gootenberg, Jonathan S; Abudayyeh, Omar O; Franklin, Brian; Kellner, Max J; Joung, Julia; Zhang, Feng

    2017-11-24

    Nucleic acid editing holds promise for treating genetic disease, particularly at the RNA level, where disease-relevant sequences can be rescued to yield functional protein products. Type VI CRISPR-Cas systems contain the programmable single-effector RNA-guided ribonuclease Cas13. We profiled type VI systems in order to engineer a Cas13 ortholog capable of robust knockdown and demonstrated RNA editing by using catalytically inactive Cas13 (dCas13) to direct adenosine-to-inosine deaminase activity by ADAR2 (adenosine deaminase acting on RNA type 2) to transcripts in mammalian cells. This system, referred to as RNA Editing for Programmable A to I Replacement (REPAIR), which has no strict sequence constraints, can be used to edit full-length transcripts containing pathogenic mutations. We further engineered this system to create a high-specificity variant and minimized the system to facilitate viral delivery. REPAIR presents a promising RNA-editing platform with broad applicability for research, therapeutics, and biotechnology. Copyright © 2017, American Association for the Advancement of Science.

  19. RNA Editing with CRISPR-Cas13

    PubMed Central

    Cox, David B.T.; Gootenberg, Jonathan S.; Abudayyeh, Omar O.; Franklin, Brian; Kellner, Max J.; Joung, Julia; Zhang, Feng

    2017-01-01

    Nucleic acid editing holds promise for treating genetic disease, particularly at the RNA level, where disease-relevant sequences can be rescued to yield functional protein products. Type VI CRISPR-Cas systems contain the programmable single-effector RNA-guided RNases Cas13. Here, we profile Type VI systems to engineer a Cas13 ortholog capable of robust knockdown and demonstrate RNA editing by using catalytically-inactive Cas13 (dCas13) to direct adenosine to inosine deaminase activity by ADAR2 to transcripts in mammalian cells. This system, referred to as RNA Editing for Programmable A to I Replacement (REPAIR), has no strict sequence constraints, can be used to edit full-length transcripts containing pathogenic mutations. We further engineer this system to create a high specificity variant, REPAIRv2, that is 919 times more specific than REPAIRv1 as well as minimize the system to ease viral delivery. REPAIR presents a promising RNA editing platform with broad applicability for research, therapeutics, and biotechnology. PMID:29070703

  20. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

    PubMed

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G; Rigoutsos, Isidore; Kirino, Yohei

    2017-05-19

    Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences.

    PubMed

    Mizianty, Marcin J; Kurgan, Lukasz

    2009-12-13

    Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/.

  2. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences

    PubMed Central

    2009-01-01

    Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/. PMID:20003388

  3. The Structure Difference in the Southern Margin of the Dangerous Grounds: Implications for the Final Evolution of the South China Sea

    NASA Astrophysics Data System (ADS)

    Xi, P.; Shen, C.; Zhao, Z.; Xie, X.; Mei, L.; Gong, J.; Huang, X.

    2015-12-01

    We interpret two multi-channel seismic reflection profiles, more than 900 km across the entire Dangerous Grounds, locating in east and west of the southern margin of the South China Sea respectively. Eight Cenozoic sequence boundaries are determined as well as three tectono-stratigraphic units. Detailed analysis of extensional features and unconformities revealed the tectonic in the east and west. Early extension (syn-rifting sequence) occurred in the two profiles during continental rifting, which lasted from Palaeocene to Early Oligocene, and resulted in formation of half-grabens and rotated fault-blocks. Late extension (drift-rifting sequence) has the significant difference in the both profiles. The eastern Dangerous Grounds entered rifting-depression stage and some compressional deformation occurred in the Reed Bank basin at about the beginning of Early Miocene, probably resulting from the collision of the Dangerous Grounds and the Sabah-Cagayan Arc. The western Dangerous Grounds was still in rifting until the end of Early Miocene, forming the MMU or DRU which is strongly erosional and represents a major break in sedimentation and/or erosion in partial area. Denudation fold and inverted fault can be distinguished blow the MMU, indicating the cessation of the South China Sea accompanied the NW compression, while the boundary corresponding the MMU is nearly a plano-conformity in the east. The thermal sag (post-rifting sequence) is characterized by non-faulted draping strata in the whole area. The different structure in east and west may be related to the final evolution of the SCS. When the proto-SCS closed in a scissor fashion plus the clockwise rotation of Borneo, the initial collision (c.20Ma) appeared in east part building the NW foreland basin system from Palawan Trough to Reed Bank in a short-live process, while the west part was drifting southwards until c.15Ma to form the even more remarkable foreland system from Borneo Trough to deep-water Sarawak.

  4. Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

    PubMed Central

    Bernardes, Juliana; Zaverucha, Gerson; Vaquero, Catherine; Carbone, Alessandra

    2016-01-01

    Traditional protein annotation methods describe known domains with probabilistic models representing consensus among homologous domain sequences. However, when relevant signals become too weak to be identified by a global consensus, attempts for annotation fail. Here we address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. We design a new strategy based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We propose a novel exploitation of the large amount of data available: 1. for each known protein domain, several probabilistic clade-centered models are constructed from a large and differentiated panel of homologous sequences, 2. a decision-making protocol combines outcomes obtained from multiple models, 3. a multi-criteria optimization algorithm finds the most likely protein architecture. The method is evaluated for domain and architecture prediction over several datasets and statistical testing hypotheses. Its performance is compared against HMMScan and HHblits, two widely used search methods based on sequence-profile and profile-profile comparison. Due to their closeness to actual protein sequences, clade-centered models are shown to be more specific and functionally predictive than the broadly used consensus models. Based on them, we improved annotation of Plasmodium falciparum protein sequences on a scale not previously possible. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. The method is applicable to any genome and opens new avenues to tackle evolutionary questions such as the reconstruction of ancient domain duplications, the reconstruction of the history of protein architectures, and the estimation of protein domain age. Website and software: http://www.lcqb.upmc.fr/CLADE. PMID:27472895

  5. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks.

    PubMed

    Yan, Winston X; Mirzazadeh, Reza; Garnerone, Silvano; Scott, David; Schneider, Martin W; Kallas, Tomasz; Custodio, Joaquin; Wernersson, Erik; Li, Yinqing; Gao, Linyi; Federova, Yana; Zetsche, Bernd; Zhang, Feng; Bienko, Magda; Crosetto, Nicola

    2017-05-12

    Precisely measuring the location and frequency of DNA double-strand breaks (DSBs) along the genome is instrumental to understanding genomic fragility, but current methods are limited in versatility, sensitivity or practicality. Here we present Breaks Labeling In Situ and Sequencing (BLISS), featuring the following: (1) direct labelling of DSBs in fixed cells or tissue sections on a solid surface; (2) low-input requirement by linear amplification of tagged DSBs by in vitro transcription; (3) quantification of DSBs through unique molecular identifiers; and (4) easy scalability and multiplexing. We apply BLISS to profile endogenous and exogenous DSBs in low-input samples of cancer cells, embryonic stem cells and liver tissue. We demonstrate the sensitivity of BLISS by assessing the genome-wide off-target activity of two CRISPR-associated RNA-guided endonucleases, Cas9 and Cpf1, observing that Cpf1 has higher specificity than Cas9. Our results establish BLISS as a versatile, sensitive and efficient method for genome-wide DSB mapping in many applications.

  6. The SUPERFAMILY database in 2004: additions and improvements.

    PubMed

    Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian

    2004-01-01

    The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.

  7. Functional artificial luciferases as an optical readout for bioassays.

    PubMed

    Kim, Sung Bae; Izumi, Hiroshi

    2014-06-13

    This study elucidates functional artificial luciferases (ALucs) wholly synthesized for bioassays and molecular imaging. The ALucs bearing epitopes were newly created by amending the sequences of our previously reported ALucs in light of a multi-sequence alignment and hydrophobicity search. The synthesized ALucs are survived in live cells and stable in culture media for 25 days after secretion. The epitopes in ALucs are exposed during the secretion process and indeed valid for column purification and immunological assays. The ALucs exerted a 9400-times stronger optical intensity with a coelenterazine derivative (CTZ i), when compared with Renilla reniformis luciferase 8.6-535. A supersecondary structure of ALuc30 was predicted with respect to the X-ray crystallographic information of the coelenterazine-binding protein (CBP). The structure revealed that ALuc30 has a room for accommodating the iodide of CTZ i. This study guides on how to create functional artificial luciferases and predicts the structural details with the current bioinformatics technologies. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System.

    PubMed

    Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

    2017-01-01

    Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.

  9. Best Practices for Environmental Site Management: A Practical Guide for Applying Environmental Sequence Stratigraphy to Improve Conceptual Site Models

    EPA Science Inventory

    Presented here is a practical guide on the application of the geologic principles of sequence stratigraphy and facies models to the characterization of stratigraphic heterogeneity at hazardous waste sites. This technology is applicable to sites underlain by clastic aquifers (int...

  10. Getting to Know You...All about You: Preschool Orientation Manual.

    ERIC Educational Resources Information Center

    San Ysidro School District, CA.

    Designed to guide teachers through a 20-day sequence of preschool orientation activities, the manual presents a numbered sequence of topics with related objectives and explanations, preparation and planning needs, and specific activities for children. Section I is entitled "All Around Us" and focuses on guiding preschool children and…

  11. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles

    PubMed Central

    Mathelier, Anthony; Fornes, Oriol; Arenillas, David J.; Chen, Chih-yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W.; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.

    2016-01-01

    JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. PMID:26531826

  12. Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

    PubMed

    Schmidt Am Busch, Marcel; Sedano, Audrey; Simonson, Thomas

    2010-05-05

    Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.

  13. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

    USDA-ARS?s Scientific Manuscript database

    Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...

  14. Mandibular reconstruction using fibula free flap harvested using a customised cutting guide: how we do it.

    PubMed

    Tarsitano, A; Ciocca, L; Cipriani, R; Scotti, R; Marchetti, C

    2015-06-01

    Free fibula flap is routinely used for mandibular reconstructions. For contouring the flap, multiple osteotomies should be shaped to reproduce the native mandibular contour. The bone segments should be fixed using a reconstructive plate. This plate is usually manually bent by the surgeon during surgery. This method is efficient, but during reconstruction it is complicated to reproduce the complex 3D conformation of the mandible and recreate a normal morphology with a mandibular profile as similar as possible to the original; any aberration in its structural alignment may lead to aesthetic and function alterations due to malocclusion or temporomandibular disorders. In order to achieve better morphological and functional outcomes, we have performed a customised flap harvest using cutting guides. This study demonstrates how we have performed customised mandibular reconstruction using CAD-CAM fibular cutting guides in 20 patients undergoing oncological segmental resection.

  15. Reconstruction Of The Permittivity Profile Of A Stratified Dielectric Layer

    NASA Astrophysics Data System (ADS)

    Vogelzang, E.; Ferwerda, H. A.; Yevick, D.

    1985-03-01

    A numerical procedure is given for the reconstruction of the permittivity profile of a dielectric slab on a perfect conductor. Profiles not supporting guided modes are reconstructed from the complex reflection amplitude for TE-polarized, monochromatic plane waves incident from different directions using the Marchenko theory. The contribution of guided modes is incorporated in the reconstruction procedure through the Gelfand-Levitan equations. An advantage of our approach is that a unique solution for the permittivity profile is obtained without the use of complicated regularization techniques. Some illustrative numerical examples are presented.

  16. COMOC 2: Two-dimensional aerodynamics sequence, computer program user's guide

    NASA Technical Reports Server (NTRS)

    Manhardt, P. D.; Orzechowski, J. A.; Baker, A. J.

    1977-01-01

    The COMOC finite element fluid mechanics computer program system is applicable to diverse problem classes. The two dimensional aerodynamics sequence was established for solution of the potential and/or viscous and turbulent flowfields associated with subsonic flight of elementary two dimensional isolated airfoils. The sequence is constituted of three specific flowfield options in COMOC for two dimensional flows. These include the potential flow option, the boundary layer option, and the parabolic Navier-Stokes option. By sequencing through these options, it is possible to computationally construct a weak-interaction model of the aerodynamic flowfield. This report is the user's guide to operation of COMOC for the aerodynamics sequence.

  17. Functional Diversity of Haloacid Dehalogenase Superfamily Phosphatases from Saccharomyces cerevisiae: BIOCHEMICAL, STRUCTURAL, AND EVOLUTIONARY INSIGHTS.

    PubMed

    Kuznetsova, Ekaterina; Nocek, Boguslaw; Brown, Greg; Makarova, Kira S; Flick, Robert; Wolf, Yuri I; Khusnutdinova, Anna; Evdokimova, Elena; Jin, Ke; Tan, Kemin; Hanson, Andrew D; Hasnain, Ghulam; Zallot, Rémi; de Crécy-Lagard, Valérie; Babu, Mohan; Savchenko, Alexei; Joachimiak, Andrzej; Edwards, Aled M; Koonin, Eugene V; Yakunin, Alexander F

    2015-07-24

    The haloacid dehalogenase (HAD)-like enzymes comprise a large superfamily of phosphohydrolases present in all organisms. The Saccharomyces cerevisiae genome encodes at least 19 soluble HADs, including 10 uncharacterized proteins. Here, we biochemically characterized 13 yeast phosphatases from the HAD superfamily, which includes both specific and promiscuous enzymes active against various phosphorylated metabolites and peptides with several HADs implicated in detoxification of phosphorylated compounds and pseudouridine. The crystal structures of four yeast HADs provided insight into their active sites, whereas the structure of the YKR070W dimer in complex with substrate revealed a composite substrate-binding site. Although the S. cerevisiae and Escherichia coli HADs share low sequence similarities, the comparison of their substrate profiles revealed seven phosphatases with common preferred substrates. The cluster of secondary substrates supporting significant activity of both S. cerevisiae and E. coli HADs includes 28 common metabolites that appear to represent the pool of potential activities for the evolution of novel HAD phosphatases. Evolution of novel substrate specificities of HAD phosphatases shows no strict correlation with sequence divergence. Thus, evolution of the HAD superfamily combines the conservation of the overall substrate pool and the substrate profiles of some enzymes with remarkable biochemical and structural flexibility of other superfamily members. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. Facilitated sequence counting and assembly by template mutagenesis

    PubMed Central

    Levy, Dan; Wigler, Michael

    2014-01-01

    Presently, inferring the long-range structure of the DNA templates is limited by short read lengths. Accurate template counts suffer from distortions occurring during PCR amplification. We explore the utility of introducing random mutations in identical or nearly identical templates to create distinguishable patterns that are inherited during subsequent copying. We simulate the applications of this process under assumptions of error-free sequencing and perfect mapping, using cytosine deamination as a model for mutation. The simulations demonstrate that within readily achievable conditions of nucleotide conversion and sequence coverage, we can accurately count the number of otherwise identical molecules as well as connect variants separated by long spans of identical sequence. We discuss many potential applications, such as transcript profiling, isoform assembly, haplotype phasing, and de novo genome assembly. PMID:25313059

  19. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

    PubMed Central

    2014-01-01

    Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393

  20. Exploiting three kinds of interface propensities to identify protein binding sites.

    PubMed

    Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

    2009-08-01

    Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. In this study, we present a building block of proteins called order profiles to use the evolutionary information of the protein sequence frequency profiles and apply this building block to produce a class of propensities called order profile interface propensities. For comparisons, we revisit the usage of residue interface propensities and binary profile interface propensities for protein binding site prediction. Each kind of propensities combined with sequence profiles and accessible surface areas are inputted into SVM. When tested on four types of complexes (hetero-permanent complexes, hetero-transient complexes, homo-permanent complexes and homo-transient complexes), experimental results show that the order profile interface propensities are better than residue interface propensities and binary profile interface propensities. Therefore, order profile is a suitable profile-level building block of the protein sequences and can be widely used in many tasks of computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the protein remote homology detection.

  1. Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.

    PubMed

    Bashir, Ali; Bansal, Vikas; Bafna, Vineet

    2010-06-18

    Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.

  2. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    PubMed

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  3. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields

    NASA Astrophysics Data System (ADS)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  4. Advances in the molecular genetics of gliomas - implications for classification and therapy.

    PubMed

    Reifenberger, Guido; Wirsching, Hans-Georg; Knobbe-Thomsen, Christiane B; Weller, Michael

    2017-07-01

    Genome-wide molecular-profiling studies have revealed the characteristic genetic alterations and epigenetic profiles associated with different types of gliomas. These molecular characteristics can be used to refine glioma classification, to improve prediction of patient outcomes, and to guide individualized treatment. Thus, the WHO Classification of Tumours of the Central Nervous System was revised in 2016 to incorporate molecular biomarkers - together with classic histological features - in an integrated diagnosis, in order to define distinct glioma entities as precisely as possible. This paradigm shift is markedly changing how glioma is diagnosed, and has important implications for future clinical trials and patient management in daily practice. Herein, we highlight the developments in our understanding of the molecular genetics of gliomas, and review the current landscape of clinically relevant molecular biomarkers for use in classification of the disease subtypes. Novel approaches to the genetic characterization of gliomas based on large-scale DNA-methylation profiling and next-generation sequencing are also discussed. In addition, we illustrate how advances in the molecular genetics of gliomas can promote the development and clinical translation of novel pathogenesis-based therapeutic approaches, thereby paving the way towards precision medicine in neuro-oncology.

  5. An ultrasonically levitated noncontact stage using traveling vibrations on precision ceramic guide rails.

    PubMed

    Koyama, Daisuke; Ide, Takeshi; Friend, James R; Nakamura, Kentaro; Ueha, Sadayuki

    2007-03-01

    This paper presents a noncontact sliding table design and measurements of its performance via ultrasonic levitation. A slider placed atop two vibrating guide rails is levitated by an acoustic radiation force emitted from the rails. A flexural traveling wave propagating along the guide rails allows noncontact transportation of the slider. Permitting a transport mechanism that reduces abrasion and dust generation with an inexpensive and simple structure. The profile of the sliding table was designed using the finite-element analysis (FEA) for high levitation and transportation efficiency. The prototype sliding table was made of alumina ceramic (Al2O3) to increase machining accuracy and rigidity using a structure composed of a pair of guide rails with a triangular cross section and piezoelectric transducers. Two types of transducers were used: bolt-clamped Langevin transducers and bimorph transducers. A 40-mm long slider was designed to fit atop the two rail guides. Flexural standing waves and torsional standing waves were observed along the guide rails at resonance, and the levitation of the slider was obtained using the flexural mode even while the levitation distance was less than 10 microm. The levitation distance of the slider was measured while increasing the slider's weight. The levitation pressure, rigidity, and vertical displacement amplitude of the levitating slider thus were measured to be 6.7 kN/m2, 3.0 kN/microm/m2, and less than 1 microm, respectively. Noncontact transport of the slider was achieved using phased drive of the two transducers at either end of the vibrating guide rail. By controlling the phase difference, the slider transportation direction could be switched, and a maximum thrust of 13 mN was obtained.

  6. GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.

    PubMed

    Zhu, Lihua Julie; Lawrence, Michael; Gupta, Ankit; Pagès, Hervé; Kucukural, Alper; Garber, Manuel; Wolfe, Scot A

    2017-05-15

    Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html .

  7. STUMP un"stumped": anti-tumor response to anaplastic lymphoma kinase (ALK) inhibitor based targeted therapy in uterine inflammatory myofibroblastic tumor with myxoid features harboring DCTN1-ALK fusion.

    PubMed

    Subbiah, Vivek; McMahon, Caitlin; Patel, Shreyaskumar; Zinner, Ralph; Silva, Elvio G; Elvin, Julia A; Subbiah, Ishwaria M; Ohaji, Chimela; Ganeshan, Dhakshina Moorthy; Anand, Deepa; Levenback, Charles F; Berry, Jenny; Brennan, Tim; Chmielecki, Juliann; Chalmers, Zachary R; Mayfield, John; Miller, Vincent A; Stephens, Philip J; Ross, Jeffrey S; Ali, Siraj M

    2015-06-11

    Recurrent, metastatic mesenchymal myxoid tumors of the gynecologic tract present a management challenge as there is minimal evidence to guide systemic therapy. Such tumors also present a diagnostic dilemma, as myxoid features are observed in leiomyosarcomas, inflammatory myofibroblastic tumors (IMT), and mesenchymal myxoid tumors. Comprehensive genomic profiling was performed in the course of clinical care on a case of a recurrent, metastatic myxoid uterine malignancy (initially diagnosed as smooth muscle tumor of uncertain malignant potential (STUMP)), to guide identify targeted therapeutic options. To our knowledge, this case represents the first report of clinical response to targeted therapy in a tumor harboring a DCTN1-ALK fusion protein. Hybridization capture of 315 cancer-related genes plus introns from 28 genes often rearranged or altered in cancer was applied to >50 ng of DNA extracted from this sample and sequenced to high, uniform coverage. Therapy was given in the context of a phase I clinical trial ClinicalTrials.gov Identifier: ( NCT01548144 ). Immunostains showed diffuse positivity for ALK1 expression and comprehensive genomic profiling identified an in frame DCTN1-ALK gene fusion. The diagnosis of STUMP was revised to that of an IMT with myxoid features. The patient was enrolled in a clinical trial and treated with an anaplastic lymphoma kinase (ALK) inhibitor (crizotinib/Xalkori®) and a multikinase VEGF inhibitor (pazopanib/Votrient®). The patient experienced an ongoing partial response (6+ months) by response evaluation criteria in solid tumors (RECIST) 1.1 criteria. For myxoid tumors of the gynecologic tract, comprehensive genomic profiling can identify clinical relevant genomic alterations that both direct treatment targeted therapy and help discriminate between similar diagnostic entities.

  8. Transcriptional Profiling of Synovial Macrophages Using Minimally Invasive Ultrasound-Guided Synovial Biopsies in Rheumatoid Arthritis.

    PubMed

    Mandelin, Arthur M; Homan, Philip J; Shaffer, Alexander M; Cuda, Carla M; Dominguez, Salina T; Bacalao, Emily; Carns, Mary; Hinchcliff, Monique; Lee, Jungwha; Aren, Kathleen; Thakrar, Anjali; Montgomery, Anna B; Bridges, S Louis; Bathon, Joan M; Atkinson, John P; Fox, David A; Matteson, Eric L; Buckley, Christopher D; Pitzalis, Costantino; Parks, Deborah; Hughes, Laura B; Geraldino-Pardilla, Laura; Ike, Robert; Phillips, Kristine; Wright, Kerry; Filer, Andrew; Kelly, Stephen; Ruderman, Eric M; Morgan, Vince; Abdala-Valencia, Hiam; Misharin, Alexander V; Budinger, G Scott; Bartom, Elizabeth T; Pope, Richard M; Perlman, Harris; Winter, Deborah R

    2018-06-01

    Currently, there are no reliable biomarkers for predicting therapeutic response in patients with rheumatoid arthritis (RA). The synovium may unlock critical information for determining efficacy, since a reduction in the numbers of sublining synovial macrophages remains the most reproducible biomarker. Thus, a clinically actionable method for the collection of synovial tissue, which can be analyzed using high-throughput strategies, must become a reality. This study was undertaken to assess the feasibility of utilizing synovial biopsies as a precision medicine-based approach for patients with RA. Rheumatologists at 6 US academic sites were trained in minimally invasive ultrasound-guided synovial tissue biopsy. Biopsy specimens obtained from patients with RA and synovial tissue from patients with osteoarthritis (OA) were subjected to histologic analysis, fluorescence-activated cell sorting, and RNA sequencing (RNA-seq). An optimized protocol for digesting synovial tissue was developed to generate high-quality RNA-seq libraries from isolated macrophage populations. Associations were determined between macrophage transcriptional profiles and clinical parameters in RA patients. Patients with RA reported minimal adverse effects in response to synovial biopsy. Comparable RNA quality was observed from synovial tissue and isolated macrophages between patients with RA and patients with OA. Whole tissue samples from patients with RA demonstrated a high degree of transcriptional heterogeneity. In contrast, the transcriptional profile of isolated RA synovial macrophages highlighted different subpopulations of patients and identified 6 novel transcriptional modules that were associated with disease activity and therapy. Performance of synovial tissue biopsies by rheumatologists in the US is feasible and generates high-quality samples for research. Through the use of cutting-edge technologies to analyze synovial biopsy specimens in conjunction with corresponding clinical information, a precision medicine-based approach for patients with RA is attainable. © 2018, American College of Rheumatology.

  9. How disordered is my protein and what is its disorder for? A guide through the “dark side” of the protein universe

    PubMed Central

    Lieutaud, Philippe; Uversky, Alexey V.; Uversky, Vladimir N.; Longhi, Sonia

    2016-01-01

    ABSTRACT In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder. PMID:28232901

  10. Main-Sequence CMEs as Magnetic Explosions: Compatibility with Observed Kinematics

    NASA Technical Reports Server (NTRS)

    Moore, Ron; Falconer, David; Sterling, Alphonse

    2004-01-01

    We examine the kinematics of 26 CMEs of the morphological main sequence of CMEs, those having the classic three-part bubble structure of (1) a bright front eveloping (2) a dark cavity within which rides (3) a bright blob/filamentary feature. Each CME is observed in Yohkoh/SXT images to originate from near the limb (> or equal to 0.7 R(sub Sun) from disk center). The basic data (from the SOHO LASCO CME Catalog) for the kinematics of each CME are the sequence of LASCO images of the CME, the time of each image, the measured radial distance of the front edge of the CME in each image, and the measured angular extent of the CME. About half of our CMEs (12) occur with a flare, and the rest (14) occur without a flare. While the average linear-fit speed of the flare CMEs (1000 km/s) is twice that of the non-flare CMEs (510 km/s), the flare CMEs and the non-flare CMEs are similar in that some have nearly flat velocity-height (radial extent) profiles (little acceleration), some have noticeably falling velocity profiles (noticeable deceleration), and the rest have velocity profiles that rise considerably through the outer corona (blatant acceleration). This suggests that in addition to sharing similar morphology, main-sequence CMEs all have basically the same driving mechanism. The observed radial progression of each of our 26 CMEs is fit by a simple model magnetic plasmoid that is in pressure balance with the radial magnetic field in the outer corona and that propels itself outward by magnetic expansion, doing no net work on its surroundings. On average over the 26 CMEs, this model fits the observations as well as the assumption of constant acceleration. This is compatible with main-sequence CMEs being magnetically driven, basically magnetic explosions, with the velocity profile in the outer corona being largely dictated by the initial Alfien speed in the CME (when the front is at approx. 3 (sub Sun), analogous to the mass of a main-sequence star dictating the luminosity.

  11. Gene Structures, Evolution and Transcriptional Profiling of the WRKY Gene Family in Castor Bean (Ricinus communis L.).

    PubMed

    Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui

    2016-01-01

    WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.

  12. Inferring coarse-grain histone-DNA interaction potentials from high-resolution structures of the nucleosome

    NASA Astrophysics Data System (ADS)

    Meyer, Sam; Everaers, Ralf

    2015-02-01

    The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.

  13. Prediction of phenotypes of missense mutations in human proteins from biological assemblies.

    PubMed

    Wei, Qiong; Xu, Qifang; Dunbrack, Roland L

    2013-02-01

    Single nucleotide polymorphisms (SNPs) are the most frequent variation in the human genome. Nonsynonymous SNPs that lead to missense mutations can be neutral or deleterious, and several computational methods have been presented that predict the phenotype of human missense mutations. These methods use sequence-based and structure-based features in various combinations, relying on different statistical distributions of these features for deleterious and neutral mutations. One structure-based feature that has not been studied significantly is the accessible surface area within biologically relevant oligomeric assemblies. These assemblies are different from the crystallographic asymmetric unit for more than half of X-ray crystal structures. We find that mutations in the core of proteins or in the interfaces in biological assemblies are significantly more likely to be disease-associated than those on the surface of the biological assemblies. For structures with more than one protein in the biological assembly (whether the same sequence or different), we find the accessible surface area from biological assemblies provides a statistically significant improvement in prediction over the accessible surface area of monomers from protein crystal structures (P = 6e-5). When adding this information to sequence-based features such as the difference between wildtype and mutant position-specific profile scores, the improvement from biological assemblies is statistically significant but much smaller (P = 0.018). Combining this information with sequence-based features in a support vector machine leads to 82% accuracy on a balanced dataset of 50% disease-associated mutations from SwissVar and 50% neutral mutations from human/primate sequence differences in orthologous proteins. Copyright © 2012 Wiley Periodicals, Inc.

  14. DNA Replication Profiling Using Deep Sequencing.

    PubMed

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  15. Teaching/Learning Guide: Level I.

    ERIC Educational Resources Information Center

    Carbon - Lehigh Intermediate Unit, Schnecksville, PA.

    The manual presents sequences of skills designed for use as guides to teaching/learning objectives and as a basis for evaluating and recording special education students' progress. It is explained that the goal of the first level of objectives (sequenced in this document) is to enable the student to function at a motor/psychomotor state of…

  16. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

    PubMed Central

    2013-01-01

    Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482

  17. Self-guided management of exome and whole-genome sequencing results: changing the results return model.

    PubMed

    Yu, Joon-Ho; Jamal, Seema M; Tabor, Holly K; Bamshad, Michael J

    2013-09-01

    Researchers and clinicians face the practical and ethical challenge of if and how to offer for return the wide and varied scope of results available from individual exome sequencing and whole-genome sequencing. We argue that rather than viewing individual exome sequencing and whole-genome sequencing as a test for which results need to be "returned," that the technology should instead be framed as a dynamic resource of information from which results should be "managed" over the lifetime of an individual. We further suggest that individual exome sequencing and whole-genome sequencing results management is optimized using a self-guided approach that enables individuals to self-select among results offered for return in a convenient, confidential, personalized context that is responsive to their value system. This approach respects autonomy, allows individuals to maximize potential benefits of genomic information (beneficence) and minimize potential harms (nonmaleficence), and also preserves their right to an open future to the extent they desire or think is appropriate. We describe key challenges and advantages of such a self-guided management system and offer guidance on implementation using an information systems approach.

  18. Molecular characterization of southern bluefin tuna myoglobin (Thunnus maccoyii).

    PubMed

    Nurilmala, Mala; Ochiai, Yoshihiro

    2016-10-01

    The primary structure of southern bluefin tuna Thunnus maccoyii Mb has been elucidated by molecular cloning techniques. The cDNA of this tuna encoding Mb contained 776 nucleotides, with an open reading frame of 444 nucleotides encoding 147 amino acids. The nucleotide sequence of the coding region was identical to those of other bluefin tunas (T. thynnus and T. orientalis), thus giving the same amino acid sequences. Based on the deduced amino acid sequence, bioinformatic analysis was performed including phylogenic tree, hydropathy plot and homology modeling. In order to investigate the autoxidation profiles, the isolation of Mb was performed from the dark muscle. The water soluble fraction was subjected to ammonium sulfate fractionation (60-90 % saturation) followed by preparative gel electrophoresis. Autoxidation profiles of Mb were delineated at pH 5.6, 6.5 and 7.4 at temperature 37 °C. The autoxidation rate of tuna Mb was slightly higher than that of horse Mb at all pH examined. These results revealed that tuna myoglobin was unstable than that of horse Mb mainly at acidic pH.

  19. Correction of respiratory motion for IMRT using aperture adaptive technique and visual guidance: A feasibility study

    NASA Astrophysics Data System (ADS)

    Chen, Ho-Hsing; Wu, Jay; Chuang, Keh-Shih; Kuo, Hsiang-Chi

    2007-07-01

    Intensity-modulated radiation therapy (IMRT) utilizes nonuniform beam profile to deliver precise radiation doses to a tumor while minimizing radiation exposure to surrounding normal tissues. However, the problem of intrafraction organ motion distorts the dose distribution and leads to significant dosimetric errors. In this research, we applied an aperture adaptive technique with a visual guiding system to toggle the problem of respiratory motion. A homemade computer program showing a cyclic moving pattern was projected onto the ceiling to visually help patients adjust their respiratory patterns. Once the respiratory motion becomes regular, the leaf sequence can be synchronized with the target motion. An oscillator was employed to simulate the patient's breathing pattern. Two simple fields and one IMRT field were measured to verify the accuracy. Preliminary results showed that after appropriate training, the amplitude and duration of volunteer's breathing can be well controlled by the visual guiding system. The sharp dose gradient at the edge of the radiation fields was successfully restored. The maximum dosimetric error in the IMRT field was significantly decreased from 63% to 3%. We conclude that the aperture adaptive technique with the visual guiding system can be an inexpensive and feasible alternative without compromising delivery efficiency in clinical practice.

  20. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.

    PubMed

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-09-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  1. Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity

    PubMed Central

    Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2015-01-01

    The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648

  2. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

    PubMed

    Meinicke, Peter

    2009-09-02

    Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.

  3. Quantitative profiling of immune repertoires for minor lymphocyte counts using unique molecular identifiers.

    PubMed

    Egorov, Evgeny S; Merzlyak, Ekaterina M; Shelenkov, Andrew A; Britanova, Olga V; Sharonov, George V; Staroverov, Dmitriy B; Bolotin, Dmitriy A; Davydov, Alexey N; Barsova, Ekaterina; Lebedev, Yuriy B; Shugay, Mikhail; Chudakov, Dmitriy M

    2015-06-15

    Emerging high-throughput sequencing methods for the analyses of complex structure of TCR and BCR repertoires give a powerful impulse to adaptive immunity studies. However, there are still essential technical obstacles for performing a truly quantitative analysis. Specifically, it remains challenging to obtain comprehensive information on the clonal composition of small lymphocyte populations, such as Ag-specific, functional, or tissue-resident cell subsets isolated by sorting, microdissection, or fine needle aspirates. In this study, we report a robust approach based on unique molecular identifiers that allows profiling Ag receptors for several hundred to thousand lymphocytes while preserving qualitative and quantitative information on clonal composition of the sample. We also describe several general features regarding the data analysis with unique molecular identifiers that are critical for accurate counting of starting molecules in high-throughput sequencing applications. Copyright © 2015 by The American Association of Immunologists, Inc.

  4. The Victor C++ library for protein representation and advanced manipulation.

    PubMed

    Hirsh, Layla; Piovesan, Damiano; Giollo, Manuel; Ferrari, Carlo; Tosatto, Silvio C E

    2015-04-01

    Protein sequence and structure representation and manipulation require dedicated software libraries to support methods of increasing complexity. Here, we describe the VIrtual Constrution TOol for pRoteins (Victor) C++ library, an open source platform dedicated to enabling inexperienced users to develop advanced tools and gathering contributions from the community. The provided application examples cover statistical energy potentials, profile-profile sequence alignments and ab initio loop modeling. Victor was used over the last 15 years in several publications and optimized for efficiency. It is provided as a GitHub repository with source files and unit tests, plus extensive online documentation, including a Wiki with help files and tutorials, examples and Doxygen documentation. The C++ library and online documentation, distributed under a GPL license are available from URL: http://protein.bio.unipd.it/victor/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. FIST: a sensory domain for diverse signal transduction pathways in prokaryotes and ubiquitin signaling in eukaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borziak, Kirill; Jouline, Igor B

    2007-01-01

    Motivation: Sensory domains that are conserved among Bacteria, Archaea and Eucarya are important detectors of common signals detected by living cells. Due to their high sequence divergence, sensory domains are difficult to identify. We systematically look for novel sensory domains using sensitive profile-based searches initi-ated with regions of signal transduction proteins where no known domains can be identified by current domain models. Results: Using profile searches followed by multiple sequence alignment, structure prediction, and domain architecture analysis, we have identified a novel sensory domain termed FIST, which is present in signal transduction proteins from Bacteria, Archaea and Eucarya. Remote similaritymore » to a known ligand-binding fold and chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.« less

  6. Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

    PubMed

    Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

    2018-03-01

    Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Short guide to SDI profiling at ORNL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pomerance, H.S.

    1976-06-01

    ORNL has machine-searchable data bases that correspond to printed indexes and abstracts. This guide describes the peculiarities of those several data bases and the conventions of the ORNL search system so that users can write their own queries or search profiles and can interpret the part of the output that is encoded.

  8. Understanding the core of RNA interference: The dynamic aspects of Argonaute-mediated processes.

    PubMed

    Zhu, Lizhe; Jiang, Hanlun; Sheong, Fu Kit; Cui, Xuefeng; Wang, Yanli; Gao, Xin; Huang, Xuhui

    2017-09-01

    At the core of RNA interference, the Argonaute proteins (Ago) load and utilize small guide nucleic acids to silence mRNAs or cleave foreign nucleic acids in a sequence specific manner. In recent years, based on extensive structural studies of Ago and its interaction with the nucleic acids, considerable progress has been made to reveal the dynamic aspects of various Ago-mediated processes. Here we review these novel insights into the guide-strand loading, duplex unwinding, and effects of seed mismatch, with a focus on two representative Agos, the human Ago 2 (hAgo2) and the bacterial Thermus thermophilus Ago (TtAgo). In particular, comprehensive molecular simulation studies revealed that although sharing similar overall structures, the two Agos have vastly different conformational landscapes and guide-strand loading mechanisms because of the distinct rigidity of their L1-PAZ hinge. Given the central role of the PAZ motions in regulating the exposure of the nucleic acid binding channel, these findings exemplify the importance of protein motions in distinguishing the overlapping, yet distinct, mechanisms of Ago-mediated processes in different organisms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Development of an ultrasmall C-band linear accelerator guide for a four-dimensional image-guided radiotherapy system with a gimbaled x-ray head.

    PubMed

    Kamino, Yuichiro; Miura, Sadao; Kokubo, Masaki; Yamashita, Ichiro; Hirai, Etsuro; Hiraoka, Masahiro; Ishikawa, Junzo

    2007-05-01

    We are developing a four-dimensional image-guided radiotherapy system with a gimbaled x-ray head. It is capable of pursuing irradiation and delivering irradiation precisely with the help of an agile moving x-ray head on the gimbals. Requirements for the accelerator guide were established, system design was developed, and detailed design was conducted. An accelerator guide was manufactured and basic beam performance and leakage radiation from the accelerator guide were evaluated at a low pulse repetition rate. The accelerator guide including the electron gun is 38 cm long and weighs about 10 kg. The length of the accelerating structure is 24.4 cm. The accelerating structure is a standing wave type and is composed of the axial-coupled injector section and the side-coupled acceleration cavity section. The injector section is composed of one prebuncher cavity, one buncher cavity, one side-coupled half cavity, and two axial coupling cavities. The acceleration cavity section is composed of eight side-coupled nose reentrant cavities and eight coupling cavities. The electron gun is a diode-type gun with a cerium hexaboride (CeB6) direct heating cathode. The accelerator guide can be operated without any magnetic focusing device. Output beam current was 75 mA with a transmission efficiency of 58%, and the average energy was 5.24 MeV. Beam energy was distributed from 4.95 to 5.6 MeV. The beam profile, measured 88 mm from the beam output hole on the axis of the accelerator guide, was 0.7 mm X 0.9 mm full width at half maximum (FWHM) width. The beam loading line was 5.925 (MeV)-Ib (mA) X 0.00808 (MeV/mA), where Ib is output beam current. The maximum radiation leakage of the accelerator guide at 100 cm from the axis of the accelerator guide was calculated as 0.33 cGy/min at the rated x-ray output of 500 cGy/min from the measured value. This leakage requires no radiation shielding for the accelerator guide itself per IEC 60601-2-1.

  10. The Staphylococcus aureus Two-Component System AgrAC Displays Four Distinct Genomic Arrangements That Delineate Genomic Virulence Factor Signatures

    PubMed Central

    Choudhary, Kumari S.; Mih, Nathan; Monk, Jonathan; Kavvas, Erol; Yurkovich, James T.; Sakoulas, George; Palsson, Bernhard O.

    2018-01-01

    Two-component systems (TCSs) consist of a histidine kinase and a response regulator. Here, we evaluated the conservation of the AgrAC TCS among 149 completely sequenced Staphylococcus aureus strains. It is composed of four genes: agrBDCA. We found that: (i) AgrAC system (agr) was found in all but one of the 149 strains, (ii) the agr positive strains were further classified into four agr types based on AgrD protein sequences, (iii) the four agr types not only specified the chromosomal arrangement of the agr genes but also the sequence divergence of AgrC histidine kinase protein, which confers signal specificity, (iv) the sequence divergence was reflected in distinct structural properties especially in the transmembrane region and second extracellular binding domain, and (v) there was a strong correlation between the agr type and the virulence genomic profile of the organism. Taken together, these results demonstrate that bioinformatic analysis of the agr locus leads to a classification system that correlates with the presence of virulence factors and protein structural properties. PMID:29887846

  11. Profiling defect depth in composite materials using thermal imaging NDE

    NASA Astrophysics Data System (ADS)

    Obeidat, Omar; Yu, Qiuye; Han, Xiaoyan

    2018-04-01

    Sonic Infrared (IR) NDE, is a relatively new NDE technology; it has been demonstrated as a reliable and sensitive method to detect defects. SIR uses ultrasonic excitation with IR imaging to detect defects and flaws in the structures being inspected. An IR camera captures infrared radiation from the target for a period of time covering the ultrasound pulse. This period of time may be much longer than the pulse depending on the defect depth and the thermal properties of the materials. With the increasing deployment of composites in modern aerospace and automobile structures, fast, wide-area and reliable NDE methods are necessary. Impact damage is one of the major concerns in modern composites. Damage can occur at a certain depth without any visual indication on the surface. Defect depth information can influence maintenance decisions. Depth profiling relies on the time delays in the captured image sequence. We'll present our work on the defect depth profiling by using the temporal information of IR images. An analytical model is introduced to describe heat diffusion from subsurface defects in composite materials. Depth profiling using peak time is introduced as well.

  12. The Ottawa Model of Research Use: a guide to clinical innovation in the NICU.

    PubMed

    Hogan, Debora L; Logan, Jo

    2004-01-01

    To improve performance of a neonatal transport team by implementing a research-based family assessment instrument. Objectives included providing a structure for evaluating families and fostering the healthcare relationship. Neonatal transports are associated with family crises. Transport teams require a comprehensive framework to accurately assess family responses to adversity and tools to guide their practice toward parental mastery of the event. Currently, there are no assessment tools that merge family nursing expertise with neonatal transport. A family assessment tool grounded in contemporary family nursing theory and research was developed by a clinical nurse specialist. The Ottawa Model of Research Use guided the process of piloting the innovation with members of a transport team. Focus groups, interviews, and surveys were conducted to create profiles of barriers and facilitators to research use by team members. Tailored research transfer strategies were enacted based on the profile results. Formative evaluations demonstrated improvements in team members' perceptions of their knowledge, family centeredness, and ability to assess and intervene with families. The family assessment tool is currently being incorporated into Clinical Practice Guidelines for Transport and thus will be considered standard care. Use of a family assessment tool is an effective way of appraising families and addressing suffering. The Ottawa Model of Research Use provided a framework for implementing the clinical innovation. A key role of the clinical nurse specialist is to influence nursing practice by fostering research use by practitioners. When developing and implementing a clinical innovation, input from end users and consumers is pivotal. Incorporating the innovation into a practice guideline provides a structure to imbed research evidence into practice.

  13. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  14. Multilocus Sequence Typing Analysis of Staphylococcus lugdunensis Implies a Clonal Population Structure

    PubMed Central

    Chassain, Benoît; Lemée, Ludovic; Didi, Jennifer; Thiberge, Jean-Michel; Brisse, Sylvain; Pons, Jean-Louis

    2012-01-01

    Staphylococcus lugdunensis is recognized as one of the major pathogenic species within the genus Staphylococcus, even though it belongs to the coagulase-negative group. A multilocus sequence typing (MLST) scheme was developed to study the genetic relationships and population structure of 87 S. lugdunensis isolates from various clinical and geographic sources by DNA sequence analysis of seven housekeeping genes (aroE, dat, ddl, gmk, ldh, recA, and yqiL). The number of alleles ranged from four (gmk and ldh) to nine (yqiL). Allelic profiles allowed the definition of 20 different sequence types (STs) and five clonal complexes. The 20 STs lacked correlation with geographic source. Isolates recovered from hematogenic infections (blood or osteoarticular isolates) or from skin and soft tissue infections did not cluster in separate lineages. Penicillin-resistant isolates clustered mainly in one clonal complex, unlike glycopeptide-tolerant isolates, which did not constitute a distinct subpopulation within S. lugdunensis. Phylogenies from the sequences of the seven individual housekeeping genes were congruent, indicating a predominantly mutational evolution of these genes. Quantitative analysis of the linkages between alleles from the seven loci revealed a significant linkage disequilibrium, thus confirming a clonal population structure for S. lugdunensis. This first MLST scheme for S. lugdunensis provides a new tool for investigating the macroepidemiology and phylogeny of this unusually virulent coagulase-negative Staphylococcus. PMID:22785196

  15. Guide-bound structures of an RNA-targeting A-cleaving CRISPR-Cas13a enzyme

    PubMed Central

    Knott, Gavin J.; East-Seletsky, Alexandra; Cofsky, Joshua C.; Holton, James M.; Charles, Emeric; O’Connell, Mitchell R.; Doudna, Jennifer A.

    2018-01-01

    CRISPR adaptive immune systems protect bacteria from infections by deploying CRISPR RNA (crRNA)-guided enzymes to recognize and cut foreign nucleic acids. Type VI-A CRISPR-Cas systems include the Cas13a enzyme, an RNA-activated ribonuclease (RNase) capable of crRNA processing and single-stranded RNA degradation upon target transcript binding. Here we present the 2.0 Å resolution crystal structure of a crRNA-bound L. bacterium Cas13a (LbaCas13a), representing a recently discovered Cas13a enzyme subtype. This structure and accompanying biochemical experiments define for the first time the Cas13a catalytic residues that are directly responsible for crRNA maturation. In addition, the orientation of the foreign-derived target RNA-specifying sequence in the protein interior explains the conformational gating of Cas13a nuclease activation. These results describe how Cas13a enzymes generate functional crRNAs and how catalytic activity is blocked prior to target RNA recognition, with implications for both bacterial immunity and diagnostic applications. PMID:28892041

  16. Expanding the Described Metabolome of the Marine Cyanobacterium Moorea producens JHB through Orthogonal Natural Products Workflows

    PubMed Central

    Boudreau, Paul D.; Monroe, Emily A.; Mehrotra, Suneet; Desfor, Shane; Korobeynikov, Anton; Sherman, David H.; Murray, Thomas F.; Gerwick, Lena; Dorrestein, Pieter C.; Gerwick, William H.

    2015-01-01

    Moorea producens JHB, a Jamaican strain of tropical filamentous marine cyanobacteria, has been extensively studied by traditional natural products techniques. These previous bioassay and structure guided isolations led to the discovery of two exciting classes of natural products, hectochlorin (1) and jamaicamides A (2) and B (3). In the current study, mass spectrometry-based ‘molecular networking’ was used to visualize the metabolome of Moorea producens JHB, and both guided and enhanced the isolation workflow, revealing additional metabolites in these compound classes. Further, we developed additional insight into the metabolic capabilities of this strain by genome sequencing analysis, which subsequently led to the isolation of a compound unrelated to the jamaicamide and hectochlorin families. Another approach involved stimulation of the biosynthesis of a minor jamaicamide metabolite by cultivation in modified media, and provided insights about the underlying biosynthetic machinery as well as preliminary structure-activity information within this structure class. This study demonstrated that these orthogonal approaches are complementary and enrich secondary metabolomic coverage even in an extensively studied bacterial strain. PMID:26222584

  17. EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A

    PubMed Central

    Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott

    2015-01-01

    The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928

  18. EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

    PubMed

    Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott

    2015-01-01

    The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.

  19. Reference-guided de novo assembly approach improves genome reconstruction for related species.

    PubMed

    Lischer, Heidi E L; Shimizu, Kentaro K

    2017-11-10

    The development of next-generation sequencing has made it possible to sequence whole genomes at a relatively low cost. However, de novo genome assemblies remain challenging due to short read length, missing data, repetitive regions, polymorphisms and sequencing errors. As more and more genomes are sequenced, reference-guided assembly approaches can be used to assist the assembly process. However, previous methods mostly focused on the assembly of other genotypes within the same species. We adapted and extended a reference-guided de novo assembly approach, which enables the usage of a related reference sequence to guide the genome assembly. In order to compare and evaluate de novo and our reference-guided de novo assembly approaches, we used a simulated data set of a repetitive and heterozygotic plant genome. The extended reference-guided de novo assembly approach almost always outperforms the corresponding de novo assembly program even when a reference of a different species is used. Similar improvements can be observed in high and low coverage situations. In addition, we show that a single evaluation metric, like the widely used N50 length, is not enough to properly rate assemblies as it not always points to the best assembly evaluated with other criteria. Therefore, we used the summed z-scores of 36 different statistics to evaluate the assemblies. The combination of reference mapping and de novo assembly provides a powerful tool to improve genome reconstruction by integrating information of a related genome. Our extension of the reference-guided de novo assembly approach enables the application of this strategy not only within but also between related species. Finally, the evaluation of genome assemblies is often not straight forward, as the truth is not known. Thus one should always use a combination of evaluation metrics, which not only try to assess the continuity but also the accuracy of an assembly.

  20. Structural basis of DNA sequence recognition by the response regulator PhoP in Mycobacterium tuberculosis.

    PubMed

    He, Xiaoyuan; Wang, Liqin; Wang, Shuishu

    2016-04-15

    The transcriptional regulator PhoP is an essential virulence factor in Mycobacterium tuberculosis, and it presents a target for the development of new anti-tuberculosis drugs and attenuated tuberculosis vaccine strains. PhoP binds to DNA as a highly cooperative dimer by recognizing direct repeats of 7-bp motifs with a 4-bp spacer. To elucidate the PhoP-DNA binding mechanism, we determined the crystal structure of the PhoP-DNA complex. The structure revealed a tandem PhoP dimer that bound to the direct repeat. The surprising tandem arrangement of the receiver domains allowed the four domains of the PhoP dimer to form a compact structure, accounting for the strict requirement of a 4-bp spacer and the highly cooperative binding of the dimer. The PhoP-DNA interactions exclusively involved the effector domain. The sequence-recognition helix made contact with the bases of the 7-bp motif in the major groove, and the wing interacted with the adjacent minor groove. The structure provides a starting point for the elucidation of the mechanism by which PhoP regulates the virulence of M. tuberculosis and guides the design of screening platforms for PhoP inhibitors.

  1. Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation

    PubMed Central

    Mekler, Vladimir; Minakhin, Leonid; Severinov, Konstantin

    2017-01-01

    The prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR)-associated 9 (Cas9) endonuclease cleaves double-stranded DNA sequences specified by guide RNA molecules and flanked by a protospacer adjacent motif (PAM) and is widely used for genome editing in various organisms. The RNA-programmed Cas9 locates the target site by scanning genomic DNA. We sought to elucidate the mechanism of initial DNA interrogation steps that precede the pairing of target DNA with guide RNA. Using fluorometric and biochemical assays, we studied Cas9/guide RNA complexes with model DNA substrates that mimicked early intermediates on the pathway to the final Cas9/guide RNA–DNA complex. The results show that Cas9/guide RNA binding to PAM favors separation of a few PAM-proximal protospacer base pairs allowing initial target interrogation by guide RNA. The duplex destabilization is mediated, in part, by Cas9/guide RNA affinity for unpaired segments of nontarget strand DNA close to PAM. Furthermore, our data indicate that the entry of double-stranded DNA beyond a short threshold distance from PAM into the Cas9/single-guide RNA (sgRNA) interior is hindered. We suggest that the interactions unfavorable for duplex DNA binding promote DNA bending in the PAM-proximal region during early steps of Cas9/guide RNA–DNA complex formation, thus additionally destabilizing the protospacer duplex. The mechanism that emerges from our analysis explains how the Cas9/sgRNA complex is able to locate the correct target sequence efficiently while interrogating numerous nontarget sequences associated with correct PAMs. PMID:28484024

  2. Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation.

    PubMed

    Mekler, Vladimir; Minakhin, Leonid; Severinov, Konstantin

    2017-05-23

    The prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR)-associated 9 (Cas9) endonuclease cleaves double-stranded DNA sequences specified by guide RNA molecules and flanked by a protospacer adjacent motif (PAM) and is widely used for genome editing in various organisms. The RNA-programmed Cas9 locates the target site by scanning genomic DNA. We sought to elucidate the mechanism of initial DNA interrogation steps that precede the pairing of target DNA with guide RNA. Using fluorometric and biochemical assays, we studied Cas9/guide RNA complexes with model DNA substrates that mimicked early intermediates on the pathway to the final Cas9/guide RNA-DNA complex. The results show that Cas9/guide RNA binding to PAM favors separation of a few PAM-proximal protospacer base pairs allowing initial target interrogation by guide RNA. The duplex destabilization is mediated, in part, by Cas9/guide RNA affinity for unpaired segments of nontarget strand DNA close to PAM. Furthermore, our data indicate that the entry of double-stranded DNA beyond a short threshold distance from PAM into the Cas9/single-guide RNA (sgRNA) interior is hindered. We suggest that the interactions unfavorable for duplex DNA binding promote DNA bending in the PAM-proximal region during early steps of Cas9/guide RNA-DNA complex formation, thus additionally destabilizing the protospacer duplex. The mechanism that emerges from our analysis explains how the Cas9/sgRNA complex is able to locate the correct target sequence efficiently while interrogating numerous nontarget sequences associated with correct PAMs.

  3. Utilizing Structures of CYP2D6 and BACE1 Complexes To Reduce Risk of Drug–Drug Interactions with a Novel Series of Centrally Efficacious BACE1 Inhibitors

    PubMed Central

    2016-01-01

    In recent years, the first generation of β-secretase (BACE1) inhibitors advanced into clinical development for the treatment of Alzheimer’s disease (AD). However, the alignment of drug-like properties and selectivity remains a major challenge. Herein, we describe the discovery of a novel class of potent, low clearance, CNS penetrant BACE1 inhibitors represented by thioamidine 5. Further profiling suggested that a high fraction of the metabolism (>95%) was due to CYP2D6, increasing the potential risk for victim-based drug–drug interactions (DDI) and variable exposure in the clinic due to the polymorphic nature of this enzyme. To guide future design, we solved crystal structures of CYP2D6 complexes with substrate 5 and its corresponding metabolic product pyrazole 6, which provided insight into the binding mode and movements between substrate/inhibitor complexes. Guided by the BACE1 and CYP2D6 crystal structures, we designed and synthesized analogues with reduced risk for DDI, central efficacy, and improved hERG therapeutic margins. PMID:25781223

  4. State of Iowa Scope and Sequence for Vocational Home Economics.

    ERIC Educational Resources Information Center

    Iowa State Dept. of Public Instruction, Des Moines. Div. of Career Education.

    This scope and sequence guide for vocational home economics programs in Iowa discusses the most important dimensions of the program and describes the general purposes of programs along with factors that affect them, both in general and for Iowa specifically. The introductory chapter of the guide sets the context and the mission and also includes a…

  5. Computational Redesign of Acyl-ACP Thioesterase with Improved Selectivity toward Medium-Chain-Length Fatty Acids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grisewood, Matthew J.; Hernández-Lozada, Néstor J.; Thoden, James B.

    Enzyme and metabolic engineering offer the potential to develop biocatalysts for converting natural resources to a wide range of chemicals. To broaden the scope of potential products beyond natural metabolites, methods of engineering enzymes to accept alternative substrates and/or perform novel chemistries must be developed. DNA synthesis can create large libraries of enzyme-coding sequences, but most biochemistries lack a simple assay to screen for promising enzyme variants. Our solution to this challenge is structure-guided mutagenesis, in which optimization algorithms select the best sequences from libraries based on specified criteria (i.e., binding selectivity). We demonstrate this approach by identifying medium-chain (C8–C12)more » acyl-ACP thioesterases through structure-guided mutagenesis. Medium-chain fatty acids, which are products of thioesterase-catalyzed hydrolysis, are limited in natural abundance, compared to long-chain fatty acids; the limited supply leads to high costs of C6–C10 oleochemicals such as fatty alcohols, amines, and esters. Here, we applied computational tools to tune substrate binding of the highly active ‘TesA thioesterase in Escherichia coli. We used the IPRO algorithm to design thioesterase variants with enhanced C12 or C8 specificity, while maintaining high activity. After four rounds of structure-guided mutagenesis, we identified 3 variants with enhanced production of dodecanoic acid (C12) and 27 variants with enhanced production of octanoic acid (C8). The top variants reached up to 49% C12 and 50% C8 while exceeding native levels of total free fatty acids. A comparably sized library created by random mutagenesis failed to identify promising mutants. The chain length-preference of ‘TesA and the best mutant were confirmed in vitro using acyl-CoA substrates. Molecular dynamics simulations, confirmed by resolved crystal structures, of ‘TesA variants suggest that hydrophobic forces govern ‘TesA substrate specificity. Finally, we expect the design rules that we uncovered and the thioesterase variants that we identified will be useful to metabolic engineering projects aimed at sustainable production of medium-chain-length oleochemicals.« less

  6. Computational Redesign of Acyl-ACP Thioesterase with Improved Selectivity toward Medium-Chain-Length Fatty Acids

    DOE PAGES

    Grisewood, Matthew J.; Hernández-Lozada, Néstor J.; Thoden, James B.; ...

    2017-04-20

    Enzyme and metabolic engineering offer the potential to develop biocatalysts for converting natural resources to a wide range of chemicals. To broaden the scope of potential products beyond natural metabolites, methods of engineering enzymes to accept alternative substrates and/or perform novel chemistries must be developed. DNA synthesis can create large libraries of enzyme-coding sequences, but most biochemistries lack a simple assay to screen for promising enzyme variants. Our solution to this challenge is structure-guided mutagenesis, in which optimization algorithms select the best sequences from libraries based on specified criteria (i.e., binding selectivity). We demonstrate this approach by identifying medium-chain (C8–C12)more » acyl-ACP thioesterases through structure-guided mutagenesis. Medium-chain fatty acids, which are products of thioesterase-catalyzed hydrolysis, are limited in natural abundance, compared to long-chain fatty acids; the limited supply leads to high costs of C6–C10 oleochemicals such as fatty alcohols, amines, and esters. Here, we applied computational tools to tune substrate binding of the highly active ‘TesA thioesterase in Escherichia coli. We used the IPRO algorithm to design thioesterase variants with enhanced C12 or C8 specificity, while maintaining high activity. After four rounds of structure-guided mutagenesis, we identified 3 variants with enhanced production of dodecanoic acid (C12) and 27 variants with enhanced production of octanoic acid (C8). The top variants reached up to 49% C12 and 50% C8 while exceeding native levels of total free fatty acids. A comparably sized library created by random mutagenesis failed to identify promising mutants. The chain length-preference of ‘TesA and the best mutant were confirmed in vitro using acyl-CoA substrates. Molecular dynamics simulations, confirmed by resolved crystal structures, of ‘TesA variants suggest that hydrophobic forces govern ‘TesA substrate specificity. Finally, we expect the design rules that we uncovered and the thioesterase variants that we identified will be useful to metabolic engineering projects aimed at sustainable production of medium-chain-length oleochemicals.« less

  7. Terminal Duplex Stability and Nucleotide Identity Differentially Control siRNA Loading and Activity in RNA Interference

    PubMed Central

    Angart, Phillip A.; Carlson, Rebecca J.; Adu-Berchie, Kwasi

    2016-01-01

    Efficient short interfering RNA (siRNA)-mediated gene silencing requires selection of a sequence that is complementary to the intended target and possesses sequence and structural features that encourage favorable functional interactions with the RNA interference (RNAi) pathway proteins. In this study, we investigated how terminal sequence and structural characteristics of siRNAs contribute to siRNA strand loading and silencing activity and how these characteristics ultimately result in a functionally asymmetric duplex in cultured HeLa cells. Our results reiterate that the most important characteristic in determining siRNA activity is the 5′ terminal nucleotide identity. Our findings further suggest that siRNA loading is controlled principally by the hybridization stability of the 5′ terminus (Nucleotides: 1–2) of each siRNA strand, independent of the opposing terminus. Postloading, RNA-induced silencing complex (RISC)–specific activity was found to be improved by lower hybridization stability in the 5′ terminus (Nucleotides: 3–4) of the loaded siRNA strand and greater hybridization stability toward the 3′ terminus (Nucleotides: 17–18). Concomitantly, specific recognition of the 5′ terminal nucleotide sequence by human Argonaute 2 (Ago2) improves RISC half-life. These findings indicate that careful selection of siRNA sequences can maximize both the loading and the specific activity of the intended guide strand. PMID:27399870

  8. Understanding the mechanisms of protein-DNA interactions

    NASA Astrophysics Data System (ADS)

    Lavery, Richard

    2004-03-01

    Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.

  9. MIPE: A metagenome-based community structure explorer and SSU primer evaluation tool

    PubMed Central

    Zhou, Quan

    2017-01-01

    An understanding of microbial community structure is an important issue in the field of molecular ecology. The traditional molecular method involves amplification of small subunit ribosomal RNA (SSU rRNA) genes by polymerase chain reaction (PCR). However, PCR-based amplicon approaches are affected by primer bias and chimeras. With the development of high-throughput sequencing technology, unbiased SSU rRNA gene sequences can be mined from shotgun sequencing-based metagenomic or metatranscriptomic datasets to obtain a reflection of the microbial community structure in specific types of environment and to evaluate SSU primers. However, the use of short reads obtained through next-generation sequencing for primer evaluation has not been well resolved. The software MIPE (MIcrobiota metagenome Primer Explorer) was developed to adapt numerous short reads from metagenomes and metatranscriptomes. Using metagenomic or metatranscriptomic datasets as input, MIPE extracts and aligns rRNA to reveal detailed information on microbial composition and evaluate SSU rRNA primers. A mock dataset, a real Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) test dataset, two PrimerProspector test datasets and a real metatranscriptomic dataset were used to validate MIPE. The software calls Mothur (v1.33.3) and the SILVA database (v119) for the alignment and classification of rRNA genes from a metagenome or metatranscriptome. MIPE can effectively extract shotgun rRNA reads from a metagenome or metatranscriptome and is capable of classifying these sequences and exhibiting sensitivity to different SSU rRNA PCR primers. Therefore, MIPE can be used to guide primer design for specific environmental samples. PMID:28350876

  10. SeqRate: sequence-based protein folding type classification and rates prediction

    PubMed Central

    2010-01-01

    Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647

  11. Atmospheric turbulence profiling with SLODAR using multiple adaptive optics wavefront sensors.

    PubMed

    Wang, Lianqi; Schöck, Matthias; Chanan, Gary

    2008-04-10

    The slope detection and ranging (SLODAR) method recovers atmospheric turbulence profiles from time averaged spatial cross correlations of wavefront slopes measured by Shack-Hartmann wavefront sensors. The Palomar multiple guide star unit (MGSU) was set up to test tomographic multiple guide star adaptive optics and provided an ideal test bed for SLODAR turbulence altitude profiling. We present the data reduction methods and SLODAR results from MGSU observations made in 2006. Wind profiling is also performed using delayed wavefront cross correlations along with SLODAR analysis. The wind profiling analysis is shown to improve the height resolution of the SLODAR method and in addition gives the wind velocities of the turbulent layers.

  12. Writing DNA with GenoCAD.

    PubMed

    Czar, Michael J; Cai, Yizhi; Peccoud, Jean

    2009-07-01

    Chemical synthesis of custom DNA made to order calls for software streamlining the design of synthetic DNA sequences. GenoCAD (www.genocad.org) is a free web-based application to design protein expression vectors, artificial gene networks and other genetic constructs composed of multiple functional blocks called genetic parts. By capturing design strategies in grammatical models of DNA sequences, GenoCAD guides the user through the design process. By successively clicking on icons representing structural features or actual genetic parts, complex constructs composed of dozens of functional blocks can be designed in a matter of minutes. GenoCAD automatically derives the construct sequence from its comprehensive libraries of genetic parts. Upon completion of the design process, users can download the sequence for synthesis or further analysis. Users who elect to create a personal account on the system can customize their workspace by creating their own parts libraries, adding new parts to the libraries, or reusing designs to quickly generate sets of related constructs.

  13. The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

    PubMed Central

    Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

    2016-01-01

    Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326

  14. Molecular Cytogenetics Guides Massively Parallel Sequencing of a Radiation-Induced Chromosome Translocation in Human Cells.

    PubMed

    Cornforth, Michael N; Anur, Pavana; Wang, Nicholas; Robinson, Erin; Ray, F Andrew; Bedford, Joel S; Loucas, Bradford D; Williams, Eli S; Peto, Myron; Spellman, Paul; Kollipara, Rahul; Kittler, Ralf; Gray, Joe W; Bailey, Susan M

    2018-05-11

    Chromosome rearrangements are large-scale structural variants that are recognized drivers of oncogenic events in cancers of all types. Cytogenetics allows for their rapid, genome-wide detection, but does not provide gene-level resolution. Massively parallel sequencing (MPS) promises DNA sequence-level characterization of the specific breakpoints involved, but is strongly influenced by bioinformatics filters that affect detection efficiency. We sought to characterize the breakpoint junctions of chromosomal translocations and inversions in the clonal derivatives of human cells exposed to ionizing radiation. Here, we describe the first successful use of DNA paired-end analysis to locate and sequence across the breakpoint junctions of a radiation-induced reciprocal translocation. The analyses employed, with varying degrees of success, several well-known bioinformatics algorithms, a task made difficult by the involvement of repetitive DNA sequences. As for underlying mechanisms, the results of Sanger sequencing suggested that the translocation in question was likely formed via microhomology-mediated non-homologous end joining (mmNHEJ). To our knowledge, this represents the first use of MPS to characterize the breakpoint junctions of a radiation-induced chromosomal translocation in human cells. Curiously, these same approaches were unsuccessful when applied to the analysis of inversions previously identified by directional genomic hybridization (dGH). We conclude that molecular cytogenetics continues to provide critical guidance for structural variant discovery, validation and in "tuning" analysis filters to enable robust breakpoint identification at the base pair level.

  15. Populations, Teacher's Guide.

    ERIC Educational Resources Information Center

    Conard, David; Lawson, Chester A.

    This Teacher's Guide is designed for use with the Science Curriculum Improvement Study's (SCIS) unit Population. Populations is the third of a six-unit sequence of SCIS's Life Science Program for grades K-6. The Populations guide consists of activity outlines along with suggestions for guiding children's observation and manipulations of living…

  16. Experimental Design-Based Functional Mining and Characterization of High-Throughput Sequencing Data in the Sequence Read Archive

    PubMed Central

    Nakazato, Takeru; Ohta, Tazro; Bono, Hidemasa

    2013-01-01

    High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA). As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs) from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH) extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called “Gendoo”. We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called “DBCLS SRA” (http://sra.dbcls.jp/). This service will improve accessibility to high-quality data from SRA. PMID:24167589

  17. Application of denaturing gradient gel electrophoresis (DGGE) to the analysis of microbial communities of subgingival plaque.

    PubMed

    Fujimoto, C; Maeda, H; Kokeguchi, S; Takashiba, S; Nishimura, F; Arai, H; Fukui, K; Murayama, Y

    2003-08-01

    Denaturing gradient gel electrophoresis (DGGE) was applied to the microbiologic examination of subgingival plaque. The PCR primers were designed from conserved nucleotide sequences on 16S ribosomal RNA gene (16SrDNA) with GC rich clamp at the 5'-end. Polymerase chain reaction (PCR) was performed using the primers and genomic DNAs of typical periodontal bacteria. The generated 16SrDNA fragments were separated by denaturing gel. Although the sizes of the amplified DNA fragments were almost the same among the species, 16SrDNAs of the periodontal bacteria were distinguished according to their specific sequences. The microflora of clinical plaque samples were profiled by the PCR-DGGE method, and the dominant 16SrDNA bands were cloned and sequenced. Simultaneously, Actinobacillus actinomycetemcomitans, Porphyromonas gingivalis and Prevotella intermedia were detected by an ordinary PCR method. In the deep periodontal pockets, the bacterial community structures were complicated and P. gingivalis was the most dominant species, whereas the DGGE profiles were simple and Streptococcus or Neisseria species were dominant in the shallow pockets. The species-specific PCR method revealed the presence of A. actinomycetemcomitans, P. gingivalis and P. intermedia in the clinical samples. However, corresponding bands were not always observed in the DGGE profiles, indicating a lower sensitivity of the DGGE method. Although the DGGE method may have a lower sensitivity than the ordinary PCR methods, it could visualize the bacterial qualitative compositions and reveal the major species of the plaque. The DGGE analysis and following sequencing may have the potential to be a promising bacterial examination procedure in periodontal diseases.

  18. Mesozoic basin development beneath the southeastern US coastal plain: evidence from new COCORP profiling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McBride, J.H.; Nelson, K.D.; Arnow, J.A.

    1985-01-01

    New COCORP profiling on the Georgia coastal plain indicates that the Triassic/Early Jurassic South Georgia basin is a composite feature, which includes several large half-grabens separated by intervening regions where the Triassic/Early Jurassic section is much thinner. Two half-grabens imaged on the profiles have apparent widths of 125 and 40 km, and at their deepest points contain about 5 km of basin fill. Both basins are bounded on their south flanks by major normal faults that dip moderately steeply toward the north, and are disrupted internally by subsidiary normal faults within the basin fill sequences. The orientation of the mainmore » basin-bounding faults suggests that they might have reactivated Paleozoic south-vergent structures formed on the south side of the Alleghenian suture. Evolution of the South Georgia basin appears to follow a model of initial, rapid rifting followed by flexural subsidence. The major episode of normal faulting, and hence extension, within the South Georgia basin occurred prior to extrusion of an areally extensive sequence of Early Jurassic basalt flows. This sequence is traceable across most of the width of the South Georgia basin in western Georgia, and may extend as far east as offshore South Carolina. Jurassic strata above the basalt horizon are notably less faulted and accumulated within a broadly subsiding basin that thins both to the north and south. The occurrence of the basalt relatively late in the rift sequence supports the hypothesis that the southeastern US may have been a major area of incipient spreading after Pangea had begun to separate.« less

  19. In depth chemical investigation of Glycyrrhiza triphylla Fisch roots guided by a preliminary HPLC-ESIMSn profiling.

    PubMed

    Shakeri, Abolfazl; Masullo, Milena; D'Urso, Gilda; Iranshahi, Mehrdad; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia

    2018-05-15

    Chemical investigations on Glycyrrhiza spp. have mostly been focused on G. glabra (typically cultivated in Europe, henceforth called European licorice), G. uralensis and G. inflata (known as Chinese licorice) with little information on the constituents of other Glycyrrhiza species. According to the growing interest in further Glycyrrhiza spp. to be used as sweeteners, the roots of G. triphylla have been investigated. The LC-ESI/LTQOrbitrap/MS profile of the methanolic extract of G. triphylla roots guided the isolation of 21 compounds, of which the structures were elucidated by 1D- and 2D-NMR experiments. Based on this approach, 6 previously unreported compounds including two isoflavones 7,5'-dihydroxy-6,3'-dimethoxy-isoflavone-7-O-β-d-glucopyranoside (4) and 7,5'-dihydroxy-6,3'-dimethoxy-isoflavone-7-O-(7,8-dihydro-p-hydroxycinnamoyl)-β-d-glucopyranoside (7) and four saponins, named licoricesaponins M3 (13), N2 (14), O2 (16) and P2 (18), have been characterized. It is to be noted that the accurate masses of some compounds here reported for the first time corresponded to those of compounds previously described in Glycyrrhiza spp. Thus an approach based only on MS analysis could be misleading; only isolation followed by NMR analysis allowed us to unambiguously assign the structures of these previously unreported compounds. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

    PubMed

    Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

    2014-02-01

    Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.

  1. Guiding exploration in conformational feature space with Lipschitz underestimation for ab-initio protein structure prediction.

    PubMed

    Hao, Xiaohu; Zhang, Guijun; Zhou, Xiaogen

    2018-04-01

    Computing conformations which are essential to associate structural and functional information with gene sequences, is challenging due to the high dimensionality and rugged energy surface of the protein conformational space. Consequently, the dimension of the protein conformational space should be reduced to a proper level, and an effective exploring algorithm should be proposed. In this paper, a plug-in method for guiding exploration in conformational feature space with Lipschitz underestimation (LUE) for ab-initio protein structure prediction is proposed. The conformational space is converted into ultrafast shape recognition (USR) feature space firstly. Based on the USR feature space, the conformational space can be further converted into Underestimation space according to Lipschitz estimation theory for guiding exploration. As a consequence of the use of underestimation model, the tight lower bound estimate information can be used for exploration guidance, the invalid sampling areas can be eliminated in advance, and the number of energy function evaluations can be reduced. The proposed method provides a novel technique to solve the exploring problem of protein conformational space. LUE is applied to differential evolution (DE) algorithm, and metropolis Monte Carlo(MMC) algorithm which is available in the Rosetta; When LUE is applied to DE and MMC, it will be screened by the underestimation method prior to energy calculation and selection. Further, LUE is compared with DE and MMC by testing on 15 small-to-medium structurally diverse proteins. Test results show that near-native protein structures with higher accuracy can be obtained more rapidly and efficiently with the use of LUE. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Colorstratigraphy; A New Stratigraphic Correlation Technique

    NASA Astrophysics Data System (ADS)

    Nanayakkara, N. U.; Ranasinghage, P. N.; Priyantha, C.; Abillapitiya, T.

    2016-12-01

    Here we introduce a novel stratigraphic technique namely colorstratigraphy for correlating sedimentary sequences. Minihagalkanda is about 1 km long amphitheater like sedimentary terrain, situated at the southeastern coast of Sri Lanka. It has Miocene sedimentary sequences, separated in to 10-12 m high small hillocks by erosion, and bounded by about 30 m high escarpment. Sandstone, yellowish sandy clay, greenish silty clay sequences are capped by 4-5 m limestone bed in these hillocks but not at the boundary escarpment. Stratigraphic profiles at two hillocks and the boundary escarpment, separated each other by 200-300 m, were selected to test the new colorstartigraphic correlation technique. Color reflectance (DSR) was measured at four samples in each sequence at every profile and hence altogether 36 reflectance measurements were taken using Minolta 2500D hand-held color spectrophotometer. The first-derivative of the reflectance spectra (dR/dλ) defines the "spectral shape" of the sample. Therefore, DSR data (360-740 nm) measured at 10 nm resolution were used to calculate a center-weighted, first-derivative spectra for each reflectance sample consisting of 39 channels. Particle size of each sequence was measured at all 03 profiles using laser particle size analyzer to verify the stratigraphic correlation. Mean reflectance spectrum for each sequence at all 03 profiles were plotted on the same graph for comparison. Same was done for the grain size spectrums. Discriminant function analysis was performed separately for dsr data and grain size data using a number assigned to each sedimentary sequence as the grouping variable Color spectrums of sandstone, yellowish sandy clay, and greenish silty clay sequences at all three profiles perfectly match showing clear stratigraphic correlation among these three stratigraphic profiles. Matching grain size distribution curves of the three sequence at the three profiles verify the stratigraphic correlation. Perfect 100 % discrimination of the three sequences with color reflectance data proves the accuracy of the correlation. Similar 100 % discrimination resulted with grain size data further verifies the results. Therefore, colorstratigraphy based on DSR can be introduced as a quick and easy technique for stratigraphic correlation of sedimentary sequences.

  3. MetaCRAST: reference-guided extraction of CRISPR spacers from unassembled metagenomes.

    PubMed

    Moller, Abraham G; Liang, Chun

    2017-01-01

    Clustered regularly interspaced short palindromic repeat (CRISPR) systems are the adaptive immune systems of bacteria and archaea against viral infection. While CRISPRs have been exploited as a tool for genetic engineering, their spacer sequences can also provide valuable insights into microbial ecology by linking environmental viruses to their microbial hosts. Despite this importance, metagenomic CRISPR detection remains a major challenge. Here we present a reference-guided CRISPR spacer detection tool ( Meta genomic C RISPR R eference- A ided S earch T ool-MetaCRAST) that constrains searches based on user-specified direct repeats (DRs). These DRs could be expected from assembly or taxonomic profiles of metagenomes. We compared the performance of MetaCRAST to those of two existing metagenomic CRISPR detection tools-Crass and MinCED-using both real and simulated acid mine drainage (AMD) and enhanced biological phosphorus removal (EBPR) metagenomes. Our evaluation shows MetaCRAST improves CRISPR spacer detection in real metagenomes compared to the de novo CRISPR detection methods Crass and MinCED. Evaluation on simulated metagenomes show it performs better than de novo tools for Illumina metagenomes and comparably for 454 metagenomes. It also has comparable performance dependence on read length and community composition, run time, and accuracy to these tools. MetaCRAST is implemented in Perl, parallelizable through the Many Core Engine (MCE), and takes metagenomic sequence reads and direct repeat queries (FASTA or FASTQ) as input. It is freely available for download at https://github.com/molleraj/MetaCRAST.

  4. Automated prediction of protein function and detection of functional sites from structure.

    PubMed

    Pazos, Florencio; Sternberg, Michael J E

    2004-10-12

    Current structural genomics projects are yielding structures for proteins whose functions are unknown. Accordingly, there is a pressing requirement for computational methods for function prediction. Here we present PHUNCTIONER, an automatic method for structure-based function prediction using automatically extracted functional sites (residues associated to functions). The method relates proteins with the same function through structural alignments and extracts 3D profiles of conserved residues. Functional features to train the method are extracted from the Gene Ontology (GO) database. The method extracts these features from the entire GO hierarchy and hence is applicable across the whole range of function specificity. 3D profiles associated with 121 GO annotations were extracted. We tested the power of the method both for the prediction of function and for the extraction of functional sites. The success of function prediction by our method was compared with the standard homology-based method. In the zone of low sequence similarity (approximately 15%), our method assigns the correct GO annotation in 90% of the protein structures considered, approximately 20% higher than inheritance of function from the closest homologue.

  5. novPTMenzy: a database for enzymes involved in novel post-translational modifications

    PubMed Central

    Khater, Shradha; Mohanty, Debasisa

    2015-01-01

    With the recent discoveries of novel post-translational modifications (PTMs) which play important roles in signaling and biosynthetic pathways, identification of such PTM catalyzing enzymes by genome mining has been an area of major interest. Unlike well-known PTMs like phosphorylation, glycosylation, SUMOylation, no bioinformatics resources are available for enzymes associated with novel and unusual PTMs. Therefore, we have developed the novPTMenzy database which catalogs information on the sequence, structure, active site and genomic neighborhood of experimentally characterized enzymes involved in five novel PTMs, namely AMPylation, Eliminylation, Sulfation, Hydroxylation and Deamidation. Based on a comprehensive analysis of the sequence and structural features of these known PTM catalyzing enzymes, we have created Hidden Markov Model profiles for the identification of similar PTM catalyzing enzymatic domains in genomic sequences. We have also created predictive rules for grouping them into functional subfamilies and deciphering their mechanistic details by structure-based analysis of their active site pockets. These analytical modules have been made available as user friendly search interfaces of novPTMenzy database. It also has a specialized analysis interface for some PTMs like AMPylation and Eliminylation. The novPTMenzy database is a unique resource that can aid in discovery of unusual PTM catalyzing enzymes in newly sequenced genomes. Database URL: http://www.nii.ac.in/novptmenzy.html PMID:25931459

  6. CRISPR/Cas9 cleavages in budding yeast reveal templated insertions and strand-specific insertion/deletion profiles.

    PubMed

    Lemos, Brenda R; Kaplan, Adam C; Bae, Ji Eun; Ferrazzoli, Alexander E; Kuo, James; Anand, Ranjith P; Waterman, David P; Haber, James E

    2018-02-27

    Harnessing CRISPR-Cas9 technology provides an unprecedented ability to modify genomic loci via DNA double-strand break (DSB) induction and repair. We analyzed nonhomologous end-joining (NHEJ) repair induced by Cas9 in budding yeast and found that the orientation of binding of Cas9 and its guide RNA (gRNA) profoundly influences the pattern of insertion/deletions (indels) at the site of cleavage. A common indel created by Cas9 is a 1-bp (+1) insertion that appears to result from Cas9 creating a 1-nt 5' overhang that is filled in by a DNA polymerase and ligated. The origin of +1 insertions was investigated by using two gRNAs with PAM sequences located on opposite DNA strands but designed to cleave the same sequence. These templated +1 insertions are dependent on the X-family DNA polymerase, Pol4. Deleting Pol4 also eliminated +2 and +3 insertions, which are biased toward homonucleotide insertions. Using inverted PAM sequences, we also found significant differences in overall NHEJ efficiency and repair profiles, suggesting that the binding of the Cas9:gRNA complex influences subsequent NHEJ processing. As with events induced by the site-specific HO endonuclease, CRISPR-Cas9-mediated NHEJ repair depends on the Ku heterodimer and DNA ligase 4. Cas9 events are highly dependent on the Mre11-Rad50-Xrs2 complex, independent of Mre11's nuclease activity. Inspection of the outcomes of a large number of Cas9 cleavage events in mammalian cells reveals a similar templated origin of +1 insertions in human cells, but also a significant frequency of similarly templated +2 insertions.

  7. Adherent and Invasive Escherichia coli Is Associated with Granulomatous Colitis in Boxer Dogs

    PubMed Central

    Simpson, Kenneth W.; Dogan, Belgin; Rishniw, Mark; Goldstein, Richard E.; Klaessig, Suzanne; McDonough, Patrick L.; German, Alex J.; Yates, Robin M.; Russell, David G.; Johnson, Susan E.; Berg, Douglas E.; Harel, Josee; Bruant, Guillaume; McDonough, Sean P.; Schukken, Ynte H.

    2006-01-01

    The mucosa-associated microflora is increasingly considered to play a pivotal role in the pathogenesis of inflammatory bowel disease. This study explored the possibility that an abnormal mucosal flora is involved in the etiopathogenesis of granulomatous colitis of Boxer dogs (GCB). Colonic biopsy samples from affected dogs (n = 13) and controls (n = 38) were examined by fluorescent in situ hybridization (FISH) with a eubacterial 16S rRNA probe. Culture, 16S ribosomal DNA sequencing, and histochemistry were used to guide subsequent FISH. GCB-associated Escherichia coli isolates were evaluated for their ability to invade and persist in cultured epithelial cells and macrophages as well as for serotype, phylogenetic group, genome size, overall genotype, and presence of virulence genes. Intramucosal gram-negative coccobacilli were present in 100% of GCB samples but not controls. Invasive bacteria hybridized with FISH probes to E. coli. Three of four GCB-associated E. coli isolates adhered to, invaded, and replicated within cultured epithelial cells. Invasion triggered a “splash”-type response, was decreased by cytochalasin D, genistein, colchicine, and wortmannin, and paralleled the behavior of the Crohn's disease-associated strain E. coli LF 82. GCB E. coli and LF 82 were diverse in serotype and overall genotype but similar in phylogeny (B2 and D), in virulence gene profiles (fyuA, irp1, irp2, chuA, fepC, ibeA, kpsMII, iss), in having a larger genome size than commensal E. coli, and in the presence of novel multilocus sequence types. We conclude that GCB is associated with selective intramucosal colonization by E. coli. E. coli strains associated with GCB and Crohn's disease have an adherent and invasive phenotype and novel multilocus sequence types and resemble E. coli associated with extraintestinal disease in phylogeny and virulence gene profile. PMID:16861666

  8. De novo assembly and characterization of Muscovy duck liver transcriptome and analysis of differentially regulated genes in response to heat stress.

    PubMed

    Zeng, Tao; Zhang, Liping; Li, Jinjun; Wang, Deqian; Tian, Yong; Lu, Lizhi

    2015-05-01

    High temperature is a major abiotic stress limiting animal growth and productivity worldwide. The Muscovy duck (Cairina moschata), sometimes called the Barbary drake, is a type of duck with a fairly unusual domestication history. In Southeast Asia, duck meat is one of the top meats consumed, and as such, the production of the meat is an important topic of research. The transcriptomic and genomic data presently available are insufficient to understanding the molecular mechanism underlying the heat tolerance of Muscovy ducks. Thus, transcriptome and expression profiling data for this species are required as important resource for identifying genes and developing molecular marker. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. More than 225 million clean reads were generated and assembled into 36,903 unique transcripts with an average length of 1,135 bp. A total of 21,221 (57.50 %) unigenes were annotated. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with transcription, signal transduction, and apoptosis. We also performed gene expression profiling analysis upon heat treatment in Muscovy ducks and identified 470 heat-response unique transcripts. GO term enrichment showed that protein folding and chaperone binding were significant enrichment, whereas KEGG pathway analyses showed that Ras and MAPKs were activated after heat stress in Muscovy ducks. Our research enriched sequences information of Muscovy duck, provided novel insights into responses to heat stress in these ducks, and serve as candidate genes or markers that can be used to guide future efforts to breed heat-tolerant duck strains.

  9. Next-generation sequencing for identification of candidate genes for Fusarium wilt and sterility mosaic disease in pigeonpea (Cajanus cajan).

    PubMed

    Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Kumar, Vinay; Kale, Sandip M; Sinha, Pallavi; Chitikineni, Annapurna; Pazhamala, Lekha T; Garg, Vanika; Sharma, Mamta; Sameer Kumar, Chanda Venkata; Parupalli, Swathi; Vechalapu, Suryanarayana; Patil, Suyash; Muniswamy, Sonnappa; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Dharmaraj, Pallavi Subbanna; Varshney, Rajeev K

    2016-05-01

    To map resistance genes for Fusarium wilt (FW) and sterility mosaic disease (SMD) in pigeonpea, sequencing-based bulked segregant analysis (Seq-BSA) was used. Resistant (R) and susceptible (S) bulks from the extreme recombinant inbred lines of ICPL 20096 × ICPL 332 were sequenced. Subsequently, SNP index was calculated between R- and S-bulks with the help of draft genome sequence and reference-guided assembly of ICPL 20096 (resistant parent). Seq-BSA has provided seven candidate SNPs for FW and SMD resistance in pigeonpea. In parallel, four additional genotypes were re-sequenced and their combined analysis with R- and S-bulks has provided a total of 8362 nonsynonymous (ns) SNPs. Of 8362 nsSNPs, 60 were found within the 2-Mb flanking regions of seven candidate SNPs identified through Seq-BSA. Haplotype analysis narrowed down to eight nsSNPs in seven genes. These eight nsSNPs were further validated by re-sequencing 11 genotypes that are resistant and susceptible to FW and SMD. This analysis revealed association of four candidate nsSNPs in four genes with FW resistance and four candidate nsSNPs in three genes with SMD resistance. Further, In silico protein analysis and expression profiling identified two most promising candidate genes namely C.cajan_01839 for SMD resistance and C.cajan_03203 for FW resistance. Identified candidate genomic regions/SNPs will be useful for genomics-assisted breeding in pigeonpea. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  10. Classifying the bacterial gut microbiota of termites and cockroaches: A curated phylogenetic reference database (DictDb).

    PubMed

    Mikaelyan, Aram; Köhler, Tim; Lampert, Niclas; Rohland, Jeffrey; Boga, Hamadi; Meuser, Katja; Brune, Andreas

    2015-10-01

    Recent developments in sequencing technology have given rise to a large number of studies that assess bacterial diversity and community structure in termite and cockroach guts based on large amplicon libraries of 16S rRNA genes. Although these studies have revealed important ecological and evolutionary patterns in the gut microbiota, classification of the short sequence reads is limited by the taxonomic depth and resolution of the reference databases used in the respective studies. Here, we present a curated reference database for accurate taxonomic analysis of the bacterial gut microbiota of dictyopteran insects. The Dictyopteran gut microbiota reference Database (DictDb) is based on the Silva database but was significantly expanded by the addition of clones from 11 mostly unexplored termite and cockroach groups, which increased the inventory of bacterial sequences from dictyopteran guts by 26%. The taxonomic depth and resolution of DictDb was significantly improved by a general revision of the taxonomic guide tree for all important lineages, including a detailed phylogenetic analysis of the Treponema and Alistipes complexes, the Fibrobacteres, and the TG3 phylum. The performance of this first documented version of DictDb (v. 3.0) using the revised taxonomic guide tree in the classification of short-read libraries obtained from termites and cockroaches was highly superior to that of the current Silva and RDP databases. DictDb uses an informative nomenclature that is consistent with the literature also for clades of uncultured bacteria and provides an invaluable tool for anyone exploring the gut community structure of termites and cockroaches. Copyright © 2015 Elsevier GmbH. All rights reserved.

  11. Microsatellite DNA capture from enriched libraries.

    PubMed

    Gonzalez, Elena G; Zardoya, Rafael

    2013-01-01

    Microsatellites are DNA sequences of tandem repeats of one to six nucleotides, which are highly polymorphic, and thus the molecular markers of choice in many kinship, population genetic, and conservation studies. There have been significant technical improvements since the early methods for microsatellite isolation were developed, and today the most common procedures take advantage of the hybrid capture methods of enriched-targeted microsatellite DNA. Furthermore, recent advents in sequencing technologies (i.e., next-generation sequencing, NGS) have fostered the mining of microsatellite markers in non-model organisms, affording a cost-effective way of obtaining a large amount of sequence data potentially useful for loci characterization. The rapid improvements of NGS platforms together with the increase in available microsatellite information open new avenues to the understanding of the evolutionary forces that shape genetic structuring in wild populations. Here, we provide detailed methodological procedures for microsatellite isolation based on the screening of GT microsatellite-enriched libraries, either by cloning and Sanger sequencing of positive clones or by direct NGS. Guides for designing new species-specific primers and basic genotyping are also given.

  12. Unraveling a molecular determinant for clathrin-independent internalization of the M2 muscarinic acetylcholine receptor

    PubMed Central

    Wan, Min; Zhang, Wenhua; Tian, Yangli; Xu, Chanjuan; Xu, Tao; Liu, Jianfeng; Zhang, Rongying

    2015-01-01

    Endocytosis and postendocytic sorting of G-protein-coupled receptors (GPCRs) is important for the regulation of both their cell surface density and signaling profile. Unlike the mechanisms of clathrin-dependent endocytosis (CDE), the mechanisms underlying the control of GPCR signaling by clathrin-independent endocytosis (CIE) remain largely unknown. Among the muscarinic acetylcholine receptors (mAChRs), the M4 mAChR undergoes CDE and recycling, whereas the M2 mAChR is internalized through CIE and targeted to lysosomes. Here we investigated the endocytosis and postendocytic trafficking of M2 mAChR based on a comparative analysis of the third cytoplasmic domain in M2 and M4 mAChRs. For the first time, we identified that the sequence 374KKKPPPS380 servers as a sorting signal for the clathrin-independent internalization of M2 mAChR. Switching 374KKKPPPS380 to the i3 loop of the M4 mAChR shifted the receptor into lysosomes through the CIE pathway; and therefore away from CDE and recycling. We also found another previously unidentified sequence that guides CDE of the M2 mAChR, 361VARKIVKMTKQPA373, which is normally masked in the presence of the downstream sequence 374KKKPPPS380. Taken together, our data indicate that endocytosis and postendocytic sorting of GPCRs that undergo CIE could be sequence-dependent. PMID:26094760

  13. The implication of DNA bending energy for nucleosome positioning and sliding.

    PubMed

    Liu, Guoqing; Xing, Yongqiang; Zhao, Hongyu; Cai, Lu; Wang, Jianying

    2018-06-11

    Nucleosome not only directly affects cellular processes, such as DNA replication, recombination, and transcription, but also severs as a fundamentally important target of epigenetic modifications. Our previous study indicated that the bending property of DNA is important in nucleosome formation, particularly in predicting the dyad positions of nucleosomes on a DNA segment. Here, we investigated the role of bending energy in nucleosome positioning and sliding in depth to decipher sequence-directed mechanism. The results show that bending energy is a good physical index to predict the free energy in the process of nucleosome reconstitution in vitro. Our data also imply that there are at least 20% of the nucleosomes in budding yeast do not adopt canonical positioning, in which underlying sequences wrapped around histones are structurally symmetric. We also revealed distinct patterns of bending energy profile for distinctly organized chromatin structures, such as well-positioned nucleosomes, fuzzy nucleosomes, and linker regions and discussed nucleosome sliding in terms of bending energy. We proposed that the stability of a nucleosome is positively correlated with the strength of the bending anisotropy of DNA segment, and both accessibility and directionality of nucleosome sliding is likely to be modulated by diverse patterns of DNA bending energy profile.

  14. Iridium profile for 10 million years across the Cretaceous-Tertiary boundary at Gubbio (Italy)

    NASA Technical Reports Server (NTRS)

    Alvarez, Walter; Asaro, Frank; Montanari, Alessandro

    1990-01-01

    The iridium anomaly at the Cretaceous-Tertiary (KT) boundary was discovered in the pelagic limestone sequence at Gubbio on the basis of 12 samples analyzed by neutron activation analysis (NAA) and was interpreted as indicating impact of a large extraterrestrial object at exactly the time of the KT mass extinction. Continuing controversy over the shape of the Ir profile at the Gubbio KT boundary and its interpretation called for a more detailed follow-up study. Analysis of a 57-meter-thick, 10-million-year-old part of the Gubbio sequence using improved NAA techniques revealed that there is only one Ir anomaly at the KT boundary, but this anomaly shows an intricate fine structure, the origin of which cannot yet be entirely explained. The KT Ir anomaly peaks in a 1-centimeter-thick clay layer, where the average Ir concentration is 3000 parts per trillion (ppt); this peak is flanked by tails with Ir concentrations of 20 to 80 ppt that rise above a background of 12 to 13 ppt. The fine structure of the tails is probably due in part to lateral reworking, diffusion, burrowing, and perhaps Milankovitch cyclicity.

  15. CasA mediates Cas3-catalyzed target degradation during CRISPR RNA-guided interference.

    PubMed

    Hochstrasser, Megan L; Taylor, David W; Bhat, Prashant; Guegler, Chantal K; Sternberg, Samuel H; Nogales, Eva; Doudna, Jennifer A

    2014-05-06

    In bacteria, the clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) DNA-targeting complex Cascade (CRISPR-associated complex for antiviral defense) uses CRISPR RNA (crRNA) guides to bind complementary DNA targets at sites adjacent to a trinucleotide signature sequence called the protospacer adjacent motif (PAM). The Cascade complex then recruits Cas3, a nuclease-helicase that catalyzes unwinding and cleavage of foreign double-stranded DNA (dsDNA) bearing a sequence matching that of the crRNA. Cascade comprises the CasA-E proteins and one crRNA, forming a structure that binds and unwinds dsDNA to form an R loop in which the target strand of the DNA base pairs with the 32-nt RNA guide sequence. Single-particle electron microscopy reconstructions of dsDNA-bound Cascade with and without Cas3 reveal that Cascade positions the PAM-proximal end of the DNA duplex at the CasA subunit and near the site of Cas3 association. The finding that the DNA target and Cas3 colocalize with CasA implicates this subunit in a key target-validation step during DNA interference. We show biochemically that base pairing of the PAM region is unnecessary for target binding but critical for Cas3-mediated degradation. In addition, the L1 loop of CasA, previously implicated in PAM recognition, is essential for Cas3 activation following target binding by Cascade. Together, these data show that the CasA subunit of Cascade functions as an essential partner of Cas3 by recognizing DNA target sites and positioning Cas3 adjacent to the PAM to ensure cleavage.

  16. NEP: web server for epitope prediction based on antibody neutralization of viral strains with diverse sequences

    PubMed Central

    Chuang, Gwo-Yu; Liou, David; Kwong, Peter D.; Georgiev, Ivelin S.

    2014-01-01

    Delineation of the antigenic site, or epitope, recognized by an antibody can provide clues about functional vulnerabilities and resistance mechanisms, and can therefore guide antibody optimization and epitope-based vaccine design. Previously, we developed an algorithm for antibody-epitope prediction based on antibody neutralization of viral strains with diverse sequences and validated the algorithm on a set of broadly neutralizing HIV-1 antibodies. Here we describe the implementation of this algorithm, NEP (Neutralization-based Epitope Prediction), as a web-based server. The users must supply as input: (i) an alignment of antigen sequences of diverse viral strains; (ii) neutralization data for the antibody of interest against the same set of antigen sequences; and (iii) (optional) a structure of the unbound antigen, for enhanced prediction accuracy. The prediction results can be downloaded or viewed interactively on the antigen structure (if supplied) from the web browser using a JSmol applet. Since neutralization experiments are typically performed as one of the first steps in the characterization of an antibody to determine its breadth and potency, the NEP server can be used to predict antibody-epitope information at no additional experimental costs. NEP can be accessed on the internet at http://exon.niaid.nih.gov/nep. PMID:24782517

  17. A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets.

    PubMed

    Koren, Omry; Knights, Dan; Gonzalez, Antonio; Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E

    2013-01-01

    Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.

  18. A Guide to Enterotypes across the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets

    PubMed Central

    Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E.

    2013-01-01

    Recent analyses of human-associated bacterial diversity have categorized individuals into ‘enterotypes’ or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes. PMID:23326225

  19. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

    PubMed

    Mathelier, Anthony; Fornes, Oriol; Arenillas, David J; Chen, Chih-Yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W

    2016-01-04

    JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition

    PubMed Central

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-01-01

    Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145

  1. Effects of a Physical Exercise Program (PEP-Aut) on Autistic Children's Stereotyped Behavior, Metabolic and Physical Activity Profiles, Physical Fitness, and Health-Related Quality of Life: A Study Protocol.

    PubMed

    Ferreira, José Pedro; Andrade Toscano, Chrystiane Vasconcelos; Rodrigues, Aristides Machado; Furtado, Guilherme Eustaquio; Barros, Mauro Gomes; Wanderley, Rildo Souza; Carvalho, Humberto Moreira

    2018-01-01

    Physical exercise has shown positive effects on symptomatology and on the reduction of comorbidities in population with autism spectrum disorder (ASD). However, there is still no consensus about the most appropriate exercise intervention model for children with ASD. The physical exercise program for children with autism (PEP-Aut) protocol designed allow us to (i) examine the multivariate associations between ASD symptoms, metabolic profile, physical activity level, physical fitness, and health-related quality of life of children with ASD; (ii) assess the effects of a 40-week exercise program on all these aspects of children with ASD. The impact of the exercise program will be assessed based on the sequence of the two phases. Phase 1 is a 12-week cross-sectional study assessing the symptomatology, metabolic profile, physical fitness and physical activity levels, socioeconomic status profile, and health-related quality of life of participants. This phase is the baseline of the following phase. Phase 2 is a 48-week intervention study with a 40-week intervention with exercise that will take place in a specialized center for children with ASD in the city of Maceió-Alagoas, Brazil. The primary outcomes will be change in the symptomatic profile and the level of physical activity of children. Secondary outcomes will be anthropometric and metabolic profiles, aerobic function, grip strength, socioeconomic status, and health-related quality of life. The study will provide critical information on the efficacy of exercise for children with ASD and help guide design and delivery of future programs.

  2. Advanced MR Imaging of the Human Nucleus Accumbens--Additional Guiding Tool for Deep Brain Stimulation.

    PubMed

    Lucas-Neto, Lia; Reimão, Sofia; Oliveira, Edson; Rainha-Campos, Alexandre; Sousa, João; Nunes, Rita G; Gonçalves-Ferreira, António; Campos, Jorge G

    2015-07-01

    The human nucleus accumbens (Acc) has become a target for deep brain stimulation (DBS) in some neuropsychiatric disorders. Nonetheless, even with the most recent advances in neuroimaging it remains difficult to accurately delineate the Acc and closely related subcortical structures, by conventional MRI sequences. It is our purpose to perform a MRI study of the human Acc and to determine whether there are reliable anatomical landmarks that enable the precise location and identification of the nucleus and its core/shell division. For the Acc identification and delineation, based on anatomical landmarks, T1WI, T1IR and STIR 3T-MR images were acquired in 10 healthy volunteers. Additionally, 32-direction DTI was obtained for Acc segmentation. Seed masks for the Acc were generated with FreeSurfer and probabilistic tractography was performed using FSL. The probability of connectivity between the seed voxels and distinct brain areas was determined and subjected to k-means clustering analysis, defining 2 different regions. With conventional T1WI, the Acc borders are better defined through its surrounding anatomical structures. The DTI color-coded vector maps and IR sequences add further detail in the Acc identification and delineation. Additionally, using probabilistic tractography it is possible to segment the Acc into a core and shell division and establish its structural connectivity with different brain areas. Advanced MRI techniques allow in vivo delineation and segmentation of the human Acc and represent an additional guiding tool in the precise and safe target definition for DBS. © 2015 International Neuromodulation Society.

  3. RNA-Rocket: an RNA-Seq analysis resource for infectious disease research

    PubMed Central

    Warren, Andrew S.; Aurrecoechea, Cristina; Brunk, Brian; Desai, Prerak; Emrich, Scott; Giraldo-Calderón, Gloria I.; Harb, Omar; Hix, Deborah; Lawson, Daniel; Machi, Dustin; Mao, Chunhong; McClelland, Michael; Nordberg, Eric; Shukla, Maulik; Vosshall, Leslie B.; Wattam, Alice R.; Will, Rebecca; Yoo, Hyun Seung; Sobral, Bruno

    2015-01-01

    Motivation: RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers. Results: RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data. Availability and implementation: RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal. Contact: anwarren@vt.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:25573919

  4. RNA-Rocket: an RNA-Seq analysis resource for infectious disease research.

    PubMed

    Warren, Andrew S; Aurrecoechea, Cristina; Brunk, Brian; Desai, Prerak; Emrich, Scott; Giraldo-Calderón, Gloria I; Harb, Omar; Hix, Deborah; Lawson, Daniel; Machi, Dustin; Mao, Chunhong; McClelland, Michael; Nordberg, Eric; Shukla, Maulik; Vosshall, Leslie B; Wattam, Alice R; Will, Rebecca; Yoo, Hyun Seung; Sobral, Bruno

    2015-05-01

    RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers. RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data. RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal. anwarren@vt.edu Supplementary materials are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  5. Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment.

    PubMed

    Alahmadi, Hanin H; Shen, Yuan; Fouad, Shereen; Luft, Caroline Di B; Bentham, Peter; Kourtzi, Zoe; Tino, Peter

    2016-01-01

    Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a "Learning with privileged information" approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on a probabilistic sequence learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1) when overall fMRI signal is used as inputs to the classifier, the post-training session is most relevant; and (2) when the graph feature reflecting underlying spatiotemporal fMRI pattern is used, the pre-training session is most relevant. Taken together these results suggest that brain connectivity before training and overall fMRI signal after training are both diagnostic of cognitive skills in MCI.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Larson, Natalie M.; Zok, Frank W.

    One route for producing fiber-reinforced ceramic-matrix composites entails repeated impregnation and pyrolysis of a preceramic polymer in a fiber preform. The process relies crucially on the development of networks of contiguous cracks during pyrolysis, thereby allowing further impregnation to attain nearly-full densification. The present study employs in-situ x-ray computed tomography (XCT) to reveal in three dimensions the evolution of matrix structure during pyrolysis of a SiC-based preceramic polymer to 1200 °C. Observations are used to guide the development of a taxonomy of crack geometries and crack structures and to identify the temporal sequence of their formation. A quantitative analysis ismore » employed to characterize effects of local microstructural dimensions on the conditions required to form cracks of various types. Complementary measurements of gas evolution and mass loss of the preceramic polymer during pyrolysis as well as changes in mass density and Young's modulus provide context for the physical changes revealed by XCT. Furthermore, the findings provide a foundation for future development of physics-based models to guide composite fabrication processes.« less

  7. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Torella, JP; Lienert, F; Boehm, CR

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less

  8. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  9. The role of replay and theta sequences in mediating hippocampal-prefrontal interactions for memory and cognition.

    PubMed

    Zielinski, Mark C; Tang, Wenbo; Jadhav, Shantanu P

    2017-12-18

    Sequential activity is seen in the hippocampus during multiple network patterns, prominently as replay activity during both awake and sleep sharp-wave ripples (SWRs), and as theta sequences during active exploration. Although various mnemonic and cognitive functions have been ascribed to these hippocampal sequences, evidence for these proposed functions remains primarily phenomenological. Here, we briefly review current knowledge about replay events and theta sequences in spatial memory tasks. We reason that in order to gain a mechanistic and causal understanding of how these patterns influence memory and cognitive processing, it is important to consider how these sequences influence activity in other regions, and in particular, the prefrontal cortex, which is crucial for memory-guided behavior. For spatial memory tasks, we posit that hippocampal-prefrontal interactions mediated by replay and theta sequences play complementary and overlapping roles at different stages in learning, supporting memory encoding and retrieval, deliberative decision making, planning, and guiding future actions. This framework offers testable predictions for future physiology and closed-loop feedback inactivation experiments for specifically targeting hippocampal sequences as well as coordinated prefrontal activity in different network states, with the potential to reveal their causal roles in memory-guided behavior. © 2017 Wiley Periodicals, Inc.

  10. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data.

    PubMed

    Chen, Shifu; Huang, Tanxiao; Zhou, Yanqing; Han, Yue; Xu, Mingyan; Gu, Jia

    2017-03-14

    Some applications, especially those clinical applications requiring high accuracy of sequencing data, usually have to face the troubles caused by unavoidable sequencing errors. Several tools have been proposed to profile the sequencing quality, but few of them can quantify or correct the sequencing errors. This unmet requirement motivated us to develop AfterQC, a tool with functions to profile sequencing errors and correct most of them, plus highly automated quality control and data filtering features. Different from most tools, AfterQC analyses the overlapping of paired sequences for pair-end sequencing data. Based on overlapping analysis, AfterQC can detect and cut adapters, and furthermore it gives a novel function to correct wrong bases in the overlapping regions. Another new feature is to detect and visualise sequencing bubbles, which can be commonly found on the flowcell lanes and may raise sequencing errors. Besides normal per cycle quality and base content plotting, AfterQC also provides features like polyX (a long sub-sequence of a same base X) filtering, automatic trimming and K-MER based strand bias profiling. For each single or pair of FastQ files, AfterQC filters out bad reads, detects and eliminates sequencer's bubble effects, trims reads at front and tail, detects the sequencing errors and corrects part of them, and finally outputs clean data and generates HTML reports with interactive figures. AfterQC can run in batch mode with multiprocess support, it can run with a single FastQ file, a single pair of FastQ files (for pair-end sequencing), or a folder for all included FastQ files to be processed automatically. Based on overlapping analysis, AfterQC can estimate the sequencing error rate and profile the error transform distribution. The results of our error profiling tests show that the error distribution is highly platform dependent. Much more than just another new quality control (QC) tool, AfterQC is able to perform quality control, data filtering, error profiling and base correction automatically. Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations. While providing rich configurable options, AfterQC can detect and set all the options automatically and require no argument in most cases.

  11. Heat, Energy, and Order, Part Two of an Integrated Science Sequence, Teacher's Guide, 1970 Edition.

    ERIC Educational Resources Information Center

    Portland Project Committee, OR.

    This teacher's guide contains part two of the four-part first year Portland Project, a three-year secondary integrated science curriculum sequence. This part involves the student with unifying principles essential for deeper understanding of the concept of energy. Confidence in the atomic nature of matter is built by relating heat in terms of…

  12. Students' Guided Reinvention of Definition of Limit of a Sequence with Interactive Technology

    ERIC Educational Resources Information Center

    Flores, Alfinio; Park, Jungeun

    2016-01-01

    In a course emphasizing interactive technology, 19 students, including 18 mathematics education majors, mostly in their first year, reinvented the definition of limit of a sequence while working in small cooperative groups. The class spent four sessions of 75 minutes each on a cyclical process of guided reinvention of the definition of limit of a…

  13. Waves and Particles, The Orbital Atom, Parts One and Two of an Integrated Science Sequence, Teacher's Guide, 1973 Edition.

    ERIC Educational Resources Information Center

    Portland Project Committee, OR.

    This teacher's guide includes parts one and two of the four-part third year Portland Project, a three-year integrated secondary science curriculum sequence. The Harvard Project Physics textbook is used for reading assignments for part one. Assignments relate to waves, light, electricity, magnetic fields, Faraday and the electrical age,…

  14. Depth information in natural environments derived from optic flow by insect motion detection system: a model analysis

    PubMed Central

    Schwegmann, Alexander; Lindemann, Jens P.; Egelhaaf, Martin

    2014-01-01

    Knowing the depth structure of the environment is crucial for moving animals in many behavioral contexts, such as collision avoidance, targeting objects, or spatial navigation. An important source of depth information is motion parallax. This powerful cue is generated on the eyes during translatory self-motion with the retinal images of nearby objects moving faster than those of distant ones. To investigate how the visual motion pathway represents motion-based depth information we analyzed its responses to image sequences recorded in natural cluttered environments with a wide range of depth structures. The analysis was done on the basis of an experimentally validated model of the visual motion pathway of insects, with its core elements being correlation-type elementary motion detectors (EMDs). It is the key result of our analysis that the absolute EMD responses, i.e., the motion energy profile, represent the contrast-weighted nearness of environmental structures during translatory self-motion at a roughly constant velocity. In other words, the output of the EMD array highlights contours of nearby objects. This conclusion is largely independent of the scale over which EMDs are spatially pooled and was corroborated by scrutinizing the motion energy profile after eliminating the depth structure from the natural image sequences. Hence, the well-established dependence of correlation-type EMDs on both velocity and textural properties of motion stimuli appears to be advantageous for representing behaviorally relevant information about the environment in a computationally parsimonious way. PMID:25136314

  15. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization.

    PubMed

    Bedbrook, Claire N; Yang, Kevin K; Rice, Austin J; Gradinaru, Viviana; Arnold, Frances H

    2017-10-01

    There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well.

  16. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization

    PubMed Central

    Rice, Austin J.; Gradinaru, Viviana; Arnold, Frances H.

    2017-01-01

    There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well. PMID:29059183

  17. Initial experience with 3D isotropic high-resolution 3 T MR arthrography of the wrist.

    PubMed

    Sutherland, John K; Nozaki, Taiki; Kaneko, Yasuhito; J Yu, Hon; Rafijah, Gregory; Hitt, David; Yoshioka, Hiroshi

    2016-01-16

    Our study was performed to evaluate the image quality of 3 T MR wrist arthrograms with attention to ulnar wrist structures, comparing image quality of isotropic 3D proton density fat suppressed turbo spin echo (PDFS TSE) sequence versus standard 2D 3 T sequences as well as comparison with 1.5 T MR arthrograms. Eleven consecutive 3 T MR wrist arthrograms were performed and the following sequences evaluated: 3D isotropic PDFS, repetition time/echo time (TR/TE) 1400/28.3 ms, voxel size 0.35x0.35x0.35 mm, acquisition time 5 min; 2D coronal sequences with slice thickness 2 mm: T1 fat suppressed turbo spin echo (T1FS TSE) (TR/TE 600/20 ms); proton density (PD) TSE (TR/TE 3499/27 ms). A 1.5 T group of 18 studies with standard sequences were evaluated for comparison. All MR imaging followed fluoroscopically guided intra-articular injection of dilute gadolinium contrast. Qualitative assessment related to delineation of anatomic structures between 1.5 T and 3 T MR arthrograms was carried out using Mann-Whitney test and the differences in delineation of anatomic structures among each sequence in 3 T group were analyzed with Wilcoxon signed-rank test. Quantitative assessment of mean relative signal intensity (SI) and relative contrast measurements was performed using Wilcoxon signed-rank test. Mean qualitative scores for 3 T sequences were significantly higher than 1.5 T (p < 0.01), with isotropic 3D PDFS sequence having highest mean qualitative scores (p < 0.05). Quantitative analysis demonstrated no significant difference in relative signal intensity among the 3 T sequences. Significant differences were found in relative contrast between fluid-bone and fluid-fat comparing 3D and 2D PDFS (p < 0.01). 3D isotropic PDFS sequence showed promise in both qualitative and quantitative assessment, suggesting this may be useful for MR wrist arthrograms at 3 T. Primary reasons for diagnostic potential include the ability to make reformations in any obliquity to follow the components of ulnar side wrist structures including triangular fibrocartilage complex. Additionally, isotropic imaging provides thinner slice thickness with less partial volume averaging allowing for identification of subtle injuries.

  18. DNA-guided nanoparticle assemblies

    DOEpatents

    Gang, Oleg; Nykypanchuk, Dmytro; Maye, Mathew; van der Lelie, Daniel

    2013-07-16

    In some embodiments, DNA-capped nanoparticles are used to define a degree of crystalline order in assemblies thereof. In some embodiments, thermodynamically reversible and stable body-centered cubic (bcc) structures, with particles occupying <.about.10% of the unit cell, are formed. Designs and pathways amenable to the crystallization of particle assemblies are identified. In some embodiments, a plasmonic crystal is provided. In some aspects, a method for controlling the properties of particle assemblages is provided. In some embodiments a catalyst is formed from nanoparticles linked by nucleic acid sequences and forming an open crystal structure with catalytically active agents attached to the crystal on its surface or in interstices.

  19. Improving CRISPR-Cas specificity with chemical modifications in single-guide RNAs.

    PubMed

    Ryan, Daniel E; Taussig, David; Steinfeld, Israel; Phadnis, Smruti M; Lunstad, Benjamin D; Singh, Madhurima; Vuong, Xuan; Okochi, Kenji D; McCaffrey, Ryan; Olesiak, Magdalena; Roy, Subhadeep; Yung, Chong Wing; Curry, Bo; Sampson, Jeffrey R; Bruhn, Laurakay; Dellinger, Douglas J

    2018-01-25

    CRISPR systems have emerged as transformative tools for altering genomes in living cells with unprecedented ease, inspiring keen interest in increasing their specificity for perfectly matched targets. We have developed a novel approach for improving specificity by incorporating chemical modifications in guide RNAs (gRNAs) at specific sites in their DNA recognition sequence ('guide sequence') and systematically evaluating their on-target and off-target activities in biochemical DNA cleavage assays and cell-based assays. Our results show that a chemical modification (2'-O-methyl-3'-phosphonoacetate, or 'MP') incorporated at select sites in the ribose-phosphate backbone of gRNAs can dramatically reduce off-target cleavage activities while maintaining high on-target performance, as demonstrated in clinically relevant genes. These findings reveal a unique method for enhancing specificity by chemically modifying the guide sequence in gRNAs. Our approach introduces a versatile tool for augmenting the performance of CRISPR systems for research, industrial and therapeutic applications. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles.

    PubMed

    Sharma, Ronesh; Bayarjargal, Maitsetseg; Tsunoda, Tatsuhiko; Patil, Ashwini; Sharma, Alok

    2018-01-21

    Intrinsically Disordered Proteins (IDPs) lack stable tertiary structure and they actively participate in performing various biological functions. These IDPs expose short binding regions called Molecular Recognition Features (MoRFs) that permit interaction with structured protein regions. Upon interaction they undergo a disorder-to-order transition as a result of which their functionality arises. Predicting these MoRFs in disordered protein sequences is a challenging task. In this study, we present MoRFpred-plus, an improved predictor over our previous proposed predictor to identify MoRFs in disordered protein sequences. Two separate independent propensity scores are computed via incorporating physicochemical properties and HMM profiles, these scores are combined to predict final MoRF propensity score for a given residue. The first score reflects the characteristics of a query residue to be part of MoRF region based on the composition and similarity of assumed MoRF and flank regions. The second score reflects the characteristics of a query residue to be part of MoRF region based on the properties of flanks associated around the given residue in the query protein sequence. The propensity scores are processed and common averaging is applied to generate the final prediction score of MoRFpred-plus. Performance of the proposed predictor is compared with available MoRF predictors, MoRFchibi, MoRFpred, and ANCHOR. Using previously collected training and test sets used to evaluate the mentioned predictors, the proposed predictor outperforms these predictors and generates lower false positive rate. In addition, MoRFpred-plus is a downloadable predictor, which makes it useful as it can be used as input to other computational tools. https://github.com/roneshsharma/MoRFpred-plus/wiki/MoRFpred-plus:-Download. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. microMS: A Python Platform for Image-Guided Mass Spectrometry Profiling

    NASA Astrophysics Data System (ADS)

    Comi, Troy J.; Neumann, Elizabeth K.; Do, Thanh D.; Sweedler, Jonathan V.

    2017-09-01

    Image-guided mass spectrometry (MS) profiling provides a facile framework for analyzing samples ranging from single cells to tissue sections. The fundamental workflow utilizes a whole-slide microscopy image to select targets of interest, determine their spatial locations, and subsequently perform MS analysis at those locations. Improving upon prior reported methodology, a software package was developed for working with microscopy images. microMS, for microscopy-guided mass spectrometry, allows the user to select and profile diverse samples using a variety of target patterns and mass analyzers. Written in Python, the program provides an intuitive graphical user interface to simplify image-guided MS for novice users. The class hierarchy of instrument interactions permits integration of new MS systems while retaining the feature-rich image analysis framework. microMS is a versatile platform for performing targeted profiling experiments using a series of mass spectrometers. The flexibility in mass analyzers greatly simplifies serial analyses of the same targets by different instruments. The current capabilities of microMS are presented, and its application for off-line analysis of single cells on three distinct instruments is demonstrated. The software has been made freely available for research purposes. [Figure not available: see fulltext.

  2. microMS: A Python Platform for Image-Guided Mass Spectrometry Profiling.

    PubMed

    Comi, Troy J; Neumann, Elizabeth K; Do, Thanh D; Sweedler, Jonathan V

    2017-09-01

    Image-guided mass spectrometry (MS) profiling provides a facile framework for analyzing samples ranging from single cells to tissue sections. The fundamental workflow utilizes a whole-slide microscopy image to select targets of interest, determine their spatial locations, and subsequently perform MS analysis at those locations. Improving upon prior reported methodology, a software package was developed for working with microscopy images. microMS, for microscopy-guided mass spectrometry, allows the user to select and profile diverse samples using a variety of target patterns and mass analyzers. Written in Python, the program provides an intuitive graphical user interface to simplify image-guided MS for novice users. The class hierarchy of instrument interactions permits integration of new MS systems while retaining the feature-rich image analysis framework. microMS is a versatile platform for performing targeted profiling experiments using a series of mass spectrometers. The flexibility in mass analyzers greatly simplifies serial analyses of the same targets by different instruments. The current capabilities of microMS are presented, and its application for off-line analysis of single cells on three distinct instruments is demonstrated. The software has been made freely available for research purposes. Graphical Abstract ᅟ.

  3. High-resolution seismic sequence stratigraphy and history of relative sea level changes since the Late Miocene, northern continental margin, South China Sea

    NASA Astrophysics Data System (ADS)

    Zhong, G.; Wang, L.

    2013-12-01

    The northern South China Sea (SCS) margin is suggested as one of the ideal sites for documenting the late Cenozoic sea level changes for its characteristics of rapid sedimentation and relatively stable structural subsidence since the Late Miocene. In this study, high-resolution seismic profiles acquired by the Guangzhou Marine Geological Survey, calibrated by well control from the ODP sites 1146 and 1148, were utilized to construct a time-significant sequence stratigraphic framework, from which the history of relative sea level changes since the Late Miocene on the northern SCS margin was derived. Our study area is situated in the middle segment of the margin, between the Hainan Island to the west and the Dongsha Islands to the east. This region is to a certain degree far away from the active structural zones and is suggested as the most stable region in the margin. Totally 4000 km seismic profiles were used, which controls an area of about 6×104 km2. The seismic data have a vertical resolution of 5 to 15 m for the Upper Miocene to Quaternary interval. Three regional seismic sequence boundaries were identified. They subdivide the Late Miocene to Quaternary into three mega-sequences, which correspond to the Quaternary, Pliocene and Late Miocene, respectively by tying to well control. The Late Miocene mega-sequence, including 13 component sequences, is characterized with a basal incised canyon-developed interval overlain by three sets of progradational sequences formed in deep-water slope environments. The Pliocene mega-sequence consists of four sets of progradational sequences. Each sequence set contains one to three component sequences. At least 7 component sequences can be identified. The Quaternary mega-sequence consists of five sets of progradational sequences, in which the lower two constitute a retrogressive sequence set and the upper three a progradational sequence set. At least 9 component sequences can be recognized. Most of the component sequences within the Pliocene and Quaternary mega-sequences occur adjacent to modern shelf margin, and therefore were interpreted as shelf-marginal progradational deltaic sequences. A relative sea level curve since the Late Miocene was compiled by integrating the shift trajectory of onlap points, the stacking pattern of component sequences, and the chronostratigraphic diagrams. The curve contains about 29 cycles of relative sea level changes, showing a much higher resolution than the previous results in the region. These cycles constitute three large relative sea level rise and fall cycles. General trend of sea level variations is rising since the Late Miocene, which is opposite to the global sea level changes and is in accordance with the previous regional researches. This deviation is ascribed to the combined effects of very rapid regional subsidence and relative deficiency of sediment supply. This research was funded by the National Natural Science Foundation of China (Grant Nos. 91028003 and 41076020).

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ritchie, L.T.; Johnson, J.D.; Blond, R.M.

    The CRAC2 computer code is a revision of the Calculation of Reactor Accident Consequences computer code, CRAC, developed for the Reactor Safety Study. The CRAC2 computer code incorporates significant modeling improvements in the areas of weather sequence sampling and emergency response, and refinements to the plume rise, atmospheric dispersion, and wet deposition models. New output capabilities have also been added. This guide is to facilitate the informed and intelligent use of CRAC2. It includes descriptions of the input data, the output results, the file structures, control information, and five sample problems.

  5. Prediction of RNA secondary structures: from theory to models and real molecules

    NASA Astrophysics Data System (ADS)

    Schuster, Peter

    2006-05-01

    RNA secondary structures are derived from RNA sequences, which are strings built form the natural four letter nucleotide alphabet, {AUGC}. These coarse-grained structures, in turn, are tantamount to constrained strings over a three letter alphabet. Hence, the secondary structures are discrete objects and the number of sequences always exceeds the number of structures. The sequences built from two letter alphabets form perfect structures when the nucleotides can form a base pair, as is the case with {GC} or {AU}, but the relation between the sequences and structures differs strongly from the four letter alphabet. A comprehensive theory of RNA structure is presented, which is based on the concepts of sequence space and shape space, being a space of structures. It sets the stage for modelling processes in ensembles of RNA molecules like evolutionary optimization or kinetic folding as dynamical phenomena guided by mappings between the two spaces. The number of minimum free energy (mfe) structures is always smaller than the number of sequences, even for two letter alphabets. Folding of RNA molecules into mfe energy structures constitutes a non-invertible mapping from sequence space onto shape space. The preimage of a structure in sequence space is defined as its neutral network. Similarly the set of suboptimal structures is the preimage of a sequence in shape space. This set represents the conformation space of a given sequence. The evolutionary optimization of structures in populations is a process taking place in sequence space, whereas kinetic folding occurs in molecular ensembles that optimize free energy in conformation space. Efficient folding algorithms based on dynamic programming are available for the prediction of secondary structures for given sequences. The inverse problem, the computation of sequences for predefined structures, is an important tool for the design of RNA molecules with tailored properties. Simultaneous folding or cofolding of two or more RNA molecules can be modelled readily at the secondary structure level and allows prediction of the most stable (mfe) conformations of complexes together with suboptimal states. Cofolding algorithms are important tools for efficient and highly specific primer design in the polymerase chain reaction (PCR) and help to explain the mechanisms of small interference RNA (si-RNA) molecules in gene regulation. The evolutionary optimization of RNA structures is illustrated by the search for a target structure and mimics aptamer selection in evolutionary biotechnology. It occurs typically in steps consisting of short adaptive phases interrupted by long epochs of little or no obvious progress in optimization. During these quasi-stationary epochs the populations are essentially confined to neutral networks where they search for sequences that allow a continuation of the adaptive process. Modelling RNA evolution as a simultaneous process in sequence and shape space provides answers to questions of the optimal population size and mutation rates. Kinetic folding is a stochastic process in conformation space. Exact solutions are derived by direct simulation in the form of trajectory sampling or by solving the master equation. The exact solutions can be approximated straightforwardly by Arrhenius kinetics on barrier trees, which represent simplified versions of conformational energy landscapes. The existence of at least one sequence forming any arbitrarily chosen pair of structures is granted by the intersection theorem. Folding kinetics is the key to understanding and designing multistable RNA molecules or RNA switches. These RNAs form two or more long lived conformations, and conformational changes occur either spontaneously or are induced through binding of small molecules or other biopolymers. RNA switches are found in nature where they act as elements in genetic and metabolic regulation. The reliability of RNA secondary structure prediction is limited by the accuracy with which the empirical parameters can be determined and by principal deficiencies, for example by the lack of energy contributions resulting from tertiary interactions. In addition, native structures may be determined by folding kinetics rather than by thermodynamics. We address the first problem by considering base pair probabilities or base pairing entropies, which are derived from the partition function of conformations. A high base pair probability corresponding to a low pairing entropy is taken as an indicator of a high reliability of prediction. Pseudoknots are discussed as an example of a tertiary interaction that is highly important for RNA function. Moreover, pseudoknot formation is readily incorporated into structure prediction algorithms. Some examples of experimental data on RNA secondary structures that are readily explained using the landscape concept are presented. They deal with (i) properties of RNA molecules with random sequences, (ii) RNA molecules from restricted alphabets, (iii) existence of neutral networks, (iv) shape space covering, (v) riboswitches and (vi) evolution of non-coding RNAs as an example of evolution restricted to neutral networks.

  6. Receptors and aging: structural selectivity of the rhamnose-receptor on fibroblasts as shown by Ca(2+)-mobilization and gene-expression profiles.

    PubMed

    Faury, G; Molinari, J; Rusova, E; Mariko, B; Raveaud, S; Huber, P; Velebny, V; Robert, A M; Robert, L

    2011-01-01

    Qualitative and quantitative modifications of receptors were shown to play a key role in cell and tissue aging. We recently described the properties of a rhamnose-recognizing receptor on fibroblasts involved in the mediation of age-dependent functions of these cells. Using Ca(2+)-mobilization and DNA-microarrays we could show in the presence of rhamnose-rich oligo- and polysaccharides (RROPs) Ca(2+)-mobilization and changes in gene regulation. Here, we compared the effects of several RROPs, differing in their carbohydrate sequence and molecular weights, in normal human dermal fibroblasts (NHDFs). It appeared that different structural features were required for maximal effects on Ca(2+)-mobilization and gene-expression profiles. Maximal effect on Ca(2+) influx and intracellular free calcium regulation was exhibited by RROP-1, a 50 kDa average molecular weight polysaccharide, and RROP-3, a 5 kDa average molecular weight oligosaccharide with a different carbohydrate sequence. Maximal effect on gene-expression profiles was obtained with RROP-3. These results suggest the possibility of several different transmission pathways from the rhamnose-receptor to intracellular targets, differentially affecting these two intracellular functions, with potential consequences on aging. Although of only relative specificity, this receptor site exhibits a high affinity for rhamnose, absent from vertebrate glycoconjugates. The rhamnose-receptor might well represent an evolutionary conserved conformation of a prokaryote lectin. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  7. An atomic scale study of surface termination and digital alloy growth in InGaAs/AlAsSb multi-quantum wells.

    PubMed

    Mauger, S J C; Bozkurt, M; Koenraad, P M; Zhao, Y; Folliot, H; Bertru, N

    2016-07-20

    An atomic scale study has been performed to understand the influence of the (As,Sb) shutter sequences during interface formation on the optical properties of InGaAs/AlAsSb quantum wells. Our cross-sectional scanning tunneling microscopy results show that the onset of the Sb profile is steep in the Sb-containing layers whereas an appreciable segregation of Sb in the subsequently grown Sb free layers is observed. The steep rise of the Sb profile is due to extra Sb that is supplied to the surface prior to the growth of the Sb-containing layers. No relation is found between the (As,Sb) termination conditions of the Sb-containing layers and the resulting Sb profiles in the capping layers. Correspondingly we see that the optical properties of these quantum wells are also nearly independent on the (As,Sb) shutter sequences at the interface. Digital alloy growth in comparison to conventional molecular beam epitaxy growth was also explored. X-ray results suggest that the structural properties of the quantum well structures grown by conventional molecular beam epitaxy techniques are slightly better than those formed by digital alloy growth. However photoluminescence studies indicate that the digital alloy samples give rise to a more intense and broader photoluminescence emission. Cross-sectional scanning tunneling microscopy measurements reveal that lateral composition modulations present in the digital alloys are responsible for the enhancement of the photoluminescence intensity and inhomogeneous broadening.

  8. Application of a cDNA microarray for profiling the gene expression of Echinococcus granulosus protoscoleces treated with albendazole and artemisinin.

    PubMed

    Lü, Guodong; Zhang, Wenbao; Wang, Jianhua; Xiao, Yunfeng; Zhao, Jun; Zhao, Jianqin; Sun, Yimin; Zhang, Chuanshan; Wang, Junhua; Lin, Renyong; Liu, Hui; Zhang, Fuchun; Wen, Hao

    2014-12-01

    Cystic echinoccocosis (CE) is a neglected zoonosis that is caused by the dog-tapeworm Echinococcus granulosus. The disease is endemic worldwide. There is an urgent need for searching effective drug for the treatment of the disease. In this study, we sequenced a cDNA library constructed using RNA isolated from oncospheres, protoscoleces, cyst membrane and adult worms of E. granulosus. A total of 9065 non-redundant or unique sequences were obtained and spotted on chips as uniEST probes to profile the gene expression in protoscoleces of E. granulosus treated with the anthelmintic drugs albendazole and artemisinin, respectively. The results showed that 7 genes were up-regulated and 38 genes were down-regulated in the protoscoleces treated with albendazole. Gene analysis showed that these genes are responsible for energy metabolism, cell cycle and assembly of cell structure. We also identified 100 genes up-regulated and 6 genes down-regulated in the protoscoleces treated with artemisinin. These genes play roles in the transduction of environmental signals, and metabolism. Albendazole appeared its drug efficacy in damaging cell structure, while artemisinin was observed to increase the formation of the heterochromatin in protoscolex cells. Our results highlight the utility of using cDNA microarray methods to detect gene expression profiles of E. granulosus and, in particular, to understand the pharmacologic mechanism of anti-echinococcosis drugs. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. G-Quadruplex Folds of the Human Telomere Sequence Alter the Site Reactivity and Reaction Pathway of Guanine Oxidation Compared to Duplex DNA

    PubMed Central

    Fleming, Aaron M.; Burrows, Cynthia J.

    2013-01-01

    Telomere shortening occurs during oxidative and inflammatory stress with guanine (G) as the major site of damage. In this work, a comprehensive profile of the sites of oxidation and structures of products observed from G-quadruplex and duplex structures of the human telomere sequence was studied in the G-quadruplex folds (hybrid (K+), basket (Na+), and propeller (K+ + 50% CH3CN)) resulting from the sequence 5’-(TAGGGT)4T-3’ and in an appropriate duplex containing one telomere repeat. Oxidations with four oxidant systems consisting of riboflavin photosensitization, carbonate radical generation, singlet oxygen, and the copper Fenton-like reaction were analyzed under conditions of low product conversion to determine relative reactivity. The one-electron oxidants damaged the 5’-G in G-quadruplexes leading to spiroiminodihydantoin (Sp) and 2,2,4-triamino-2H-oxazol-5-one (Z) as major products as well as 8-oxo-7,8-dihydroguanine (OG) and 5-guanidinohydantoin (Gh) in low relative yields, while oxidation in the duplex context produced damage at the 5’- and middle-Gs of GGG sequences and resulted in Gh being the major product. Addition of the reductant N-acetylcysteine (NAC) to the reaction did not alter the riboflavin-mediated damage sites, but decreased Z by 2-fold and increased OG by 5-fold, while not altering the hydantoin ratio. However, NAC completely quenched the CO3•− reactions. Singlet oxygen oxidations of the G-quadruplex showed reactivity at all Gs on the exterior faces of G-quartets and furnished the product Sp, while no oxidation was observed in the duplex context under these conditions, and addition of NAC had no effect. Because a long telomere sequence would have higher-order structures of G-quadruplexes, studies were also conducted with 5’-(TAGGGT)8-T-3’, and it provided similar oxidation profiles to the single G-quadruplex. Lastly, CuII/H2O2-mediated oxidations were found to be indiscriminate in the damage patterns, and 5-carboxamido-5-formamido-2-iminohydantoin (2Ih) was found to be a major duplex product, while nearly equal yields of 2Ih and Sp were observed in G-quadruplex contexts. These findings indicate that the nature of the secondary structure of folded DNA greatly alters both the reactivity of G toward oxidative stress as well as the product outcome and suggest that recognition of damage in telomeric sequences by repair enzymes may be profoundly different from that of B-form duplex DNA. PMID:23438298

  10. Protein sectors: evolutionary units of three-dimensional structure

    PubMed Central

    Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama

    2011-01-01

    Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402

  11. Industrial Arts Curriculum Guide for Industrial Ceramics.

    ERIC Educational Resources Information Center

    Connecticut State Dept. of Education, Hartford. Div. of Vocational and Adult Education.

    This curriculum guide for industrial ceramics courses is part of a series of curriculum guides for use in the industrial arts curriculum in Connecticut. The guide provides information on the scope and sequence of the industrial arts curriculum, specific guidelines for industrial arts, and program goals and objectives. The content of the industrial…

  12. Personalized comprehensive molecular profiling of high risk osteosarcoma: Implications and limitations for precision medicine.

    PubMed

    Subbiah, Vivek; Wagner, Michael J; McGuire, Mary F; Sarwari, Nawid M; Devarajan, Eswaran; Lewis, Valerae O; Westin, Shanon; Kato, Shumei; Brown, Robert E; Anderson, Pete

    2015-12-01

    Despite advances in molecular medicine over recent decades, there has been little advancement in the treatment of osteosarcoma. We performed comprehensive molecular profiling in two cases of metastatic and chemotherapy-refractory osteosarcoma to guide molecularly targeted therapy. Hybridization capture of >300 cancer-related genes plus introns from 28 genes often rearranged or altered in cancer was applied to >50 ng of DNA extracted from tumor samples from two patients with recurrent, metastatic osteosarcoma. The DNA from each sample was sequenced to high, uniform coverage. Immunohistochemical probes and morphoproteomics analysis were performed, in addition to fluorescence in situ hybridization. All analyses were performed in CLIA-certified laboratories. Molecularly targeted therapy based on the resulting profiles was offered to the patients. Biomedical analytics were performed using QIAGEN's Ingenuity® Pathway Analysis. In Patient #1, comprehensive next-generation exome sequencing showed MET amplification, PIK3CA mutation, CCNE1 amplification, and PTPRD mutation. Immunohistochemistry-based morphoproteomic analysis revealed c-Met expression [(p)-c-Met (Tyr1234/1235)] and activation of mTOR/AKT pathway [IGF-1R (Tyr1165/1166), p-mTOR [Ser2448], p-Akt (Ser473)] and expression of SPARC and COX2. Targeted therapy was administered to match the P1K3CA, c-MET, and SPARC and COX2 aberrations with sirolimus+ crizotinib and abraxane+ celecoxib. In Patient #2, aberrations included NF2 loss in exons 2-16, PDGFRα amplification, and TP53 mutation. This patient was enrolled on a clinical trial combining targeted agents temsirolimus, sorafenib and bevacizumab, to match NF2, PDGFRα and TP53 aberrations. Both the patients did not benefit from matched therapy. Relapsed osteosarcoma is characterized by complex signaling and drug resistance pathways. Comprehensive molecular profiling holds great promise for tailoring personalized therapies for cancer. Methods for such profiling are evolving and need to be refined to better assist clinicians in making treatment decisions based on the large amount of data that results from this type of testing. Further research in this area is warranted.

  13. MultiSeq: unifying sequence and structure data for evolutionary analysis

    PubMed Central

    Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida

    2006-01-01

    Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055

  14. Discovery of a Highly Selective JAK2 Inhibitor, BMS-911543, for the Treatment of Myeloproliferative Neoplasms

    PubMed Central

    2015-01-01

    JAK2 kinase inhibitors are a promising new class of agents for the treatment of myeloproliferative neoplasms and have potential for the treatment of other diseases possessing a deregulated JAK2-STAT pathway. X-ray structure and ADME guided refinement of C-4 heterocycles to address metabolic liability present in dialkylthiazole 1 led to the discovery of a clinical candidate, BMS-911543 (11), with excellent kinome selectivity, in vivo PD activity, and safety profile. PMID:26288683

  15. Guide for Inspection of Coatings Applied to Hydraulic Structures.

    DTIC Science & Technology

    1986-04-01

    observed, many specifiers provide that the surfaces must be sweep blasted to provide a " tooth " before topcoating, or wiped with some strong solvent. CW...termed anchor pattern, profile, or " tooth ," and is essentially a pattern of peaks and valleys on the steel surface. This pattern is obtained by abrasive...situation, a sealer must be applied. Blushing Blushing is the hazing or whitening of the finish as a result of the absorption and "" retention of

  16. Excursion Guide-Book: International Symposium: Time Frequency and Dating in Geomorphology Held in Czechoslovakia on 16-21 June 1992,

    DTIC Science & Technology

    1992-06-21

    fauna as well as a small-sized Middle Palaeolithic industry indicating that this travertine body formed itself in a warm phase of the younger half of the... Palaeolithic industry occurs which is at present investigated by systematic excavations of -he Archaeologic Institute of SAS showing in detail the structure of...4 (5) cultural layers with Mid- Palaeolithic instruments in one profile. Based on its fauna analysis the travertine was backdated to the culmination

  17. Diverse Molecular Targets for Chalcones with Varied Bioactivities

    PubMed Central

    Zhou, Bo; Xing, Chengguo

    2015-01-01

    Natural or synthetic chalcones with different substituents have revealed a variety of biological activities that may benefit human health. The underlying mechanisms of action, particularly with respect to the direct cellular targets and the modes of interaction with the targets, have not been rigorously characterized, which imposes challenges to structure-guided rational development of therapeutic agents or chemical probes with acceptable target-selectivity profile. This review summarizes literature evidence on chalcones’ direct molecular targets in the context of their biological activities. PMID:26798565

  18. Foot-Matics. Teacher's Guide.

    ERIC Educational Resources Information Center

    Frame, Laurence

    This teacher's guide contains the following sections: Teacher Objectives; Student Objectives; Teacher Aide Suggestions; Objectives Overview; Scope and Sequence (K-8); Teacher's Guide; NFL Public Relations Director; NFL Team Addresses and Art (Helmet) Pages; Football Field Dimensions; Age Problems; Statistics from a Newspaper; Standings; Weight…

  19. Monitoring corrosion of rebar embedded in mortar using guided ultrasonic waves

    NASA Astrophysics Data System (ADS)

    Ervin, Benjamin Lee

    This thesis investigates the use of guided mechanical waves for monitoring uniform and localized corrosion in steel reinforcing bars embedded in concrete. The main forms of structural deterioration from uniform corrosion in reinforced concrete are the destruction of the bond between steel and concrete, the loss of steel cross-sectional area, and the loss of concrete cross-sectional area from cracking and spalling. Localized corrosion, or pitting, leads to severe loss of steel cross-sectional area, creating a high risk of bar tensile failure and unintended transfer of loads to the surrounding concrete. Reinforcing bars were used to guide the waves, rather than bulk concrete, allowing for longer inspection distances due to lower material absorption, scattering, and divergence. Guided mechanical waves in low frequency ranges (50-200 kHz) and higher frequency ranges (2-8 MHz) were monitored in reinforced mortar specimens undergoing accelerated uniform corrosion. The frequency ranges chosen contain wave modes with varying amounts of interaction, i.e. displacement profile, at the material interface. Lower frequency modes were shown to be sensitive to the accumulation of corrosion product and the level of bond between the surrounding mortar and rebar. This allows for the onset of corrosion and bond deterioration to be monitored. Higher frequency modes were shown to be sensitive to changes in the bar profile surface, allowing for the loss of cross-sectional area to be monitored. Guided mechanical waves in the higher frequency range were also used to monitor reinforced mortar specimens undergoing accelerated localized corrosion. The high frequency modes were sensitive to the localized attack. Also promising was the unique frequency spectrum response for both uniform and localized corrosion, allowing the two corrosion types to be differentiated from through-transmission evaluation. The isolated effects of the reinforcing ribs, simulated debonding, simulated pitting, water surrounding, and mortar surrounding were also investigated using guided mechanical waves. Results are presented and discussed within the framework of a corrosion process degradation model and service life. A thorough review and discussion of the corrosion process, modeling the propagation of corrosion, nondestructive methods for monitoring corrosion in reinforced concrete, and guided mechanical waves have also been presented.

  20. in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Xiaofan; Peris, David; Kominek, Jacek

    The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less

  1. in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

    DOE PAGES

    Zhou, Xiaofan; Peris, David; Kominek, Jacek; ...

    2016-09-16

    The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less

  2. Anisotropy-driven transition from the Moore-Read state to quantum Hall stripes

    NASA Astrophysics Data System (ADS)

    Zhu, Zheng; Sodemann, Inti; Sheng, D. N.; Fu, Liang

    2017-05-01

    We investigate the nature of the quantum Hall liquid in a half-filled second Landau level (n =1 ) as a function of band mass anisotropy using numerical exact diagonalization and density matrix renormalization group methods. We find increasing the mass anisotropy induces a quantum phase transition from the Moore-Read state to a charge density wave state. By analyzing the energy spectrum, guiding center structure factors, and by adding weak pinning potentials, we show that this charge density wave is a unidirectional quantum Hall stripe, which has a periodicity of a few magnetic lengths and survives in the thermodynamic limit. We find smooth profiles for the guiding center occupation function that reveal the strong coupling nature of the array of chiral Luttinger liquids residing at the stripe edges.

  3. A flexible motif search technique based on generalized profiles.

    PubMed

    Bucher, P; Karplus, K; Moeri, N; Hofmann, K

    1996-03-01

    A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.

  4. Motion and Energy Chemical Reactions, Parts One and Two of an Integrated Science Sequence, Teacher's Guide, 1973 Edition.

    ERIC Educational Resources Information Center

    Portland Project Committee, OR.

    This teacher's guide is for the second year of the Portland Project, a three-year integrated secondary science curriculum sequence. The first of two parts in this volume, "Motion and Energy," begins with the study of motion, going from the quantitative description to a consideration of what causes motion and a discussion of Newton's…

  5. Discovering [superscript 13]C NMR, [superscript 1]H NMR, and IR Spectroscopy in the General Chemistry Laboratory through a Sequence of Guided-Inquiry Exercises

    ERIC Educational Resources Information Center

    Iler, H. Darrell; Justice, David; Brauer, Shari; Landis, Amanda

    2012-01-01

    This sequence of three guided-inquiry labs is designed for a second-semester general chemistry course and challenges students to discover basic theoretical principles associated with [superscript 13]C NMR, [superscript 1]H NMR, and IR spectroscopy. Students learn to identify and explain basic concepts of magnetic resonance and vibrational…

  6. Chemistry of Living Matter, Energy Capture & Growth, Parts Three & Four of an Integrated Science Sequence, Teacher's Guide, 1973 Edition.

    ERIC Educational Resources Information Center

    Portland Project Committee, OR.

    This teacher's guide includes parts three and four of the four-part third year Portland Project, a three-year integrated secondary science curriculum sequence. The underlying intention of the third year is to study energy and its importance to life. Energy-related concepts considered in year one and two, and the concepts related to atomic…

  7. Propagation characteristics of ultrasonic guided waves in continuously welded rail

    NASA Astrophysics Data System (ADS)

    Yao, Wenqing; Sheng, Fuwei; Wei, Xiaoyuan; Zhang, Lei; Yang, Yuan

    2017-07-01

    Rail defects cause numerous railway accidents. Trains are derailed and serious consequences often occur. Compared to traditional bulk wave testing, ultrasonic guided waves (UGWs) can provide larger monitoring ranges and complete coverage of the waveguide cross-section. These advantages are of significant importance for the non-destructive testing (NDT) of the continuously welded rail, and the technique is therefore widely used in high-speed railways. UGWs in continuous welded rail (CWR) and their propagation characteristics have been discussed in this paper. Finite element methods (FEMs) were used to accomplish a vibration modal analysis, which is extended by a subsequent dispersion analysis. Wave structure features were illustrated by displacement profiles. It was concluded that guided waves have the ability to detect defects in the rail via choice of proper mode and frequency. Additionally, thermal conduction that is caused by temperature variation in the rail is added into modeling and simulation. The results indicated that unbalanced thermal distribution may lead to the attenuation of UGWs in the rail.

  8. An overview on genome organization of marine organisms.

    PubMed

    Costantini, Maria

    2015-12-01

    In this review we will concentrate on some general genome features of marine organisms and their evolution, ranging from vertebrate to invertebrates until unicellular organisms. Before genome sequencing, the ultracentrifugation in CsCl led to high resolution of mammalian DNA (without seeing at the sequence). The analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong in a small number of families characterized by different GC levels. The recent availability of a number of fully sequenced genomes allowed mapping very precisely the isochores, based on DNA sequences. Since isochores are tightly linked to biological properties such as gene density, replication timing and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function and evolution. This led the current level of knowledge and to further insights. Copyright © 2015. Published by Elsevier B.V.

  9. Droplet barcoding for single cell transcriptomics applied to embryonic stem cells

    PubMed Central

    Klein, Allon M; Mazutis, Linas; Akartuna, Ilke; Tallapragada, Naren; Veres, Adrian; Li, Victor; Peshkin, Leonid; Weitz, David A; Kirschner, Marc W

    2015-01-01

    Summary It has long been the dream of biologists to map gene expression at the single cell level. With such data one might track heterogeneous cell sub-populations, and infer regulatory relationships between genes and pathways. Recently, RNA sequencing has achieved single cell resolution. What is limiting is an effective way to routinely isolate and process large numbers of individual cells for quantitative in-depth sequencing. We have developed a high-throughput droplet-microfluidic approach for barcoding the RNA from thousands of individual cells for subsequent analysis by next-generation sequencing. The method shows a surprisingly low noise profile and is readily adaptable to other sequencing-based assays. We analyzed mouse embryonic stem cells, revealing in detail the population structure and the heterogeneous onset of differentiation after LIF withdrawal. The reproducibility of these high-throughput single cell data allowed us to deconstruct cell populations and infer gene expression relationships. PMID:26000487

  10. Structural phylogeny by profile extraction and multiple superimposition using electrostatic congruence as a discriminator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.

    2013-04-01

    Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less

  11. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  12. The Regulatory and Kinase Domains but Not the Interdomain Linker Determine Human Double-stranded RNA-activated Kinase (PKR) Sensitivity to Inhibition by Viral Non-coding RNAs.

    PubMed

    Sunita, S; Schwartz, Samantha L; Conn, Graeme L

    2015-11-20

    Double-stranded RNA (dsRNA)-activated protein kinase (PKR) is an important component of the innate immune system that presents a crucial first line of defense against viral infection. PKR has a modular architecture comprising a regulatory N-terminal dsRNA binding domain and a C-terminal kinase domain interposed by an unstructured ∼80-residue interdomain linker (IDL). Guided by sequence alignment, we created IDL deletions in human PKR (hPKR) and regulatory/kinase domain swap human-rat chimeric PKRs to assess the contributions of each domain and the IDL to regulation of the kinase activity by RNA. Using circular dichroism spectroscopy, limited proteolysis, kinase assays, and isothermal titration calorimetry, we show that each PKR protein is properly folded with similar domain boundaries and that each exhibits comparable polyinosinic-cytidylic (poly(rI:rC)) dsRNA activation profiles and binding affinities for adenoviral virus-associated RNA I (VA RNAI) and HIV-1 trans-activation response (TAR) RNA. From these results we conclude that the IDL of PKR is not required for RNA binding or mediating changes in protein conformation or domain interactions necessary for PKR regulation by RNA. In contrast, inhibition of rat PKR by VA RNAI and TAR RNA was found to be weaker than for hPKR by 7- and >300-fold, respectively, and each human-rat chimeric domain-swapped protein showed intermediate levels of inhibition. These findings indicate that PKR sequence or structural elements in the kinase domain, present in hPKR but absent in rat PKR, are exploited by viral non-coding RNAs to accomplish efficient inhibition of PKR. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. DNA minor groove electrostatic potential: influence of sequence-specific transitions of the torsion angle gamma and deoxyribose conformations.

    PubMed

    Zhitnikova, M Y; Shestopalova, A V

    2017-11-01

    The structural adjustments of the sugar-phosphate DNA backbone (switching of the γ angle (O5'-C5'-C4'-C3') from canonical to alternative conformations and/or C2'-endo → C3'-endo transition of deoxyribose) lead to the sequence-specific changes in accessible surface area of both polar and non-polar atoms of the grooves and the polar/hydrophobic profile of the latter ones. The distribution of the minor groove electrostatic potential is likely to be changing as a result of such conformational rearrangements in sugar-phosphate DNA backbone. Our analysis of the crystal structures of the short free DNA fragments and calculation of their electrostatic potentials allowed us to determine: (1) the number of classical and alternative γ angle conformations in the free B-DNA; (2) changes in the minor groove electrostatic potential, depending on the conformation of the sugar-phosphate DNA backbone; (3) the effect of the DNA sequence on the minor groove electrostatic potential. We have demonstrated that the structural adjustments of the DNA double helix (the conformations of the sugar-phosphate backbone and the minor groove dimensions) induce changes in the distribution of the minor groove electrostatic potential and are sequence-specific. Therefore, these features of the minor groove sizes and distribution of minor groove electrostatic potential can be used as a signal for recognition of the target DNA sequence by protein in the implementation of the indirect readout mechanism.

  14. Identification of cancer-specific motifs in mimotope profiles of serum antibody repertoire.

    PubMed

    Gerasimov, Ekaterina; Zelikovsky, Alex; Măndoiu, Ion; Ionov, Yurij

    2017-06-07

    For fighting cancer, earlier detection is crucial. Circulating auto-antibodies produced by the patient's own immune system after exposure to cancer proteins are promising bio-markers for the early detection of cancer. Since an antibody recognizes not the whole antigen but 4-7 critical amino acids within the antigenic determinant (epitope), the whole proteome can be represented by a random peptide phage display library. This opens the possibility to develop an early cancer detection test based on a set of peptide sequences identified by comparing cancer patients' and healthy donors' global peptide profiles of antibody specificities. Due to the enormously large number of peptide sequences contained in global peptide profiles generated by next generation sequencing, the large number of cancer and control sera is required to identify cancer-specific peptides with high degree of statistical significance. To decrease the number of peptides in profiles generated by nextgen sequencing without losing cancer-specific sequences we used for generation of profiles the phage library enriched by panning on the pool of cancer sera. To further decrease the complexity of profiles we used computational methods for transforming a list of peptides constituting the mimotope profiles to the list motifs formed by similar peptide sequences. We have shown that the amino-acid order is meaningful in mimotope motifs since they contain significantly more peptides than motifs among peptides where amino-acids are randomly permuted. Also the single sample motifs significantly differ from motifs in peptides drawn from multiple samples. Finally, multiple cancer-specific motifs have been identified.

  15. Biomolecular and clinical practice in malignant pleural mesothelioma and lung cancer: what thoracic surgeons should know†

    PubMed Central

    Opitz, Isabelle; Bueno, Raphael; Lim, Eric; Pass, Harvey; Pastorino, Ugo; Boeri, Mattia; Rocco, Gaetano

    2014-01-01

    Today, molecular-profile-directed therapy is a guiding principle of modern thoracic oncology. The knowledge of new biomolecular technology applied to the diagnosis, prognosis, and treatment of lung cancer and mesothelioma should be part of the 21st century thoracic surgeons' professional competence. The European Society of Thoracic Surgeons (ESTS) Biology Club aims at providing a comprehensive insight into the basic biology of the diseases we are treating. During the 2013 ESTS Annual Meeting, different experts of the field presented the current knowledge about diagnostic and prognostic biomarkers in malignant pleural mesothelioma including new perspectives as well as the role and potential application of microRNA and genomic sequencing for lung cancer, which are summarized in the present article. PMID:24623168

  16. Measuring and Reducing Off-Target Activities of Programmable Nucleases Including CRISPR-Cas9

    PubMed Central

    Koo, Taeyoung; Lee, Jungjoon; Kim, Jin-Soo

    2015-01-01

    Programmable nucleases, which include zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and RNA-guided engineered nucleases (RGENs) repurposed from the type II clustered, regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9) system are now widely used for genome editing in higher eukaryotic cells and whole organisms, revolutionising almost every discipline in biological research, medicine, and biotechnology. All of these nucleases, however, induce off-target mutations at sites homologous in sequence with on-target sites, limiting their utility in many applications including gene or cell therapy. In this review, we compare methods for detecting nuclease off-target mutations. We also review methods for profiling genome-wide off-target effects and discuss how to reduce or avoid off-target mutations. PMID:25985872

  17. Characterization of Omega-WINGS galaxy clusters. I. Stellar light and mass profiles

    NASA Astrophysics Data System (ADS)

    Cariddi, S.; D'Onofrio, M.; Fasano, G.; Poggianti, B. M.; Moretti, A.; Gullieuszik, M.; Bettoni, D.; Sciarratta, M.

    2018-02-01

    Context. Galaxy clusters are the largest virialized structures in the observable Universe. Knowledge of their properties provides many useful astrophysical and cosmological information. Aims: Our aim is to derive the luminosity and stellar mass profiles of the nearby galaxy clusters of the Omega-WINGS survey and to study the main scaling relations valid for such systems. Methods: We merged data from the WINGS and Omega-WINGS databases, sorted the sources according to the distance from the brightest cluster galaxy (BCG), and calculated the integrated luminosity profiles in the B and V bands, taking into account extinction, photometric and spatial completeness, K correction, and background contribution. Then, by exploiting the spectroscopic sample we derived the stellar mass profiles of the clusters. Results: We obtained the luminosity profiles of 46 galaxy clusters, reaching r200 in 30 cases, and the stellar mass profiles of 42 of our objects. We successfully fitted all the integrated luminosity growth profiles with one or two embedded Sérsic components, deriving the main clusters parameters. Finally, we checked the main scaling relation among the clusters parameters in comparison with those obtained for a selected sample of early-type galaxies (ETGs) of the same clusters. Conclusions: We found that the nearby galaxy clusters are non-homologous structures such as ETGs and exhibit a color-magnitude (CM) red-sequence relation very similar to that observed for galaxies in clusters. These properties are not expected in the current cluster formation scenarios. In particular the existence of a CM relation for clusters, shown here for the first time, suggests that the baryonic structures grow and evolve in a similar way at all scales.

  18. Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.

    PubMed

    Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G

    2010-06-01

    The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.

  19. Reference-guided assembly of four diverse Arabidopsis thaliana genomes

    PubMed Central

    Schneeberger, Korbinian; Ossowski, Stephan; Ott, Felix; Klein, Juliane D.; Wang, Xi; Lanz, Christa; Smith, Lisa M.; Cao, Jun; Fitz, Joffrey; Warthmann, Norman; Henz, Stefan R.; Huson, Daniel H.; Weigel, Detlef

    2011-01-01

    We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html. PMID:21646520

  20. Drafting Lab Management Guide.

    ERIC Educational Resources Information Center

    Ohio State Univ., Columbus. Instructional Materials Lab.

    This manual was developed to guide drafting instructors and vocational supervisors in sequencing laboratory instruction and controlling the flow of work for a 2-year machine trades training program. The first part of the guide provides information on program management (program description, safety concerns, academic issues, implementation…

Top