Penetration with Long Rods: A Theoretical Framework and Comparison with Instrumented Impacts
1981-05-01
program to begin probing the details of the interaction process. The theoretical framework underlying such a program is explained in detail. The theory of...of the time sequence of events during penetration. Data from one series of experiments, reported in detail elsewhere, is presented and discussed within the theoretical framework .
USDA-ARS?s Scientific Manuscript database
The availability of complete or nearly complete genome sequences from several plant species permits detailed discovery and cross-species comparison of transposable elements (TEs) at the whole genome level. We initially investigated 510 LTR-retrotransposon (LTR-RT) families that are comprised of 32,...
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis
Steele, Joe; Bastola, Dhundy
2014-01-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base–base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel–Ziv techniques from data compression. PMID:23904502
Modern Computational Techniques for the HMMER Sequence Analysis
2013-01-01
This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
Bidirectional Retroviral Integration Site PCR Methodology and Quantitative Data Analysis Workflow.
Suryawanshi, Gajendra W; Xu, Song; Xie, Yiming; Chou, Tom; Kim, Namshin; Chen, Irvin S Y; Kim, Sanggu
2017-06-14
Integration Site (IS) assays are a critical component of the study of retroviral integration sites and their biological significance. In recent retroviral gene therapy studies, IS assays, in combination with next-generation sequencing, have been used as a cell-tracking tool to characterize clonal stem cell populations sharing the same IS. For the accurate comparison of repopulating stem cell clones within and across different samples, the detection sensitivity, data reproducibility, and high-throughput capacity of the assay are among the most important assay qualities. This work provides a detailed protocol and data analysis workflow for bidirectional IS analysis. The bidirectional assay can simultaneously sequence both upstream and downstream vector-host junctions. Compared to conventional unidirectional IS sequencing approaches, the bidirectional approach significantly improves IS detection rates and the characterization of integration events at both ends of the target DNA. The data analysis pipeline described here accurately identifies and enumerates identical IS sequences through multiple steps of comparison that map IS sequences onto the reference genome and determine sequencing errors. Using an optimized assay procedure, we have recently published the detailed repopulation patterns of thousands of Hematopoietic Stem Cell (HSC) clones following transplant in rhesus macaques, demonstrating for the first time the precise time point of HSC repopulation and the functional heterogeneity of HSCs in the primate system. The following protocol describes the step-by-step experimental procedure and data analysis workflow that accurately identifies and quantifies identical IS sequences.
Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes.
Kahlau, Sabine; Aspinall, Sue; Gray, John C; Bock, Ralph
2006-08-01
Tomato, Solanum lycopersicum (formerly Lycopersicon esculentum), has long been one of the classical model species of plant genetics. More recently, solanaceous species have become a model of evolutionary genomics, with several EST projects and a tomato genome project having been initiated. As a first contribution toward deciphering the genetic information of tomato, we present here the complete sequence of the tomato chloroplast genome (plastome). The size of this circular genome is 155,461 base pairs (bp), with an average AT content of 62.14%. It contains 114 genes and conserved open reading frames (ycfs). Comparison with the previously sequenced plastid DNAs of Nicotiana tabacum and Atropa belladonna reveals patterns of plastid genome evolution in the Solanaceae family and identifies varying degrees of conservation of individual plastid genes. In addition, we discovered several new sites of RNA editing by cytidine-to-uridine conversion. A detailed comparison of editing patterns in the three solanaceous species highlights the dynamics of RNA editing site evolution in chloroplasts. To assess the level of intraspecific plastome variation in tomato, the plastome of a second tomato cultivar was sequenced. Comparison of the two genotypes (IPA-6, bred in South America, and Ailsa Craig, bred in Europe) revealed no nucleotide differences, suggesting that the plastomes of modern tomato cultivars display very little, if any, sequence variation.
Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis.
Bonham-Carter, Oliver; Steele, Joe; Bastola, Dhundy
2014-11-01
Modern sequencing and genome assembly technologies have provided a wealth of data, which will soon require an analysis by comparison for discovery. Sequence alignment, a fundamental task in bioinformatics research, may be used but with some caveats. Seminal techniques and methods from dynamic programming are proving ineffective for this work owing to their inherent computational expense when processing large amounts of sequence data. These methods are prone to giving misleading information because of genetic recombination, genetic shuffling and other inherent biological events. New approaches from information theory, frequency analysis and data compression are available and provide powerful alternatives to dynamic programming. These new methods are often preferred, as their algorithms are simpler and are not affected by synteny-related problems. In this review, we provide a detailed discussion of computational tools, which stem from alignment-free methods based on statistical analysis from word frequencies. We provide several clear examples to demonstrate applications and the interpretations over several different areas of alignment-free analysis such as base-base correlations, feature frequency profiles, compositional vectors, an improved string composition and the D2 statistic metric. Additionally, we provide detailed discussion and an example of analysis by Lempel-Ziv techniques from data compression. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
[The dilemma of data flood - reducing costs and increasing quality control].
Gassmann, B
2012-09-05
Digitization is found everywhere in sonography. Printing of ultrasound images using the videoprinter with special paper will be done in single cases. The documentation of sonography procedures is more and more done by saving image sequences instead of still frames. Echocardiography is routinely recorded in between with so called R-R-loops. Doing contrast enhanced ultrasound recording of sequences is necessary to get a deep impression of the vascular structure of interest. Working with this data flood in daily practice a specialized software is required. Comparison in follow up of stored and recent images/sequences is very helpful. Nevertheless quality control of the ultrasound system and the transducers is simple and safe - using a phantom for detail resolution and general image quality the stored images/sequences are comparable over the life cycle of the system. The comparison in follow up is showing decreased image quality and transducer defects immediately.
NASA Astrophysics Data System (ADS)
Liang, G. Y.; Badnell, N. R.
2011-04-01
We present results for the electron-impact excitation of all Li-like ions from Be+ to Kr33+ which we obtained using the radiation- and Auger-damped intermediate-coupling frame transformation R-matrix approach. We have included both valence- and core-electron excitations up to the 1s25l and 1s2l4l' levels, respectively. A detailed comparison of the target structure and collision data has been made for four specific ions (O5+, Ar15+, Fe23+ and Kr33+) spanning the sequence so as to assess the accuracy for the entire sequence. Effective collision strengths (Υs) are presented at temperatures ranging from 2 × 102(z + 1)2 K to 2 × 106(z + 1)2 K (where z is the residual charge of the ions, i.e. Z - 3). Detailed comparisons for the Υs are made with the results of previous calculations for several ions which span the sequence. The radiation and Auger damping effects were explored for core-excitations along the iso-electronic sequence. Furthermore, we examined the iso-electronic trends of effective collision strengths as a function of temperature. These data are made available in the archives of APAP via http://www.apap-network.org, OPEN-ADAS via http://open.adas.ac.uk, as well as anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/528/A69
Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies
Lasitschka, Bärbel; Jones, David; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland
2013-01-01
The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes. PMID:23776689
A DS-UWB Cognitive Radio System Based on Bridge Function Smart Codes
NASA Astrophysics Data System (ADS)
Xu, Yafei; Hong, Sheng; Zhao, Guodong; Zhang, Fengyuan; di, Jinshan; Zhang, Qishan
This paper proposes a direct-sequence UWB Gaussian pulse of cognitive radio systems based on bridge function smart sequence matrix and the Gaussian pulse. As the system uses the spreading sequence code, that is the bridge function smart code sequence, the zero correlation zones (ZCZs) which the bridge function sequences' auto-correlation functions had, could reduce multipath fading of the pulse interference. The Modulated channel signal was sent into the IEEE 802.15.3a UWB channel. We analysis the ZCZs's inhibition to the interference multipath interference (MPI), as one of the main system sources interferences. The simulation in SIMULINK/MATLAB is described in detail. The result shows the system has better performance by comparison with that employing Walsh sequence square matrix, and it was verified by the formula in principle.
A COMPARISON OF RESPONSE CONFIRMATION TECHNIQUES FOR AN ADJUNCTIVE SELF-STUDY PROGRAM.
ERIC Educational Resources Information Center
MEYER, DONALD E.
AN EXPERIMENT COMPARED THE EFFECTIVENESS OF FOUR METHODS OF CONFIRMING RESPONSES TO AN ADJUNCTIVE SELF-STUDY PROGRAM. THE PROGRAM WAS DESIGNED FOR AIR FORCE AIRCREWS UNDERTAKING A REFRESHER COURSE IN ENGINEERING. A SERIES OF SEQUENCED MULTIPLE CHOICE QUESTIONS EACH REFERRED TO A PAGE AND PARAGRAPH OF A PUBLICATION CONTAINING DETAILED INFORMATION…
Gruenstaeudl, Michael; Gerschler, Nico; Borsch, Thomas
2018-06-21
The sequencing and comparison of plastid genomes are becoming a standard method in plant genomics, and many researchers are using this approach to infer plant phylogenetic relationships. Due to the widespread availability of next-generation sequencing, plastid genome sequences are being generated at breakneck pace. This trend towards massive sequencing of plastid genomes highlights the need for standardized bioinformatic workflows. In particular, documentation and dissemination of the details of genome assembly, annotation, alignment and phylogenetic tree inference are needed, as these processes are highly sensitive to the choice of software and the precise settings used. Here, we present the procedure and results of sequencing, assembling, annotating and quality-checking of three complete plastid genomes of the aquatic plant genus Cabomba as well as subsequent gene alignment and phylogenetic tree inference. We accompany our findings by a detailed description of the bioinformatic workflow employed. Importantly, we share a total of eleven software scripts for each of these bioinformatic processes, enabling other researchers to evaluate and replicate our analyses step by step. The results of our analyses illustrate that the plastid genomes of Cabomba are highly conserved in both structure and gene content.
Roca, Alberto I
2014-01-01
The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J; Burnett, John C; Zhou, Jiehua
2016-09-22
The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct "biased sequences" and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the "biased sequences" was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy.
Mashiyama, Susan T.; Koupparis, Kyriacos; Caffrey, Conor R.; McKerrow, James H.; Babbitt, Patricia C.
2012-01-01
We performed a genome-level computational study of sequence and structure similarity, the latter using crystal structures and models, of the proteases of Homo sapiens and the human parasite Trypanosoma brucei. Using sequence and structure similarity networks to summarize the results, we constructed global views that show visually the relative abundance and variety of proteases in the degradome landscapes of these two species, and provide insights into evolutionary relationships between proteases. The results also indicate how broadly these sequence sets are covered by three-dimensional structures. These views facilitate cross-species comparisons and offer clues for drug design from knowledge about the sequences and structures of potential drug targets and their homologs. Two protease groups (“M32” and “C51”) that are very different in sequence from human proteases are examined in structural detail, illustrating the application of this global approach in mining new pathogen genomes for potential drug targets. Based on our analyses, a human ACE2 inhibitor was selected for experimental testing on one of these parasite proteases, TbM32, and was shown to inhibit it. These sequence and structure data, along with interactive versions of the protein similarity networks generated in this study, are available at http://babbittlab.ucsf.edu/resources.html. PMID:23236535
Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin
2017-01-21
RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi
2015-07-01
A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Zhang, Wenwei; Cheng, Zhuomin; Xu, Lei; Wu, Maosen; Waterhouse, Peter; Zhou, Guanghe; Li, Shifang
2009-01-01
The complete nucleotide sequence of the ssRNA genome of a Chinese GPV isolate of barley yellow dwarf virus (BYDV) was determined. It comprised 5673 nucleotides, and the deduced genome organization resembled that of members of the genus Polerovirus. It was most closely related to cereal yellow dwarf virus-RPV (77% nt identity over the entire genome; coat protein amino acid identity 79%). The GPV isolate also differs in vector specificity from other BYDV strains. Biological properties, phylogenetic analyses and detailed sequence comparisons suggest that GPV should be considered a member of a new species within the genus, and the name Wheat yellow dwarf virus-GPV is proposed.
Wada, H; Satoh, N
1994-01-01
Almost the entire sequences of 18S rDNA were determined for two chaetognaths, five echinoderms, a hemichordate, and two urochordates (a larvacean and a salp). Phylogenetic comparisons of the sequences, together with those of other deuterostomes (an ascidian, a cephalochordate, and vertebrates) and protostomes (an arthropod and a mollusc), suggest the monophyly of the deuterostomes, with the exception of the chaetognaths. Chaetognaths may not be a group of deuterostomes. The deuterostome group closest to vertebrates was the group of cephalochordates. Ascidians, larvaceans, and salps seem to form a discrete group (urochordates), in which the early divergence of larvaceans is evident. These results support the hypothesis that chordates evolved from free-living ancestors. PMID:8127885
Montoya, Leticia; Bandala, Victor M; Garay-Serrano, Edith
2015-08-01
Two pure Alnus acuminata stands established in a montane forest in central Mexico (Puebla State) were monitored between 2010 and 2013 to confirm and recognize the ectomycorrhizal (EcM) systems of A. acuminata with Lactarius cuspidoaurantiacus and Lactarius herrerae, two recently described species. Through comparison of internal transcribed spacer (ITS) of nuclear ribosomal DNA sequences from basidiomes and ectomycorrhizas sampled in the forest stands, we confirmed their ectomycorrhizal association. The phytobiont was corroborated by comparing ITS sequences obtained from EcM root tips and leaves collected in the study site and from other sequences of A. acuminata available in Genbank. Detailed morphological and anatomical descriptions of the ectomycorrhizal systems are presented and complemented with photographs.
Goossens, Dirk; Moens, Lotte N; Nelis, Eva; Lenaerts, An-Sofie; Glassee, Wim; Kalbe, Andreas; Frey, Bruno; Kopal, Guido; De Jonghe, Peter; De Rijk, Peter; Del-Favero, Jurgen
2009-03-01
We evaluated multiplex PCR amplification as a front-end for high-throughput sequencing, to widen the applicability of massive parallel sequencers for the detailed analysis of complex genomes. Using multiplex PCR reactions, we sequenced the complete coding regions of seven genes implicated in peripheral neuropathies in 40 individuals on a GS-FLX genome sequencer (Roche). The resulting dataset showed highly specific and uniform amplification. Comparison of the GS-FLX sequencing data with the dataset generated by Sanger sequencing confirmed the detection of all variants present and proved the sensitivity of the method for mutation detection. In addition, we showed that we could exploit the multiplexed PCR amplicons to determine individual copy number variation (CNV), increasing the spectrum of detected variations to both genetic and genomic variants. We conclude that our straightforward procedure substantially expands the applicability of the massive parallel sequencers for sequencing projects of a moderate number of amplicons (50-500) with typical applications in resequencing exons in positional or functional candidate regions and molecular genetic diagnostics. 2008 Wiley-Liss, Inc.
2014-01-01
Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393
Memory for sequences of events impaired in typical aging.
Allen, Timothy A; Morris, Andrea M; Stark, Shauna M; Fortin, Norbert J; Stark, Craig E L
2015-03-01
Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18-22 yr) and older adults (62-86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented "in sequence" or "out of sequence." Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence ("Repeats"; e.g., AB A: DEF), (ii) skipping ahead in the sequence ("Skips"; e.g., AB D: DEF), and (iii) inserting an item from a different sequence into the same ordinal position ("Ordinal Transfers"; e.g., AB 3: DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the capacity to remember sequences of events is fundamentally affected by typical aging. © 2015 Allen et al.; Published by Cold Spring Harbor Laboratory Press.
Memory for sequences of events impaired in typical aging
Allen, Timothy A.; Morris, Andrea M.; Stark, Shauna M.; Fortin, Norbert J.
2015-01-01
Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18–22 yr) and older adults (62–86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented “in sequence” or “out of sequence.” Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence (“Repeats”; e.g., ABADEF), (ii) skipping ahead in the sequence (“Skips”; e.g., ABDDEF), and (iii) inserting an item from a different sequence into the same ordinal position (“Ordinal Transfers”; e.g., AB3DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the capacity to remember sequences of events is fundamentally affected by typical aging. PMID:25691514
Jeong, Man-Ki; Soh, Ho Young; Wi, Jin Hee; Suh, Hae-Lip
2018-01-01
Notomastus koreanus sp. n. , collected from the sublittoral muddy bottom of Korean waters, is described as a new species. The Korean new species closely resembles N. torquatus Hutchings & Rainer, 1979 in the chaetal arrangement and the details of abdominal segments, but differs in the position of genital pores and the absence of eyes. DNA sequences (mtCOI, 16S rRNA, and histone H3) of the new species were compared with all the available sequences of Notomastus species in the GenBank database. Three genes showed significant genetic differences between the new species and its congeners (COI: 51.2%, 16S: 38.1-47.3%, H3: 3.7-9.3%). This study also includes a comprehensive comparison of the new Korean Notomastus species with its most closely similar species, based on the morphological and genetic results.
Haygood, M G
1990-01-01
Flashlight fishes (family Anomalopidae) have light organs that contain luminous bacterial symbionts. Although the symbionts have not yet been successfully cultured, the luciferase genes have been cloned directly from the light organ of the Caribbean species, Kryptophanaron alfredi. The goal of this project was to evaluate the relationship of the symbiont to free-living luminous bacteria by comparison of genes coding for bacterial luciferase (lux genes). Hybridization of a lux AB probe from the Kryptophanaron alfredi symbiont to DNAs from 9 strains (8 species) of luminous bacteria showed that none of the strains tested had lux genes highly similar to the symbiont. The most similar were a group consisting of Vibrio harveyi, Vibrio splendidus and Vibrio orientalis. The nucleotide sequence of the luciferase alpha subunit gene luxA) of the Kryptophanaron alfredi symbiont was determined in order to do a more detailed comparison with published luxA sequences from Vibrio harveyi, Vibrio fischeri and Photobacterium leiognathi. The hybridization results, sequence comparisons and the mol% G + C of the Kryptophanaron alfredi symbiont luxA gene suggest that the symbiont may be considered as a new species of luminous Vibrio related to Vibrio harveyi.
Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes
Cannon, Steven B.; Sterck, Lieven; Rombauts, Stephane; Sato, Shusei; Cheung, Foo; Gouzy, Jérôme; Wang, Xiaohong; Mudge, Joann; Vasdewani, Jayprakash; Schiex, Thomas; Spannagl, Manuel; Monaghan, Erin; Nicholson, Christine; Humphray, Sean J.; Schoof, Heiko; Mayer, Klaus F. X.; Rogers, Jane; Quétier, Francis; Oldroyd, Giles E.; Debellé, Frédéric; Cook, Douglas R.; Retzel, Ernest F.; Roe, Bruce A.; Town, Christopher D.; Tabata, Satoshi; Van de Peer, Yves; Young, Nevin D.
2006-01-01
Genome sequencing of the model legumes, Medicago truncatula and Lotus japonicus, provides an opportunity for large-scale sequence-based comparison of two genomes in the same plant family. Here we report synteny comparisons between these species, including details about chromosome relationships, large-scale synteny blocks, microsynteny within blocks, and genome regions lacking clear correspondence. The Lotus and Medicago genomes share a minimum of 10 large-scale synteny blocks, each with substantial collinearity and frequently extending the length of whole chromosome arms. The proportion of genes syntenic and collinear within each synteny block is relatively homogeneous. Medicago–Lotus comparisons also indicate similar and largely homogeneous gene densities, although gene-containing regions in Mt occupy 20–30% more space than Lj counterparts, primarily because of larger numbers of Mt retrotransposons. Because the interpretation of genome comparisons is complicated by large-scale genome duplications, we describe synteny, synonymous substitutions and phylogenetic analyses to identify and date a probable whole-genome duplication event. There is no direct evidence for any recent large-scale genome duplication in either Medicago or Lotus but instead a duplication predating speciation. Phylogenetic comparisons place this duplication within the Rosid I clade, clearly after the split between legumes and Salicaceae (poplar). PMID:17003129
The zebrafish reference genome sequence and its relationship to the human genome.
Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L
2013-04-25
Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
The zebrafish reference genome sequence and its relationship to the human genome
Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.
2013-01-01
Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743
Use of mutation spectra analysis software.
Rogozin, I; Kondrashov, F; Glazko, G
2001-02-01
The study and comparison of mutation(al) spectra is an important problem in molecular biology, because these spectra often reflect on important features of mutations and their fixation. Such features include the interaction of DNA with various mutagens, the function of repair/replication enzymes, and properties of target proteins. It is known that mutability varies significantly along nucleotide sequences, such that mutations often concentrate at certain positions, called "hotspots," in a sequence. In this paper, we discuss in detail two approaches for mutation spectra analysis: the comparison of mutation spectra with a HG-PUBL program, (FTP: sunsite.unc.edu/pub/academic/biology/dna-mutations/hyperg) and hotspot prediction with the CLUSTERM program (www.itba.mi.cnr.it/webmutation; ftp.bionet.nsc.ru/pub/biology/dbms/clusterm.zip). Several other approaches for mutational spectra analysis, such as the analysis of a target protein structure, hotspot context revealing, multiple spectra comparisons, as well as a number of mutation databases are briefly described. Mutation spectra in the lacI gene of E. coli and the human p53 gene are used for illustration of various difficulties of such analysis. Copyright 2001 Wiley-Liss, Inc.
On the normalization of the minimum free energy of RNAs by sequence length.
Trotta, Edoardo
2014-01-01
The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.
On the Normalization of the Minimum Free Energy of RNAs by Sequence Length
Trotta, Edoardo
2014-01-01
The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875
Steiner, S; Vogl, T J; Fischer, P; Steger, W; Neuhaus, P; Keck, H
1995-08-01
The aim of our study was to evaluate a T2-weighted turbo-spinecho sequence in comparison to a T2-weighted spinecho sequence in imaging focal liver lesions. In our study 35 patients with suspected focal liver lesions were examined. Standardised imaging protocol included a conventional T2-weighted SE sequence (TR/TE = 2000/90/45, acquisition time = 10.20) as well as a T2-weighted TSE sequence (TR/TE = 4700/90, acquisition time = 6.33). Calculation of S/N and C/N ratio as a basis of quantitative evaluation was done using standard methods. A diagnostic score was implemented to enable qualitative assessment. In 7% (n = 2) the TSE sequence enabled detection of further liver lesions showing a size of less than 1 cm in diameter. Comparing anatomical details the TSE sequence was superior. S/N and C/N ratio of anatomic and pathologic structures of the TSE sequence were higher compared to results of the SE sequence. Our results indicate that the T2-weighted turbo-spinecho sequence is well appropriate for imaging focal liver lesions, and leads to reduction of imaging time.
High-resolution MRI of cranial nerves in posterior fossa at 3.0 T.
Guo, Zi-Yi; Chen, Jing; Liang, Qi-Zhou; Liao, Hai-Yan; Cheng, Qiong-Yue; Fu, Shui-Xi; Chen, Cai-Xiang; Yu, Dan
2013-02-01
To evaluate the influence of high-resolution imaging obtainable with the higher field strength of 3.0 T on the visualization of the brain nerves in the posterior fossa. In total, 20 nerves were investigated on MRI of 12 volunteers each and selected for comparison, respectively, with the FSE sequences with 5 mm and 2 mm section thicknesses and gradient recalled echo (GRE) sequences acquired with a 3.0-T scanner. The MR images were evaluated by three independent readers who rated image quality according to depiction of anatomic detail and contrast with use of a rating scale. In general, decrease of the slice thickness showed a significant increase in the detection of nerves as well as in the image quality characteristics. Comparing FSE and GRE imaging, the course of brain nerves and brainstem vessels was visualized best with use of the three-dimensional (3D) pulse sequence. The comparison revealed the clear advantage of a thin section. The increased resolution enabled immediate identification of all brainstem nerves. GRE sequence most distinctly and confidently depicted pertinent structures and enables 3D reconstruction to illustrate complex relations of the brainstem. Copyright © 2013 Hainan Medical College. Published by Elsevier B.V. All rights reserved.
Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J.; Burnett, John C.; Zhou, Jiehua
2016-01-01
The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct “biased sequences” and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the “biased sequences” was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy. PMID:27652575
NASA Astrophysics Data System (ADS)
Doronin, Alexander; Meglinski, Igor
2017-02-01
Current report considers development of a unified Monte Carlo (MC) -based computational model for simulation of propagation of Laguerre-Gaussian (LG) beams in turbid tissue-like scattering medium. With a primary goal to proof the concept of using complex light for tissue diagnosis we explore propagation of LG beams in comparison with Gaussian beams for both linear and circular polarization. MC simulations of radially and azimuthally polarized LG beams in turbid media have been performed, classic phenomena such as preservation of the orbital angular momentum, optical memory and helicity flip are observed, detailed comparison is presented and discussed.
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
2012-01-01
Background Genetic mapping and QTL detection are powerful methodologies in plant improvement and breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production of superior grapes to meet human demand. High throughput and low cost of the recently developed next generation sequencing (NGS) technology have resulted in its wide application in genome research. Sequencing restriction-site associated DNA (RAD) might be an efficient strategy to simplify genotyping. Combining NGS with RAD has proven to be powerful for single nucleotide polymorphism (SNP) marker development. Results An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the published Vitis vinifera genome revealed both conservation and variations. Conclusions The applicability of next generation RAD sequencing for genotyping a grape F1 population was demonstrated, leading to the successful development of a genetic map with high density and quality using our designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety of genome investigations, such as QTL detection, sequence assembly and genome comparison. PMID:22908993
Amino acid sequence of the smaller basic protein from rat brain myelin
Dunkley, Peter R.; Carnegie, Patrick R.
1974-01-01
1. The complete amino acid sequence of the smaller basic protein from rat brain myelin was determined. This protein differs from myelin basic proteins of other species in having a deletion of a polypeptide of 40 amino acid residues from the centre of the molecule. 2. A detailed comparison is made of the constant and variable regions in a group of myelin basic proteins from six species. 3. An arginine residue in the rat protein was found to be partially methylated. The ratio of methylated to unmethylated arginine at this position differed from that found for the human basic protein. 4. Three tryptic peptides were isolated in more than one form. The differences between the two forms of each peptide are discussed in relation to the electrophoretic heterogeneity of myelin basic proteins, which is known to occur at alkaline pH values. 5. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50029 at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1973) 131, 5. PMID:4141893
Whole-genome comparative analysis of three phytopathogenic Xylella fastidiosa strains.
Bhattacharyya, Anamitra; Stilwagen, Stephanie; Ivanova, Natalia; D'Souza, Mark; Bernal, Axel; Lykidis, Athanasios; Kapatral, Vinayak; Anderson, Iain; Larsen, Niels; Los, Tamara; Reznik, Gary; Selkov, Eugene; Walunas, Theresa L; Feil, Helene; Feil, William S; Purcell, Alexander; Lassez, Jean-Louis; Hawkins, Trevor L; Haselkorn, Robert; Overbeek, Ross; Predki, Paul F; Kyrpides, Nikos C
2002-09-17
Xylella fastidiosa (Xf) causes wilt disease in plants and is responsible for major economic and crop losses globally. Owing to the public importance of this phytopathogen we embarked on a comparative analysis of the complete genome of Xf pv citrus and the partial genomes of two recently sequenced strains of this species: Xf pv almond and Xf pv oleander, which cause leaf scorch in almond and oleander plants, respectively. We report a reanalysis of the previously sequenced Xf 9a5c (CVC, citrus) strain and the two "gapped" Xf genomes revealing ORFs encoding critical functions in pathogenicity and conjugative transfer. Second, a detailed whole-genome functional comparison was based on the three sequenced Xf strains, identifying the unique genes present in each strain, in addition to those shared between strains. Third, an "in silico" cellular reconstruction of these organisms was made, based on a comparison of their core functional subsystems that led to a characterization of their conjugative transfer machinery, identification of potential differences in their adhesion mechanisms, and highlighting of the absence of a classical quorum-sensing mechanism. This study demonstrates the effectiveness of comparative analysis strategies in the interpretation of genomes that are closely related.
Metal-poor stars. IV - The evolution of red giants.
NASA Technical Reports Server (NTRS)
Rood, R. T.
1972-01-01
Detailed evolutionary calculations for six Population-II red giants are presented. The first five of these models are followed from the zero age main sequence to the onset of the helium flash. The sixth model allows the effect of direct electron-neutrino interactions to be estimated. The updated input physics and evolutionary code are described briefly. The results of the calculations are presented in a manner pertinent to later stages of evolutions and suitable for comparison with observations.
Infrared space observatory photometry of circumstellar dust in Vega-type systems
NASA Technical Reports Server (NTRS)
Fajardo-Acosta, S. B.; Stencel, R. E.; Backman, D. E.; Thakur, N.
1998-01-01
The ISOPHOT (Infrared Space Observatory Photometry) instrument onboard the Infrared Space Observatory (ISO) was used to obtain 3.6-90 micron photometry of Vega-type systems. Photometric data were calibrated with the ISOPHOT fine calibration source 1 (FCS1). Linear regression was used to derive transformations to make comparisons to ground-based and IRAS photometry systems possible. These transformations were applied to the photometry of 14 main-sequence stars. Details of these results are reported on.
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka
2009-01-01
Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria. PMID:19440252
Cerenius, Lage; Liu, Haipeng; Zhang, Yanjiao; Rimphanitchayakit, Vichien; Tassanakajon, Anchalee; Gunnar Andersson, M; Söderhäll, Kenneth; Söderhäll, Irene
2010-01-01
Crustacean hemocytes were found to produce a large number of transcripts coding for Kazal-type proteinase inhibitors (KPIs). A detailed study performed with the crayfish Pacifastacus leniusculus and the shrimp Penaeus monodon revealed the presence of at least 26 and 20 different Kazal domains from the hemocyte KPIs, respectively. Comparisons with KPIs from other taxa indicate that the sequences of these domains evolve rapidly. A few conserved positions, e.g. six invariant cysteines were present in all domain sequences whereas the position of P1 amino acid, a determinant for substrate specificity, varied highly. A study with a single crayfish animal suggested that even at the individual level considerable sequence variability among hemocyte KPIs produced exist. Expression analysis of four crayfish KPI transcripts in hematopoietic tissue cells and different hemocyte types suggest that some of these KPIs are likely to be involved in hematopoiesis or hemocyte release as they were produced in particular hemocyte types or maturation stages only.
Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka
2009-01-01
Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria.
Content, Structure, and Sequence of the Detailing Discipline at Kendall College of Art and Design.
ERIC Educational Resources Information Center
Mulder, Bruce E.
A study identified the appropriate general content, structure, and sequence for a detailing discipline that promoted student achievement to professional levels. Its focus was the detailing discipline, a sequence of studio courses within the furniture design program at Kendall College of Art and Design, Grand Rapids, Michigan. (Detailing, an…
2009-01-01
Background Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines. Results The complete plastid genome of P. argentatum is 152,803 bp. Based on the overall comparison of individual protein coding genes with those in L. sativa, G. abyssinica and H. annuus, we demonstrate that the P. argentatum chloroplast genome sequence is most closely related to that of H. annuus. Similar to chloroplast genomes in G. abyssinica, L. sativa and H. annuus, the plastid genome of P. argentatum has a large 23 kb inversion with a smaller 3.4 kb inversion, within the large inversion. Using the matK and psbA-trnH spacer chloroplast DNA barcodes, three of the four Parthenium species tested, P. tomentosum, P. hysterophorus and P. schottii, can be differentiated from P. argentatum. In addition, we identified lines within P. argentatum. Conclusion The genome sequence of the P. argentatum chloroplast will enrich the sequence resources of plastid genomes in commercial crops. The availability of the complete plastid genome sequence may facilitate transformation efficiency by using the precise sequence of endogenous flanking sequences and regulatory elements in chloroplast transformation vectors. The DNA barcoding study forms the foundation for genetic identification of commercially significant lines of P. argentatum that are important for producing latex. PMID:19917140
Yu, Danna; Fang, Xindong; Storey, Kenneth B; Zhang, Yongpu; Zhang, Jiayong
2016-05-01
The complete mitochondrial genomes of the yellow-bellied slider (Trachemys scripta scripta) and anoxia tolerant red-eared slider (Trachemys scripta elegans) turtles were sequenced to analyze gene arrangement. The complete mt genomes of T. s. scripta and elegans were circular molecules of 16,791 bp and 16,810 bp in length, respectively, and included an A + 1 frameshift insertion in ND3 and ND4L genes. The AT content of the overall base composition of scripta and elegans was 61.2%. Nucleotide sequence divergence of the mt-genome (p distance) between scripta and elegans was 0.4%. A detailed comparison between the mitochondrial genomes of the two subspecies is shown.
Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.
2013-01-01
Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121
Koide, Tie; Zaini, Paulo A; Moreira, Leandro M; Vêncio, Ricardo Z N; Matsukuma, Adriana Y; Durham, Alan M; Teixeira, Diva C; El-Dorry, Hamza; Monteiro, Patrícia B; da Silva, Ana Claudia R; Verjovski-Almeida, Sergio; da Silva, Aline M; Gomes, Suely L
2004-08-01
Xylella fastidiosa is a phytopathogenic bacterium that causes serious diseases in a wide range of economically important crops. Despite extensive comparative analyses of genome sequences of Xylella pathogenic strains from different plant hosts, nonpathogenic strains have not been studied. In this report, we show that X. fastidiosa strain J1a12, associated with citrus variegated chlorosis (CVC), is nonpathogenic when injected into citrus and tobacco plants. Furthermore, a DNA microarray-based comparison of J1a12 with 9a5c, a CVC strain that is highly pathogenic and had its genome completely sequenced, revealed that 14 coding sequences of strain 9a5c are absent or highly divergent in strain J1a12. Among them, we found an arginase and a fimbrial adhesin precursor of type III pilus, which were confirmed to be absent in the nonpathogenic strain by PCR and DNA sequencing. The absence of arginase can be correlated to the inability of J1a12 to multiply in host plants. This enzyme has been recently shown to act as a bacterial survival mechanism by down-regulating host nitric oxide production. The lack of the adhesin precursor gene is in accordance with the less aggregated phenotype observed for J1a12 cells growing in vitro. Thus, the absence of both genes can be associated with the failure of the J1a12 strain to establish and spread in citrus and tobacco plants. These results provide the first detailed comparison between a nonpathogenic strain and a pathogenic strain of X. fastidiosa, constituting an important step towards understanding the molecular basis of the disease.
Koide, Tie; Zaini, Paulo A.; Moreira, Leandro M.; Vêncio, Ricardo Z. N.; Matsukuma, Adriana Y.; Durham, Alan M.; Teixeira, Diva C.; El-Dorry, Hamza; Monteiro, Patrícia B.; da Silva, Ana Claudia R.; Verjovski-Almeida, Sergio; da Silva, Aline M.; Gomes, Suely L.
2004-01-01
Xylella fastidiosa is a phytopathogenic bacterium that causes serious diseases in a wide range of economically important crops. Despite extensive comparative analyses of genome sequences of Xylella pathogenic strains from different plant hosts, nonpathogenic strains have not been studied. In this report, we show that X. fastidiosa strain J1a12, associated with citrus variegated chlorosis (CVC), is nonpathogenic when injected into citrus and tobacco plants. Furthermore, a DNA microarray-based comparison of J1a12 with 9a5c, a CVC strain that is highly pathogenic and had its genome completely sequenced, revealed that 14 coding sequences of strain 9a5c are absent or highly divergent in strain J1a12. Among them, we found an arginase and a fimbrial adhesin precursor of type III pilus, which were confirmed to be absent in the nonpathogenic strain by PCR and DNA sequencing. The absence of arginase can be correlated to the inability of J1a12 to multiply in host plants. This enzyme has been recently shown to act as a bacterial survival mechanism by down-regulating host nitric oxide production. The lack of the adhesin precursor gene is in accordance with the less aggregated phenotype observed for J1a12 cells growing in vitro. Thus, the absence of both genes can be associated with the failure of the J1a12 strain to establish and spread in citrus and tobacco plants. These results provide the first detailed comparison between a nonpathogenic strain and a pathogenic strain of X. fastidiosa, constituting an important step towards understanding the molecular basis of the disease. PMID:15292146
Blake, Damer P; Oakes, Richard; Smith, Adrian L
2011-02-01
Eimeria maxima is one of the seven Eimeria spp. that infect the chicken and cause the disease coccidiosis. The well characterised immunogenicity and genetic diversity associated with E. maxima promote its use in genetics-led studies on avian coccidiosis. The development of a genetic map for E. maxima, presented here based upon 647 amplified fragment length polymorphism markers typed from 22 clonal hybrid lines and assembled into 13 major linkage groups, is a major new resource for work with this parasite. Comparison with genetic maps produced for other coccidial parasites indicates relatively high levels of genetic recombination. Conversion of ∼14% of the markers representing the major linkage groups to sequence characterised amplified region markers can provide a scaffold for the assembly of future genomic sequences as well as providing a foundation for more detailed genetic maps. Comparison with the Eimeria tenella genetic map produced 10years ago has revealed a less biased marker distribution, with no more than nine markers mapped within any unresolved heritable unit. Nonetheless, preliminary bioinformatic characterisation of the three largest publicly available genomic E. maxima sequences suggest that the feature-poor/feature-rich structure which has previously been found to define the first sequenced E. tenella chromosome also defines the E. maxima genome. The significance of such a segmented genome and the apparent potential for variation in genetic recombination will be relevant to haplotype stability and the longevity of future anticoccidial strategies based upon multiple loci targeted by novel chemotherapeutic drugs or recombinant subunit vaccines. Copyright © 2010 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Novikov, V.
1991-05-01
The U.S. Army's detailed equipment decontamination process is a stochastic flow shop which has N independent non-identical jobs (vehicles) which have overlapping processing times. This flow shop consists of up to six non-identical machines (stations). With the exception of one station, the processing times of the jobs are random variables. Based on an analysis of the processing times, the jobs for the 56 Army heavy division companies were scheduled according to the best shortest expected processing time - longest expected processing time (SEPT-LEPT) sequence. To assist in this scheduling the Gap Comparison Heuristic was developed to select the best SEPT-LEPTmore » schedule. This schedule was then used in balancing the detailed equipment decon line in order to find the best possible site configuration subject to several constraints. The detailed troop decon line, in which all jobs are independent and identically distributed, was then balanced. Lastly, an NBC decon optimization computer program was developed using the scheduling and line balancing results. This program serves as a prototype module for the ANBACIS automated NBC decision support system.... Decontamination, Stochastic flow shop, Scheduling, Stochastic scheduling, Minimization of the makespan, SEPT-LEPT Sequences, Flow shop line balancing, ANBACIS.« less
Phylum- and Class-Specific PCR Primers for General Microbial Community Analysis
Blackwood, Christopher B.; Oaks, Adam; Buyer, Jeffrey S.
2005-01-01
Amplification of a particular DNA fragment from a mixture of organisms by PCR is a common first step in methods of examining microbial community structure. The use of group-specific primers in community DNA profiling applications can provide enhanced sensitivity and phylogenetic detail compared to domain-specific primers. Other uses for group-specific primers include quantitative PCR and library screening. The purpose of the present study was to develop several primer sets targeting commonly occurring and important groups. Primers specific for the 16S ribosomal sequences of Alphaproteobacteria, Betaproteobacteria, Bacilli, Actinobacteria, and Planctomycetes and for parts of both the 18S ribosomal sequence and the internal transcribed spacer region of Basidiomycota were examined. Primers were tested by comparison to sequences in the ARB 2003 database, and chosen primers were further tested by cloning and sequencing from soil community DNA. Eighty-five to 100% of the sequences obtained from clone libraries were found to be placed with the groups intended as targets, demonstrating the specificity of the primers under field conditions. It will be important to reevaluate primers over time because of the continual growth of sequence databases and revision of microbial taxonomy. PMID:16204538
Gönner, Lorenz; Vitay, Julien; Hamker, Fred H.
2017-01-01
Hippocampal place-cell sequences observed during awake immobility often represent previous experience, suggesting a role in memory processes. However, recent reports of goals being overrepresented in sequential activity suggest a role in short-term planning, although a detailed understanding of the origins of hippocampal sequential activity and of its functional role is still lacking. In particular, it is unknown which mechanism could support efficient planning by generating place-cell sequences biased toward known goal locations, in an adaptive and constructive fashion. To address these questions, we propose a model of spatial learning and sequence generation as interdependent processes, integrating cortical contextual coding, synaptic plasticity and neuromodulatory mechanisms into a map-based approach. Following goal learning, sequential activity emerges from continuous attractor network dynamics biased by goal memory inputs. We apply Bayesian decoding on the resulting spike trains, allowing a direct comparison with experimental data. Simulations show that this model (1) explains the generation of never-experienced sequence trajectories in familiar environments, without requiring virtual self-motion signals, (2) accounts for the bias in place-cell sequences toward goal locations, (3) highlights their utility in flexible route planning, and (4) provides specific testable predictions. PMID:29075187
2011-01-01
Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S. G.; Carniel, E.; Larimer, Frank W
2004-09-01
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here, we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons with available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveal 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, to our knowledge, represent the only new genetic material in Y. pestis acquired since themore » the divergence from Y. pseudotuberculosis. In contrast, 149 other pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive insertion sequence-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of preexisting gene expression pathways, appear to be more important than acquisition of genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.
Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S
2015-12-01
Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.
Mulé, Sébastien; Soize, Sébastien; Benaissa, Azzedine; Portefaix, Christophe; Pierot, Laurent
2016-08-01
To investigate the ability of T2* and fluid-attenuated inversion recovery (FLAIR) MR sequences to detect hemosiderin deposition 3 months after aneurysmal subarachnoid hemorrhage (SAH) in comparison with early non-enhanced CT (NECT) as a gold standard. From September 2008 through May 2013, patients with aneurysmal SAH were included if a NECT less than 24 h after the onset of symptoms showed a SAH, and MRI, including T2* and FLAIR sequences, was performed 3 months later. All aneurysms were treated endovascularly. NECT and MR sequences were blindly analyzed for the presence of SAH (NECT) or hemosiderin deposition (MRI). When positive, details of the spatial distribution of SAH or hemosiderin deposits were noted. Sensitivities were calculated for each patient. Sensitivities, specificities, and positive predictive values (PPVs) were calculated for each location. Forty-nine patients (mean age 52.9 years) were included. Bleeding-related patterns were identified in 43 patients (87.8%) on T2* and 10 patients (20.4%) on FLAIR. T2* was highly predictive of the location of the initial hemorrhage, especially in the Sylvian cisterns (PPVs 95% and 100%) and the anterior interhemispheric fissure (PPV 90%). The T2* sequence can detect and localize a previous SAH a few months after aneurysmal bleeding. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Comparison of large-insert, small-insert and pyrosequencing libraries for metagenomic analysis.
Danhorn, Thomas; Young, Curtis R; DeLong, Edward F
2012-11-01
The development of DNA sequencing methods for characterizing microbial communities has evolved rapidly over the past decades. To evaluate more traditional, as well as newer methodologies for DNA library preparation and sequencing, we compared fosmid, short-insert shotgun and 454 pyrosequencing libraries prepared from the same metagenomic DNA samples. GC content was elevated in all fosmid libraries, compared with shotgun and 454 libraries. Taxonomic composition of the different libraries suggested that this was caused by a relative underrepresentation of dominant taxonomic groups with low GC content, notably Prochlorales and the SAR11 cluster, in fosmid libraries. While these abundant taxa had a large impact on library representation, we also observed a positive correlation between taxon GC content and fosmid library representation in other low-GC taxa, suggesting a general trend. Analysis of gene category representation in different libraries indicated that the functional composition of a library was largely a reflection of its taxonomic composition, and no additional systematic biases against particular functional categories were detected at the level of sequencing depth in our samples. Another important but less predictable factor influencing the apparent taxonomic and functional library composition was the read length afforded by the different sequencing technologies. Our comparisons and analyses provide a detailed perspective on the influence of library type on the recovery of microbial taxa in metagenomic libraries and underscore the different uses and utilities of more traditional, as well as contemporary 'next-generation' DNA library construction and sequencing technologies for exploring the genomics of the natural microbial world.
Budak, Hikmet; Kantar, Melda
2015-07-01
MicroRNAs (miRNAs) are small, endogenous, non-coding RNA molecules that regulate gene expression at the post-transcriptional level. As high-throughput next generation sequencing (NGS) and Big Data rapidly accumulate for various species, efforts for in silico identification of miRNAs intensify. Surprisingly, the effect of the input genomics sequence on the robustness of miRNA prediction was not evaluated in detail to date. In the present study, we performed a homology-based miRNA and isomiRNA prediction of the 5D chromosome of bread wheat progenitor, Aegilops tauschii, using two distinct sequence data sets as input: (1) raw sequence reads obtained from 454-GS FLX Titanium sequencing platform and (2) an assembly constructed from these reads. We also compared this method with a number of available plant sequence datasets. We report here the identification of 62 and 22 miRNAs from raw reads and the assembly, respectively, of which 16 were predicted with high confidence from both datasets. While raw reads promoted sensitivity with the high number of miRNAs predicted, 55% (12 out of 22) of the assembly-based predictions were supported by previous observations, bringing specificity forward compared to the read-based predictions, of which only 37% were supported. Importantly, raw reads could identify several repeat-related miRNAs that could not be detected with the assembly. However, raw reads could not capture 6 miRNAs, for which the stem-loops could only be covered by the relatively longer sequences from the assembly. In summary, the comparison of miRNA datasets obtained by these two strategies revealed that utilization of raw reads, as well as assemblies for in silico prediction, have distinct advantages and disadvantages. Consideration of these important nuances can benefit future miRNA identification efforts in the current age of NGS and Big Data driven life sciences innovation.
2005-09-11
Taking advantage of extra solar energy collected during the day, NASA's Mars Exploration Rover Spirit settled in for an evening of stargazing, photographing the two moons of Mars as they crossed the night sky. The first two images in this sequence show gradual enhancements in the surface detail of Mars' largest moon, Phobos, made possible through a combination technique known as "stacking." In "stacking," scientists use a mathematical process known as Laplacian sharpening to reinforce features that appear consistently in repetitive images and minimize features that show up only intermittently. In this view of Phobos, the large crater named Stickney is just out of sight on the moon's upper right limb. Spirit acquired the first two images with the panoramic camera on the night of sol 585 (Aug. 26,2005). The far right image of Phobos, for comparison, was taken by the High Resolution Stereo Camera on Mars Express, a European Space Agency orbiter. The third image in this sequence was derived from the far right image by making it blurrier for comparison with the panoramic camera images to the left http://photojournal.jpl.nasa.gov/catalog/PIA06335
Weaver, Keith E; Kwong, Stephen M; Firth, Neville; Francia, Maria Victoria
2009-03-01
The pheromone-responsive conjugative plasmids of Enterococcus faecalis and the multiresistance plasmids pSK1 and pSK41 of Staphylococcus aureus are among the best studied plasmids native to Gram-positive bacteria. Although these plasmids seem largely restricted to their native hosts, protein sequence comparison of their replication initiator proteins indicates that they are clearly related. Homology searches indicate that these replicons are representatives of a large family of plasmids and a few phage that are widespread among the low G+C Gram-positive bacteria. We propose to name this family the RepA_N family of replicons after the annotated conserved domain that the initiator protein contains. Detailed sequence comparisons indicate that the initiator protein phylogeny is largely congruent with that of the host, suggesting that the replicons have evolved along with their current hosts and that intergeneric transfer has been rare. However, related proteins were identified on chromosomal regions bearing characteristics indicative of ICE elements, and the phylogeny of these proteins displayed evidence of more frequent intergeneric transfer. Comparison of stability determinants associated with the RepA_N replicons suggests that they have a modular evolution as has been observed in other plasmid families.
Sierra-Garcia, Isabel Natalia; Dellagnezze, Bruna M; Santos, Viviane P; Chaves B, Michel R; Capilla, Ramsés; Santos Neto, Eugenio V; Gray, Neil; Oliveira, Valeria M
2017-01-01
Microorganisms have shown their ability to colonize extreme environments including deep subsurface petroleum reservoirs. Physicochemical parameters may vary greatly among petroleum reservoirs worldwide and so do the microbial communities inhabiting these different environments. The present work aimed at the characterization of the microbiota in biodegraded and non-degraded petroleum samples from three Brazilian reservoirs and the comparison of microbial community diversity across oil reservoirs at local and global scales using 16S rRNA clone libraries. The analysis of 620 16S rRNA bacterial and archaeal sequences obtained from Brazilian oil samples revealed 42 bacterial OTUs and 21 archaeal OTUs. The bacterial community from the degraded oil was more diverse than the non-degraded samples. Non-degraded oil samples were overwhelmingly dominated by gammaproteobacterial sequences with a predominance of the genera Marinobacter and Marinobacterium. Comparisons of microbial diversity among oil reservoirs worldwide suggested an apparent correlation of prokaryotic communities with reservoir temperature and depth and no influence of geographic distance among reservoirs. The detailed analysis of the phylogenetic diversity across reservoirs allowed us to define a core microbiome encompassing three bacterial classes (Gammaproteobacteria, Clostridia, and Bacteroidia) and one archaeal class (Methanomicrobia) ubiquitous in petroleum reservoirs and presumably owning the abilities to sustain life in these environments.
Weaver, Keith E.; Kwong, Stephen M.; Firth, Neville; Francia, Maria Victoria
2009-01-01
The pheromone-responsive conjugative plasmids of Enterococcus faecalis and the multi-resistance plasmids pSK1 and pSK41 of Staphylococcus aureus are among the best studied plasmids native to Gram-positive bacteria. Although these plasmids seem largely restricted to their native hosts, protein sequence comparison of their replication initiator proteins indicates that they are clearly related. Homology searches indicate that these replicons are representatives of a large family of plasmids and a few phage that are widespread among the low G+C Gram-positive bacteria. We propose to name this family the RepA_N family of replicons after the annotated conserved domain that the initiator protein contains. Detailed sequence comparisons indicate that the initiator protein phylogeny is largely congruent with that of the host, suggesting that the replicons have evolved along with their current hosts and that intergeneric transfer has been rare. However, related proteins were identified on chromosomal regions bearing characteristics indicative of ICE elements, and the phylogeny of these proteins displayed evidence of more frequent intergeneric transfer. Comparison of stability determinants associated with the RepA_N replicons suggests that they have a modular evolution as has been observed in other plasmid families. PMID:19100285
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Biaoyang; Nasir, J.; Kalchman, M.A.
1995-02-10
We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less
Vibration-Rotation Bands of HF and DF
1977-09-23
98 IZa. Comparison of Observed and Calculated Line Positions of HF, Av = I Sequence ........................... 99 f2b. Comparison of Observed and...Calculated Line Positions of HF, Av = 2 Sequence ........................... 102 12c. Comparison of Observed and Calculated Line Positions of HF, Av = 3...Sequence ........................... 107 i2d. Comparison of Observed and Calculated Line Positions ofHF, Av = 4 Sequence ........................... fi
Luedin, Samuel M; Pothier, Joël F; Danza, Francesco; Storelli, Nicola; Frigaard, Niels-Ulrik; Wittwer, Matthias; Tonolla, Mauro
2018-01-01
" Thiodictyon syntrophicum" sp. nov. strain Cad16 T is a photoautotrophic purple sulfur bacterium belonging to the family of Chromatiaceae in the class of Gammaproteobacteria . The type strain Cad16 T was isolated from the chemocline of the alpine meromictic Lake Cadagno in Switzerland. Strain Cad16 T represents a key species within this sulfur-driven bacterial ecosystem with respect to carbon fixation. The 7.74-Mbp genome of strain Cad16 T has been sequenced and annotated. It encodes 6237 predicted protein sequences and 59 RNA sequences. Phylogenetic comparison based on 16S rRNA revealed that Thiodictyon elegans strain DSM 232 T the most closely related species. Genes involved in sulfur oxidation, central carbon metabolism and transmembrane transport were found. Noteworthy, clusters of genes encoding the photosynthetic machinery and pigment biosynthesis are found on the 0.48 Mb plasmid pTs485. We provide a detailed insight into the Cad16 T genome and analyze it in the context of the microbial ecosystem of Lake Cadagno.
Nadin-Davis, S A; Huang, W; Wandeler, A I
1996-03-01
Since its recognition as a discrete epizootic in Florida in the early 1950s, the raccoon strain of rabies virus (RV) has spread over almost the entire eastern seaboard of the US and now threatens to enter the southernmost regions of Canada. To characterise this RV strain in more detail, nucleotide sequencing of the N and G genes, encoding the nucleoprotein and glycoprotein, respectively, of representative isolates has been undertaken. This sequence information generated a conserved restriction map of the N gene, thereby permitting unequivocal identification of this strain by molecular techniques. Comparisons of the predicted nucleoprotein and glycoprotein products with those of other RV strains identified a number of amino acid sequence variations conserved only in the raccoon strain. This information was used to design strain-specific primers targeted to the N gene sequences encoding these residues. The incorporation of these primers into a multiplex polymerase chain reaction (PCR) protocol permitted easy and rapid discrimination between the raccoon RV strain and indigenous Ontario RVs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mays, S.E.; Poloski, J.P.; Sullivan, W.H.
1982-07-01
A probabilistic risk assessment (PRA) was made of the Browns Ferry, Unit 1, nuclear plant as part of the Nuclear Regulatory Commission's Interim Reliability Evaluation Program (IREP). Specific goals of the study were to identify the dominant contributors to core melt, develop a foundation for more extensive use of PRA methods, expand the cadre of experienced PRA practitioners, and apply procedures for extension of IREP analyses to other domestic light water reactors. Event tree and fault tree analyses were used to estimate the frequency of accident sequences initiated by transients and loss of coolant accidents. External events such as floods,more » fires, earthquakes, and sabotage were beyond the scope of this study and were, therefore, excluded. From these sequences, the dominant contributors to probable core melt frequency were chosen. Uncertainty and sensitivity analyses were performed on these sequences to better understand the limitations associated with the estimated sequence frequencies. Dominant sequences were grouped according to common containment failure modes and corresponding release categories on the basis of comparison with analyses of similar designs rather than on the basis of detailed plant-specific calculations.« less
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-09-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-01-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648
PlanMine--a mineable resource of planarian biology and biodiversity.
Brandl, Holger; Moon, HongKee; Vila-Farré, Miquel; Liu, Shang-Yun; Henry, Ian; Rink, Jochen C
2016-01-04
Planarian flatworms are in the midst of a renaissance as a model system for regeneration and stem cells. Besides two well-studied model species, hundreds of species exist worldwide that present a fascinating diversity of regenerative abilities, tissue turnover rates, reproductive strategies and other life history traits. PlanMine (http://planmine.mpi-cbg.de/) aims to accomplish two primary missions: First, to provide an easily accessible platform for sharing, comparing and value-added mining of planarian sequence data. Second, to catalyze the comparative analysis of the phenotypic diversity amongst planarian species. Currently, PlanMine houses transcriptomes independently assembled by our lab and community contributors. Detailed assembly/annotation statistics, a custom-developed BLAST viewer and easy export options enable comparisons at the contig and assembly level. Consistent annotation of all transcriptomes by an automated pipeline, the integration of published gene expression information and inter-relational query tools provide opportunities for mining planarian gene sequences and functions. For inter-species comparisons, we include transcriptomes of, so far, six planarian species, along with images, expert-curated information on their biology and pre-calculated cross-species sequence homologies. PlanMine is based on the popular InterMine system in order to make the rich biology of planarians accessible to the general life sciences research community. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Man, Viet Hoang; Pan, Feng; Sagui, Celeste, E-mail: sagui@ncsu.edu
We explore the use of a fast laser melting simulation approach combined with atomistic molecular dynamics simulations in order to determine the melting and healing responses of B-DNA and Z-DNA dodecamers with the same d(5′-CGCGCGCGCGCG-3′){sub 2} sequence. The frequency of the laser pulse is specifically tuned to disrupt Watson-Crick hydrogen bonds, thus inducing melting of the DNA duplexes. Subsequently, the structures relax and partially refold, depending on the field strength. In addition to the inherent interest of the nonequilibrium melting process, we propose that fast melting by an infrared laser pulse could be used as a technique for a fastmore » comparison of relative stabilities of same-sequence oligonucleotides with different secondary structures with full atomistic detail of the structures and solvent. This could be particularly useful for nonstandard secondary structures involving non-canonical base pairs, mismatches, etc.« less
NASA Astrophysics Data System (ADS)
Dieterich, Sergio; Henry, Todd; Jao, W.-C.; Washington, Robert; Silverstein, Michele; Winters, J.; RECONS
2018-01-01
We present a detailed comparison of atmospheric model predictions and photometric observations for late M and L dwarfs. We discuss which wavelength regions are best for determining the fundamental properties of these cool stellar and substellar atmospheres and use this analysis to refine the HR diagram for the hydrogen burning limit first presented in 2014. We also add several new objects to the HR diagram and find little qualitative difference in the HR diagram's overall morphology when compared to our 2014 results. The L2 dwarf 2MASS 0523-1403 remains the smallest hydrogen burning star for which we calculated a radius, thus likely indicating the end of the stellar main sequence. This work is supported by the NSF Astronomy and Astrophysics Postdoctoral Fellowship program through grant AST-1400680.
Deakin, Janine E; Edwards, Melanie J; Patel, Hardip; O'Meally, Denis; Lian, Jinmin; Stenhouse, Rachael; Ryan, Sam; Livernois, Alexandra M; Azad, Bhumika; Holleley, Clare E; Li, Qiye; Georges, Arthur
2016-06-10
Squamates (lizards and snakes) are a speciose lineage of reptiles displaying considerable karyotypic diversity, particularly among lizards. Understanding the evolution of this diversity requires comparison of genome organisation between species. Although the genomes of several squamate species have now been sequenced, only the green anole lizard has any sequence anchored to chromosomes. There is only limited gene mapping data available for five other squamates. This makes it difficult to reconstruct the events that have led to extant squamate karyotypic diversity. The purpose of this study was to anchor the recently sequenced central bearded dragon (Pogona vitticeps) genome to chromosomes to trace the evolution of squamate chromosomes. Assigning sequence to sex chromosomes was of particular interest for identifying candidate sex determining genes. By using two different approaches to map conserved blocks of genes, we were able to anchor approximately 42 % of the dragon genome sequence to chromosomes. We constructed detailed comparative maps between dragon, anole and chicken genomes, and where possible, made broader comparisons across Squamata using cytogenetic mapping information for five other species. We show that squamate macrochromosomes are relatively well conserved between species, supporting findings from previous molecular cytogenetic studies. Macrochromosome diversity between members of the Toxicofera clade has been generated by intrachromosomal, and a small number of interchromosomal, rearrangements. We reconstructed the ancestral squamate macrochromosomes by drawing upon comparative cytogenetic mapping data from seven squamate species and propose the events leading to the arrangements observed in representative species. In addition, we assigned over 8 Mbp of sequence containing 219 genes to the Z chromosome, providing a list of genes to begin testing as candidate sex determining genes. Anchoring of the dragon genome has provided substantial insight into the evolution of squamate genomes, enabling us to reconstruct ancestral macrochromosome arrangements at key positions in the squamate phylogeny, demonstrating that fusions between macrochromosomes or fusions of macrochromosomes and microchromosomes, have played an important role during the evolution of squamate genomes. Assigning sequence to the sex chromosomes has identified NR5A1 as a promising candidate sex determining gene in the dragon.
Comparison of the Exomes of Common Carp (Cyprinus carpio) and Zebrafish (Danio rerio)
Henkel, Christiaan V.; Dirks, Ron P.; Jansen, Hans J.; Forlenza, Maria; Wiegertjes, Geert F.; Howe, Kerstin; van den Thillart, Guido E.E.J.M.
2012-01-01
Abstract Research on common carp, Cyprinus carpio, is beneficial for zebrafish research because of resources available owing to its large body size, such as the availability of sufficient organ material for transcriptomics, proteomics, and metabolomics. Here we describe the shot gun sequencing of a clonal double-haploid common carp line. The assembly consists of 511891 scaffolds with an N50 of 17 kb, predicting a total genome size of 1.4–1.5 Gb. A detailed analysis of the ten largest scaffolds indicates that the carp genome has a considerably lower repeat coverage than zebrafish, whilst the average intron size is significantly smaller, making it comparable to the fugu genome. The quality of the scaffolding was confirmed by comparisons with RNA deep sequencing data sets and a manual analysis for synteny with the zebrafish, especially the Hox gene clusters. In the ten largest scaffolds analyzed, the synteny of genes is almost complete. Comparisons of predicted exons of common carp with those of the zebrafish revealed only few genes specific for either zebrafish or carp, most of these being of unknown function. This supports the hypothesis of an additional genome duplication event in the carp evolutionary history, which—due to a higher degree of compactness—did not result in a genome larger than that of zebrafish. PMID:22715948
Qiu, Lingling; Jiang, Bo; Fang, Jia; Shen, Yike; Fang, Zhongxiang; Rm, Saravana Kumar; Yi, Keke; Shen, Chenjia; Yan, Daoliang; Zheng, Bingsong
2016-11-17
Hickory (Carya cathayensis), a woody plant with high nutritional and economic value, is widely planted in China. Due to its long juvenile phase, grafting is a useful technique for large-scale cultivation of hickory. To reveal the molecular mechanism during the graft process, we sequenced the transcriptomes of graft union in hickory. In our study, six RNA-seq libraries yielded a total of 83,676,860 clean short reads comprising 4.19 Gb of sequence data. A large number of differentially expressed genes (DEGs) at three time points during the graft process were identified. In detail, 777 DEGs in the 7 d vs 0 d (day after grafting) comparison were classified into 11 enriched Gene Ontology (GO) categories, and 262 DEGs in the 14 d vs 0 d comparison were classified into 15 enriched GO categories. Furthermore, an overview of the PPI network was constructed by these DEGs. In addition, 20 genes related to the auxin-and cytokinin-signaling pathways were identified, and some were validated by qRT-PCR analysis. Our comprehensive analysis provides basic information on the candidate genes and hormone signaling pathways involved in the graft process in hickory and other woody plants.
Retter, Ida; Chevillard, Christophe; Scharfe, Maren; Conrad, Ansgar; Hafner, Martin; Im, Tschong-Hun; Ludewig, Monika; Nordsiek, Gabriele; Severitt, Simone; Thies, Stephanie; Mauhar, America; Blöcker, Helmut; Müller, Werner; Riblet, Roy
2009-01-01
Although the entire mouse genome has been sequenced, there remain challenges concerning the elucidation of particular complex and polymorphic genomic loci. In the murine Igh locus, different haplotypes exist in different inbred mouse strains. For example, the Ighb haplotype sequence of the Mouse Genome Project strain C57BL/6 differs considerably from the Igha haplotype of BALB/c, which has been widely used in the analyses of Ab responses. We have sequenced and annotated the 3′ half of the Igha locus of 129S1/SvImJ, covering the CH region and approximately half of the VH region. This sequence comprises 128 VH genes, of which 49 are judged to be functional. The comparison of the Igha sequence with the homologous Ighb region from C57BL/6 revealed two major expansions in the germline repertoire of Igha. In addition, we found smaller haplotype-specific differences like the duplication of five VH genes in the Igha locus. We generated a VH allele table by comparing the individual VH genes of both haplotypes. Surprisingly, the number and position of DH genes in the 129S1 strain differs not only from the sequence of C57BL/6 but also from the map published for BALB/c. Taken together, the contiguous genomic sequence of the 3′ part of the Igha locus allows a detailed view of the recent evolution of this highly dynamic locus in the mouse. PMID:17675503
Montoya, Leticia; Bandala, Victor Manuel; Haug, Ingeborg; Stubbe, Dirk
2012-01-01
A new milkcap species, Lactarius fuscomarginatus, was found in the subtropical region of central Veracruz (eastern Mexico) associated with two relict populations of Fagus grandifolia var. mexicana. The species is characterized macroscopically by its dark pileus and stipe and by its distant and whitish lamellae with blackish to blackish brown edges. A molecular phylogenetic analyses based on ITS and LSU nucDNA sequences confirms the delimitation of this new taxon and places L. fuscomarginatus in subgenus Gerardii. A detailed morphological comparison is given with similar species.
Host Cell Virus Entry Mediated by Australian Bat Lyssavirus Envelope G glycoprotein
2013-10-24
39 Figure 7. Comparison of the amino acid sequences of Saccolaimus and Pteropus ABLV G mature protein... sequence analysis revealed that the PCR products were identical. Sequence comparisons of the ABLV N and other lyssavirus N proteins showed that ABLV...Saccolaimus flaviventris) (129). Nucleoprotein sequence comparisons revealed that the Saccolaimus N protein shared 96% amino acid homology with the Pteropus
Wide distribution of O157-antigen biosynthesis gene clusters in Escherichia coli.
Iguchi, Atsushi; Shirai, Hiroki; Seto, Kazuko; Ooka, Tadasuke; Ogura, Yoshitoshi; Hayashi, Tetsuya; Osawa, Kayo; Osawa, Ro
2011-01-01
Most Escherichia coli O157-serogroup strains are classified as enterohemorrhagic E. coli (EHEC), which is known as an important food-borne pathogen for humans. They usually produce Shiga toxin (Stx) 1 and/or Stx2, and express H7-flagella antigen (or nonmotile). However, O157 strains that do not produce Stxs and express H antigens different from H7 are sometimes isolated from clinical and other sources. Multilocus sequence analysis revealed that these 21 O157:non-H7 strains tested in this study belong to multiple evolutionary lineages different from that of EHEC O157:H7 strains, suggesting a wide distribution of the gene set encoding the O157-antigen biosynthesis in multiple lineages. To gain insight into the gene organization and the sequence similarity of the O157-antigen biosynthesis gene clusters, we conducted genomic comparisons of the chromosomal regions (about 59 kb in each strain) covering the O-antigen gene cluster and its flanking regions between six O157:H7/non-H7 strains. Gene organization of the O157-antigen gene cluster was identical among O157:H7/non-H7 strains, but was divided into two distinct types at the nucleotide sequence level. Interestingly, distribution of the two types did not clearly follow the evolutionary lineages of the strains, suggesting that horizontal gene transfer of both types of O157-antigen gene clusters has occurred independently among E. coli strains. Additionally, detailed sequence comparison revealed that some positions of the repetitive extragenic palindromic (REP) sequences in the regions flanking the O-antigen gene clusters were coincident with possible recombination points. From these results, we conclude that the horizontal transfer of the O157-antigen gene clusters induced the emergence of multiple O157 lineages within E. coli and speculate that REP sequences may involve one of the driving forces for exchange and evolution of O-antigen loci.
2011-01-01
Background Malate synthase, one of the two enzymes unique to the glyoxylate cycle, is found in all three domains of life, and is crucial to the utilization of two-carbon compounds for net biosynthetic pathways such as gluconeogenesis. In addition to the main isoforms A and G, so named because of their differential expression in E. coli grown on either acetate or glycolate respectively, a third distinct isoform has been identified. These three isoforms differ considerably in size and sequence conservation. The A isoform (MSA) comprises ~530 residues, the G isoform (MSG) is ~730 residues, and this third isoform (MSH-halophilic) is ~430 residues in length. Both isoforms A and G have been structurally characterized in detail, but no structures have been reported for the H isoform which has been found thus far only in members of the halophilic Archaea. Results We have solved the structure of a malate synthase H (MSH) isoform member from Haloferax volcanii in complex with glyoxylate at 2.51 Å resolution, and also as a ternary complex with acetyl-coenzyme A and pyruvate at 1.95 Å. Like the A and G isoforms, MSH is based on a β8/α8 (TIM) barrel. Unlike previously solved malate synthase structures which are all monomeric, this enzyme is found in the native state as a trimer/hexamer equilibrium. Compared to isoforms A and G, MSH displays deletion of an N-terminal domain and a smaller deletion at the C-terminus. The MSH active site is closely superimposable with those of MSA and MSG, with the ternary complex indicating a nucleophilic attack on pyruvate by the enolate intermediate of acetyl-coenzyme A. Conclusions The reported structures of MSH from Haloferax volcanii allow a detailed analysis and comparison with previously solved structures of isoforms A and G. These structural comparisons provide insight into evolutionary relationships among these isoforms, and also indicate that despite the size and sequence variation, and the truncated C-terminal domain of the H isoform, the catalytic mechanism is conserved. Sequence analysis in light of the structure indicates that additional members of isoform H likely exist in the databases but have been misannotated. PMID:21569248
Isakov, Ofer; Bordería, Antonio V; Golan, David; Hamenahem, Amir; Celniker, Gershon; Yoffe, Liron; Blanc, Hervé; Vignuzzi, Marco; Shomron, Noam
2015-07-01
The study of RNA virus populations is a challenging task. Each population of RNA virus is composed of a collection of different, yet related genomes often referred to as mutant spectra or quasispecies. Virologists using deep sequencing technologies face major obstacles when studying virus population dynamics, both experimentally and in natural settings due to the relatively high error rates of these technologies and the lack of high performance pipelines. In order to overcome these hurdles we developed a computational pipeline, termed ViVan (Viral Variance Analysis). ViVan is a complete pipeline facilitating the identification, characterization and comparison of sequence variance in deep sequenced virus populations. Applying ViVan on deep sequenced data obtained from samples that were previously characterized by more classical approaches, we uncovered novel and potentially crucial aspects of virus populations. With our experimental work, we illustrate how ViVan can be used for studies ranging from the more practical, detection of resistant mutations and effects of antiviral treatments, to the more theoretical temporal characterization of the population in evolutionary studies. Freely available on the web at http://www.vivanbioinfo.org : nshomron@post.tau.ac.il Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.
2016-01-01
Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240
Comparison of chemical and heating methods to enhance latent fingerprint deposits on thermal paper.
Bond, John W
2014-03-01
A comparison is made of proprietary methods to develop latent fingerprint deposits on the inked side of thermal paper using either chemical treatment (Thermanin) or the application of heat to the paper (Hot Print System). Results with a trial of five donors show that the application of heat produces statistically significantly more fingerprint ridge detail than the chemical treatment for both fingerprint deposits aged up to 4 weeks and for a nine sequence depletion series. Subjecting the thermal paper to heat treatment with the Hot Print System did not inhibit subsequent ninhydrin chemical development of fingerprint deposits on the noninked side of the paper. A further benefit of the application of heat is the rapid development of fingerprint deposits (less than a minute) compared with up to 12 h for the Thermanin chemical treatment.
The genomes and comparative genomics of Lactobacillus delbrueckii phages.
Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani
2011-07-01
Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.
Liu, Guo-Hua; Gasser, Robin B.; Nejsum, Peter; Wang, Yan; Chen, Qiang; Song, Hui-Qun; Zhu, Xing-Quan
2013-01-01
The whipworm of humans, Trichuris trichiura, is responsible for a neglected tropical disease (NTD) of major importance in tropical and subtropical countries of the world. Whipworms also infect animal hosts, including pigs, dogs and non-human primates, cause clinical disease (trichuriasis) similar to that of humans. Although Trichuris species are usually considered to be host specific, it is not clear whether non-human primates are infected with T. trichiura or other species. In the present study, we sequenced the complete mitochondrial (mt) genome as well as the first and second internal transcribed spacers (ITS-1 and ITS-2) of Trichuris from the François’ leaf-monkey (langur), and compared them with homologous sequences from human- and pig-derived Trichuris. In addition, sequence comparison of a conserved mt ribosomal gene among multiple individual whipworms revealed substantial nucleotide differences among these three host species but limited sequence variation within each of them. The molecular data indicate that the monkey-derived whipworm is a separate species from that of humans. Future work should focus on detailed population genetic and morphological studies (by electron microscopy) of whipworms from various non-humans primates and humans. PMID:23840431
Liu, Guo-Hua; Gasser, Robin B; Nejsum, Peter; Wang, Yan; Chen, Qiang; Song, Hui-Qun; Zhu, Xing-Quan
2013-01-01
The whipworm of humans, Trichuris trichiura, is responsible for a neglected tropical disease (NTD) of major importance in tropical and subtropical countries of the world. Whipworms also infect animal hosts, including pigs, dogs and non-human primates, cause clinical disease (trichuriasis) similar to that of humans. Although Trichuris species are usually considered to be host specific, it is not clear whether non-human primates are infected with T. trichiura or other species. In the present study, we sequenced the complete mitochondrial (mt) genome as well as the first and second internal transcribed spacers (ITS-1 and ITS-2) of Trichuris from the François' leaf-monkey (langur), and compared them with homologous sequences from human- and pig-derived Trichuris. In addition, sequence comparison of a conserved mt ribosomal gene among multiple individual whipworms revealed substantial nucleotide differences among these three host species but limited sequence variation within each of them. The molecular data indicate that the monkey-derived whipworm is a separate species from that of humans. Future work should focus on detailed population genetic and morphological studies (by electron microscopy) of whipworms from various non-humans primates and humans.
Kosushkin, S A; Borodulina, O R; Solov'eva, E N; Grechko, V V
2008-01-01
We have isolated and characterised sequences of a SINE family specific for squamate reptiles from a genome of lacertid lizard that we called Squam1. Copies are 360-390 bp in length and share a significant similarity with tRNA gene sequence on its 5'-end. This family was also detected by us in DNA of representatives of varanids, iguanids (anolis), gekkonids, and snakes. No signs of it were found in DNA of mammals, birds, amphibians, and crocodiles. Detailed analysis of primary structure of the retroposons obtained by us from genomic libraries or GenBank sequences was carried out. Most taxa possess 2-3 subfamilies of the SINE in their genomes with specific diagnostic features in their primary structure. Individual variability of copies in different families is about 85% and is just slightly lower on the genera level. Comparison of consensus sequences on family level reveals a high degree of structural similarity with a number of specific apomorphic features which makes it a useful marker of phylogeny for this group of reptiles. Snakes do not show specific affinity to varanids when compared to other lizards, as it was suggested earlier.
Neuwald, Andrew F
2009-08-01
The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.
Culturing of female bladder bacteria reveals an interconnected urogenital microbiota.
Thomas-White, Krystal; Forster, Samuel C; Kumar, Nitin; Van Kuiken, Michelle; Putonti, Catherine; Stares, Mark D; Hilt, Evann E; Price, Travis K; Wolfe, Alan J; Lawley, Trevor D
2018-04-19
Metagenomic analyses have indicated that the female bladder harbors an indigenous microbiota. However, there are few cultured reference strains with sequenced genomes available for functional and experimental analyses. Here we isolate and genome-sequence 149 bacterial strains from catheterized urine of 77 women. This culture collection spans 78 species, representing approximately two thirds of the bacterial diversity within the sampled bladders, including Proteobacteria, Actinobacteria, and Firmicutes. Detailed genomic and functional comparison of the bladder microbiota to the gastrointestinal and vaginal microbiotas demonstrates similar vaginal and bladder microbiota, with functional capacities that are distinct from those observed in the gastrointestinal microbiota. Whole-genome phylogenetic analysis of bacterial strains isolated from the vagina and bladder in the same women identifies highly similar Escherichia coli, Streptococcus anginosus, Lactobacillus iners, and Lactobacillus crispatus, suggesting an interlinked female urogenital microbiota that is not only limited to pathogens but is also characteristic of health-associated commensals.
Iterative Code-Aided ML Phase Estimation and Phase Ambiguity Resolution
NASA Astrophysics Data System (ADS)
Wymeersch, Henk; Moeneclaey, Marc
2005-12-01
As many coded systems operate at very low signal-to-noise ratios, synchronization becomes a very difficult task. In many cases, conventional algorithms will either require long training sequences or result in large BER degradations. By exploiting code properties, these problems can be avoided. In this contribution, we present several iterative maximum-likelihood (ML) algorithms for joint carrier phase estimation and ambiguity resolution. These algorithms operate on coded signals by accepting soft information from the MAP decoder. Issues of convergence and initialization are addressed in detail. Simulation results are presented for turbo codes, and are compared to performance results of conventional algorithms. Performance comparisons are carried out in terms of BER performance and mean square estimation error (MSEE). We show that the proposed algorithm reduces the MSEE and, more importantly, the BER degradation. Additionally, phase ambiguity resolution can be performed without resorting to a pilot sequence, thus improving the spectral efficiency.
Specific mineral associations of hydrothermal shale (South Kamchatka)
NASA Astrophysics Data System (ADS)
Rychagov, S. N.; Sergeeva, A. V.; Chernov, M. S.
2017-11-01
The sequence of hydrothermal shale from the East Pauzhet thermal field within the Pauzhet hydrothermal system (South Kamchatka) was studied in detail. It was established that the formation of shale resulted from argillization of an andesitic lava flow under the influence of an acidic sulfate vapor condensate. The horizons with radically different compositions and physical properties compared to those of the overlying homogeneous plastic shale were distinguished at the base of the sequence. These horizons are characterized by high (up to two orders of magnitude in comparison with average values in hydrothermal shale) concentrations of F, P, Na, Mg, K, Ca, Sc, Ti, V, Cr, Cu, and Zn. We suggested a geological-geochemical model, according to which a deep metal-bearing chloride-hydrocarbonate solution infiltrated into the permeable zone formed at the root of the andesitic lava flow beneath plastic shale at a certain stage of evolution of the hydrothermal system.
Genome Evolution of Plant-Parasitic Nematodes.
Kikuchi, Taisei; Eves-van den Akker, Sebastian; Jones, John T
2017-08-04
Plant parasitism has evolved independently on at least four separate occasions in the phylum Nematoda. The application of next-generation sequencing (NGS) to plant-parasitic nematodes has allowed a wide range of genome- or transcriptome-level comparisons, and these have identified genome adaptations that enable parasitism of plants. Current genome data suggest that horizontal gene transfer, gene family expansions, evolution of new genes that mediate interactions with the host, and parasitism-specific gene regulation are important adaptations that allow nematodes to parasitize plants. Sequencing of a larger number of nematode genomes, including plant parasites that show different modes of parasitism or that have evolved in currently unsampled clades, and using free-living taxa as comparators would allow more detailed analysis and a better understanding of the organization of key genes within the genomes. This would facilitate a more complete understanding of the way in which parasitism has shaped the genomes of plant-parasitic nematodes.
Zeng, Jiaolong; Yuan, Jianmin
2007-08-01
Calculation details of radiative opacity for lowly ionized gold plasmas by using our developed fully relativistic detailed level-accounting approach are presented to show the importance of accurate atomic data for a quantitative reproduction of the experimental observations. Even though a huge number of transition lines are involved in the radiative absorption of high- Z plasmas so that one believes that statistical models can often give a reasonable description of their opacities, we first show in detail that an adequate treatment of physical effects, in particular the configuration interaction (including the core-valence electron correlation), is essential to produce atomic data of bound-bound and bound-free processes for gold plasmas, which are accurate enough to correctly explain the relative intensity of two strong absorption peaks experimentally observed located near photon energy of 70 and 80 eV. A detailed study is also carried out for gold plasmas of an average ionization degree sequence of 10, for both spectrally resolved opacities and Rosseland and Planck means. For comparison, results obtained by using an average atom model are also given to show that even for a relatively higher density of matter, correlation effects are also important to predict the correct positions of absorption peaks of transition arrays.
Ramas, Viviana; Mirazo, Santiago; Bonilla, Sylvia; Ruchansky, Dora; Arbiza, Juan
2018-05-15
This study aims to investigate the HPV16 variant distribution by sequence analyses of E6, E7 oncogenes and the Long Control Region (LCR), from cervical cells collected from Uruguayan women, and to reconstruct the phylogenetic relationships among variants. Forty-seven HPV16 variants, obtained from women with HSIL, LSIL, ASCUS and NILM cytological classes were analyzed for LCR and 12 were further studied for E6 and E7. Detailed sequence comparison, genetic heterogeneity analyses and phylogenetic reconstruction were performed. A high variability was observed among LCR sequences, which were distributed in 18 different variants. E6 and E7 sequences exhibited novel non-synonymous substitutions. Uruguayan sequences mainly belonged to the European lineage, and only 5 sequences clustered in non-European branches; 3 of them in the Asian-American and North-American linage and 2 in an African branch. Additionally, 6 new variants from European and African clusters were identified. HPV16 isolates mainly belonged to the European lineage, though strains from African and Asian-American lineages were also identified. Herein is reported for the first time the distribution and molecular characterization of HPV16 variants from Uruguay, providing novel insights on the molecular epidemiology of this infectious disease in the South America. A high variability among HPV 16 isolates mainly belonged to European lineage, provides an extensive sequence dataset from a country with high burden of cervical cancer. Copyright © 2018 Elsevier B.V. All rights reserved.
Wu, Y.; Zheng, J.; Robbins, R. T.
2007-01-01
A population of Xiphinema hunaniense Wang and Wu, 1992 with all four juvenile stages was found in the rhizosphere of Pinus sp. in Hangzhou, Zhejiang, China. Morphometrics of 18 females and 35 juveniles of this population are given herein. Detailed morphology and morphometrics of the four juvenile stages are provided. Further comparisons based on morphometrics of the population with previous studies of the females and the first-stage juveniles of X. hunaniense with X. radicicola are given, and morphological variation in X. hunaniense populations are discussed. A revised polytomous key code of Loof and Luc (1990) for X. hunaniense identification is provided, i.e., A1- B4- C4- D4/5- E1- F2(3)- G2- H2-I3- J4- K2- L1. In addition, the sequence of the D2 and D3 expansion region of the 28S rRNA gene was analyzed and compared with sequences of closely related species downloaded from the NCBI database. Cluster analysis of sequences confirmed and supported the species identifications. PMID:19259473
Assessing Species Diversity Using Metavirome Data: Methods and Challenges.
Herath, Damayanthi; Jayasundara, Duleepa; Ackland, David; Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman
2017-01-01
Assessing biodiversity is an important step in the study of microbial ecology associated with a given environment. Multiple indices have been used to quantify species diversity, which is a key biodiversity measure. Measuring species diversity of viruses in different environments remains a challenge relative to measuring the diversity of other microbial communities. Metagenomics has played an important role in elucidating viral diversity by conducting metavirome studies; however, metavirome data are of high complexity requiring robust data preprocessing and analysis methods. In this review, existing bioinformatics methods for measuring species diversity using metavirome data are categorised broadly as either sequence similarity-dependent methods or sequence similarity-independent methods. The former includes a comparison of DNA fragments or assemblies generated in the experiment against reference databases for quantifying species diversity, whereas estimates from the latter are independent of the knowledge of existing sequence data. Current methods and tools are discussed in detail, including their applications and limitations. Drawbacks of the state-of-the-art method are demonstrated through results from a simulation. In addition, alternative approaches are proposed to overcome the challenges in estimating species diversity measures using metavirome data.
T-Reg Comparator: an analysis tool for the comparison of position weight matrices
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-01-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55–61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91–D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at . PMID:15980506
T-Reg Comparator: an analysis tool for the comparison of position weight matrices.
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-07-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55-61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91-D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at http://treg.molgen.mpg.de.
High-throughput sequencing: a failure mode analysis.
Yang, George S; Stott, Jeffery M; Smailus, Duane; Barber, Sarah A; Balasundaram, Miruna; Marra, Marco A; Holt, Robert A
2005-01-04
Basic manufacturing principles are becoming increasingly important in high-throughput sequencing facilities where there is a constant drive to increase quality, increase efficiency, and decrease operating costs. While high-throughput centres report failure rates typically on the order of 10%, the causes of sporadic sequencing failures are seldom analyzed in detail and have not, in the past, been formally reported. Here we report the results of a failure mode analysis of our production sequencing facility based on detailed evaluation of 9,216 ESTs generated from two cDNA libraries. Two categories of failures are described; process-related failures (failures due to equipment or sample handling) and template-related failures (failures that are revealed by close inspection of electropherograms and are likely due to properties of the template DNA sequence itself). Preventative action based on a detailed understanding of failure modes is likely to improve the performance of other production sequencing pipelines.
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Molecular diversity and distribution of marine fungi across 130 European environmental samples.
Richards, Thomas A; Leonard, Guy; Mahé, Frédéric; Del Campo, Javier; Romac, Sarah; Jones, Meredith D M; Maguire, Finlay; Dunthorn, Micah; De Vargas, Colomban; Massana, Ramon; Chambouvet, Aurélie
2015-11-22
Environmental DNA and culture-based analyses have suggested that fungi are present in low diversity and in low abundance in many marine environments, especially in the upper water column. Here, we use a dual approach involving high-throughput diversity tag sequencing from both DNA and RNA templates and fluorescent cell counts to evaluate the diversity and relative abundance of fungi across marine samples taken from six European near-shore sites. We removed very rare fungal operational taxonomic units (OTUs) selecting only OTUs recovered from multiple samples for a detailed analysis. This approach identified a set of 71 fungal 'OTU clusters' that account for 66% of all the sequences assigned to the Fungi. Phylogenetic analyses demonstrated that this diversity includes a significant number of chytrid-like lineages that had not been previously described, indicating that the marine environment encompasses a number of zoosporic fungi that are new to taxonomic inventories. Using the sequence datasets, we identified cases where fungal OTUs were sampled across multiple geographical sites and between different sampling depths. This was especially clear in one relatively abundant and diverse phylogroup tentatively named Novel Chytrid-Like-Clade 1 (NCLC1). For comparison, a subset of the water column samples was also investigated using fluorescent microscopy to examine the abundance of eukaryotes with chitin cell walls. Comparisons of relative abundance of RNA-derived fungal tag sequences and chitin cell-wall counts demonstrate that fungi constitute a low fraction of the eukaryotic community in these water column samples. Taken together, these results demonstrate the phylogenetic position and environmental distribution of 71 lineages, improving our understanding of the diversity and abundance of fungi in marine environments. © 2015 The Authors.
Molecular diversity and distribution of marine fungi across 130 European environmental samples
Richards, Thomas A.; Leonard, Guy; Mahé, Frédéric; del Campo, Javier; Romac, Sarah; Jones, Meredith D. M.; Maguire, Finlay; Dunthorn, Micah; De Vargas, Colomban; Massana, Ramon; Chambouvet, Aurélie
2015-01-01
Environmental DNA and culture-based analyses have suggested that fungi are present in low diversity and in low abundance in many marine environments, especially in the upper water column. Here, we use a dual approach involving high-throughput diversity tag sequencing from both DNA and RNA templates and fluorescent cell counts to evaluate the diversity and relative abundance of fungi across marine samples taken from six European near-shore sites. We removed very rare fungal operational taxonomic units (OTUs) selecting only OTUs recovered from multiple samples for a detailed analysis. This approach identified a set of 71 fungal ‘OTU clusters' that account for 66% of all the sequences assigned to the Fungi. Phylogenetic analyses demonstrated that this diversity includes a significant number of chytrid-like lineages that had not been previously described, indicating that the marine environment encompasses a number of zoosporic fungi that are new to taxonomic inventories. Using the sequence datasets, we identified cases where fungal OTUs were sampled across multiple geographical sites and between different sampling depths. This was especially clear in one relatively abundant and diverse phylogroup tentatively named Novel Chytrid-Like-Clade 1 (NCLC1). For comparison, a subset of the water column samples was also investigated using fluorescent microscopy to examine the abundance of eukaryotes with chitin cell walls. Comparisons of relative abundance of RNA-derived fungal tag sequences and chitin cell-wall counts demonstrate that fungi constitute a low fraction of the eukaryotic community in these water column samples. Taken together, these results demonstrate the phylogenetic position and environmental distribution of 71 lineages, improving our understanding of the diversity and abundance of fungi in marine environments. PMID:26582030
NASA Astrophysics Data System (ADS)
Leach, Franklin E.; Riley, Nicholas M.; Westphall, Michael S.; Coon, Joshua J.; Amster, I. Jonathan
2017-09-01
The structural characterization of sulfated glycosaminoglycan (GAG) carbohydrates remains an important target for analytical chemists attributable to challenges introduced by the natural complexity of these mixtures and the defined need for molecular-level details to elucidate biological structure-function relationships. Tandem mass spectrometry has proven to be the most powerful technique for this purpose. Previously, electron detachment dissociation (EDD), in comparison to other methods of ion activation, has been shown to provide the largest number of useful cleavages for de novo sequencing of GAG oligosaccharides, but such experiments are restricted to Fourier transform ion cyclotron resonance mass spectrometers (FTICR-MS). Negative electron transfer dissociation (NETD) provides similar fragmentation results, and can be achieved on any mass spectrometry platform that is designed to accommodate ion-ion reactions. Here, we examine for the first time the effectiveness of NETD-Orbitrap mass spectrometry for the structural analysis of GAG oligosaccharides. Compounds ranging in size from tetrasaccharides to decasaccharides were dissociated by NETD, producing both glycosidic and cross-ring cleavages that enabled the location of sulfate modifications. The highly-sulfated, heparin-like synthetic GAG, ArixtraTM, was also successfully sequenced by NETD. In comparison to other efforts to sequence GAG chains without fully ionized sulfate constituents, the occurrence of sulfate loss peaks is minimized by judicious precursor ion selection. The results compare quite favorably to prior results with electron detachment dissociation (EDD). Significantly, the duty cycle of the NETD experiment is sufficiently short to make it an effective tool for on-line separations, presenting a straightforward path for selective, high-throughput analysis of GAG mixtures. [Figure not available: see fulltext.
Bruce, A. Gregory; Thouless, Margaret E.; Haines, Anthony S.; Pallen, Mark J.; Grundhoff, Adam
2015-01-01
ABSTRACT Two rhadinovirus lineages have been identified in Old World primates. The rhadinovirus 1 (RV1) lineage consists of human herpesvirus 8, Kaposi's sarcoma-associated herpesvirus (KSHV), and closely related rhadinoviruses of chimpanzees, gorillas, macaques and other Old World primates. The RV2 rhadinovirus lineage is distinct and consists of closely related viruses from the same Old World primate species. Rhesus macaque rhadinovirus (RRV) is the RV2 prototype, and two RRV isolates, 26-95 and 17577, were sequenced. We determined that the pig-tailed macaque RV2 rhadinovirus, MneRV2, is highly associated with lymphomas in macaques with simian AIDS. To further study the role of rhadinoviruses in the development of lymphoma, we sequenced the complete genome of MneRV2 and identified 87 protein coding genes and 17 candidate microRNAs (miRNAs). A strong genome colinearity and sequence homology were observed between MneRV2 and RRV26-95, although the open reading frame (ORF) encoding the KSHV ORFK15 homolog was disrupted in RRV26-95. Comparison with MneRV2 revealed several genomic anomalies in RRV17577 that were not present in other rhadinovirus genomes, including an N-terminal duplication in ORF4 and a recombinative exchange of more distantly related homologs of the ORF22/ORF47 interacting glycoprotein genes. The comparison with MneRV2 has revealed novel genes and important conservation of protein coding domains and transcription initiation, termination, and splicing signals, which have added to our knowledge of RV2 rhadinovirus genetics. Further comparisons with KSHV and other RV1 rhadinoviruses will provide important avenues for dissecting the biology, evolution, and pathology of these closely related tumor-inducing viruses in humans and other Old World primates. IMPORTANCE This work provides the sequence characterization of MneRV2, the pig-tailed macaque homolog of rhesus rhadinovirus (RRV). MneRV2 and RRV belong to the rhadinovirus 2 (RV2) rhadinovirus lineage of Old World primates and are distinct but related to Kaposi's sarcoma-associated herpesvirus (KSHV), the etiologic agent of Kaposi's sarcoma. Pig-tailed macaques provide important models of human disease, and our previous studies have indicated that MneRV2 plays a causal role in AIDS-related lymphomas in macaques. Delineation of the MneRV2 sequence has allowed a detailed characterization of the genome structure, and evolutionary comparisons with RRV and KSHV have identified conserved promoters, splice junctions, and novel genes. This comparison provides insight into RV2 rhadinovirus biology and sets the groundwork for more intensive next-generation (Next-Gen) transcript and genetic analysis of this class of tumor-inducing herpesvirus. This study supports the use of MneRV2 in pig-tailed macaques as an important model for studying rhadinovirus biology, transmission and pathology. PMID:25609822
Szamalek, Justyna M; Goidts, Violaine; Cooper, David N; Hameister, Horst; Kehrer-Sawatzki, Hildegard
2006-08-01
The human and chimpanzee genomes are distinguishable in terms of ten gross karyotypic differences including nine pericentric inversions and a chromosomal fusion. Seven of these large pericentric inversions are chimpanzee-specific whereas two of them, involving human chromosomes 1 and 18, were fixed in the human lineage after the divergence of humans and chimpanzees. We have performed detailed molecular and computational characterization of the breakpoint regions of the human-specific inversion of chromosome 1. FISH analysis and sequence comparisons together revealed that the pericentromeric region of HSA 1 contains numerous segmental duplications that display a high degree of sequence similarity between both chromosomal arms. Detailed analysis of these regions has allowed us to refine the p-arm breakpoint region to a 154.2 kb interval at 1p11.2 and the q-arm breakpoint region to a 562.6 kb interval at 1q21.1. Both breakpoint regions contain human-specific segmental duplications arranged in inverted orientation. We therefore propose that the pericentric inversion of HSA 1 was mediated by intra-chromosomal non-homologous recombination between these highly homologous segmental duplications that had themselves arisen only recently in the human lineage by duplicative transposition.
NASA Astrophysics Data System (ADS)
Echtler, Helmut; Segl, Karl; Dickerhof, Corinna; Chabrillat, Sabine; Kaufmann, Hermann J.
2003-03-01
The ESF-LSF 1997 flight campaign conducted by the German Aerospace Center (DLR) recorded several transects across the island of Naxos using the airborne hyperspectral scanner DAIS. The geological targets cover all major litho-tectonic units of a metamorphic dome with the transition of metamorphic zonations from the outer meta-sedimentary greenschist envelope to the gneissic amphibolite facies and migmatitic core. Mineral identification of alternating marble-dolomite sequences and interlayered schists bearing muscovite and biotite has been accomplished using the airborne hyperspectral DAIS 7915 sensor. Data have been noise filtered based on maximum noise fraction (MNF) and fast Fourier transform (FFT) and converted from radiance to reflectance. For mineral identification, constrained linear spectral unmixing and spectral angle mapper (SAM) algorithms were tested. Due to their unsatisfying results a new approach was developed which consists of a linear mixture modeling and spectral feature fitting. This approach provides more detailed and accurate information. Results are discussed in comparison with detailed geological mapping and additional information. Calcites are clearly separated from dolomites as well as the mica-schist sequences by a good resolution of the mineral muscovite. Thereon an outstanding result represents the very good resolution of the chlorite/mica (muscovite, biotite)-transition defining a metamorphic isograde.
The microbiome of a striped dolphin (Stenella coeruleoalba) stranded in Portugal.
Godoy-Vitorino, Filipa; Rodriguez-Hilario, Arnold; Alves, Ana Luísa; Gonçalves, Filipa; Cabrera-Colon, Beatriz; Mesquita, Cristina Sousa; Soares-Castro, Pedro; Ferreira, Marisa; Marçalo, Ana; Vingada, José; Eira, Catarina; Santos, Pedro Miguel
2017-01-01
Infectious diseases with epizootic consequences have not been fully studied in marine mammals. Presently, the unprecedented depth of sequencing, made available by high-throughput approaches, allows detailed comparisons of the microbiome in health and disease. This is the first report of the striped dolphin microbiome in different body sites. Samples from one striped female edematous dolphin were acquired from a variety of body niches, including the blowhole, oral cavity, oral mucosa, tongue, stomach, intestines and genital mucosa. Detailed 16S rRNA analysis of over half a million sequences identified 235 OTUs. Beta diversity analyses indicated that microbial communities vary in structure and cluster by sample origin. Pathogenic, Gram-negative, facultative and obligate anaerobic taxa were significantly detected, including Cetobacterium, Fusobacterium and Ureaplasma. Phocoenobacter and Arcobacter dominated the oral-type samples, while Cardiobacteriaceae and Vibrio were associated with the blowhole and Photobacterium were abundant in the gut. We report for the first time the association of Epulopiscium with a marine mammal gut. The striped dolphin microbiota shows variation in structure and diversity according to the organ type. The high dominance of Gram-negative anaerobic pathogens evidences a cetacean microbiome affected by human-related bacteria. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Comparative Sequence Analysis of Multidrug-Resistant IncA/C Plasmids from Salmonella enterica.
Hoffmann, Maria; Pettengill, James B; Gonzalez-Escalona, Narjol; Miller, John; Ayers, Sherry L; Zhao, Shaohua; Allard, Marc W; McDermott, Patrick F; Brown, Eric W; Monday, Steven R
2017-01-01
Determinants of multidrug resistance (MDR) are often encoded on mobile elements, such as plasmids, transposons, and integrons, which have the potential to transfer among foodborne pathogens, as well as to other virulent pathogens, increasing the threats these traits pose to human and veterinary health. Our understanding of MDR among Salmonella has been limited by the lack of closed plasmid genomes for comparisons across resistance phenotypes, due to difficulties in effectively separating the DNA of these high-molecular weight, low-copy-number plasmids from chromosomal DNA. To resolve this problem, we demonstrate an efficient protocol for isolating, sequencing and closing IncA/C plasmids from Salmonella sp. using single molecule real-time sequencing on a Pacific Biosciences (Pacbio) RS II Sequencer. We obtained six Salmonella enterica isolates from poultry, representing six different serovars, each exhibiting the MDR-Ampc resistance profile. Salmonella plasmids were obtained using a modified mini preparation and transformed with Escherichia coli DH10Br. A Qiagen Large-Construct kit™ was used to recover highly concentrated and purified plasmid DNA that was sequenced using PacBio technology. These six closed IncA/C plasmids ranged in size from 104 to 191 kb and shared a stable, conserved backbone containing 98 core genes, with only six differences among those core genes. The plasmids encoded a number of antimicrobial resistance genes, including those for quaternary ammonium compounds and mercury. We then compared our six IncA/C plasmid sequences: first with 14 IncA/C plasmids derived from S. enterica available at the National Center for Biotechnology Information (NCBI), and then with an additional 38 IncA/C plasmids derived from different taxa. These comparisons allowed us to build an evolutionary picture of how antimicrobial resistance may be mediated by this common plasmid backbone. Our project provides detailed genetic information about resistance genes in plasmids, advances in plasmid sequencing, and phylogenetic analyses, and important insights about how MDR evolution occurs across diverse serotypes from different animal sources, particularly in agricultural settings where antimicrobial drug use practices vary.
Nolte-Ernsting, C C; Tacke, J; Adam, G B; Haage, P; Jung, P; Jakse, G; Günther, R W
2001-01-01
The aim of this study was to investigate the utility of different gadolinium-enhanced T1-weighted gradient-echo techniques in excretory MR urography. In 74 urologic patients, excretory MR urography was performed using various T1-weighted gradient-echo (GRE) sequences after injection of gadolinium-DTPA and low-dose furosemide. The examinations included conventional GRE sequences and echo-planar imaging (GRE EPI), both obtained with 3D data sets and 2D projection images. Breath-hold acquisition was used primarily. In 20 of 74 examinations, we compared breath-hold imaging with respiratory gating. Breath-hold imaging was significantly superior to respiratory gating for the visualization of pelvicaliceal systems, but not for the ureters. Complete MR urograms were obtained within 14-20 s using 3D GRE EPI sequences and in 20-30 s with conventional 3D GRE sequences. Ghost artefacts caused by ureteral peristalsis often occurred with conventional 3D GRE imaging and were almost completely suppressed in EPI sequences (p < 0.0001). Susceptibility effects were more pronounced on GRE EPI MR urograms and calculi measured 0.8-21.7% greater in diameter compared with conventional GRE sequences. Increased spatial resolution degraded the image quality only in GRE-EPI urograms. In projection MR urography, the entire pelvicaliceal system was imaged by acquisition of a fast single-slice sequence and the conventional 2D GRE technique provided superior morphological accuracy than 2D GRE EPI projection images (p < 0.0003). Fast 3D GRE EPI sequences improve the clinical practicability of excretory MR urography especially in old or critically ill patients unable to suspend breathing for more than 20 s. Conventional GRE sequences are superior to EPI in high-resolution detail MR urograms and in projection imaging.
Almond, N; Jenkins, A; Heath, A B; Kitchin, P
1993-05-01
Three cynomolgus macaques were immunized with recombinant envelope protein preparations derived from simian immunodeficiency virus (SIV). Although humoral and cellular responses were elicited by the immunization regime, all macaques became infected upon challenge with 10 MID50 of the 11/88 virus challenge stock of SIVmac251-32H. The polymerase chain reaction was used to amplify proviral SIV gp120 sequences present in the blood of both immunized and control macaques at 2 months post-infection. A comparison of the predominant sequences found in the region from V2 to V5 of gp120 failed to differentiate provirus recovered from either immunized or control animals. A detailed investigation of sequences obtained from the hypervariable V1 region identified a mixture of sequences in both immunized and control macaques. Some sequences were identical to those previously detected in the virus challenge stock, whereas others had not been detected previously. Phenogram analysis of the new V1 sequences found in immunized animals revealed that they were quite distinct from those from the virus challenge stock and that they included alterations to potential N-linked glycosylation sites. In contrast, new sequence variants recovered from the control animals were closely related to sequences from the virus challenge stock. The difference in diversity of new V1 sequences recovered from immunized and control macaques was highly significant (P < 0.001). Thus, the presence of pre-existing immune responses to SIV envelope protein is associated with greater genetic change in the V1 region of gp120. These data are discussed in relation to the epitopes of SIV gp120 that may confer protection from in vivo challenge.
Iso-seco-tanapartholides: Isolation, Synthesis and Biological Evaluation
Makiyi, Edward F; Frade, Raquel F M; Lebl, Tomas; Jaffray, Ellis G; Cobb, Susan E; Harvey, Alan L; Slawin, Alexandra M Z; Hay, Ronald T; Westwood, Nicholas J
2009-01-01
The isolation, identification and total synthesis of two plant-derived inhibitors of the NF-κB signaling pathway from the iso-seco-tanapartholide family of natural products is described. A key step in the efficient reaction sequence is a late-stage oxidative cleavage reaction that was carried out in the absence of protecting groups to give the natural products directly. A detailed comparison of the synthetic material with samples of the natural products proved informative. Biological studies on synthetic material confirmed that these compounds act late in the NF-κB signaling pathway. (© Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, Germany, 2009) PMID:23606807
The sequence measurement system of the IR camera
NASA Astrophysics Data System (ADS)
Geng, Ai-hui; Han, Hong-xia; Zhang, Hai-bo
2011-08-01
Currently, the IR cameras are broadly used in the optic-electronic tracking, optic-electronic measuring, fire control and optic-electronic countermeasure field, but the output sequence of the most presently applied IR cameras in the project is complex and the giving sequence documents from the leave factory are not detailed. Aiming at the requirement that the continuous image transmission and image procession system need the detailed sequence of the IR cameras, the sequence measurement system of the IR camera is designed, and the detailed sequence measurement way of the applied IR camera is carried out. The FPGA programming combined with the SignalTap online observation way has been applied in the sequence measurement system, and the precise sequence of the IR camera's output signal has been achieved, the detailed document of the IR camera has been supplied to the continuous image transmission system, image processing system and etc. The sequence measurement system of the IR camera includes CameraLink input interface part, LVDS input interface part, FPGA part, CameraLink output interface part and etc, thereinto the FPGA part is the key composed part in the sequence measurement system. Both the video signal of the CmaeraLink style and the video signal of LVDS style can be accepted by the sequence measurement system, and because the image processing card and image memory card always use the CameraLink interface as its input interface style, the output signal style of the sequence measurement system has been designed into CameraLink interface. The sequence measurement system does the IR camera's sequence measurement work and meanwhile does the interface transmission work to some cameras. Inside the FPGA of the sequence measurement system, the sequence measurement program, the pixel clock modification, the SignalTap file configuration and the SignalTap online observation has been integrated to realize the precise measurement to the IR camera. Te sequence measurement program written by the verilog language combining the SignalTap tool on line observation can count the line numbers in one frame, pixel numbers in one line and meanwhile account the line offset and row offset of the image. Aiming at the complex sequence of the IR camera's output signal, the sequence measurement system of the IR camera accurately measures the sequence of the project applied camera, supplies the detailed sequence document to the continuous system such as image processing system and image transmission system and gives out the concrete parameters of the fval, lval, pixclk, line offset and row offset. The experiment shows that the sequence measurement system of the IR camera can get the precise sequence measurement result and works stably, laying foundation for the continuous system.
NASA Astrophysics Data System (ADS)
Et-Touhami, M.; Et-Touhami, M.; Olsen, P. E.; Puffer, J.
2001-05-01
Previously very sparse biostratigraphic data suggested that the Early Mesozoic tholeiitic effusive and intrusive magmatism in the various basins of the Maghreb occurred over a long time (Ladinian-Hettangian). However, a detailed comparison of the stratigraphy underlying, interbedded with, and overlying the basalts in these basins shows not only remarkable similarities with each other, but also with sequences in the latest Triassic and earliest Jurassic of eastern North America. There, the sequences have been shown to be cyclical, controlled by Milankovitch-type climate cycles; the same seems to be true in at least part of the Maghreb. Thus, the Moroccan basins have cyclical sequences surrounding and interbedded with one or two basaltic units. In the Argana and Khemisset basins the Tr-J boundary is identified by palynology to be below the lowest basalt, and the remarkably close lithological similarity between the pre-basalt sequence in the other Moroccan basins and to the North American basins - especially the Fundy basin - suggests a tight correlation in time. Likewise, the strata above the lowest basalt in Morocco show a similar pattern to what is seen above the lowest basalt formation in eastern North America, as do the overlying sequences. Furthermore, geochemistry on basalts in the Argana, Bou Fekrane, Khemisset, and Iouawen basins indicate they are high-Ti quartz-normative tholeiites as are the Orange Mountain Basalt (Fundy basin) and the North Mountain Basalt (Newark basin). The remarkable lithostratigraphic similarity across the Maghreb of these strata suggest contemporaneous and synchronous eruption over a time span of less than 200 ky, based on Milankovitch calibration, and within a ~20 ky interval after the Triassic-Jurassic boundary. Differences with previous interpretations of the biostratigraphy can be rationalized as a result of: 1, an over-reliance on comparisons with northern European palynology; 2, over-interpretation of poorly preserved fossils; and 3, rarity of early Jurassic non-marine ostracode assemblages.
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.
Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.
Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena
2008-03-21
The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.
First genome sequences of Achromobacter phages reveal new members of the N4 family.
Wittmann, Johannes; Dreiseikelmann, Brigitte; Rohde, Manfred; Meier-Kolthoff, Jan P; Bunk, Boyke; Rohde, Christine
2014-01-27
Multi-resistant Achromobacter xylosoxidans has been recognized as an emerging pathogen causing nosocomially acquired infections during the last years. Phages as natural opponents could be an alternative to fight such infections. Bacteriophages against this opportunistic pathogen were isolated in a recent study. This study shows a molecular analysis of two podoviruses and reveals first insights into the genomic structure of Achromobacter phages so far. Growth curve experiments and adsorption kinetics were performed for both phages. Adsorption and propagation in cells were visualized by electron microscopy. Both phage genomes were sequenced with the PacBio RS II system based on single molecule, real-time (SMRT) technology and annotated with several bioinformatic tools. To further elucidate the evolutionary relationships between the phage genomes, a phylogenomic analysis was conducted using the genome Blast Distance Phylogeny approach (GBDP). In this study, we present the first detailed analysis of genome sequences of two Achromobacter phages so far. Phages JWAlpha and JWDelta were isolated from two different waste water treatment plants in Germany. Both phages belong to the Podoviridae and contain linear, double-stranded DNA with a length of 72329 bp and 73659 bp, respectively. 92 and 89 putative open reading frames were identified for JWAlpha and JWDelta, respectively, by bioinformatic analysis with several tools. The genomes have nearly the same organization and could be divided into different clusters for transcription, replication, host interaction, head and tail structure and lysis. Detailed annotation via protein comparisons with BLASTP revealed strong similarities to N4-like phages. Analysis of the genomes of Achromobacter phages JWAlpha and JWDelta and comparisons of different gene clusters with other phages revealed that they might be strongly related to other N4-like phages, especially of the Escherichia group. Although all these phages show a highly conserved genomic structure and partially strong similarities at the amino acid level, some differences could be identified. Those differences, e.g. the existence of specific genes for replication or host interaction in some N4-like phages, seem to be interesting targets for further examination of function and specific mechanisms, which might enlighten the mechanism of phage establishment in the host cell after infection.
Boyd, Bret M; Allen, Julie M; de Crécy-Lagard, Valérie; Reed, David L
2014-09-11
The obligate-heritable endosymbionts of insects possess some of the smallest known bacterial genomes. This is likely due to loss of genomic material during symbiosis. The mode and rate of this erosion may change over evolutionary time: faster in newly formed associations and slower in long-established ones. The endosymbionts of human and anthropoid primate lice present a unique opportunity to study genome erosion in newly established (or young) symbionts. This is because we have a detailed phylogenetic history of these endosymbionts with divergence dates for closely related species. This allows for genome evolution to be studied in detail and rates of change to be estimated in a phylogenetic framework. Here, we sequenced the genome of the chimpanzee louse endosymbiont (Candidatus Riesia pediculischaeffi) and compared it with the closely related genome of the human body louse endosymbiont. From this comparison, we found evidence for recent genome erosion leading to gene loss in these endosymbionts. Although gene loss was detected, it was not significantly greater than in older endosymbionts from aphids and ants. Additionally, we searched for genes associated with B-vitamin synthesis in the two louse endosymbiont genomes because these endosymbionts are believed to synthesize essential B vitamins absent in the louse's diet. All of the expected genes were present, except those involved in thiamin synthesis. We failed to find genes encoding for proteins involved in the biosynthesis of thiamin or any complete exogenous means of salvaging thiamin, suggesting there is an undescribed mechanism for the salvage of thiamin. Finally, genes encoding for the pantothenate de novo biosynthesis pathway were located on a plasmid in both taxa along with a heat shock protein. Movement of these genes onto a plasmid may be functionally and evolutionarily significant, potentially increasing production and guarding against the deleterious effects of mutation. These data add to a growing resource of obligate endosymbiont genomes and to our understanding of the rate and mode of genome erosion in obligate animal-associated bacteria. Ultimately sequencing additional louse p-endosymbiont genomes will provide a model system for studying genome evolution in obligate host associated bacteria. Copyright © 2014 Boyd et al.
Nishio, Shin-Ya; Usami, Shin-Ichi
2017-03-01
Recent advances in next-generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease-specific databases. Here, we report a new database development tool, named the "Clinical NGS Database," for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two-feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity-based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database. © 2016 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Souza, B; Stoutland, P; Derbise, A
2004-01-24
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons to available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveals 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, represent the only new genetic material in Y. pestis acquired since the divergence from Y.more » pseudotuberculosis. In contrast, 149 new pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive IS-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of pre-existing gene expression pathways appear to be more important than acquisition of new genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geller, Aaron M.; Hurley, Jarrod R.; Mathieu, Robert D., E-mail: a-geller@northwestern.edu, E-mail: mathieu@astro.wisc.edu, E-mail: jhurley@astro.swin.edu.au
2013-01-01
Following on from a recently completed radial-velocity survey of the old (7 Gyr) open cluster NGC 188 in which we studied in detail the solar-type hard binaries and blue stragglers of the cluster, here we investigate the dynamical evolution of NGC 188 through a sophisticated N-body model. Importantly, we employ the observed binary properties of the young (180 Myr) open cluster M35, where possible, to guide our choices for parameters of the initial binary population. We apply pre-main-sequence tidal circularization and a substantial increase to the main-sequence tidal circularization rate, both of which are necessary to match the observed tidalmore » circularization periods in the literature, including that of NGC 188. At 7 Gyr the main-sequence solar-type hard-binary population in the model matches that of NGC 188 in both binary frequency and distributions of orbital parameters. This agreement between the model and observations is in a large part due to the similarities between the NGC 188 and M35 solar-type binaries. Indeed, among the 7 Gyr main-sequence binaries in the model, only those with P {approx}> 1000 days begin to show potentially observable evidence for modifications by dynamical encounters, even after 7 Gyr of evolution within the star cluster. This emphasizes the importance of defining accurate initial conditions for star cluster models, which we propose is best accomplished through comparisons with observations of young open clusters like M35. Furthermore, this finding suggests that observations of the present-day binaries in even old open clusters can provide valuable information on their primordial binary populations. However, despite the model's success at matching the observed solar-type main-sequence population, the model underproduces blue stragglers and produces an overabundance of long-period circular main-sequence-white-dwarf binaries as compared with the true cluster. We explore several potential solutions to the paucity of blue stragglers and conclude that the model dramatically underproduces blue stragglers through mass-transfer processes. We suggest that common-envelope evolution may have been incorrectly imposed on the progenitors of the spurious long-period circular main-sequence-white-dwarf binaries, which perhaps instead should have gone through stable mass transfer to create blue stragglers, thereby bringing both the number and binary frequency of the blue straggler population in the model into agreement with the true blue stragglers in NGC 188. Thus, improvements in the physics of mass transfer and common-envelope evolution employed in the model may in fact solve both discrepancies with the observations. This project highlights the unique accessibility of open clusters to both comprehensive observational surveys and full-scale N-body simulations, both of which have only recently matured sufficiently to enable such a project, and underscores the importance of open clusters to the study of star cluster dynamics.« less
Grandi, Nicole; Cadeddu, Marta; Blomberg, Jonas; Tramontano, Enzo
2016-09-09
Human endogenous retroviruses (HERVs) are ancient sequences integrated in the germ line cells and vertically transmitted through the offspring constituting about 8 % of our genome. In time, HERVs accumulated mutations that compromised their coding capacity. A prominent exception is HERV-W locus 7q21.2, producing a functional Env protein (Syncytin-1) coopted for placental syncytiotrophoblast formation. While expression of HERV-W sequences has been investigated for their correlation to disease, an exhaustive description of the group composition and characteristics is still not available and current HERV-W group information derive from studies published a few years ago that, of course, used the rough assemblies of the human genome available at that time. This hampers the comparison and correlation with current human genome assemblies. In the present work we identified and described in detail the distribution and genetic composition of 213 HERV-W elements. The bioinformatics analysis led to the characterization of several previously unreported features and provided a phylogenetic classification of two main subgroups with different age and structural characteristics. New facts on HERV-W genomic context of insertion and co-localization with sequences putatively involved in disease development are also reported. The present work is a detailed overview of the HERV-W contribution to the human genome and provides a robust genetic background useful to clarify HERV-W role in pathologies with poorly understood etiology, representing, to our knowledge, the most complete and exhaustive HERV-W dataset up to date.
Bhatia, S; Singh Negi, M; Lakshmikumaran, M
1996-11-01
EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung
2017-08-08
We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
ERIC Educational Resources Information Center
Limongelli, Carla; Sciarrone, Filippo; Temperini, Marco; Vaste, Giulia
2011-01-01
LS-Lab provides automatic support to comparison/evaluation of the Learning Object Sequences produced by different Curriculum Sequencing Algorithms. Through this framework a teacher can verify the correspondence between the behaviour of different sequencing algorithms and her pedagogical preferences. In fact the teacher can compare algorithms…
Cylinder expansion test and gas gun experiment comparison
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harrier, Danielle
This is a summer internship presentation by the Hydro Working Group at Los Alamos National Laboratory (LANL) and goes into detail about their cylinder expansion test and gas gun experiment comparison. Specifically, the gas gun experiment is detailed along with applications, the cylinder expansion test is detailed along with applications, there is a comparison of the methods with pros and cons and limitations listed, the summer project is detailed, and future work is talked about.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Sequence Composition and Gene Content of the Short Arm of Rye (Secale cereale) Chromosome 1
Fluch, Silvia; Kopecky, Dieter; Burg, Kornel; Šimková, Hana; Taudien, Stefan; Petzold, Andreas; Kubaláková, Marie; Platzer, Matthias; Berenyi, Maria; Krainer, Siegfried; Doležel, Jaroslav; Lelley, Tamas
2012-01-01
Background The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. Methodology/Principal Findings Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. Conclusions The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye. PMID:22328922
Wallis, Michael
2008-01-15
Mammalian growth hormone (GH) sequences have been shown previously to display episodic evolution: the sequence is generally strongly conserved but on at least two occasions during mammalian evolution (on lineages leading to higher primates and ruminants) bursts of rapid evolution occurred. However, the number of mammalian orders studied previously has been relatively limited, and the availability of sequence data via mammalian genome projects provides the potential for extending the range of GH gene sequences examined. Complete or nearly complete GH gene sequences for six mammalian species for which no data were previously available have been extracted from the genome databases-Dasypus novemcinctus (nine-banded armadillo), Erinaceus europaeus (western European hedgehog), Myotis lucifugus (little brown bat), Procavia capensis (cape rock hyrax), Sorex araneus (European shrew), Spermophilus tridecemlineatus (13-lined ground squirrel). In addition incomplete data for several other species have been extended. Examination of the data in detail and comparison with previously available sequences has allowed assessment of the reliability of deduced sequences. Several of the new sequences differ substantially from the consensus sequence previously determined for eutherian GHs, indicating greater variability than previously recognised, and confirming the episodic pattern of evolution. The episodic pattern is not seen for signal sequences, 5' upstream sequence or synonymous substitutions-it is specific to the mature protein sequence, suggesting that it relates to the hormonal function. The substitutions accumulated during the course of GH evolution have occurred mainly on the side of the hormone facing away from the receptor, in a non-random fashion, and it is suggested that this may reflect interaction of the receptor-bound hormone with other proteins or small ligands.
Oracle Applications Patch Administration Tool (PAT) Beta Version
DOE Office of Scientific and Technical Information (OSTI.GOV)
2002-01-04
PAT is a Patch Administration Tool that provides analysis, tracking, and management of Oracle Application patches. This includes capabilities as outlined below: Patch Analysis & Management Tool Outline of capabilities: Administration Patch Data Maintenance -- track Oracle Application patches applied to what database instance & machine Patch Analysis capture text files (readme.txt and driver files) form comparison detail report comparison detail PL/SQL package comparison detail SQL scripts detail JSP module comparison detail Parse and load the current applptch.txt (10.7) or load patch data from Oracle Application database patch tables (11i) Display Analysis -- Compare patch to be applied with currentmore » Oracle Application installed Appl_top code versions Patch Detail Module comparison detail Analyze and display one Oracle Application module patch. Patch Management -- automatic queue and execution of patches Administration Parameter maintenance -- setting for directory structure of Oracle Application appl_top Validation data maintenance -- machine names and instances to patch Operation Patch Data Maintenance Schedule a patch (queue for later execution) Run a patch (queue for immediate execution) Review the patch logs Patch Management Reports« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersen, Mikael R.; Salazar, Margarita; Schaap, Peter
2011-06-01
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases and protein transporters.« less
Rudi, Knut; Zimonja, Monika; Kvenshagen, Bente; Rugtveit, Jarle; Midtvedt, Tore; Eggesbø, Merete
2007-01-01
We present a novel approach for comparing 16S rRNA gene clone libraries that is independent of both DNA sequence alignment and definition of bacterial phylogroups. These steps are the major bottlenecks in current microbial comparative analyses. We used direct comparisons of taxon density distributions in an absolute evolutionary coordinate space. The coordinate space was generated by using alignment-independent bilinear multivariate modeling. Statistical analyses for clone library comparisons were based on multivariate analysis of variance, partial least-squares regression, and permutations. Clone libraries from both adult and infant gastrointestinal tract microbial communities were used as biological models. We reanalyzed a library consisting of 11,831 clones covering complete colons from three healthy adults in addition to a smaller 390-clone library from infant feces. We show that it is possible to extract detailed information about microbial community structures using our alignment-independent method. Our density distribution analysis is also very efficient with respect to computer operation time, meeting the future requirements of large-scale screenings to understand the diversity and dynamics of microbial communities. PMID:17337554
Recognition of Yeast Species from Gene Sequence Comparisons
USDA-ARS?s Scientific Manuscript database
This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...
Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A; Walters, William A; Ursell, Luke K; Gibbons, Sean M; Chase, John; McDonald, Daniel; Gonzalez, Antonio; Robbins-Pianka, Adam; Clemente, Jose C; Gilbert, Jack A; Huse, Susan M; Zhou, Hong-Wei; Knight, Rob; Caporaso, J Gregory
2014-01-01
We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA) sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime) and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity). Because more of our algorithm can be run in parallel relative to "classic" open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, "classic" open-reference OTU clustering is often faster). We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons). To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of "classic" open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by "classic" open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME's uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository) that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced). Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in QIIME's OTU picking workflows and make recommendations on settings for these free parameters to optimize runtime without reducing the quality of the results. These optimized parameters can vastly decrease the runtime of uclust-based OTU picking in QIIME.
The limits of protein sequence comparison?
Pearson, William R; Sierk, Michael L
2010-01-01
Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized. PMID:15919194
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.
Powell, Bradford C; Hutchison, Clyde A
2006-01-19
Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs
Powell, Bradford C; Hutchison, Clyde A
2006-01-01
Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288
Viswanathan, R; Balamuralikrishnan, M; Karuppaiah, R
2008-12-01
Sugarcane yellow leaf virus (SCYLV) that causes yellow leaf disease (YLD) in sugarcane (recently reported in India) belongs to Polerovirus. Detailed studies were conducted to characterize the virus based on partial open reading frames (ORFs) 1 and 2 and complete ORFs 3 and 4 sequences in their genome. Reverse-transcriptase polymerase chain reaction (RT-PCR) was performed on 48 sugarcane leaf samples to detect the virus using a specific set of primers. Of the 48 samples, 36 samples (field samples with and without foliar symptoms) including 10 meristem culture derived plants were found to be positive to SCYLV infection. Additionally, an aphid colony collected from symptomatic sugarcane in the field was also found to be SCYLV positive. The amplicons from 22 samples were cloned, sequenced and acronymed as SCYLV-CB isolates. The nucleotide (nt) and amino acid (aa) sequence comparison showed a significant variation between SCYLV-CB and the database sequences at nt (3.7-5.1%) and aa (3.2-5.3%) sequence level in the CP coding region. However, the database sequences comprising isolates of three reported genotypes, viz., BRA, PER and REU, were observed with least nt and aa sequence dissimilarities (0.0-1.6%). The phylogenetic analyses of the overlapping ORFs (ORF 3 and ORF 4) of SCYLV encoding CP and MP determined in this study and additional sequences of 26 other isolates including an Indian isolate (SCYLV-IND) available from GenBank were distributed in four phylogenetic clusters. The SCYLV-CB isolates from this study lineated in two clusters (C1 and C2) and all the other isolates from the worldwide locations into another two clusters (C3 and C4). The sequence variation of the isolates in this study with the database isolates, even in the least variable region of the SCYLV genome, showed that the population existing in India is significantly different from rest of the world. Further, comparison of partial sequences encoding for ORFs 1 and 2 revealed that YLD in sugarcane in India is caused by at least three genotypes, viz., CUB, IND and BRA-PER, of which a majority of the samples were found infected with Cuban genotype (CUB) and lesser by IND and BRA-PER genotypes. The genotype IND was identified as a new genotype from this study, and this was found to have significant variation with the reported genotypes.
ERIC Educational Resources Information Center
Noell, George H.; Gresham, Frank M.
2001-01-01
Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…
Bhatia, Shipra; Gordon, Christopher T.; Foster, Robert G.; Melin, Lucie; Abadie, Véronique; Baujat, Geneviève; Vazquez, Marie-Paule; Amiel, Jeanne; Lyonnet, Stanislas; van Heyningen, Veronica; Kleinjan, Dirk A.
2015-01-01
Disruption of gene regulation by sequence variation in non-coding regions of the genome is now recognised as a significant cause of human disease and disease susceptibility. Sequence variants in cis-regulatory elements (CREs), the primary determinants of spatio-temporal gene regulation, can alter transcription factor binding sites. While technological advances have led to easy identification of disease-associated CRE variants, robust methods for discerning functional CRE variants from background variation are lacking. Here we describe an efficient dual-colour reporter transgenesis approach in zebrafish, simultaneously allowing detailed in vivo comparison of spatio-temporal differences in regulatory activity between putative CRE variants and assessment of altered transcription factor binding potential of the variant. We validate the method on known disease-associated elements regulating SHH, PAX6 and IRF6 and subsequently characterise novel, ultra-long-range SOX9 enhancers implicated in the craniofacial abnormality Pierre Robin Sequence. The method provides a highly cost-effective, fast and robust approach for simultaneously unravelling in a single assay whether, where and when in embryonic development a disease-associated CRE-variant is affecting its regulatory function. PMID:26030420
Smith, M. Alex; Fisher, Brian L; Hebert, Paul D.N
2005-01-01
The role of DNA barcoding as a tool to accelerate the inventory and analysis of diversity for hyperdiverse arthropods is tested using ants in Madagascar. We demonstrate how DNA barcoding helps address the failure of current inventory methods to rapidly respond to pressing biodiversity needs, specifically in the assessment of richness and turnover across landscapes with hyperdiverse taxa. In a comparison of inventories at four localities in northern Madagascar, patterns of richness were not significantly different when richness was determined using morphological taxonomy (morphospecies) or sequence divergence thresholds (Molecular Operational Taxonomic Unit(s); MOTU). However, sequence-based methods tended to yield greater richness and significantly lower indices of similarity than morphological taxonomy. MOTU determined using our molecular technique were a remarkably local phenomenon—indicative of highly restricted dispersal and/or long-term isolation. In cases where molecular and morphological methods differed in their assignment of individuals to categories, the morphological estimate was always more conservative than the molecular estimate. In those cases where morphospecies descriptions collapsed distinct molecular groups, sequence divergences of 16% (on average) were contained within the same morphospecies. Such high divergences highlight taxa for further detailed genetic, morphological, life history, and behavioral studies. PMID:16214741
A Comprehensive Curation Shows the Dynamic Evolutionary Patterns of Prokaryotic CRISPRs.
Mai, Guoqin; Ge, Ruiquan; Sun, Guoquan; Meng, Qinghan; Zhou, Fengfeng
2016-01-01
Motivation. Clustered regularly interspaced short palindromic repeat (CRISPR) is a genetic element with active regulation roles for foreign invasive genes in the prokaryotic genomes and has been engineered to work with the CRISPR-associated sequence (Cas) gene Cas9 as one of the modern genome editing technologies. Due to inconsistent definitions, the existing CRISPR detection programs seem to have missed some weak CRISPR signals. Results. This study manually curates all the currently annotated CRISPR elements in the prokaryotic genomes and proposes 95 updates to the annotations. A new definition is proposed to cover all the CRISPRs. The comprehensive comparison of CRISPR numbers on the taxonomic levels of both domains and genus shows high variations for closely related species even in the same genus. The detailed investigation of how CRISPRs are evolutionarily manipulated in the 8 completely sequenced species in the genus Thermoanaerobacter demonstrates that transposons act as a frequent tool for splitting long CRISPRs into shorter ones along a long evolutionary history.
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.
Meinicke, Peter
2009-09-02
Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment
Freschi, Valerio; Bogliolo, Alessandro
2012-01-01
In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
New powerful statistics for alignment-free sequence comparison under a pattern transfer model.
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu
2011-09-07
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu
2011-01-01
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298
Multiple alignment-free sequence comparison
Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine
2013-01-01
Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
EVALLER: a web server for in silico assessment of potential protein allergenicity
Barrio, Alvaro Martinez; Soeria-Atmadja, Daniel; Nistér, Anders; Gustafsson, Mats G.; Hammerling, Ulf; Bongcam-Rudloff, Erik
2007-01-01
Bioinformatics testing approaches for protein allergenicity, involving amino acid sequence comparisons, have evolved appreciably over the last several years to increased sophistication and performance. EVALLER, the web server presented in this article is based on our recently published ‘Detection based on Filtered Length-adjusted Allergen Peptides’ (DFLAP) algorithm, which affords in silico determination of potential protein allergenicity of high sensitivity and excellent specificity. To strengthen bioinformatics risk assessment in allergology EVALLER provides a comprehensive outline of its judgment on a query protein's potential allergenicity. Each such textual output incorporates a scoring figure, a confidence numeral of the assignment and information on high- or low-scoring matches to identified allergen-related motifs, including their respective location in accordingly derived allergens. The interface, built on a modified Perl Open Source package, enables dynamic and color-coded graphic representation of key parts of the output. Moreover, pertinent details can be examined in great detail through zoomed views. The server can be accessed at http://bioinformatics.bmc.uu.se/evaller.html. PMID:17537818
CANDELS Visual Classifications: Scheme, Data Release, and First Results
NASA Technical Reports Server (NTRS)
Kartaltepe, Jeyhan S.; Mozena, Mark; Kocevski, Dale; McIntosh, Daniel H.; Lotz, Jennifer; Bell, Eric F.; Faber, Sandy; Ferguson, Henry; Koo, David; Bassett, Robert;
2014-01-01
We have undertaken an ambitious program to visually classify all galaxies in the five CANDELS fields down to H <24.5 involving the dedicated efforts of 65 individual classifiers. Once completed, we expect to have detailed morphological classifications for over 50,000 galaxies spanning 0 < z < 4 over all the fields. Here, we present our detailed visual classification scheme, which was designed to cover a wide range of CANDELS science goals. This scheme includes the basic Hubble sequence types, but also includes a detailed look at mergers and interactions, the clumpiness of galaxies, k-corrections, and a variety of other structural properties. In this paper, we focus on the first field to be completed - GOODS-S, which has been classified at various depths. The wide area coverage spanning the full field (wide+deep+ERS) includes 7634 galaxies that have been classified by at least three different people. In the deep area of the field, 2534 galaxies have been classified by at least five different people at three different depths. With this paper, we release to the public all of the visual classifications in GOODS-S along with the Perl/Tk GUI that we developed to classify galaxies. We present our initial results here, including an analysis of our internal consistency and comparisons among multiple classifiers as well as a comparison to the Sersic index. We find that the level of agreement among classifiers is quite good and depends on both the galaxy magnitude and the galaxy type, with disks showing the highest level of agreement and irregulars the lowest. A comparison of our classifications with the Sersic index and restframe colors shows a clear separation between disk and spheroid populations. Finally, we explore morphological k-corrections between the V-band and H-band observations and find that a small fraction (84 galaxies in total) are classified as being very different between these two bands. These galaxies typically have very clumpy and extended morphology or are very faint in the V-band.
Gardner, Shea N.; Hall, Barry G.
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four “raw read” genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths. PMID:24349125
Gardner, Shea N; Hall, Barry G
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
Reference-guided assembly of four diverse Arabidopsis thaliana genomes
Schneeberger, Korbinian; Ossowski, Stephan; Ott, Felix; Klein, Juliane D.; Wang, Xi; Lanz, Christa; Smith, Lisa M.; Cao, Jun; Fitz, Joffrey; Warthmann, Norman; Henz, Stefan R.; Huson, Daniel H.; Weigel, Detlef
2011-01-01
We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html. PMID:21646520
The need for high-quality whole-genome sequence databases in microbial forensics.
Sjödin, Andreas; Broman, Tina; Melefors, Öjar; Andersson, Gunnar; Rasmusson, Birgitta; Knutsson, Rickard; Forsman, Mats
2013-09-01
Microbial forensics is an important part of a strengthened capability to respond to biocrime and bioterrorism incidents to aid in the complex task of distinguishing between natural outbreaks and deliberate acts. The goal of a microbial forensic investigation is to identify and criminally prosecute those responsible for a biological attack, and it involves a detailed analysis of the weapon--that is, the pathogen. The recent development of next-generation sequencing (NGS) technologies has greatly increased the resolution that can be achieved in microbial forensic analyses. It is now possible to identify, quickly and in an unbiased manner, previously undetectable genome differences between closely related isolates. This development is particularly relevant for the most deadly bacterial diseases that are caused by bacterial lineages with extremely low levels of genetic diversity. Whole-genome analysis of pathogens is envisaged to be increasingly essential for this purpose. In a microbial forensic context, whole-genome sequence analysis is the ultimate method for strain comparisons as it is informative during identification, characterization, and attribution--all 3 major stages of the investigation--and at all levels of microbial strain identity resolution (ie, it resolves the full spectrum from family to isolate). Given these capabilities, one bottleneck in microbial forensics investigations is the availability of high-quality reference databases of bacterial whole-genome sequences. To be of high quality, databases need to be curated and accurate in terms of sequences, metadata, and genetic diversity coverage. The development of whole-genome sequence databases will be instrumental in successfully tracing pathogens in the future.
Insights into the sequence parameters for halophilic adaptation.
Nath, Abhigyan
2016-03-01
The sequence parameters for halophilic adaptation are still not fully understood. To understand the molecular basis of protein hypersaline adaptation, a detailed analysis is carried out, and investigated the likely association of protein sequence attributes to halophilic adaptation. A two-stage strategy is implemented, where in the first stage a supervised machine learning classifier is build, giving an overall accuracy of 86 % on stratified tenfold cross validation and 90 % on blind testing set, which are better than the previously reported results. The second stage consists of statistical analysis of sequence features and possible extraction of halophilic molecular signatures. The results of this study showed that, halophilic proteins are characterized by lower average charge, lower K content, and lower S content. A statistically significant preference/avoidance list of sequence parameters is also reported giving insights into the molecular basis of halophilic adaptation. D, Q, E, H, P, T, V are significantly preferred while N, C, I, K, M, F, S are significantly avoided. Among amino acid physicochemical groups, small, polar, charged, acidic and hydrophilic groups are preferred over other groups. The halophilic proteins also showed a preference for higher average flexibility, higher average polarity and avoidance for higher average positive charge, average bulkiness and average hydrophobicity. Some interesting trends observed in dipeptide counts are also reported. Further a systematic statistical comparison is undertaken for gaining insights into the sequence feature distribution in different residue structural states. The current analysis may facilitate the understanding of the mechanism of halophilic adaptation clearer, which can be further used for rational design of halophilic proteins.
Mechanism for DNA transposons to generate introns on genomic scales
Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.
2017-01-01
Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution
Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691
Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.
2011-01-01
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515
2012-01-01
Background RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Results Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. Conclusions This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates. PMID:22985019
Robles, José A; Qureshi, Sumaira E; Stephen, Stuart J; Wilson, Susan R; Burden, Conrad J; Taylor, Jennifer M
2012-09-17
RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates.
NASA Astrophysics Data System (ADS)
Campbell, T. L.; Geller, J. B.; Heller, P.; Ruiz, G.; Chang, A.; McCann, L.; Ceballos, L.; Marraffini, M.; Ashton, G.; Larson, K.; Havard, S.; Meagher, K.; Wheelock, M.; Drake, C.; Rhett, G.
2016-02-01
The Ballast Water Management Act, the Marine Invasive Species Act, and the Coastal Ecosystem Protection Act require the California Department of Fish and Wildlife to monitor and evaluate the extent of biological invasions in the state's marine and estuarine waters. This has been performed statewide, using a variety of methodologies. Conventional sample collection and processing is laborious, slow and costly, and may require considerable taxonomic expertise requiring detailed time-consuming microscopic study of multiple specimens. These factors limit the volume of biomass that can be searched for introduced species. New technologies continue to reduce the cost and increase the throughput of genetic analyses, which become efficient alternatives to traditional morphological analysis for identification, monitoring and surveillance of marine invasive species. Using next-generation sequencing of mitochondrial Cytochrome c oxidase subunit I (COI) and nuclear large subunit ribosomal RNA (LSU), we analyzed over 15,000 individual marine invertebrates collected in Californian waters. We have created sequence databases of California native and non-native species to assist in molecular identification and surveillance in North American waters. Metagenetics, the next-generation sequencing of environmental samples with comparison to DNA sequence databases, is a faster and cost-effective alternative to individual sample analysis. We have sequenced from biomass collected from whole settlement plates and plankton in California harbors, and used our introduced species database to create species lists. We can combine these species lists for individual marinas with collected environmental data, such as temperature, salinity, and dissolved oxygen to understand the ecology of marine invasions. Here we discuss high throughput sampling, sequencing, and COASTLINE, our data analysis answer to challenges working with hundreds of millions of sequencing reads from tens of thousands of specimens.
DeGraaff-Surpless, K.; Mahoney, J.B.; Wooden, J.L.; McWilliams, M.O.
2003-01-01
High-frequency sampling for detrital zircon analysis can provide a detailed record of fine-scale basin evolution by revealing the temporal and spatial variability of detrital zircon ages within clastic sedimentary successions. This investigation employed detailed sampling of two sedimentary successions in the Methow/Methow-Tyaughton basin of the southern Canadian Cordillera to characterize the heterogeneity of detrital zircon signatures within single lithofacies and assess the applicability of detrital zircon analysis in distinguishing fine-scale provenance changes not apparent in lithologic analysis of the strata. The Methow/Methow-Tyaughton basin contains two distinct stratigraphic sequences of middle Albian to Santonian clastic sedimentary rocks: submarine-fan deposits of the Harts Pass Formation/Jackass Mountain Group and fluvial deposits of the Winthrop Formation. Although both stratigraphic sequences displayed consistent ranges in detrital zircon ages on a broad scale, detailed sampling within each succession revealed heterogeneity in the detrital zircon age distributions that was systematic and predictable in the turbidite succession but unpredictable in the fluvial succession. These results suggest that a high-density sampling approach permits interpretation of finescale changes within a lithologically uniform turbiditic sedimentary succession, but heterogeneity within fluvial systems may be too large and unpredictable to permit accurate fine-scale characterization of the evolution of source regions. The robust composite detrital zircon age signature developed for these two successions permits comparison of the Methow/Methow-Tyaughton basin age signature with known plutonic source-rock ages from major plutonic belts throughout the Cretaceous North American margin. The Methow/Methow-Tyaughton basin detrital zircon age signature matches best with source regions in the southern Canadian Cordillera, requiring that the basin developed in close proximity to the southern Canadian Cordillera and providing evidence against large-scale dextral translation of the Methow terrane.
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.
Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M
2003-02-28
Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
Omics Metadata Management Software (OMMS).
Perez-Arriaga, Martha O; Wilson, Susan; Williams, Kelly P; Schoeniger, Joseph; Waymire, Russel L; Powell, Amy Jo
2015-01-01
Next-generation sequencing projects have underappreciated information management tasks requiring detailed attention to specimen curation, nucleic acid sample preparation and sequence production methods required for downstream data processing, comparison, interpretation, sharing and reuse. The few existing metadata management tools for genome-based studies provide weak curatorial frameworks for experimentalists to store and manage idiosyncratic, project-specific information, typically offering no automation supporting unified naming and numbering conventions for sequencing production environments that routinely deal with hundreds, if not thousands of samples at a time. Moreover, existing tools are not readily interfaced with bioinformatics executables, (e.g., BLAST, Bowtie2, custom pipelines). Our application, the Omics Metadata Management Software (OMMS), answers both needs, empowering experimentalists to generate intuitive, consistent metadata, and perform analyses and information management tasks via an intuitive web-based interface. Several use cases with short-read sequence datasets are provided to validate installation and integrated function, and suggest possible methodological road maps for prospective users. Provided examples highlight possible OMMS workflows for metadata curation, multistep analyses, and results management and downloading. The OMMS can be implemented as a stand alone-package for individual laboratories, or can be configured for webbased deployment supporting geographically-dispersed projects. The OMMS was developed using an open-source software base, is flexible, extensible and easily installed and executed. The OMMS can be obtained at http://omms.sandia.gov. The OMMS can be obtained at http://omms.sandia.gov.
Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P
2012-01-01
Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
Omics Metadata Management Software (OMMS)
Perez-Arriaga, Martha O; Wilson, Susan; Williams, Kelly P; Schoeniger, Joseph; Waymire, Russel L; Powell, Amy Jo
2015-01-01
Next-generation sequencing projects have underappreciated information management tasks requiring detailed attention to specimen curation, nucleic acid sample preparation and sequence production methods required for downstream data processing, comparison, interpretation, sharing and reuse. The few existing metadata management tools for genome-based studies provide weak curatorial frameworks for experimentalists to store and manage idiosyncratic, project-specific information, typically offering no automation supporting unified naming and numbering conventions for sequencing production environments that routinely deal with hundreds, if not thousands of samples at a time. Moreover, existing tools are not readily interfaced with bioinformatics executables, (e.g., BLAST, Bowtie2, custom pipelines). Our application, the Omics Metadata Management Software (OMMS), answers both needs, empowering experimentalists to generate intuitive, consistent metadata, and perform analyses and information management tasks via an intuitive web-based interface. Several use cases with short-read sequence datasets are provided to validate installation and integrated function, and suggest possible methodological road maps for prospective users. Provided examples highlight possible OMMS workflows for metadata curation, multistep analyses, and results management and downloading. The OMMS can be implemented as a stand alone-package for individual laboratories, or can be configured for webbased deployment supporting geographically-dispersed projects. The OMMS was developed using an open-source software base, is flexible, extensible and easily installed and executed. The OMMS can be obtained at http://omms.sandia.gov. Availability The OMMS can be obtained at http://omms.sandia.gov PMID:26124554
Marzocchi, W.; Vilardo, G.; Hill, D.P.; Ricciardi, G.P.; Ricco, C.
2001-01-01
We analyzed and compared the seismic activity that has occurred in the last two to three decades in three distinct volcanic areas: Phlegraean Fields, Italy; Vesuvius, Italy; and Long Valley, California. Our main goal is to identify and discuss common features and peculiarities in the temporal evolution of earthquake sequences that may reflect similarities and differences in the generating processes between these volcanic systems. In particular, we tried to characterize the time series of the number of events and of the seismic energy release in terms of stochastic, deterministic, and chaotic components. The time sequences from each area consist of thousands of earthquakes that allow a detailed quantitative analysis and comparison. The results obtained showed no evidence for either deterministic or chaotic components in the earthquake sequences in Long Valley caldera, which appears to be dominated by stochastic behavior. In contrast, earthquake sequences at Phlegrean Fields and Mount Vesuvius show a deterministic signal mainly consisting of a 24-hour periodicity. Our analysis suggests that the modulation in seismicity is in some way related to thermal diurnal processes, rather than luni-solar tidal effects. Independently from the process that generates these periodicities on the seismicity., it is suggested that the lack (or presence) of diurnal cycles is seismic swarms of volcanic areas could be closely linked to the presence (or lack) of magma motion.
Kaplan, J B; Merkel, W K; Nichols, B P
1985-06-05
The amide group of glutamine is a source of nitrogen in the biosynthesis of a variety of compounds. These reactions are catalyzed by a group of enzymes known as glutamine amidotransferases; two of these, the glutamine amidotransferase subunits of p-aminobenzoate synthase and anthranilate synthase have been studied in detail and have been shown to be structurally and functionally related. In some micro-organisms, p-aminobenzoate synthase and anthranilate synthase share a common glutamine amidotransferase subunit. We report here the primary DNA and deduced amino acid sequences of the p-aminobenzoate synthase glutamine amidotransferase subunits from Salmonella typhimurium, Klebsiella aerogenes and Serratia marcescens. A comparison of these glutamine amidotransferase sequences to the sequences of ten others, including some that function specifically in either the p-aminobenzoate synthase or anthranilate synthase complexes and some that are shared by both synthase complexes, has revealed several interesting features of the structure and organization of these genes, and has allowed us to speculate as to the evolutionary history of this family of enzymes. We propose a model for the evolution of the p-aminobenzoate synthase and anthranilate synthase glutamine amidotransferase subunits in which the duplication and subsequent divergence of the genetic information encoding a shared glutamine amidotransferase subunit led to the evolution of two new pathway-specific enzymes.
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons
2011-01-01
Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. Conclusions There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/. PMID:21824423
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.
Alikhan, Nabil-Fareed; Petty, Nicola K; Ben Zakour, Nouri L; Beatson, Scott A
2011-08-08
Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/.
Color-magnitude diagram of Palomar 4 - CCD photometry
NASA Astrophysics Data System (ADS)
Christian, C. A.; Heasley, J. N.
1986-04-01
Photometry of the globular cluster Pal 4 was obtained with the RCA CCD camera on the 3.6 m Canada-France-Hawaii Telescope on Mauna Kea. The color-magnitude diagram of the cluster shows a well-defined red horizontal branch, typical of outer halo systems, and an asymptotic giant branch well separated from the giant branch. The population of Pal 4 has been sampled to the main-sequence turnoff region (V = 25), allowing a detailed comparison of this distant object with theoretical models. The cluster parameters consistent with the CCD data are (m - M)0 = 20.1 + or - 0.1 mag, E(B - V) = 0.02 + or - 0.02, and Fe/H forbidden line = -1.7 + or - 0.1 with Y =0.2. The age of the cluster, determined by comparison with the isochrones of VandenBerg and Bell (1985) is consistent with an age of 15 + or - 1 Gyr, similar to inner halo globular clusters with ages determined in the same way.
DNA Polymorphism: A Comparison of Force Fields for Nucleic Acids
Reddy, Swarnalatha Y.; Leclerc, Fabrice; Karplus, Martin
2003-01-01
The improvements of the force fields and the more accurate treatment of long-range interactions are providing more reliable molecular dynamics simulations of nucleic acids. The abilities of certain nucleic acid force fields to represent the structural and conformational properties of nucleic acids in solution are compared. The force fields are AMBER 4.1, BMS, CHARMM22, and CHARMM27; the comparison of the latter two is the primary focus of this paper. The performance of each force field is evaluated first on its ability to reproduce the B-DNA decamer d(CGATTAATCG)2 in solution with simulations in which the long-range electrostatics were treated by the particle mesh Ewald method; the crystal structure determined by Quintana et al. (1992) is used as the starting point for all simulations. A detailed analysis of the structural and solvation properties shows how well the different force fields can reproduce sequence-specific features. The results are compared with data from experimental and previous theoretical studies. PMID:12609851
Comparative genomics of Lactobacillus
Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.
2011-01-01
Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712
NASA Astrophysics Data System (ADS)
Gao, Jie; Jiang, Li-Li; Xu, Zhen-Yuan
2009-10-01
A new chaos game representation of protein sequences based on the detailed hydrophobic-hydrophilic (HP) model has been proposed by Yu et al (Physica A 337 (2004) 171). A CGR-walk model is proposed based on the new CGR coordinates for the protein sequences from complete genomes in the present paper. The new CGR coordinates based on the detailed HP model are converted into a time series, and a long-memory ARFIMA(p, d, q) model is introduced into the protein sequence analysis. This model is applied to simulating real CGR-walk sequence data of twelve protein sequences. Remarkably long-range correlations are uncovered in the data and the results obtained from these models are reasonably consistent with those available from the ARFIMA(p, d, q) model.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.
Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue
2018-05-15
Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Dynamics of actin evolution in dinoflagellates.
Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F
2011-04-01
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Heye, Tobias; Sommer, Gregor; Miedinger, David; Bremerich, Jens; Bieri, Oliver
2015-09-01
To evaluate the anatomical details offered by a new single breath-hold ultrafast 3D balanced steady-state free precession (uf-bSSFP) sequence in comparison to low-dose chest computed tomography (CT). This was an Institutional Review Board (IRB)-approved, Health Insurance Portability and Accountability Act (HIPAA)-compliant prospective study. A total of 20 consecutive patients enrolled in a lung cancer screening trial underwent same-day low-dose chest CT and 1.5T MRI. The presence of pulmonary nodules and anatomical details on 1.9 mm isotropic uf-bSSFP images was compared to 2 mm lung window reconstructions by two readers. The number of branching points on six predefined pulmonary arteries and the distance between the most peripheral visible vessel segment to the pleural surface on thin slices and 50 mm maximum intensity projections (MIP) were assessed. Image quality and sharpness of the pulmonary vasculature were rated on a 5-point scale. The uf-bSSFP detection rate of pulmonary nodules (32 nodules visible on CT and MRI, median diameter 3.9 mm) was 45.5% with 21 false-positive findings (pooled data of both readers). Uf-bSSFP detected 71.2% of branching points visible on CT data. The mean distance between peripheral vasculature and pleural surface was 13.0 ± 4.2 mm (MRI) versus 8.5 ± 3.3 mm (CT) on thin slices and 8.6 ± 3.9 mm (MRI) versus 4.6 ± 2.5 mm (CT) on MIPs. Median image quality and sharpness were rated 4 each. Although CT is superior to MRI, uf-bSSFP imaging provides good anatomical details with sufficient image quality and sharpness obtainable in a single breath-hold covering the entire chest. © 2014 Wiley Periodicals, Inc.
Atomic Decay Data for Modeling K Lines of Iron Peak and Light Odd-Z Elements*
NASA Technical Reports Server (NTRS)
Palmeri, P.; Quinet, P.; Mendoza, C.; Bautista, M. A.; Garcia, J.; Witthoeft, M. C.; Kallman, T. R.
2012-01-01
Complete data sets of level energies, transition wavelengths, A-values, radiative and Auger widths and fluorescence yields for K-vacancy levels of the F, Na, P, Cl, K, Sc, Ti, V, Cr, Mn, Co, Cu and Zn isonuclear sequences have been computed by a Hartree-Fock method that includes relativistic corrections as implemented in Cowan's atomic structure computer suite. The atomic parameters for more than 3 million fine-structure K lines have been determined. Ions with electron number N greater than 9 are treated for the first time, and detailed comparisons with available measurements and theoretical data for ions with N less than or equal to 9 are carried out in order to estimate reliable accuracy ratings.
Deep HST Photometry of NGC 6388: Age and Horizontal Branch Luminosity
NASA Technical Reports Server (NTRS)
Stetson, Peter B.; Catelan, M.; Pritzl, Barton J.; Smith, Horace A.; Kinemuchi, Karen; Layden, Andrew C.; Sweigart, Allen V.; Rich, R. M.
2006-01-01
We present the first deep color-magnitude diagram (CMD) of the Galactic globular cluster NGC 6388, obtained with the Hubble Space Telescope, that is able to reach the main-sequence turnoff point of the cluster. From a detailed comparison between the cluster CMD and that of 47 Tucanae (NGC 104), we find that the bulk of the stars in these two clusters have nearly the same age and chemical composition. On the other hand, our results indicate that the blue horizontal branch and RR Lyrae components in NGC 6388 are intrinsically over-luminous, which must be due to one or more, still undetermined, non-canonical second parameter(s) affecting a relatively minor fraction of the stars in NGC 6388.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy
Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Barakat, Mohamed; Ortet, Philippe; Whitworth, David E
2013-04-20
Regulatory proteins (RPs) such as transcription factors (TFs) and two-component system (TCS) proteins control how prokaryotic cells respond to changes in their external and/or internal state. Identification and annotation of TFs and TCSs is non-trivial, and between-genome comparisons are often confounded by different standards in annotation. There is a need for user-friendly, fast and convenient tools to allow researchers to overcome the inherent variability in annotation between genome sequences. We have developed the web-server P2RP (Predicted Prokaryotic Regulatory Proteins), which enables users to identify and annotate TFs and TCS proteins within their sequences of interest. Users can input amino acid or genomic DNA sequences, and predicted proteins therein are scanned for the possession of DNA-binding domains and/or TCS domains. RPs identified in this manner are categorised into families, unambiguously annotated, and a detailed description of their features generated, using an integrated software pipeline. P2RP results can then be outputted in user-specified formats. Biologists have an increasing need for fast and intuitively usable tools, which is why P2RP has been developed as an interactive system. As well as assisting experimental biologists to interrogate novel sequence data, it is hoped that P2RP will be built into genome annotation pipelines and re-annotation processes, to increase the consistency of RP annotation in public genomic sequences. P2RP is the first publicly available tool for predicting and analysing RP proteins in users' sequences. The server is freely available and can be accessed along with documentation at http://www.p2rp.org.
Ravi, Anuradha; Avershina, Ekaterina; Angell, Inga Leena; Ludvigsen, Jane; Manohar, Prasanth; Padmanaban, Sumathi; Nachimuthu, Ramesh; Snipen, Lars; Rudi, Knut
2018-06-01
Use of the 16S rRNA gene in microbiota studies is limited by the lack of taxonomic and functional resolution. High resolution analyses are particularly important for understanding transmission and persistence of bacteria. The aim of our work was therefore to compare a novel reduced metagenome sequencing (RMS) approach with 16S rRNA gene sequencing to determine both the metagenome genetic diversity and the mother-to-child sharing of the microbiota in a cohort of 17 mother-child pairs. We found that although both approaches gave comparable results with respect to sample separation and taxonomy, RMS gave higher resolution and the potential for genomic-/functional assignment. Using RMS we estimated that the metagenome size increased from about 60 Mbp for 4-day-old children to about 225 Mbp for mothers. The 4-day-old children shared 7% of the metagenome sequences with the mothers, while the metagenome sequence sharing was >30% among the mothers. We found 15 genomes shared across >50% of the mothers, of which 10 belonged to Clostridia. Only Bacteroides showed a direct mother-child association, with B. vulgatus being abundant in both 4-day-old children and mothers. For the functional assignments, we identified a significant association between antibiotic usage during labor, and quantity of Fosfomycin resistance genes. In conclusion, our results show a higher functional and taxonomic resolution for RMS compared to 16S rRNA gene sequencing, where RMS enabled a detailed description of mother to child gut microbiota transmission - supporting a late recruitment of most gut bacteria and an effect of antibiotic treatment during labor on infant antibiotic resistance gene patterns. Copyright © 2018. Published by Elsevier B.V.
Comparison of the theoretical and real-world evolutionary potential of a genetic circuit
NASA Astrophysics Data System (ADS)
Razo-Mejia, M.; Boedicker, J. Q.; Jones, D.; DeLuna, A.; Kinney, J. B.; Phillips, R.
2014-04-01
With the development of next-generation sequencing technologies, many large scale experimental efforts aim to map genotypic variability among individuals. This natural variability in populations fuels many fundamental biological processes, ranging from evolutionary adaptation and speciation to the spread of genetic diseases and drug resistance. An interesting and important component of this variability is present within the regulatory regions of genes. As these regions evolve, accumulated mutations lead to modulation of gene expression, which may have consequences for the phenotype. A simple model system where the link between genetic variability, gene regulation and function can be studied in detail is missing. In this article we develop a model to explore how the sequence of the wild-type lac promoter dictates the fold-change in gene expression. The model combines single-base pair resolution maps of transcription factor and RNA polymerase binding energies with a comprehensive thermodynamic model of gene regulation. The model was validated by predicting and then measuring the variability of lac operon regulation in a collection of natural isolates. We then implement the model to analyze the sensitivity of the promoter sequence to the regulatory output, and predict the potential for regulation to evolve due to point mutations in the promoter region.
Eoff, Jennifer D.
2014-01-01
New data from detailed measured sections permit comprehensive analysis of the sequence framework of the Furongian (Upper Cambrian; Jiangshanian and Sunwaptan stages) Tunnel City Group (Lone Rock Formation and Mazomanie Formation) of Wisconsin and Minnesota. The sequence-stratigraphic architecture of the lower part of the Sunwaptan Stage at the base of the Tunnel City Group, at the contact between the Wonewoc Formation and Lone Rock Formation, records the first part of complex polyphase flooding (Sauk III) of the Laurentian craton, at a scale smaller than most events recorded by global sea-level curves. Flat-pebble conglomerate and glauconite document transgressive ravinement and development of a condensed section when creation of accommodation exceeded its consumption by sedimentation. Thinly-bedded, fossiliferous sandstone represents the most distal setting during earliest highstand. Subsequent deposition of sandstone characterized by hummocky or trough cross-stratification records progradational pulses of shallower, storm- and wave-dominated environments across the craton before final flooding of Sauk III commenced with carbonate deposition during the middle part of the Sunwaptan Stage. Comparison of early Sunwaptan flooding of the inner Laurentian craton to published interpretations from other parts of North America suggests that Sauk III was not a single, long-term accommodation event as previously proposed.
Rodriguez Parkitna, Jan M; Ozyhar, Andrzej; Wiśniewski, Jacek R; Kochman, Marian
2002-09-01
Juvenile hormone binding proteins (JHBPs) serve as specific carriers of juvenile hormone (JH) in insect hemolymph. As shown in this report, Galleria mellonella JHBP is encoded by a cDNA of 1063 nucleotides. The pre-protein consists of 245 amino acids with a 20 amino acid leader sequence. The concentration of the JHBP mRNA reaches a maximum on the third day of the last larval instar, and decreases five-fold towards pupation. Comparison of amino acid sequences of JHBPs from Bombyx mori, Heliothis virescens, Manduca sexta and G. mellonella shows that 57 positions out of 226 are occupied by identical amino acids. A phylogeny tree was constructed from 32 proteins, which function could be associated to JH. It has three major branches: (i) ligand binding domains of nuclear receptors, (ii) JHBPs and JH esterases (JHEs), and (iii) hypothetical proteins found in Drosophila melanogaster genome. Despite the close positioning of JHEs and JHBPs on the tree, which probably arises from the presence of a common JH binding motif, these proteins are unlikely to belong to the same family. Detailed analysis of the secondary structure modeling shows that JHBPs may contain a beta-barrel motif flanked by alpha-helices and thus be evolutionary related to the same superfamily as calycins.
Solis, Armando D
2014-01-01
The most informative probability distribution functions (PDFs) describing the Ramachandran phi-psi dihedral angle pair, a fundamental descriptor of backbone conformation of protein molecules, are derived from high-resolution X-ray crystal structures using an information-theoretic approach. The Information Maximization Device (IMD) is established, based on fundamental information-theoretic concepts, and then applied specifically to derive highly resolved phi-psi maps for all 20 single amino acid and all 8000 triplet sequences at an optimal resolution determined by the volume of current data. The paper shows that utilizing the latent information contained in all viable high-resolution crystal structures found in the Protein Data Bank (PDB), totaling more than 77,000 chains, permits the derivation of a large number of optimized sequence-dependent PDFs. This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs. Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs. Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision. The high-resolution phi-psi maps derived in this work are available for download.
Timsit, Youri; Bombard, Sophie
2007-12-01
Metal ions play a key role in RNA folding and activity. Elucidating the rules that govern the binding of metal ions is therefore an essential step for better understanding the RNA functions. High-resolution data are a prerequisite for a detailed structural analysis of ion binding on RNA and, in particular, the observation of monovalent cations. Here, the high-resolution crystal structures of the tridecamer duplex r(GCGUUUGAAACGC) crystallized under different conditions provides new structural insights on ion binding on GAAA/UUU sequences that exhibit both unusual structural and functional properties in RNA. The present study extends the repertory of RNA ion binding sites in showing that the two first bases of UUU triplets constitute a specific site for sodium ions. A striking asymmetric pattern of metal ion binding in the two equivalent halves of the palindromic sequence demonstrates that sequence and its environment act together to bind metal ions. A highly ionophilic half that binds six metal ions allows, for the first time, the observation of a disodium cluster in RNA. The comparison of the equivalent halves of the duplex provides experimental evidences that ion binding correlates with structural alterations and groove contraction.
Korber, B T; Osmanov, S; Esparza, J; Myers, G
1994-11-01
The World Health Organization Global Programme on AIDS (WHO/GPA) is conducting a large-scale collaborative study of human immunodeficiency virus type 1 (HIV-1) variation, based in four potential vaccine-trial site countries: Brazil, Rwanda, Thailand, and Uganda. Through the course of this study, it was crucial to keep track of certain attributes of the samples from which the viral nucleotide sequences were derived (e.g., country of origin and viral culture characterization), so that meaningful sequence comparisons could be made. Here we describe a system developed in the context of the WHO/GPA study that summarizes such critical attributes by representing them as standardized characters directly incorporated into sequence names. This nomenclature allows linkage of clinical, phenotypic, and geographic information with molecular data. We propose that other investigators involved in human immunodeficiency virus (HIV) nucleotide sequencing efforts adopt a similar standardized sequence nomenclature to facilitate cross-study sequence comparison. HIV sequence data are being generated at an ever-increasing rate; directly coupled to this increase is our deepening understanding of biological parameters that influence or result from sequence variability. A standardized sequence nomenclature that includes relevant biological information would enable researchers to better utilize the growing body of sequence data, and enhance their ability to interpret the biological implications of their own data through facilitating comparisons with previously published work.
Dai, Qi; Yang, Yanchun; Wang, Tianming
2008-10-15
Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Lelliottia aquatilis sp. nov., isolated from drinking water.
Kämpfer, Peter; Glaeser, Stefanie P; Packroff, Gabriele; Behringer, Katja; Exner, Martin; Chakraborty, Trinad; Schmithausen, Ricarda M; Doijad, Swapnil
2018-06-22
Five beige-pigmented, oxidase-negative bacterial isolates, 6331-17 T , 6332-17, 6333-17, 6334-17 and 9827-07, isolated either from a drinking water storage reservoir or drinking water in 2006 and 2017 in Germany, were examined in detail applying by a polyphasic taxonomic approach. Cells of the isolates were rod-shaped and Gram-stain-negative. Comparison of the 16S rRNA gene sequences of these five isolates showed highest sequence similarities to Lelliottia amnigena (99.98 %) and Lelliottia nimipressuralis (99.99 %). Multilocus sequence analyses based on concatenated partial rpoB, gyrB, infB and atpD sequences confirmed the clustering of these isolates with Lelliottia species, but also revealed a clear distinction to the closest related type strains. Analysis of the genome sequences of these isolates indicated >70 % in silico DNA-DNA hybridization and high average nucleotide identities between strains. Nevertheless, they showed only <70 and <95 % similarity to the type strains of these two Lelliottia species. The fatty acid profiles of these isolates were very similar and consisted of the major fatty acids C16:0, C17 : 0cyclo, C15 : 0iso 2-OH/C16 : 1ω7c and C18 : 1ω7c. In addition, physiological/biochemical tests revealed high phenotypic similarity to each other. These cumulative data indicate that these isolates represent a novel Lelliottia species, for which the name Lelliottia aquatilis sp. nov. is proposed, with strain 6331-17 T (=CCM 8846 T =CIP 111609 T =LMG 30560 T ) as the type strain.
Oono, Ryoko
2017-01-01
High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions 'how and why are communities different?' This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences.
2017-01-01
High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions ‘how and why are communities different?’ This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences. PMID:29253889
Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.
Favero, F; Joshi, T; Marquard, A M; Birkbak, N J; Krzystanek, M; Li, Q; Szallasi, Z; Eklund, A C
2015-01-01
Exome or whole-genome deep sequencing of tumor DNA along with paired normal DNA can potentially provide a detailed picture of the somatic mutations that characterize the tumor. However, analysis of such sequence data can be complicated by the presence of normal cells in the tumor specimen, by intratumor heterogeneity, and by the sheer size of the raw data. In particular, determination of copy number variations from exome sequencing data alone has proven difficult; thus, single nucleotide polymorphism (SNP) arrays have often been used for this task. Recently, algorithms to estimate absolute, but not allele-specific, copy number profiles from tumor sequencing data have been described. We developed Sequenza, a software package that uses paired tumor-normal DNA sequencing data to estimate tumor cellularity and ploidy, and to calculate allele-specific copy number profiles and mutation profiles. We applied Sequenza, as well as two previously published algorithms, to exome sequence data from 30 tumors from The Cancer Genome Atlas. We assessed the performance of these algorithms by comparing their results with those generated using matched SNP arrays and processed by the allele-specific copy number analysis of tumors (ASCAT) algorithm. Comparison between Sequenza/exome and SNP/ASCAT revealed strong correlation in cellularity (Pearson's r = 0.90) and ploidy estimates (r = 0.42, or r = 0.94 after manual inspecting alternative solutions). This performance was noticeably superior to previously published algorithms. In addition, in artificial data simulating normal-tumor admixtures, Sequenza detected the correct ploidy in samples with tumor content as low as 30%. The agreement between Sequenza and SNP array-based copy number profiles suggests that exome sequencing alone is sufficient not only for identifying small scale mutations but also for estimating cellularity and inferring DNA copy number aberrations. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
NASA Astrophysics Data System (ADS)
Stow, Dorrik A. V.; Shanmugam, Ganapathy
1980-01-01
A comparative study of the sequence of sedimentary structures in ancient and modern fine-grained turbidites is made in three contrasting areas. They are (1) Holocene and Pleistocene deep-sea muds of the Nova Scotian Slope and Rise, (2) Middle Ordovician Sevier Shale of the Valley and Ridge Province of the Southern Appalachians, and (3) Cambro-Ordovician Halifax Slate of the Meguma Group in Nova Scotia. A standard sequence of structures is proposed for fine-grained turbidites. The complete sequence has nine sub-divisions that are here termed T 0 to T 8. "The lower subdivision (T 0) comprises a silt lamina which has a sharp, scoured and load-cast base, internal parallel-lamination and cross-lamination, and a sharp current-lineated or wavy surface with 'fading-ripples' (= Type C etc. …)." (= Type C ripple-drift cross-lamination, Jopling and Walker, 1968). The overlying sequence shows textural and compositional grading through alternating silt and mud laminae. A convolute-laminated sub-division (T 1) is overlain by low-amplitude climbing ripples (T 2), thin regular laminae (T 3), thin indistinct laminae (T 4), and thin wipsy or convolute laminae (T 5). The topmost three divisions, graded mud (T 6), ungraded mud (T 7) and bioturbated mud (T 8), do not have silt laminae but rare patchy silt lenses and silt pseudonodules and a thin zone of micro-burrowing near the upper surface. The proposed sequence is analogous to the Bouma (1962) structural scheme for sandy turbidites and is approximately equivalent to Bouma's (C)DE divisions. The repetition of partial sequences characterizes different parts of the slope/base-of-slope/basin plain environment, and represents deposition from different stages of evolution of a large, muddy, turbidity flow. Microstructural detail and sequence are well preserved in ancient and even slightly metamorphosed sediments. Their recognition is important for determining depositional processes and for palaeoenvironmental interpretation.
Aigrain, Louise; Gu, Yong; Quail, Michael A
2016-06-13
The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.
The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
NASA Astrophysics Data System (ADS)
Sokolović, I.; Mali, P.; Odavić, J.; Radošević, S.; Medvedeva, S. Yu.; Botha, A. E.; Shukrinov, Yu. M.; Tekić, J.
2017-08-01
The devil's staircase structure arising from the complete mode locking of an entirely nonchaotic system, the overdamped dc+ac driven Frenkel-Kontorova model with deformable substrate potential, was observed. Even though no chaos was found, a hierarchical ordering of the Shapiro steps was made possible through the use of a previously introduced continued fraction formula. The absence of chaos, deduced here from Lyapunov exponent analyses, can be attributed to the overdamped character and the Middleton no-passing rule. A comparative analysis of a one-dimensional stack of Josephson junctions confirmed the disappearance of chaos with increasing dissipation. Other common dynamic features were also identified through this comparison. A detailed analysis of the amplitude dependence of the Shapiro steps revealed that only for the case of a purely sinusoidal substrate potential did the relative sizes of the steps follow a Farey sequence. For nonsinusoidal (deformed) potentials, the symmetry of the Stern-Brocot tree, depicting all members of particular Farey sequence, was seen to be increasingly broken, with certain steps being more prominent and their relative sizes not following the Farey rule.
Protein classification using modified n-grams and skip-grams.
Islam, S M Ashiqul; Heil, Benjamin J; Kearney, Christopher Michel; Baker, Erich J
2018-05-01
Classification by supervised machine learning greatly facilitates the annotation of protein characteristics from their primary sequence. However, the feature generation step in this process requires detailed knowledge of attributes used to classify the proteins. Lack of this knowledge risks the selection of irrelevant features, resulting in a faulty model. In this study, we introduce a supervised protein classification method with a novel means of automating the work-intensive feature generation step via a Natural Language Processing (NLP)-dependent model, using a modified combination of n-grams and skip-grams (m-NGSG). A meta-comparison of cross-validation accuracy with twelve training datasets from nine different published studies demonstrates a consistent increase in accuracy of m-NGSG when compared to contemporary classification and feature generation models. We expect this model to accelerate the classification of proteins from primary sequence data and increase the accessibility of protein characteristic prediction to a broader range of scientists. m-NGSG is freely available at Bitbucket: https://bitbucket.org/sm_islam/mngsg/src. A web server is available at watson.ecs.baylor.edu/ngsg. erich_baker@baylor.edu. Supplementary data are available at Bioinformatics online.
Sequence information signal processor
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1999-01-01
An electronic circuit is used to compare two sequences, such as genetic sequences, to determine which alignment of the sequences produces the greatest similarity. The circuit includes a linear array of series-connected processors, each of which stores a single element from one of the sequences and compares that element with each successive element in the other sequence. For each comparison, the processor generates a scoring parameter that indicates which segment ending at those two elements produces the greatest degree of similarity between the sequences. The processor uses the scoring parameter to generate a similar scoring parameter for a comparison between the stored element and the next successive element from the other sequence. The processor also delivers the scoring parameter to the next processor in the array for use in generating a similar scoring parameter for another pair of elements. The electronic circuit determines which processor and alignment of the sequences produce the scoring parameter with the highest value.
Breaking the computational barriers of pairwise genome comparison.
Torreno, Oscar; Trelles, Oswaldo
2015-08-11
Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.
Baghbaderani, Behnam Ahmadian; Syama, Adhikarla; Sivapatham, Renuka; Pei, Ying; Mukherjee, Odity; Fellner, Thomas; Zeng, Xianmin; Rao, Mahendra S
2016-08-01
We have recently described manufacturing of human induced pluripotent stem cells (iPSC) master cell banks (MCB) generated by a clinically compliant process using cord blood as a starting material (Baghbaderani et al. in Stem Cell Reports, 5(4), 647-659, 2015). In this manuscript, we describe the detailed characterization of the two iPSC clones generated using this process, including whole genome sequencing (WGS), microarray, and comparative genomic hybridization (aCGH) single nucleotide polymorphism (SNP) analysis. We compare their profiles with a proposed calibration material and with a reporter subclone and lines made by a similar process from different donors. We believe that iPSCs are likely to be used to make multiple clinical products. We further believe that the lines used as input material will be used at different sites and, given their immortal status, will be used for many years or even decades. Therefore, it will be important to develop assays to monitor the state of the cells and their drift in culture. We suggest that a detailed characterization of the initial status of the cells, a comparison with some calibration material and the development of reporter sublcones will help determine which set of tests will be most useful in monitoring the cells and establishing criteria for discarding a line.
Hahnemann, Maria L; Kraff, Oliver; Maderwald, Stefan; Johst, Soeren; Orzada, Stephan; Umutlu, Lale; Ladd, Mark E; Quick, Harald H; Lauenstein, Thomas C
2016-06-01
To perform non-enhanced (NE) magnetic resonance imaging (MRI) of the small bowel at 7 Tesla (7T) and to compare it with 1.5 Tesla (1.5T). Twelve healthy subjects were prospectively examined using a 1.5T and 7T MRI system. Coronal and axial true fast imaging with steady-state precession (TrueFISP) imaging and a coronal T2-weighted (T2w) half-Fourier acquisition single-shot turbo spin-echo (HASTE) sequence were acquired. Image analysis was performed by 1) visual evaluation of tissue contrast and detail detectability, 2) measurement and calculation of contrast ratios and 3) assessment of artifacts. NE MRI of the small bowel at 7T was technically feasible. In the vast majority of the cases, tissue contrast and image details were equivalent at both field strengths. At 7T, two cases revealed better detail detectability in the TrueFISP, and better contrast in the HASTE. Susceptibility artifacts and B1 inhomogeneities were significantly increased at 7T. This study provides first insights into NE ultra-high field MRI of the small bowel and may be considered an important step towards high quality T2w abdominal imaging at 7T MRI. Copyright © 2016 Elsevier Inc. All rights reserved.
Reaction schemes visualized in network form: the syntheses of strychnine as an example.
Proudfoot, John R
2013-05-24
Representation of synthesis sequences in a network form provides an effective method for the comparison of multiple reaction schemes and an opportunity to emphasize features such as reaction scale that are often relegated to experimental sections. An example of data formatting that allows construction of network maps in Cytoscape is presented, along with maps that illustrate the comparison of multiple reaction sequences, comparison of scaffold changes within sequences, and consolidation to highlight common key intermediates used across sequences. The 17 different synthetic routes reported for strychnine are used as an example basis set. The reaction maps presented required a significant data extraction and curation, and a standardized tabular format for reporting reaction information, if applied in a consistent way, could allow the automated combination of reaction information across different sources.
Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu
2016-11-23
The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing
USDA-ARS?s Scientific Manuscript database
Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang
2018-03-10
Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.
Guidelines for reporting and using prediction tools for genetic variation analysis.
Vihinen, Mauno
2013-02-01
Computational prediction methods are widely used for the analysis of human genome sequence variants and their effects on gene/protein function, splice site aberration, pathogenicity, and disease risk. New methods are frequently developed. We believe that guidelines are essential for those writing articles about new prediction methods, as well as for those applying these tools in their research, so that the necessary details are reported. This will enable readers to gain the full picture of technical information, performance, and interpretation of results, and to facilitate comparisons of related methods. Here, we provide instructions on how to describe new methods, report datasets, and assess the performance of predictive tools. We also discuss what details of predictor implementation are essential for authors to understand. Similarly, these guidelines for the use of predictors provide instructions on what needs to be delineated in the text, as well as how researchers can avoid unwarranted conclusions. They are applicable to most prediction methods currently utilized. By applying these guidelines, authors will help reviewers, editors, and readers to more fully comprehend prediction methods and their use. © 2012 Wiley Periodicals, Inc.
Request to monitor the CV SDSS161033 (1605-00) for HST observations AND TU Cas comparison stars
NASA Astrophysics Data System (ADS)
Price, Aaron
2005-06-01
AAVSO Alert Notice 319 contains two topics. First: Dr. Paula Szkody (University of Washington) has requested AAVSO assistance in monitoring the suspected UGWZ dwarf nova SDSS J161033 [V386 Ser] for upcoming HST observations. This campaign is similar to the one recently run on SDSS J2205 and SDSS J013132 (AAVSO Alert Notice 318). HST mission planners need to be absolutely sure that SDSS J161033 is not in outburst immediately prior to the scheduled observation; AAVSO observations will be crucial to carrying out the HST program. Nightly V observations are requested June 24-July 1 UT. We are making an unusual request in that we are asking for the FITS images themselves to be uploaded to the AAVSO's FTP site. Second: AAVSO Alert Notice 318 did not specify which stars on the TU Cas PEP chart should be used as comparison and check stars. Also, there was an error on the chart regarding the location of the "83" comparison star [the chart that is available online reflects a corrected location]. Please use the "89" and the "74" stars as your comparison and check stars, respectively. Finder charts with sequence may be created using the AAVSO Variable Star Plotter (https://www.aavso.org/vsp). Observations should be submitted to the AAVSO International Database. See full Alert Notice for more details.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Weiwen; Culley, David E.; Gritsenko, Marina A.
2006-11-03
ABSTRACT In the previous study, the whole-genome gene expression profiles of D. vulgaris in response to oxidative stress and heat shock were determined. The results showed 24-28% of the responsive genes were hypothetical proteins that have not been experimentally characterized or whose function can not be deduced by simple sequence comparison. To further explore the protecting mechanisms employed in D. vulgaris against the oxidative stress and heat shock, attempt was made in this study to infer functions of these hypothetical proteins by phylogenomic profiling along with detailed sequence comparison against various publicly available databases. By this approach we were abletomore » assign possible functions to 25 responsive hypothetical proteins. The findings included that DVU0725, induced by oxidative stress, may be involved in lipopolysaccharide biosynthesis, implying that the alternation of lipopolysaccharide on cell surface might service as a mechanism against oxidative stress in D. vulgaris. In addition, two responsive proteins, DVU0024 encoding a putative transcriptional regulator and DVU1670 encoding predicted redox protein, were sharing co-evolution atterns with rubrerythrin in Archaeoglobus fulgidus and Clostridium perfringens, respectively, implying that they might be part of the stress response and protective systems in D. vulgaris. The study demonstrated that phylogenomic profiling is a useful tool in interpretation of experimental genomics data, and also provided further insight on cellular response to oxidative stress and heat shock in D. vulgaris.« less
Comparative and functional genomics provide insights into the pathogenicity of dermatophytic fungi
2011-01-01
Background Millions of humans and animals suffer from superficial infections caused by a group of highly specialized filamentous fungi, the dermatophytes, which exclusively infect keratinized host structures. To provide broad insights into the molecular basis of the pathogenicity-associated traits, we report the first genome sequences of two closely phylogenetically related dermatophytes, Arthroderma benhamiae and Trichophyton verrucosum, both of which induce highly inflammatory infections in humans. Results 97% of the 22.5 megabase genome sequences of A. benhamiae and T. verrucosum are unambiguously alignable and collinear. To unravel dermatophyte-specific virulence-associated traits, we compared sets of potentially pathogenicity-associated proteins, such as secreted proteases and enzymes involved in secondary metabolite production, with those of closely related onygenales (Coccidioides species) and the mould Aspergillus fumigatus. The comparisons revealed expansion of several gene families in dermatophytes and disclosed the peculiarities of the dermatophyte secondary metabolite gene sets. Secretion of proteases and other hydrolytic enzymes by A. benhamiae was proven experimentally by a global secretome analysis during keratin degradation. Molecular insights into the interaction of A. benhamiae with human keratinocytes were obtained for the first time by global transcriptome profiling. Given that A. benhamiae is able to undergo mating, a detailed comparison of the genomes further unraveled the genetic basis of sexual reproduction in this species. Conclusions Our results enlighten the genetic basis of fundamental and putatively virulence-related traits of dermatophytes, advancing future research on these medically important pathogens. PMID:21247460
Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G
2012-09-01
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637
A 3D sequence-independent representation of the protein data bank.
Fischer, D; Tsai, C J; Nussinov, R; Wolfson, H
1995-10-01
Here we address the following questions. How many structurally different entries are there in the Protein Data Bank (PDB)? How do the proteins populate the structural universe? To investigate these questions a structurally non-redundant set of representative entries was selected from the PDB. Construction of such a dataset is not trivial: (i) the considerable size of the PDB requires a large number of comparisons (there were more than 3250 structures of protein chains available in May 1994); (ii) the PDB is highly redundant, containing many structurally similar entries, not necessarily with significant sequence homology, and (iii) there is no clear-cut definition of structural similarity. The latter depend on the criteria and methods used. Here, we analyze structural similarity ignoring protein topology. To date, representative sets have been selected either by hand, by sequence comparison techniques which ignore the three-dimensional (3D) structures of the proteins or by using sequence comparisons followed by linear structural comparison (i.e. the topology, or the sequential order of the chains, is enforced in the structural comparison). Here we describe a 3D sequence-independent automated and efficient method to obtain a representative set of protein molecules from the PDB which contains all unique structures and which is structurally non-redundant. The method has two novel features. The first is the use of strictly structural criteria in the selection process without taking into account the sequence information. To this end we employ a fast structural comparison algorithm which requires on average approximately 2 s per pairwise comparison on a workstation. The second novel feature is the iterative application of a heuristic clustering algorithm that greatly reduces the number of comparisons required. We obtain a representative set of 220 chains with resolution better than 3.0 A, or 268 chains including lower resolution entries, NMR entries and models. The resulting set can serve as a basis for extensive structural classification and studies of 3D recurring motifs and of sequence-structure relationships. The clustering algorithm succeeds in classifying into the same structural family chains with no significant sequence homology, e.g. all the globins in one single group, all the trypsin-like serine proteases in another or all the immunoglobulin-like folds into a third. In addition, unexpected structural similarities of interest have been automatically detected between pairs of chains. A cluster analysis of the representative structures demonstrates the way the "structural universe' is populated.
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.
Soares, Inês; Goios, Ana; Amorim, António
2012-01-01
The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-03-01
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
CANDELS Visual Classifications: Scheme, Data Release, and First Results
NASA Astrophysics Data System (ADS)
Kartaltepe, Jeyhan S.; Mozena, Mark; Kocevski, Dale; McIntosh, Daniel H.; Lotz, Jennifer; Bell, Eric F.; Faber, Sandy; Ferguson, Harry; Koo, David; Bassett, Robert; Bernyk, Maksym; Blancato, Kirsten; Bournaud, Frederic; Cassata, Paolo; Castellano, Marco; Cheung, Edmond; Conselice, Christopher J.; Croton, Darren; Dahlen, Tomas; de Mello, Duilia F.; DeGroot, Laura; Donley, Jennifer; Guedes, Javiera; Grogin, Norman; Hathi, Nimish; Hilton, Matt; Hollon, Brett; Koekemoer, Anton; Liu, Nick; Lucas, Ray A.; Martig, Marie; McGrath, Elizabeth; McPartland, Conor; Mobasher, Bahram; Morlock, Alice; O'Leary, Erin; Peth, Mike; Pforr, Janine; Pillepich, Annalisa; Rosario, David; Soto, Emmaris; Straughn, Amber; Telford, Olivia; Sunnquist, Ben; Trump, Jonathan; Weiner, Benjamin; Wuyts, Stijn; Inami, Hanae; Kassin, Susan; Lani, Caterina; Poole, Gregory B.; Rizer, Zachary
2015-11-01
We have undertaken an ambitious program to visually classify all galaxies in the five CANDELS fields down to H < 24.5 involving the dedicated efforts of over 65 individual classifiers. Once completed, we expect to have detailed morphological classifications for over 50,000 galaxies spanning 0 < z < 4 over all the fields, with classifications from 3 to 5 independent classifiers for each galaxy. Here, we present our detailed visual classification scheme, which was designed to cover a wide range of CANDELS science goals. This scheme includes the basic Hubble sequence types, but also includes a detailed look at mergers and interactions, the clumpiness of galaxies, k-corrections, and a variety of other structural properties. In this paper, we focus on the first field to be completed—GOODS-S, which has been classified at various depths. The wide area coverage spanning the full field (wide+deep+ERS) includes 7634 galaxies that have been classified by at least three different people. In the deep area of the field, 2534 galaxies have been classified by at least five different people at three different depths. With this paper, we release to the public all of the visual classifications in GOODS-S along with the Perl/Tk GUI that we developed to classify galaxies. We present our initial results here, including an analysis of our internal consistency and comparisons among multiple classifiers as well as a comparison to the Sérsic index. We find that the level of agreement among classifiers is quite good (>70% across the full magnitude range) and depends on both the galaxy magnitude and the galaxy type, with disks showing the highest level of agreement (>50%) and irregulars the lowest (<10%). A comparison of our classifications with the Sérsic index and rest-frame colors shows a clear separation between disk and spheroid populations. Finally, we explore morphological k-corrections between the V-band and H-band observations and find that a small fraction (84 galaxies in total) are classified as being very different between these two bands. These galaxies typically have very clumpy and extended morphology or are very faint in the V-band.
Phillips, C; Gettings, K Butler; King, J L; Ballard, D; Bodner, M; Borsuk, L; Parson, W
2018-05-01
The STR sequence template file published in 2016 as part of the considerations from the DNA Commission of the International Society for Forensic Genetics on minimal STR sequence nomenclature requirements, has been comprehensively revised and audited using the latest GRCh38 genome assembly. The list of forensic STRs characterized was expanded by including supplementary autosomal, X- and Y-chromosome microsatellites in less common use for routine DNA profiling, but some likely to be adopted in future massively parallel sequencing (MPS) STR panels. We outline several aspects of sequence alignment and annotation that required care and attention to detail when comparing sequences to GRCh37 and GRCh38 assemblies, as well as the necessary matching of MPS-based allele descriptions to previously established repeat region structures described in initial sequencing studies of the less well known forensic STRs. The revised sequence guide is now available in a dynamically updated FTP format from the STRidER website with a date-stamped change log to allow users to explore their own MPS data with the most up-to-date forensic STR sequence information compiled in a simple guide. Copyright © 2018 Elsevier B.V. All rights reserved.
Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.
Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris
2004-07-14
With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS
Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...
NASA Astrophysics Data System (ADS)
Kumar, Ashok; Kumari, Shalini; Borkar, Hitesh; Katiyar, Ram S.; Scott, James Floyd
2017-01-01
We present detailed Raman studies of SrZrO3 (SZO) that show three anomalies in Raman modes: One has a small jump in frequency ω, one has its intensity vanish, and a third has a sharp change in temperature derivative dω(T)/dT from flat below T = 600 K to a Curie-Weiss dependence above 600 K with extrapolation to zero frequency at the known transition temperature T = 970 K, thereby proving the latter to be displacive. In addition, the P4mm ferroelectric phase predicted at high stresses has preliminary support from polarization-voltage experiments. The inference of a new transition in the temperature region 600-650 K is in disagreement with neutron studies. Comparisons are given for family member SrSnO3 and SrHfO3, and we discuss the different conclusions of Kennedy and Knight. We show that a known transition in SrHfO3 is also displacive with a well-behaved soft mode.
NASA Technical Reports Server (NTRS)
Rosenzweig, P.; Morrison, N. D.
1986-01-01
Five early B-type stars near the main-sequence turnoff in NGC 457 have been observed at low dispersion with the short-wavelength prime and the long-wavelength redundant cameras of the IUE satellite. The equivalent widths of spectral features that are particularly strong and sensitive to temperature and luminosity were computed in the cluster stars and in 20 lightly reddened stars of types O9-B3 and luminosity classes III-V. The comparison of the equivalent widths provides a reliable method for finding matching pairs. Having identified the best comparison star for each program star, binned fluxes were used to determine the mean extinction curve. In order to cover the visible region, monochromatic fluxes of Phi Cas were derived from observations with the intensified Reticon scanner mounted on the No. 2 0.9 m telescope of KPNO, and they were dereddened with the mean extinction curve of Savage and Mathis. Thus, the intrinsic energy distribution of Phi Cas were determined from 1500 to 5800 A for use in a detailed model-atmosphere analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
In silico evidence for sequence-dependent nucleosome sliding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lequieu, Joshua; Schwartz, David C.; de Pablo, Juan J.
Nucleosomes represent the basic building block of chromatin and provide an important mechanism by which cellular processes are controlled. The locations of nucleosomes across the genome are not random but instead depend on both the underlying DNA sequence and the dynamic action of other proteins within the nucleus. These processes are central to cellular function, and the molecular details of the interplay between DNA sequence and nudeosome dynamics remain poorly understood. In this work, we investigate this interplay in detail by relying on a molecular model, which permits development of a comprehensive picture of the underlying free energy surfaces andmore » the corresponding dynamics of nudeosome repositioning. The mechanism of nudeosome repositioning is shown to be strongly linked to DNA sequence and directly related to the binding energy of a given DNA sequence to the histone core. It is also demonstrated that chromatin remodelers can override DNA-sequence preferences by exerting torque, and the histone H4 tail is then identified as a key component by which DNA-sequence, histone modifications, and chromatin remodelers could in fact be coupled.« less
... auris infection spread globally? CDC conducted whole genome sequencing of C. auris specimens from countries in the ... Asia, southern Africa, and South America. Whole genome sequencing produces detailed DNA fingerprints of organisms. CDC found ...
SRD: a Staphylococcus regulatory RNA database.
Sassi, Mohamed; Augagneur, Yoann; Mauro, Tony; Ivain, Lorraine; Chabelskaya, Svetlana; Hallier, Marc; Sallou, Olivier; Felden, Brice
2015-05-01
An overflow of regulatory RNAs (sRNAs) was identified in a wide range of bacteria. We designed and implemented a new resource for the hundreds of sRNAs identified in Staphylococci, with primary focus on the human pathogen Staphylococcus aureus. The "Staphylococcal Regulatory RNA Database" (SRD, http://srd.genouest.org/) compiled all published data in a single interface including genetic locations, sequences and other features. SRD proposes novel and simplified identifiers for Staphylococcal regulatory RNAs (srn) based on the sRNA's genetic location in S. aureus strain N315 which served as a reference. From a set of 894 sequences and after an in-depth cleaning, SRD provides a list of 575 srn exempt of redundant sequences. For each sRNA, their experimental support(s) is provided, allowing the user to individually assess their validity and significance. RNA-seq analysis performed on strains N315, NCTC8325, and Newman allowed us to provide further details, upgrade the initial annotation, and identified 159 RNA-seq independent transcribed sRNAs. The lists of 575 and 159 sRNAs sequences were used to predict the number and location of srns in 18 S. aureus strains and 10 other Staphylococci. A comparison of the srn contents within 32 Staphylococcal genomes revealed a poor conservation between species. In addition, sRNA structure predictions obtained with MFold are accessible. A BLAST server and the intaRNA program, which is dedicated to target prediction, were implemented. SRD is the first sRNA database centered on a genus; it is a user-friendly and scalable device with the possibility to submit new sequences that should spread in the literature. © 2015 Sassi et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Serratia aquatilis sp. nov., isolated from drinking water systems.
Kämpfer, Peter; Glaeser, Stefanie P
2016-01-01
A cream-white-pigmented, oxidase-negative bacterium (strain 2015-2462-01T), isolated from a drinking water system, was investigated in detail to determine its taxonomic position. Cells of the isolate were rod-shaped and stained Gram-negative. A comparison of the 16S rRNA gene sequence of strain 2015-2462-01T with sequences of the type strains of closely related species of the genus Serratia revealed highest similarity to Serratia fonticola (98.4 %), Serratia proteamaculans (97.8 %), Serratia liquefaciens and Serratia grimesii (both 97.7 %). 16S rRNA gene sequence similarities to all other Serratia species were below 97.4 %. Multilocus sequence analysis (MLSA) on the basis of concatenated partial gyrB, rpoB, infB and atpD gene sequences showed a clear distinction of strain 2015-2462-01T from the type strains of the closest related Serratia species. The fatty acid profile of the strain consisted of C16 : 1 ω7c, C16 : 0; C14 : 0 and C14 : 0 3-OH/iso-C16 : 1 I as major components. DNA-DNA hybridizations between 2015-2462-01T and S. fonticola ATCC 29844T resulted in a relatedness value of 27 % (reciprocal 20 %). This DNA-DNA hybridization result in combination with the MLSA results and the differential biochemical properties indicated that strain 2015-2462-01T represents a novel species of the genus Serratia, for which the name Serratia aquatilis sp. nov. is proposed. The type strain is 2015-2462-01T ( = LMG 29119T = CCM 8626T).
NASA Technical Reports Server (NTRS)
Koeberl, Christian; Reimold, Wolf Uwe; Boer, Rudolf H.
1992-01-01
The Barberton Greenstone belt is a 3.5- to 3.2-Ga-old formation situated in the Swaziland Supergroup near Barberton, northeast Transvaal, South Africa. The belt includes a lower, predominantly volcanic sequence, and an upper sedimentary sequence (e.g., the Fig Tree Group). Within this upper sedimentary sequence, Lowe and Byerly identified a series of different beds of spherules with diameters of around 0.5-2 mm. Lowe and Byerly and Lowe et al. have interpreted these spherules to be condensates of rock vapor produced by large meteorite impacts in the early Archean. We have collected a series of samples from drill cores from the Mt. Morgan and Princeton sections near Barberton, as well as samples taken from underground exposures in the Sheba and Agnes mines. These samples seem much better preserved than the surface samples described by Lowe and Byerly and Lowe et al. Over a scale of just under 30 cm, several well-defined spherule beds are visible, interspaced with shales and/or layers of banded iron formation. Some spherules have clearly been deposited on top of a sedimentary unit because the shale layer shows indentions from the overlying spherules. Although fresher than the surface samples (e.g., spherule bed S-2), there is abundant evidence for extensive alteration, presumably by hydrothermal processes. In some sections of the cores sulfide mineralization is common. For our mineralogical and petrographical studies we have prepared detailed thin sections of all core and underground samples (as well as some surface samples from the S-2 layer for comparison). For geochemical work, layers with thicknesses in the order of 1-5 mm were separated from selected core and underground samples. The chemical analyses are being performed using neutron activation analysis in order to obtain data for about 35 trace elements in each sample. Major elements are being determined by XRF and plasma spectrometry. To clarify the history of the sulfide mineralization, sulfur isotopic compositions are being determined.
QCScreen: a software tool for data quality control in LC-HRMS based metabolomics.
Simader, Alexandra Maria; Kluger, Bernhard; Neumann, Nora Katharina Nicole; Bueschl, Christoph; Lemmens, Marc; Lirk, Gerald; Krska, Rudolf; Schuhmacher, Rainer
2015-10-24
Metabolomics experiments often comprise large numbers of biological samples resulting in huge amounts of data. This data needs to be inspected for plausibility before data evaluation to detect putative sources of error e.g. retention time or mass accuracy shifts. Especially in liquid chromatography-high resolution mass spectrometry (LC-HRMS) based metabolomics research, proper quality control checks (e.g. for precision, signal drifts or offsets) are crucial prerequisites to achieve reliable and comparable results within and across experimental measurement sequences. Software tools can support this process. The software tool QCScreen was developed to offer a quick and easy data quality check of LC-HRMS derived data. It allows a flexible investigation and comparison of basic quality-related parameters within user-defined target features and the possibility to automatically evaluate multiple sample types within or across different measurement sequences in a short time. It offers a user-friendly interface that allows an easy selection of processing steps and parameter settings. The generated results include a coloured overview plot of data quality across all analysed samples and targets and, in addition, detailed illustrations of the stability and precision of the chromatographic separation, the mass accuracy and the detector sensitivity. The use of QCScreen is demonstrated with experimental data from metabolomics experiments using selected standard compounds in pure solvent. The application of the software identified problematic features, samples and analytical parameters and suggested which data files or compounds required closer manual inspection. QCScreen is an open source software tool which provides a useful basis for assessing the suitability of LC-HRMS data prior to time consuming, detailed data processing and subsequent statistical analysis. It accepts the generic mzXML format and thus can be used with many different LC-HRMS platforms to process both multiple quality control sample types as well as experimental samples in one or more measurement sequences.
Using video-oriented instructions to speed up sequence comparison.
Wozniak, A
1997-04-01
This document presents an implementation of the well-known Smith-Waterman algorithm for comparison of proteic and nucleic sequences, using specialized video instructions. These instructions, SIMD-like in their design, make possible parallelization of the algorithm at the instruction level. Benchmarks on an ULTRA SPARC running at 167 MHz show a speed-up factor of two compared to the same algorithm implemented with integer instructions on the same machine. Performance reaches over 18 million matrix cells per second on a single processor, giving to our knowledge the fastest implementation of the Smith-Waterman algorithm on a workstation. The accelerated procedure was introduced in LASSAP--a LArge Scale Sequence compArison Package software developed at INRIA--which handles parallelism at higher level. On a SUN Enterprise 6000 server with 12 processors, a speed of nearly 200 million matrix cells per second has been obtained. A sequence of length 300 amino acids is scanned against SWISSPROT R33 (1,8531,385 residues) in 29 s. This procedure is not restricted to databank scanning. It applies to all cases handled by LASSAP (intra- and inter-bank comparisons, Z-score computation, etc.
Digital signal processing methods for biosequence comparison.
Benson, D C
1990-01-01
A method is discussed for DNA or protein sequence comparison using a finite field fast Fourier transform, a digital signal processing technique; and statistical methods are discussed for analyzing the output of this algorithm. This method compares two sequences of length N in computing time proportional to N log N compared to N2 for methods currently used. This method makes it feasible to compare very long sequences. An example is given to show that the method correctly identifies sites of known homology. PMID:2349096
Bartels, Daniela; Kespohl, Sebastian; Albaum, Stefan; Drüke, Tanja; Goesmann, Alexander; Herold, Julia; Kaiser, Olaf; Pühler, Alfred; Pfeiffer, Friedhelm; Raddatz, Günter; Stoye, Jens; Meyer, Folker; Schuster, Stephan C
2005-04-01
We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boore, Jeffrey L.
2004-11-27
Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less
Discrete sequence prediction and its applications
NASA Technical Reports Server (NTRS)
Laird, Philip
1992-01-01
Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We apply sequence prediction using a simple and practical sequence-prediction algorithm, called TDAG. The TDAG algorithm is first tested by comparing its performance with some common data compression algorithms. Then it is adapted to the detailed requirements of dynamic program optimization, with excellent results.
Nabavi, Reza; Conneely, Brendan; McCarthy, Elaine; Good, Barbara; Shayan, Parviz; DE Waal, Theo
2014-09-01
Accurate identification of sheep nematodes is a critical point in epidemiological studies and monitoring of drug resistance in flocks. However, due to a close morphological similarity between the eggs and larval stages of many of these nematodes, such identification is not a trivial task. There are a number of studies showing that molecular targets in ribosomal DNA (Internal transcribed spacer 1, 2 and Intergenic spacer) are suitable for accurate identification of sheep bursate nematodes. The objective of present study was to compare the ITS1, ITS2 and IGS regions of Iranian common bursate nematodes in order to choose best target for specific identification methods. The first and second internal transcribed spacers (ITS1and ITS2) and intergenic spacer (IGS) of the ribosomal DNA (rDNA) of 5 common Iranian bursate nematodes of sheep were sequenced. The sequences of some non-Iranian isolates were used for comparison in order to evaluate the variation in sequence homology between geographically different nematode populations. Comparison of the ITS1 and ITS2 sequences of Iranian nematodes showed greatest similarity among Teladorsagia circumcincta and Marshallagia marshalli of 94% and 88%, respectively. While Trichostrongylus colubriformis and M. marshalli showed the highest homology (99%) in the IGS sequences. Comparison of the spacer sequences of Iranian with non-Iranian isolates showed significantly higher variation in Haemonchus contortus compared to the other species. Both the ITS1 and ITS2 sequences are convenient targets to have species-specific identification of Iranian bursate nematodes. On the other hand the IGS region may be a less suitable molecular target.
Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M
2014-06-01
It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
Thiyagarajan, P; Ponnuswamy, P K
1981-09-01
Following the procedure described in the preceding article, the low energy conformations located for the four dimeric subunits of RNA, ApG, ApU, CpG, and CpU are presented. The A-RNA type and Watson-Crick type helical conformations and a number of different kinds of loop promoting ones were identified as low energy in all the units. The 3E-3E and 3E-2E pucker sequences are found to be more or less equally preferred; the 2E-2E sequence is occasionally preferred, while the 2E-3E is highly prohibited in all the units. A conformation similar to the one observed in the drug-dinucleoside monophosphate complex crystals becomes a low energy case only for the CpG unit. The low energy conformations obtained for the four model units were used to assess the stability of the conformational states of the dinucleotide segments in the four crystal models of the tRNAPhe molecule. Information on the occurrence of the less preferred sugar-pucker sequences in the various loop regions in the tRNAPhe molecule has been obtained. A detailed comparison of the conformational characteristics of DNA and RNA subunits at the dimeric level is presented on the basis of the results.
Thiyagarajan, P; Ponnuswamy, P K
1981-01-01
Following the procedure described in the preceding article, the low energy conformations located for the four dimeric subunits of RNA, ApG, ApU, CpG, and CpU are presented. The A-RNA type and Watson-Crick type helical conformations and a number of different kinds of loop promoting ones were identified as low energy in all the units. The 3E-3E and 3E-2E pucker sequences are found to be more or less equally preferred; the 2E-2E sequence is occasionally preferred, while the 2E-3E is highly prohibited in all the units. A conformation similar to the one observed in the drug-dinucleoside monophosphate complex crystals becomes a low energy case only for the CpG unit. The low energy conformations obtained for the four model units were used to assess the stability of the conformational states of the dinucleotide segments in the four crystal models of the tRNAPhe molecule. Information on the occurrence of the less preferred sugar-pucker sequences in the various loop regions in the tRNAPhe molecule has been obtained. A detailed comparison of the conformational characteristics of DNA and RNA subunits at the dimeric level is presented on the basis of the results. PMID:6168312
Hernández-Orts, Jesús S; Smales, Lesley R; Pinacho-Pinacho, Carlos D; García-Varela, Martín; Presswell, Bronwen
2017-02-01
The polymorphid acanthocephalan, Corynosoma hannae Zdzitowiecki, 1984 is characterised on the basis of newly collected material from a New Zealand sea lion, Phocarctos hookeri (Gray), and long-nosed fur seal, Arctophoca forsteri (Lesson) (definitive hosts), and from Stewart Island shags, Leucocarbo chalconotus (Gray), spotted shags, Phalacrocorax punctatus (Sparrman) and yellow-eyed penguins, Megadyptes antipodes (Hombron & Jacquinot) (non-definitive hosts) from New Zealand. Specimens are described in detail and scanning electron micrographs for C. hannae are provided. Additionally, cystacanths of C. hannae are reported and described for the first time from the body cavity and mesenteries of New Zealand brill, Colistium guntheri (Hutton) and from New Zealand sole, Peltorhamphus novaezeelandiae Günther from Kaka Point, Otago in New Zealand. Partial sequence data for the mitochondrial cytochrome c oxidase 1 gene (cox1) for adults, immature specimens and cystacanths of C. hannae were obtained. Phylogenetic analyses of the newly-generated sequences and for available cox1 sequences of Corynosoma spp. revealed a close relationship between C. hannae and C. australe Johnston, 1937, both species infecting pinnipeds in the Southern Hemisphere. However, a morphological comparison of the species suggests that C. hannae mostly closely resembles C. evae Zdzitowiecki, 1984 and C. semerme (Forssell, 1904), the latter of which occurs in pinnipeds in the Northern Hemisphere. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Ozer, Abdullah; Tome, Jacob M; Friedman, Robin C; Gheba, Dan; Schroth, Gary P; Lis, John T
2015-08-01
Because RNA-protein interactions have a central role in a wide array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay that couples sequencing on an Illumina GAIIx genome analyzer with the quantitative assessment of protein-RNA interactions. This assay is able to analyze interactions between one or possibly several proteins with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of the EGFP and negative elongation factor subunit E (NELF-E) proteins with their corresponding canonical and mutant RNA aptamers. Here we provide a detailed protocol for HiTS-RAP that can be completed in about a month (8 d hands-on time). This includes the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, HiTS and protein binding with a GAIIx instrument, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, quantitative analysis of RNA on a massively parallel array (RNA-MaP) and RNA Bind-n-Seq (RBNS), for quantitative analysis of RNA-protein interactions.
Foxl2 function in ovarian development.
Uhlenhaut, Nina Henriette; Treier, Mathias
2006-07-01
Foxl2 is a forkhead transcription factor essential for proper reproductive function in females. Human patients carrying mutations in the FOXL2 gene display blepharophimosis/ptosis/epicanthus inversus syndrome (BPES), an autosomal dominant disease associated with eyelid defects and premature ovarian failure in females. Recently, animal models for BPES have been developed that in combination with a catalogue of human FOXL2 mutations provide further insight into its molecular function. Mice homozygous mutant for Foxl2 display craniofacial malformations and female infertility. The analysis of the murine phenotype has revealed that Foxl2 is required for granulosa cell function. These ovarian somatic cells surround and nourish the oocyte and play an important role in follicle formation and activation. Mutations upstream of FOXL2 in humans, not affecting the coding sequence itself, have also been shown to cause BPES, which points to the existence of a distant regulatory element necessary for proper gene expression. The same regulatory sequences may be deleted in the goat polled intersex syndrome (PIS), in which FoxL2 expression is severely reduced. Sequence comparison of FoxL2 from several vertebrate species has shown that it is a highly conserved gene involved in ovary development. Thus, the detailed understanding of Foxl2 function and regulation and the identification of its transcriptional targets may open new avenues for the treatment of female infertility in the future.
MotionFlow: Visual Abstraction and Aggregation of Sequential Patterns in Human Motion Tracking Data.
Jang, Sujin; Elmqvist, Niklas; Ramani, Karthik
2016-01-01
Pattern analysis of human motions, which is useful in many research areas, requires understanding and comparison of different styles of motion patterns. However, working with human motion tracking data to support such analysis poses great challenges. In this paper, we propose MotionFlow, a visual analytics system that provides an effective overview of various motion patterns based on an interactive flow visualization. This visualization formulates a motion sequence as transitions between static poses, and aggregates these sequences into a tree diagram to construct a set of motion patterns. The system also allows the users to directly reflect the context of data and their perception of pose similarities in generating representative pose states. We provide local and global controls over the partition-based clustering process. To support the users in organizing unstructured motion data into pattern groups, we designed a set of interactions that enables searching for similar motion sequences from the data, detailed exploration of data subsets, and creating and modifying the group of motion patterns. To evaluate the usability of MotionFlow, we conducted a user study with six researchers with expertise in gesture-based interaction design. They used MotionFlow to explore and organize unstructured motion tracking data. Results show that the researchers were able to easily learn how to use MotionFlow, and the system effectively supported their pattern analysis activities, including leveraging their perception and domain knowledge.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, M.; Lee, C.S.
1997-12-31
The objective of this study is to develop a rapid and sensitive method for oligosaccharide sequencing. The oligosaccharides are subjected to the enzyme array digestion with exoglycosidases of known and well-defined specificities. The enzyme array method involves the division of oligosaccharide sample into aliquots, and the incubation of each aliquot with a precisely defined mixture of exoglycosidases. In the enzyme array method, the presence of a specific linkage anywhere in the oligosaccharide is determined by the inability of an enzyme mixture lacking a given enzyme to cleave that linkage ( a stop point) and the ability of the other enzymesmore » to cleave the linkage up to that point. The direct quantification of released monosaccharides from the enzyme array can be achieved by using pulsed amperometric detection (PAD) or by fluorescent derivatization with a fluorophoric agent. The measured monosaccharide concentrations in combination with the enzyme array analysis provide detail characterization of oligosaccharides with their sugar composition, configuration, and linkage information, The released monosaccharides are further quantified by anion exchange chromatography and capillary electrophoresis for the comparison with the results obtained from PAD and fluorescence measurements. Our enzyme array-electrochemical (or fluorescent) detection method does not require any separation procedure and any prior labeling of oligosaccharide and have several practical advantages over the current carbohydrate sequencing techniques including simplicity, speed, and the ability to use small amounts of starting material.« less
Fast alignment-free sequence comparison using spaced-word frequencies.
Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard
2014-07-15
Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
Bào, Yīmíng; Kuhn, Jens H
2018-01-01
During the last decade, genome sequence-based classification of viruses has become increasingly prominent. Viruses can be even classified based on coding-complete genome sequence data alone. Nevertheless, classification remains arduous as experts are required to establish phylogenetic trees to depict the evolutionary relationships of such sequences for preliminary taxonomic placement. Pairwise sequence comparison (PASC) of genomes is one of several novel methods for establishing relationships among viruses. This method, provided by the US National Center for Biotechnology Information as an open-access tool, circumvents phylogenetics, and yet PASC results are often in agreement with those of phylogenetic analyses. Computationally inexpensive, PASC can be easily performed by non-taxonomists. Here we describe how to use the PASC tool for the preliminary classification of novel viral hemorrhagic fever-causing viruses.
Insights from Human/Mouse genome comparisons
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pennacchio, Len A.
2003-03-30
Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less
NASA Astrophysics Data System (ADS)
Sun, S. M.; Slightom, J. L.; Hall, T. C.
1981-01-01
A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-01-01
Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
NASA Technical Reports Server (NTRS)
Schneider, F.
1999-01-01
UML use cases conceptually identify function points or major requirements that a software system must satisfy. Sequence diagrams expand each use case to show in temporal sequence a more detailed notion of intended system behavior.
USDA-ARS?s Scientific Manuscript database
Coat protein sequences of 33 Potyvirus isolates from legume and Passiflora spp. were sequenced to determine the identity of infecting viruses. Phylogenetic analysis of the sequences revealed the presence of seven distinct virus species....
USDA-ARS?s Scientific Manuscript database
Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...
Current challenges in genome annotation through structural biology and bioinformatics.
Furnham, Nicholas; de Beer, Tjaart A P; Thornton, Janet M
2012-10-01
With the huge volume in genomic sequences being generated from high-throughout sequencing projects the requirement for providing accurate and detailed annotations of gene products has never been greater. It is proving to be a huge challenge for computational biologists to use as much information as possible from experimental data to provide annotations for genome data of unknown function. A central component to this process is to use experimentally determined structures, which provide a means to detect homology that is not discernable from just the sequence and permit the consequences of genomic variation to be realized at the molecular level. In particular, structures also form the basis of many bioinformatics methods for improving the detailed functional annotations of enzymes in combination with similarities in sequence and chemistry. Copyright © 2012. Published by Elsevier Ltd.
Hanke, Dennis; Pohlmann, Anne; Sauter-Louis, Carola; Höper, Dirk; Stadler, Julia; Ritzmann, Mathias; Steinrigl, Adi; Schwarz, Bernd-Andreas; Akimkin, Valerij; Fux, Robert; Blome, Sandra; Beer, Martin
2017-07-06
Porcine epidemic diarrhea (PED) is an acute and highly contagious enteric disease of swine caused by the eponymous virus (PEDV) which belongs to the genus Alphacoronavirus within the Coronaviridae virus family. Following the disastrous outbreaks in Asia and the United States, PEDV has been detected also in Europe. In order to better understand the overall situation, the molecular epidemiology, and factors that might influence the most variable disease impact; 40 samples from swine feces were collected from different PED outbreaks in Germany and other European countries and sequenced by shot-gun next-generation sequencing. A total of 38 new PEDV complete coding sequences were generated. When compared on a global scale, all investigated sequences from Central and South-Eastern Europe formed a rather homogeneous PEDV S INDEL cluster, suggesting a recent re-introduction. However, in-detail analyses revealed two new clusters and putative ancestor strains. Based on the available background data, correlations between clusters and location, farm type or clinical presentation could not be established. Additionally, the impact of secondary infections was explored using the metagenomic data sets. While several coinfections were observed, no correlation was found with disease courses. However, in addition to the PEDV genomes, ten complete viral coding sequences from nine different data sets were reconstructed each representing new virus strains. In detail, three pasivirus A strains, two astroviruses, a porcine sapelovirus, a kobuvirus, a porcine torovirus, a posavirus, and an enterobacteria phage were almost fully sequenced.
Middleton, Christopher P.; Senerchia, Natacha; Stein, Nils; Akhunov, Eduard D.; Keller, Beat
2014-01-01
Using Roche/454 technology, we sequenced the chloroplast genomes of 12 Triticeae species, including bread wheat, barley and rye, as well as the diploid progenitors and relatives of bread wheat Triticum urartu, Aegilops speltoides and Ae. tauschii. Two wild tetraploid taxa, Ae. cylindrica and Ae. geniculata, were also included. Additionally, we incorporated wild Einkorn wheat Triticum boeoticum and its domesticated form T. monococcum and two Hordeum spontaneum (wild barley) genotypes. Chloroplast genomes were used for overall sequence comparison, phylogenetic analysis and dating of divergence times. We estimate that barley diverged from rye and wheat approximately 8–9 million years ago (MYA). The genome donors of hexaploid wheat diverged between 2.1–2.9 MYA, while rye diverged from Triticum aestivum approximately 3–4 MYA, more recently than previously estimated. Interestingly, the A genome taxa T. boeoticum and T. urartu were estimated to have diverged approximately 570,000 years ago. As these two have a reproductive barrier, the divergence time estimate also provides an upper limit for the time required for the formation of a species boundary between the two. Furthermore, we conclusively show that the chloroplast genome of hexaploid wheat was contributed by the B genome donor and that this unknown species diverged from Ae. speltoides about 980,000 years ago. Additionally, sequence alignments identified a translocation of a chloroplast segment to the nuclear genome which is specific to the rye/wheat lineage. We propose the presented phylogeny and divergence time estimates as a reference framework for future studies on Triticeae. PMID:24614886
Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A
2011-06-04
Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ling, Jiqiang; Peterson, Kaitlyn M.; Simonovic, Ivana
2014-03-12
Aminoacyl-tRNA synthetases (aaRSs) ensure faithful translation of mRNA into protein by coupling an amino acid to a set of tRNAs with conserved anticodon sequences. Here, we show that in mitochondria of Saccharomyces cerevisiae, a single aaRS (MST1) recognizes and aminoacylates two natural tRNAs that contain anticodon loops of different size and sequence. Besides a regular ?? with a threonine (Thr) anticodon, MST1 also recognizes an unusual ??, which contains an enlarged anticodon loop and an anticodon triplet that reassigns the CUN codons from leucine to threonine. Our data show that MST1 recognizes the anticodon loop in both tRNAs, but employsmore » distinct recognition mechanisms. The size but not the sequence of the anticodon loop is critical for ?? recognition, whereas the anticodon sequence is essential for aminoacylation of ??. The crystal structure of MST1 reveals that, while lacking the N-terminal editing domain, the enzyme closely resembles the bacterial threonyl-tRNA synthetase (ThrRS). A detailed structural comparison with Escherichia coli ThrRS, which is unable to aminoacylate ??, reveals differences in the anticodon-binding domain that probably allow recognition of the distinct anticodon loops. Finally, our mutational and modeling analyses identify the structural elements in MST1 (e.g., helix {alpha}11) that define tRNA selectivity. Thus, MTS1 exemplifies that a single aaRS can recognize completely divergent anticodon loops of natural isoacceptor tRNAs and that in doing so it facilitates the reassignment of the genetic code in yeast mitochondria.« less
Image based performance analysis of thermal imagers
NASA Astrophysics Data System (ADS)
Wegner, D.; Repasi, E.
2016-05-01
Due to advances in technology, modern thermal imagers resemble sophisticated image processing systems in functionality. Advanced signal and image processing tools enclosed into the camera body extend the basic image capturing capability of thermal cameras. This happens in order to enhance the display presentation of the captured scene or specific scene details. Usually, the implemented methods are proprietary company expertise, distributed without extensive documentation. This makes the comparison of thermal imagers especially from different companies a difficult task (or at least a very time consuming/expensive task - e.g. requiring the execution of a field trial and/or an observer trial). For example, a thermal camera equipped with turbulence mitigation capability stands for such a closed system. The Fraunhofer IOSB has started to build up a system for testing thermal imagers by image based methods in the lab environment. This will extend our capability of measuring the classical IR-system parameters (e.g. MTF, MTDP, etc.) in the lab. The system is set up around the IR- scene projector, which is necessary for the thermal display (projection) of an image sequence for the IR-camera under test. The same set of thermal test sequences might be presented to every unit under test. For turbulence mitigation tests, this could be e.g. the same turbulence sequence. During system tests, gradual variation of input parameters (e. g. thermal contrast) can be applied. First ideas of test scenes selection and how to assembly an imaging suite (a set of image sequences) for the analysis of imaging thermal systems containing such black boxes in the image forming path is discussed.
Sotelo, Elena; Fernández-Pinero, Jovita; Llorente, Francisco; Vázquez, Ana; Moreno, Ana; Agüero, Montserrat; Cordioli, Paolo; Tenorio, Antonio; Jiménez-Clavero, Miguel Ángel
2011-11-01
In recent years, West Nile virus (WNV) has re-emerged in the Western Mediterranean region. As a result, the number of complete WNV genome sequences available from this region has increased, allowing more detailed phylogenetic analyses, which may help to understand the evolutionary history of WNV circulating in the Western Mediterranean. To this aim, the present work describes six new complete WNV sequences from recent outbreaks and surveillance in Italy in 2008-2009 and in Spain in 2008 and 2010. Comparison with other sequences from different WNV clusters within lineage 1 (clade 1a) confirmed that all Western Mediterranean WNV isolates obtained since 1996 (except one from Tunisia, collected in 1997) cluster in a single monophyletic group (here called 'WMed' subtype). The analysis differentiated two subgroups within this subtype, which appear to have evolved from earlier WMed strains, suggesting a single introduction in the area, and further dissemination and evolution. Close similarities between WNV variants circulating in consecutive years, one in Spain, between 2007 and 2008, and another in Italy between 2008 and 2009, suggest that the virus possibly overwinters in Western Mediterranean sites. The NS3(249)-proline genotype, recently proposed as a virulence determinant for WNV, has arisen independently at least twice in the area. Overall, these results indicate that the frequent recurrence of outbreaks caused by phylogenetically homogeneous WNV in the Western Mediterranean since 1996 is consistent with a single introduction followed by viral persistence in endemic foci in the area, rather than resulting from independent introductions from exogenous endemic foci.
Characterization of the microflora of the human axilla.
Taylor, D; Daulby, A; Grimshaw, S; James, G; Mercer, J; Vaziri, S
2003-06-01
It is widely accepted that axillary malodour is attributable to the microbial biotransformation of odourless, natural secretions into volatile odorous products. Consequently, there is a need to understand the microbial ecology of the axilla in order that deodorant products, which control microbial action in this region, can be developed in the appropriate manner. A detailed characterization of the axillary microflora of a group of human volunteers has been performed. The axillary microflora is composed of four principal groups of bacteria (staphylococci, aerobic coryneforms, micrococci and propionibacteria), and the yeast genus Malassezia. Results indicated that the axillary microflora was dominated by either staphylococcal or aerobic coryneform species. Comparisons between axillary bacterial numbers and levels of axillary odour demonstrated the greatest association between odour levels and the presence of aerobic coryneforms in the under-arm. As the taxonomy of cutaneous aerobic coryneforms is poorly understood, a further study was conducted to characterize selected axillary aerobic coryneform isolates. Using the molecular technique of 16S rDNA sequencing, selected genomic sequences of a number of axillary aerobic coryneform isolates were obtained. Comparisons with sequence databases indicated the likely presence of a range of Corynebacterium species on axillary skin, although the majority of isolates were most similar to either Corynebacterium G-2 CDC G5840 or C. mucifaciens DMMZ 2278. Although for a panel of individuals differences in the carriage of Corynebacterium species were noted, similar species were carried by a number of panellists. All isolates examined in this limited evaluation failed to demonstrate the capability to metabolize long-chain fatty acids (LCFAs) to shorter chain, more volatile products. The application of this modern molecular phylogenetic technique has increased understanding of the diversity of aerobic coryneform carriage in the axilla, and on human skin. The application of this technique in other studies to assess the ethnic differences in cutaneous bacterial ecology, or the effects on the microflora of specific product use, will assist in the future development of novel deodorant systems.
Ghosh, Pritha; Sowdhamini, Ramanathan
2017-08-24
Pathogenic bacteria have evolved various strategies to counteract host defences. They are also exposed to environments that are undergoing constant changes. Hence, in order to survive, bacteria must adapt themselves to the changing environmental conditions by performing regulations at the transcriptional and/or post-transcriptional levels. Roles of RNA-binding proteins (RBPs) as virulence factors have been very well studied. Here, we have used a sequence search-based method to compare and contrast the proteomes of 16 pathogenic and three non-pathogenic E. coli strains as well as to obtain a global picture of the RBP landscape (RBPome) in E. coli. Our results show that there are no significant differences in the percentage of RBPs encoded by the pathogenic and the non-pathogenic E. coli strains. The differences in the types of Pfam domains as well as Pfam RNA-binding domains, encoded by these two classes of E. coli strains, are also insignificant. The complete and distinct RBPome of E. coli has been established by studying all known E. coli strains till date. We have also identified RBPs that are exclusive to pathogenic strains, and most of them can be exploited as drug targets since they appear to be non-homologous to their human host proteins. Many of these pathogen-specific proteins were uncharacterised and their identities could be resolved on the basis of sequence homology searches with known proteins. Detailed structural modelling, molecular dynamics simulations and sequence comparisons have been pursued for selected examples to understand differences in stability and RNA-binding. The approach used in this paper to cross-compare proteomes of pathogenic and non-pathogenic strains may also be extended to other bacterial or even eukaryotic proteomes to understand interesting differences in their RBPomes. The pathogen-specific RBPs reported in this study, may also be taken up further for clinical trials and/or experimental validations.
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...
2015-11-16
detailed discussion of barcode designs in Supplementary Note 1, Supplementary Fig. 1 and sequences in Supplementary Note 2). Whereas the nicking and...eight subpools, each as a one- or as a two-barcode version ( design details in Supplementary Note 1). All subpools amplified strands with the expected...for the c2ca designs . We used the same restriction enzymes (Nb.BsrDI and Nt.BspQI) that were encoded between the primers and the target sequences to
ERIC Educational Resources Information Center
Du, Wenchong; Kelly, Steve W.
2013-01-01
The present study examines implicit sequence learning in adult dyslexics with a focus on comparing sequence transitions with different statistical complexities. Learning of a 12-item deterministic sequence was assessed in 12 dyslexic and 12 non-dyslexic university students. Both groups showed equivalent standard reaction time increments when the…
Performance comparison of leading image codecs: H.264/AVC Intra, JPEG2000, and Microsoft HD Photo
NASA Astrophysics Data System (ADS)
Tran, Trac D.; Liu, Lijie; Topiwala, Pankaj
2007-09-01
This paper provides a detailed rate-distortion performance comparison between JPEG2000, Microsoft HD Photo, and H.264/AVC High Profile 4:4:4 I-frame coding for high-resolution still images and high-definition (HD) 1080p video sequences. This work is an extension to our previous comparative study published in previous SPIE conferences [1, 2]. Here we further optimize all three codecs for compression performance. Coding simulations are performed on a set of large-format color images captured from mainstream digital cameras and 1080p HD video sequences commonly used for H.264/AVC standardization work. Overall, our experimental results show that all three codecs offer very similar coding performances at the high-quality, high-resolution setting. Differences tend to be data-dependent: JPEG2000 with the wavelet technology tends to be the best performer with smooth spatial data; H.264/AVC High-Profile with advanced spatial prediction modes tends to cope best with more complex visual content; Microsoft HD Photo tends to be the most consistent across the board. For the still-image data sets, JPEG2000 offers the best R-D performance gains (around 0.2 to 1 dB in peak signal-to-noise ratio) over H.264/AVC High-Profile intra coding and Microsoft HD Photo. For the 1080p video data set, all three codecs offer very similar coding performance. As in [1, 2], neither do we consider scalability nor complexity in this study (JPEG2000 is operating in non-scalable, but optimal performance mode).
Whistle sequences in wild killer whales (Orcinus orca).
Riesch, Rüdiger; Ford, John K B; Thomsen, Frank
2008-09-01
Combining different stereotyped vocal signals into specific sequences increases the range of information that can be transferred between individuals. The temporal emission pattern and the behavioral context of vocal sequences have been described in detail for a variety of birds and mammals. Yet, in cetaceans, the study of vocal sequences is just in its infancy. Here, we provide a detailed analysis of sequences of stereotyped whistles in killer whales off Vancouver Island, British Columbia. A total of 1140 whistle transitions in 192 whistle sequences recorded from resident killer whales were analyzed using common spectrographic analysis techniques. In addition to the stereotyped whistles described by Riesch et al., [(2006). "Stability and group specificity of stereotyped whistles in resident killer whales, Orcinus orca, off British Columbia," Anim. Behav. 71, 79-91.] We found a new and rare stereotyped whistle (W7) as well as two whistle elements, which are closely linked to whistle sequences: (1) stammers and (2) bridge elements. Furthermore, the frequency of occurrence of 12 different stereotyped whistle types within the sequences was not randomly distributed and the transition patterns between whistles were also nonrandom. Finally, whistle sequences were closely tied to close-range behavioral interactions (in particular among males). Hence, we conclude that whistle sequences in wild killer whales are complex signal series and propose that they are most likely emitted by single individuals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feild, M.J.; Armstrong, F.B.
1987-05-01
E. coli JA199 pDU11 harbors a multicopy plasmid containing the ilv GEDAY gene cluster of S. typhimurium. TmB, gene product of ilv E, was purified, crystallized, and subjected to Edman degradation using a gas phase sequencer. The intact protein yielded an amino terminal 31 residue sequence. Both carboxymethylated apoenzyme and (/sup 3/H)-NaBH-reduced holoenzyme were then subjected to digestion by trypsin. The digests were fractionated using reversed phase HPLC, and the peptides isolated were sequenced. The borohydride-treated holoenzyme was used to isolate the cofactor-binding peptide. The peptide is 27 residues long and a comparison with known sequences of other aminotransferases revealedmore » limited homology. Peptides accounting for 211 of 288 predicted residues have been sequenced, including 9 residues of the carboxyl terminus. Comparison of peptides with the inferred amino acid sequence of the E. coli K-12 enzyme has helped determine the sequence of the amino terminal 59 residues; only two differences between the sequences are noted in this region.« less
Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L
1988-01-01
Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437
Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine
2009-01-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Liu, Hanmei; Wang, Xuewen; Wei, Bin; Wang, Yongbin; Liu, Yinghong; Zhang, Junjie; Hu, Yufeng; Yu, Guowu; Li, Jian; Xu, Zhanbin; Huang, Yubi
2016-01-01
In southwest China, some maize landraces have long been isolated geographically, and have phenotypes that differ from those of widely grown cultivars. These landraces may harbor rich genetic variation responsible for those phenotypes. Four-row Wax is one such landrace, with four rows of kernels on the cob. We resequenced the genome of Four-row Wax, obtaining 50.46 Gb sequence at 21.87× coverage, then identified and characterized 3,252,194 SNPs, 213,181 short InDels (1–5 bp) and 39,631 structural variations (greater than 5 bp). Of those, 312,511 (9.6%) SNPs were novel compared to the most detailed haplotype map (HapMap) SNP database of maize. Characterization of variations in reported kernel row number (KRN) related genes and KRN QTL regions revealed potential causal mutations in fea2, td1, kn1, and te1. Genome-wide comparisons revealed abundant genetic variations in Four-row Wax, which may be associated with environmental adaptation. The sequence and SNP variations described here enrich genetic resources of maize, and provide guidance into study of seed numbers for crop yield improvement. PMID:27242868
NASA Technical Reports Server (NTRS)
Roddy, D. J.; Ullrich, G. W.; Sauer, F. M.; Jones, G. H. S.
1977-01-01
Cratering motions and structural deformation are described for the rim of the Prairie Flat multiring crater, 85.5 m across and 5.3 m deep, which was formed by the detonation of a 500-ton TNT surface-tangent sphere. The terminal displacement and motion data are derived from marker cans and velocity gages emplaced in drill holes in a three-dimensional matrix radial to the crater. The integration of this data with a detailed geologic cross section, mapped from deep trench excavations through the rim, provides a composite view of the general sequence of motions that formed a transiently uplifted rim, overturned flap, inverted stratigraphy, downfolded rim, and deformed strata in the crater walls. Preliminary comparisons with laboratory experimental cratering and with numerical simulations indicate that explosion craters of the Prairie Flat-type generated by surface and near-surface energy sources tend to follow predictable motion sequences and produce comparable structural deformation. More specifically, central uplift and multiring impact craters with morphologies and structures comparable to Prairie Flat are inferred to have experienced similar deformational histories of the rim, such as uplift, overturning, terracing, and downfolding.
Spatial distribution of marine airborne bacterial communities
Seifried, Jasmin S; Wichels, Antje; Gerdts, Gunnar
2015-01-01
The spatial distribution of bacterial populations in marine bioaerosol samples was investigated during a cruise from the North Sea to the Baltic Sea via Skagerrak and Kattegat. The analysis of the sampled bacterial communities with a pyrosequencing approach revealed that the most abundant phyla were represented by the Proteobacteria (49.3%), Bacteroidetes (22.9%), Actinobacteria (16.3%), and Firmicutes (8.3%). Cyanobacteria were assigned to 1.5% of all bacterial reads. A core of 37 bacterial OTUs made up more than 75% of all bacterial sequences. The most abundant OTU was Sphingomonas sp. which comprised 17% of all bacterial sequences. The most abundant bacterial genera were attributed to distinctly different areas of origin, suggesting highly heterogeneous sources for bioaerosols of marine and coastal environments. Furthermore, the bacterial community was clearly affected by two environmental parameters – temperature as a function of wind direction and the sampling location itself. However, a comparison of the wind directions during the sampling and calculated backward trajectories underlined the need for more detailed information on environmental parameters for bioaerosol investigations. The current findings support the assumption of a bacterial core community in the atmosphere. They may be emitted from strong aerosolizing sources, probably being mixed and dispersed over long distances. PMID:25800495
Uronic polysaccharide degrading enzymes.
Garron, Marie-Line; Cygler, Miroslaw
2014-10-01
In the past several years progress has been made in the field of structure and function of polysaccharide lyases (PLs). The number of classified polysaccharide lyase families has increased to 23 and more detailed analysis has allowed the identification of more closely related subfamilies, leading to stronger correlation between each subfamily and a unique substrate. The number of as yet unclassified polysaccharide lyases has also increased and we expect that sequencing projects will allow many of these unclassified sequences to emerge as new families. The progress in structural analysis of PLs has led to having at least one representative structure for each of the families and for two unclassified enzymes. The newly determined structures have folds observed previously in other PL families and their catalytic mechanisms follow either metal-assisted or Tyr/His mechanisms characteristic for other PL enzymes. Comparison of PLs with glycoside hydrolases (GHs) shows several folds common to both classes but only for the β-helix fold is there strong indication of divergent evolution from a common ancestor. Analysis of bacterial genomes identified gene clusters containing multiple polysaccharide cleaving enzymes, the Polysaccharides Utilization Loci (PULs), and their gene complement suggests that they are organized to process completely a specific polysaccharide. Copyright © 2014 Elsevier Ltd. All rights reserved.
Schwefel, David; Boucherit, Virginie C; Christodoulou, Evangelos; Walker, Philip A; Stoye, Jonathan P; Bishop, Kate N; Taylor, Ian A
2015-04-08
The SAMHD1 triphosphohydrolase inhibits HIV-1 infection of myeloid and resting T cells by depleting dNTPs. To overcome SAMHD1, HIV-2 and some SIVs encode either of two lineages of the accessory protein Vpx that bind the SAMHD1 N or C terminus and redirect the host cullin-4 ubiquitin ligase to target SAMHD1 for proteasomal degradation. We present the ternary complex of Vpx from SIV that infects mandrills (SIVmnd-2) with the cullin-4 substrate receptor, DCAF1, and N-terminal and SAM domains from mandrill SAMHD1. The structure reveals details of Vpx lineage-specific targeting of SAMHD1 N-terminal "degron" sequences. Comparison with Vpx from SIV that infects sooty mangabeys (SIVsmm) complexed with SAMHD1-DCAF1 identifies molecular determinants directing Vpx lineages to N- or C-terminal SAMHD1 sequences. Inspection of the Vpx-DCAF1 interface also reveals conservation of Vpx with the evolutionally related HIV-1/SIV accessory protein Vpr. These data suggest a unified model for how Vpx and Vpr exploit DCAF1 to promote viral replication. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Natronorubrum sediminis sp. nov., an archaeon isolated from a saline lake.
Gutiérrez, M C; Castillo, A M; Corral, P; Minegishi, H; Ventosa, A
2010-08-01
Two novel haloalkaliphilic archaea, strains CG-6T and CG-4, were isolated from sediment of the hypersaline Lake Chagannor in Inner Mongolia, China. Cells of the two strains were pleomorphic, non-motile and strictly aerobic. They required at least 2.5 M NaCl for growth, with optimum growth at 3.4 M NaCl. They grew at pH 8.0-11.0, with optimum growth at pH 9.0. Hypotonic treatment with less than 1.5 M NaCl caused cell lysis. The two strains had similar polar lipid compositions, possessing C20C20 and C20C25 derivatives of phosphatidylglycerol and phosphatidylglycerol phosphate methyl ester. No glycolipids were detected. Comparison of 16S rRNA gene sequences and morphological features placed them in the genus Natronorubrum. 16S rRNA gene sequence similarities to strains of recognized species of the genus Natronorubrum were 96.2-93.8%. Detailed phenotypic characterization and DNA-DNA hybridization studies revealed that the two strains belong to a novel species in the genus Natronorubrum, for which the name Natronorubrum sediminis sp. nov. is proposed; the type strain is CG-6T (=CECT 7487T =CGMCC 1.8981T =JCM 15982T).
Simulation of Electronic Circular Dichroism of Nucleic Acids: From the Structure to the Spectrum.
Padula, Daniele; Jurinovich, Sandro; Di Bari, Lorenzo; Mennucci, Benedetta
2016-11-14
We present a quantum mechanical (QM) simulation of the electronic circular dichroism (ECD) of nucleic acids (NAs). The simulation combines classical molecular dynamics, to obtain the structure and its temperature-dependent fluctuations, with a QM excitonic model to determine the ECD. The excitonic model takes into account environmental effects through a polarizable embedding and uses a refined approach to calculate the electronic couplings in terms of full transition densities. Three NAs with either similar conformations but different base sequences or similar base sequences but different conformations have been investigated and the results were compared with experimental observations; a good agreement was seen in all cases. A detailed analysis of the nature of the ECD bands in terms of their excitonic composition was also carried out. Finally, a comparison between the QM and the DeVoe models clearly revealed the importance of including fluctuations of the excitonic parameters and of accurately determining the electronic couplings. This study demonstrates the feasibility of the ab initio simulation of the ECD spectra of NAs, that is, without the need of experimental structural or electronic data. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
DroSpeGe: rapid access database for new Drosophila species genomes.
Gilbert, Donald G
2007-01-01
The Drosophila species comparative genome database DroSpeGe (http://insects.eugenes.org/DroSpeGe/) provides genome researchers with rapid, usable access to 12 new and old Drosophila genomes, since its inception in 2004. Scientists can use, with minimal computing expertise, the wealth of new genome information for developing new insights into insect evolution. New genome assemblies provided by several sequencing centers have been annotated with known model organism gene homologies and gene predictions to provided basic comparative data. TeraGrid supplies the shared cyberinfrastructure for the primary computations. This genome database includes homologies to Drosophila melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. BLAST searches of the newest assemblies are integrated with genome maps. GBrowse maps provide detailed views of cross-species aligned genomes. BioMart provides for data mining of annotations and sequences. Common chromosome maps identify major synteny among species. Potential gain and loss of genes is suggested by Gene Ontology groupings for genes of the new species. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
Structural test of the parameterized-backbone method for protein design.
Plecs, Joseph J; Harbury, Pehr B; Kim, Peter S; Alber, Tom
2004-09-03
Designing new protein folds requires a method for simultaneously optimizing the conformation of the backbone and the side-chains. One approach to this problem is the use of a parameterized backbone, which allows the systematic exploration of families of structures. We report the crystal structure of RH3, a right-handed, three-helix coiled coil that was designed using a parameterized backbone and detailed modeling of core packing. This crystal structure was determined using another rationally designed feature, a metal-binding site that permitted experimental phasing of the X-ray data. RH3 adopted the intended fold, which has not been observed previously in biological proteins. Unanticipated structural asymmetry in the trimer was a principal source of variation within the RH3 structure. The sequence of RH3 differs from that of a previously characterized right-handed tetramer, RH4, at only one position in each 11 amino acid sequence repeat. This close similarity indicates that the design method is sensitive to the core packing interactions that specify the protein structure. Comparison of the structures of RH3 and RH4 indicates that both steric overlap and cavity formation provide strong driving forces for oligomer specificity.
Gupta, Rani; Kumari, Arti; Syal, Poonam; Singh, Yogesh
2015-01-01
Lipase catalyzes hydrolysis of fats in lipid water interphase and perform variety of biotransformation reactions under micro aqueous conditions. The major sources include microbial lipases; among these yeast and fungal lipases are of special interest because they can carry out various stereoselective reactions. These lipases are highly diverse and are categorized into three classes on the basis of oxyanion hole: GX, GGGX and Y. The detailed phylogenetic analysis showed that GX family is more diverse than GGGX and Y family. Sequence and structural comparisons revealed that lipases are conserved only in the signature sequence region. Their characteristic structural determinants viz. lid, binding pocket and oxyanion hole are hotspots for mutagenesis. Few examples are cited in this review to highlight the multidisciplinary approaches for designing novel enzyme variants with improved thermo stability and substrate specificity. In addition, we present a brief account on biotechnological applications of lipases. Lipases have also gained attention as virulence factors, therefore, we surveyed the role of lipases in yeast physiology related to colonization, adhesion, biofilm formation and pathogenesis. The new genomic era has opened numerous possibilities to genetically manipulate lipases for food, fuel and pharmaceuticals. Copyright © 2014 Elsevier Ltd. All rights reserved.
DNA barcodes for 1/1000 of the animal kingdom.
Hebert, Paul D N; Dewaard, Jeremy R; Landry, Jean-François
2010-06-23
This study reports DNA barcodes for more than 1300 Lepidoptera species from the eastern half of North America, establishing that 99.3 per cent of these species possess diagnostic barcode sequences. Intraspecific divergences averaged just 0.43 per cent among this assemblage, but most values were lower. The mean was elevated by deep barcode divergences (greater than 2%) in 5.1 per cent of the species, often involving the sympatric occurrence of two barcode clusters. A few of these cases have been analysed in detail, revealing species overlooked by the current taxonomic system. This study also provided a large-scale test of the extent of regional divergence in barcode sequences, indicating that geographical differentiation in the Lepidoptera of eastern North America is small, even when comparisons involve populations as much as 2800 km apart. The present results affirm that a highly effective system for the identification of Lepidoptera in this region can be built with few records per species because of the limited intra-specific variation. As most terrestrial and marine taxa are likely to possess a similar pattern of population structure, an effective DNA-based identification system can be developed with modest effort.
Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard
2016-10-01
Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.
Complete genome sequence of the plant pathogen Erwinia amylovora strain ATCC 49946
USDA-ARS?s Scientific Manuscript database
Erwinia amylovora causes the economically important disease fire blight that affects rosaceous plants, especially pear and apple. Here we report the complete genome sequence and annotation of strain ATCC 49946. The analysis of the sequence and its comparison with sequenced genomes of closely related...
NASA Astrophysics Data System (ADS)
Benkler, Erik; Telle, Harald R.
2007-06-01
An improved phase-locked loop (PLL) for versatile synchronization of a sampling pulse train to an optical data stream is presented. It enables optical sampling of the true waveform of repetitive high bit-rate optical time division multiplexed (OTDM) data words such as pseudorandom bit sequences. Visualization of the true waveform can reveal details, which cause systematic bit errors. Such errors cannot be inferred from eye diagrams and require word-synchronous sampling. The programmable direct-digital-synthesis circuit used in our novel PLL approach allows flexible adaption of virtually any problem-specific synchronization scenario, including those required for waveform sampling, for jitter measurements by slope detection, and for classical eye-diagrams. Phase comparison of the PLL is performed at 10-GHz OTDM base clock rate, leading to a residual synchronization jitter of less than 70 fs.
Centaur propellant acquisition system study
NASA Technical Reports Server (NTRS)
Blatt, M. H.; Walter, M. D.
1975-01-01
A study was performed to determine the desirability of replacing the hydrogen peroxide settling system on the Centaur D-1S with a capillary acquisition system. A comprehensive screening was performed to select the most promising capillary device fluid acquisition, thermal conditioning, and fabrication techniques. Refillable start baskets and bypass feed start tanks were selected for detailed design. Critical analysis areas were settling and refilling, start sequence development with an initially dry boost pump, and cooling the fluid delivered to the boost pump in order to provide necessary net position suction head (NPSH). Design drawings were prepared for the start basket and start tank concepts for both LO2 and LH2 tanks. System comparisons indicated that the start baskets using wicking for thermal conditioning, and thermal subcooling for boost pump NPSH, are the most desirable systems for future development.
Centaur propellant acquisition system
NASA Technical Reports Server (NTRS)
Blatt, M. H.; Aydelott, J. C.
1975-01-01
The desirability of replacing the hydrogen peroxide settling system of the Centaur D-1S with a capillary acquisition system was evaluated. A comprehensive screening was performed to select the most promising capillary device fluid acquisition, thermal conditioning, and fabrication techniques. Refillable start baskets and bypass feed start tanks were selected for detailed design. Critical analysis areas were settling and refilling, start sequence development with an initially dry boost pump, and cooling the fluid delivered to the boost pump to provide the necessary net positive suction head (NPSH). Design drawings were prepared for start basket and start tank concepts for both the liquid oxygen and liquid hydrogen tanks. System comparisons indicated that the start baskets using wicking flow for thermal conditioning, and thermal subcooling for providing boost pump NPSH, are the most desirable systems for future Centaur acquisition system development.
NASA Astrophysics Data System (ADS)
Parvizpour, Sepideh; Razmara, Jafar; Ramli, Aizi Nor Mazila; Md Illias, Rosli; Shamsir, Mohd Shahir
2014-06-01
The structure of a novel psychrophilic β-mannanase enzyme from Glaciozyma antarctica PI12 yeast has been modelled and analysed in detail. To our knowledge, this is the first attempt to model a psychrophilic β-mannanase from yeast. To this end, a 3D structure of the enzyme was first predicted using a threading method because of the low sequence identity (<30 %) using MODELLER9v12 and simulated using GROMACS at varying low temperatures for structure refinement. Comparisons with mesophilic and thermophilic mannanases revealed that the psychrophilic mannanase contains longer loops and shorter helices, increases in the number of aromatic and hydrophobic residues, reductions in the number of hydrogen bonds and salt bridges and numerous amino acid substitutions on the surface that increased the flexibility and its efficiency for catalytic reactions at low temperatures.
[Description of the ISO 9001/2000 certification process in the parenteral nutrition area].
Miana Mena, M T; Fontanals Martínez, S; López Púa, Y; López Suñé, E; Codina Jané, C; Ribas Sala, J
2007-01-01
In order to guarantee quality and safety and to increase user satisfaction, healthcare organisations have integrated quality management systems into their structures. This study describes the process for introducing the UNE-EN-ISO-9001/2000 standard in the parenteral nutrition area. A multidisciplinary group established the scope of the standard, focusing on transcription, preparation, dispensation and microbiological control. A detailed procedure describing the sequences of circuits and associated activities, the responsible staff and the action guidelines to be followed was established. Quality and activity markers were also established. This process has enabled a standard system to be implemented, with its operation perfectly described and documented, allowing its stages to be traceable and supervised. As there is no record of the data obtained beforehand, no direct comparison can be made; its evolution must therefore be analysed in the future.
AlignMe—a membrane protein sequence alignment web server
Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.
2014-01-01
We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu
2017-01-01
Apostasioideae, consists of only two genera, Apostasia and Neuwiedia , which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla ), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase ( ndh ) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci- ndhA intron, matK-5'trnK , clpP-psbB , rps8-rpl14 , trnT-trnL , 3'trnK-matK , clpP intron , psbK-trnK , trnS-psbC , and ndhF-rpl32 -that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed.
Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu
2017-01-01
Apostasioideae, consists of only two genera, Apostasia and Neuwiedia, which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase (ndh) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci—ndhA intron, matK-5′trnK, clpP-psbB, rps8-rpl14, trnT-trnL, 3′trnK-matK, clpP intron, psbK-trnK, trnS-psbC, and ndhF-rpl32—that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed. PMID:29046685
NASA Astrophysics Data System (ADS)
Kereszturi, Gábor; Németh, Károly; Cronin, Shane J.; Procter, Jonathan; Agustín-Flores, Javier
2014-10-01
Monogenetic basaltic volcanism is characterised by a complex array of eruptive behaviours, reflecting spatial and temporal variability of the magmatic properties (e.g. composition, eruptive volume, magma flux) as well as environmental factors at the vent site (e.g. availability of water, country rock geology, faulting). These combine to produce changes in eruption style over brief periods (minutes to days) in many eruption episodes. Monogenetic eruptions in some volcanic fields often start with a phreatomagmatic vent-opening phase that later transforms into "dry" magmatic explosive or effusive activity, with a strong variation in the duration and importance of this first phase. Such an eruption sequence pattern occurred in 83% of the known eruption in the 0.25 My-old Auckland Volcanic Field (AVF), New Zealand. In this investigation, the eruptive volumes were compared with the sequences of eruption styles preserved in the pyroclastic record at each volcano of the AVF, as well as environmental influencing factors, such as distribution and thickness of water-saturated semi- to unconsolidated sediments, topographic position, distances from known fault lines. The AVF showed that there is no correlation between ejecta ring volumes and environmental influencing factors that is valid for the entire AVF. In contrary, using a set of comparisons of single volcanoes with well-known and documented sequences, resultant eruption sequences could be explained by predominant patterns of the environment in which these volcanoes were erupted. Based on the spatial variability of these environmental factors, a first-order susceptibility hazard map was constructed for the AVF that forecasts areas of largest likelihood for phreatomagmatic eruptions by overlaying topographical and shallow geological information. Combining detailed phase-by-phase breakdowns of eruptive volumes and the event sequences of the AVF, along with the new susceptibility map, more realistic eruption scenarios can be developed for different parts of the volcanic field. This approach can be applied to tailoring field and sub-field specific hazard forecasting at similar volcanic fields worldwide.
Kinematics and spectra of planetary nebulae with O VI-sequence nuclei
NASA Technical Reports Server (NTRS)
Johnson, H. M.
1976-01-01
Spectral features of NGC 5189 and NGC 6905 are tabulated. Fabry-Perot profiles around H alpha and O III lambda 5007 of NGC 5189, NGC 6905, NGC 246, and NGC 1535, are illustrated. The latter planetary nebula is a non-O VI-sequence, comparison object of high excitation. The kinematics of the four planetary nebulae are simply analyzed. Discussion of these data is motivated by the possibility of collisional excitation by high-speed ejecta from broad-lined O VI-sequence nuclei, and by the opportunity to make a comparison with conditions in the supernova remnant or ring nebula, G2.4 + 1.4, which contains an O VI-sequence nucleus of Population I.
VizieR Online Data Catalog: KIC 8462852 GTC spectra (Deeg+, 2018)
NASA Astrophysics Data System (ADS)
Deeg, H. J.; Alonso, R.; Nespral, D.; Boyajian, T.
2018-01-01
Spectra obtained in the follow-up of KIC 8462852 (Boyajian's star) with OSIRIS at the GTC telescope. These spectra have been reduced as described in the paper and are contained in two directories, for target and comparison spectra: sp_target contains spectra of the target star (KIC 8462852) sp_compar contains spectra of the comparison star (KIC 8462763) At each pointing of the GTC, a sequence of 10-45 spectra was generated. The individual spectra are named: tpXXYY.dat for the target spectra and cpXXYY.dat for the comparison spectra, where XX is the pointing number, and YY is a sequence number. The format of each spectrum file is a two-column ascii file: Wavelength (Angstrom) | Flux (arbitrary units)) The files times_pXX.dat correspond to each of the pointings and contain the times of mid-exposure of each spectrum, in the HJD_UTC-2400000 framework. These times apply to both target and comparison spectra and are ordered by increasing sequence number. There are a total of 516 spectra of the target and 516 spectra of the comparison. (19 data files).
Alignment-free sequence comparison (II): theoretical power of comparison statistics.
Wan, Lin; Reinert, Gesine; Sun, Fengzhu; Waterman, Michael S
2010-11-01
Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.
Method and apparatus for biological sequence comparison
Marr, T.G.; Chang, W.I.
1997-12-23
A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.
Method and apparatus for biological sequence comparison
Marr, Thomas G.; Chang, William I-Wei
1997-01-01
A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.
Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes.
Rusinov, I S; Ershova, A S; Karyagina, A S; Spirin, S A; Alexeevski, A V
2018-02-01
Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Holm, Liisa; Laakso, Laura M
2016-07-08
The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Pasi, Marco; Maddocks, John H.; Lavery, Richard
2015-01-01
Microsecond molecular dynamics simulations of B-DNA oligomers carried out in an aqueous environment with a physiological salt concentration enable us to perform a detailed analysis of how potassium ions interact with the double helix. The oligomers studied contain all 136 distinct tetranucleotides and we are thus able to make a comprehensive analysis of base sequence effects. Using a recently developed curvilinear helicoidal coordinate method we are able to analyze the details of ion populations and densities within the major and minor grooves and in the space surrounding DNA. The results show higher ion populations than have typically been observed in earlier studies and sequence effects that go beyond the nature of individual base pairs or base pair steps. We also show that, in some special cases, ion distributions converge very slowly and, on a microsecond timescale, do not reflect the symmetry of the corresponding base sequence. PMID:25662221
Castejon, Maria; Menéndez, Maria Carmen; Comas, Iñaki; Vicente, Ana; Garcia, Maria J
2018-06-01
Bacterial whole-genome sequences contain informative features of their evolutionary pathways. Comparison of whole-genome sequences have become the method of choice for classification of prokaryotes, thus allowing the identification of bacteria from an evolutionary perspective, and providing data to resolve some current controversies. Currently, controversy exists about the assignment of members of the Mycobacterium avium complex, as is for the cases of Mycobacterium yongonense and 'Mycobacterium indicus pranii'. These two mycobacteria, closely related to Mycobacterium intracellulare on the basis of standard phenotypic and single gene-sequences comparisons, were not considered a member of such species on the basis on some particular differences displayed by a single strain. Whole-genome sequence comparison procedures, namely the average nucleotide identity and the genome distance, showed that those two mycobacteria should be considered members of the species M. intracellulare. The results were confirmed with other whole-genome comparison supplementary methods. According to the data provided, Mycobacterium yongonense and 'Mycobacterium indicus pranii' should be considered and renamed and included as members of M. intracellulare. This study highlights the problems caused when a novel species is accepted on the basis of a single strain, as was the case for M. yongonense. Based mainly on whole-genome sequence analysis, we conclude that M. yongonense should be reclassified as a subspecies of Mycobacterium intracellulareas Mycobacterium intracellularesubsp. yongonense and 'Mycobacterium indicus pranii' classified in the same subspecies as the type strain of Mycobacterium intracellulare and classified as Mycobacterium intracellularesubsp. intracellulare.
Guidelines for Grades 9-12 Mathematics Curriculum. Toward Meeting Present and Future Needs.
ERIC Educational Resources Information Center
Peterson, Wayne, Ed.
Three sequences of coursework are detailed in the curriculum development guidelines provided in this document. The 4-year sequence, structured around problem-solving, applications, and the acquisition of theory, is designed for the college-bound student who plans to enter a mathematics-based field of study. The 3-year sequence is designed for…
King, Julie; Thomas, Ann; James, Caron; King, Ian; Armstead, Ian
2013-07-03
Ryegrasses and fescues (genera, Lolium and Festuca) are species of forage and turf grasses which are used widely in agricultural and amenity situations. They are classified within the sub-family Pooideae and so are closely related to Brachypodium distachyon, wheat, barley, rye and oats. Recently, a DArT array has been developed which can be used in generating marker and mapping information for ryegrasses and fescues. This represents a potential common marker set for ryegrass and fescue researchers which can be linked through to comparative genomic information for the grasses. A F2 perennial ryegrass genetic map was developed consisting of 7 linkage groups defined by 1316 markers and deriving a total map length of 683 cM. The marker set included 866 DArT and 315 gene sequence-based markers. Comparison with previous DArT mapping studies in perennial and Italian ryegrass (L. multiflorum) identified 87 and 105 DArT markers in common, respectively, of which 94% and 87% mapped to homoeologous linkage groups. A similar comparison with meadow fescue (F. pratensis) identified only 28 DArT markers in common, of which c. 50% mapped to non-homoelogous linkage groups. In L. perenne, the genetic distance spanned by the DArT markers encompassed the majority of the regions that could be described in terms of comparative genomic relationships with rice, Brachypodium distachyon, and Sorghum bicolor. DArT markers are likely to be a useful common marker resource for ryegrasses and fescues, though the success in aligning different populations through the mapping of common markers will be influenced by degrees of population interrelatedness. The detailed mapping of DArT and gene-based markers in this study potentially allows comparative relationships to be derived in future mapping populations characterised using solely DArT markers.
PLANET ENGULFMENT BY {approx}1.5-3 M{sub sun} RED GIANTS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kunitomo, M.; Ikoma, M.; Sato, B.
2011-08-20
Recent radial-velocity surveys for GK clump giants have revealed that planets also exist around {approx}1.5-3 M{sub sun} stars. However, no planets have been found inside 0.6 AU around clump giants, in contrast to solar-type main-sequence stars, many of which harbor short-period planets such as hot Jupiters. In this study, we examine the possibility that planets were engulfed by host stars evolving on the red-giant branch (RGB). We integrate the orbital evolution of planets in the RGB and helium-burning phases of host stars, including the effects of stellar tide and stellar mass loss. Then we derive the critical semimajor axis (ormore » the survival limit) inside which planets are eventually engulfed by their host stars after tidal decay of their orbits. Specifically, we investigate the impact of stellar mass and other stellar parameters on the survival limit in more detail than previous studies. In addition, we make detailed comparisons with measured semimajor axes of planets detected so far, which no previous study has done. We find that the critical semimajor axis is quite sensitive to stellar mass in the range between 1.7 and 2.1 M{sub sun}, which suggests a need for careful comparison between theoretical and observational limits of the existence of planets. Our comparison demonstrates that all planets orbiting GK clump giants that have been detected are beyond the survival limit, which is consistent with the planet-engulfment hypothesis. However, on the high-mass side (>2.1M{sub sun}), the detected planets are orbiting significantly far from the survival limit, which suggests that engulfment by host stars may not be the main reason for the observed lack of short-period giant planets. To confirm our conclusion, the detection of more planets around clump giants, especially with masses {approx}> 2.5M{sub sun}, is required.« less
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.
1987-01-01
The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
Kimura, M; Kimura, J; Hatakeyama, T
1988-11-21
The complete amino acid sequences of ribosomal proteins S11 from the Gram-positive eubacterium Bacillus stearothermophilus and of S19 from the archaebacterium Halobacterium marismortui have been determined. A search for homologous sequences of these proteins revealed that they belong to the ribosomal protein S11 family. Homologous proteins have previously been sequenced from Escherichia coli as well as from chloroplast, yeast and mammalian ribosomes. A pairwise comparison of the amino acid sequences showed that Bacillus protein S11 shares 68% identical residues with S11 from Escherichia coli and a slightly lower homology (52%) with the homologous chloroplast protein. The halophilic protein S19 is more related to the eukaryotic (45-49%) than to the eubacterial counterparts (35%).
A detailed gravimetric geoid from North America to Eurasia
NASA Technical Reports Server (NTRS)
Vincent, S. F.; Strange, W. E.; Marsh, J. G.
1972-01-01
A detailed gravimetric geoid of the United States, North Atlantic, and Eurasia, which was computed from a combination of satellite derived and surface gravity data, is presented. The precision of this detailed geoid is + or - 2 to + or - 3 m in the continents but may be in the range of 5 to 7 m in those areas where data is sparse. Comparisons of the detailed gravimetric geoid with results of Rapp, Fischer, and Rice for the United States, Bomford in Europe, and Heiskanen and Fischer in India are presented. Comparisons are also presented with geoid heights from satellite solutions for geocentric station coordinates in North America, the Caribbean, and Europe.
GPU-based cloud service for Smith-Waterman algorithm using frequency distance filtration scheme.
Lee, Sheng-Ta; Lin, Chun-Yuan; Hung, Che Lun
2013-01-01
As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing units (GPUs). This work presents a novel Smith-Waterman algorithm with a frequency-based filtration method on GPUs rather than merely accelerating the comparisons yet expending computational resources to handle such unnecessary comparisons. A user friendly interface is also designed for potential cloud server applications with GPUs. Additionally, two data sets, H1N1 protein sequences (query sequence set) and human protein database (database set), are selected, followed by a comparison of CUDA-SW and CUDA-SW with the filtration method, referred to herein as CUDA-SWf. Experimental results indicate that reducing unnecessary sequence alignments can improve the computational time by up to 41%. Importantly, by using CUDA-SWf as a cloud service, this application can be accessed from any computing environment of a device with an Internet connection without time constraints.
Domain similarity based orthology detection.
Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich
2015-05-13
Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus
2015-03-01
The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.
On the shape of martian dust and water ice aerosols
NASA Astrophysics Data System (ADS)
Pitman, K. M.; Wolff, M. J.; Clancy, R. T.; Clayton, G. C.
2000-10-01
Researchers have often calculated radiative properties of Martian aerosols using either Mie theory for homogeneous spheres or semi-empirical theories. Given that these atmospheric particles are randomly oriented, this approach seems fairly reasonable. However, the idea that randomly oriented nonspherical particles have scattering properties equivalent to even a select subset of spheres is demonstratably false} (Bohren and Huffman 1983; Bohren and Koh 1985, Appl. Optics, 24, 1023). Fortunately, recent computational developments now enable us to directly compute scattering properties for nonspherical particles. We have combined a numerical approach for axisymmetric particle shapes, i.e., cylinders, disks, spheroids (Waterman's T-Matrix approach as improved by Mishchenko and collaborators; cf., Mishchenko et al. 1997, JGR, 102, D14, 16,831), with a multiple-scattering radiative transfer algorithm to constrain the shape of water ice and dust aerosols. We utilize a two-stage iterative process. First, we empirically derive a scattering phase function for each aerosol component (starting with some ``guess'') from radiative transfer models of MGS Thermal Emission Spectrometer Emission Phase Function (EPF) sequences (for details on this step, see Clancy et al., DPS 2000). Next, we perform a series of scattering calculations, adjusting our parameters to arrive at a ``best-fit'' theoretical phase function. In this presentation, we provide details on the second step in our analysis, including the derived phase functions (for several characteristic EPF sequences) as well as the particle properties of the best-fit theoretical models. We provide a sensitivity analysis for the EPF model-data comparisons in terms of perturbations in the particle properties (i.e., range of axial ratios, sizes, refractive indices, etc). This work is supported through NASA grant NAGS-9820 (MJW) and JPL contract no. 961471 (RTC).
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.
Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya
2015-01-01
In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms
Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya
2015-01-01
In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143
Sul, Woo Jun; Cole, James R.; Jesus, Ederson da C.; Wang, Qiong; Farris, Ryan J.; Fish, Jordan A.; Tiedje, James M.
2011-01-01
High-throughput sequencing of 16S rRNA genes has increased our understanding of microbial community structure, but now even higher-throughput methods to the Illumina scale allow the creation of much larger datasets with more samples and orders-of-magnitude more sequences that swamp current analytic methods. We developed a method capable of handling these larger datasets on the basis of assignment of sequences into an existing taxonomy using a supervised learning approach (taxonomy-supervised analysis). We compared this method with a commonly used clustering approach based on sequence similarity (taxonomy-unsupervised analysis). We sampled 211 different bacterial communities from various habitats and obtained ∼1.3 million 16S rRNA sequences spanning the V4 hypervariable region by pyrosequencing. Both methodologies gave similar ecological conclusions in that β-diversity measures calculated by using these two types of matrices were significantly correlated to each other, as were the ordination configurations and hierarchical clustering dendrograms. In addition, our taxonomy-supervised analyses were also highly correlated with phylogenetic methods, such as UniFrac. The taxonomy-supervised analysis has the advantages that it is not limited by the exhaustive computation required for the alignment and clustering necessary for the taxonomy-unsupervised analysis, is more tolerant of sequencing errors, and allows comparisons when sequences are from different regions of the 16S rRNA gene. With the tremendous expansion in 16S rRNA data acquisition underway, the taxonomy-supervised approach offers the potential to provide more rapid and extensive community comparisons across habitats and samples. PMID:21873204
Shams, S; Martola, J; Cavallin, L; Granberg, T; Shams, M; Aspelin, P; Wahlund, L O; Kristoffersen-Wiberg, M
2015-06-01
Cerebral microbleeds are thought to have potentially important clinical implications in dementia and stroke. However, the use of both T2* and SWI MR imaging sequences for microbleed detection has complicated the cross-comparison of study results. We aimed to determine the impact of microbleed sequences on microbleed detection and associated clinical parameters. Patients from our memory clinic (n = 246; 53% female; mean age, 62) prospectively underwent 3T MR imaging, with conventional thick-section T2*, thick-section SWI, and conventional thin-section SWI. Microbleeds were assessed separately on thick-section SWI, thin-section SWI, and T2* by 3 raters, with varying neuroradiologic experience. Clinical and radiologic parameters from the dementia investigation were analyzed in association with the number of microbleeds in negative binomial regression analyses. Prevalence and number of microbleeds were higher on thick-/thin-section SWI (20/21%) compared with T2*(17%). There was no difference in microbleed prevalence/number between thick- and thin-section SWI. Interrater agreement was excellent for all raters and sequences. Univariate comparisons of clinical parameters between patients with and without microbleeds yielded no difference across sequences. In the regression analysis, only minor differences in clinical associations with the number of microbleeds were noted across sequences. Due to the increased detection of microbleeds, we recommend SWI as the sequence of choice in microbleed detection. Microbleeds and their association with clinical parameters are robust to the effects of varying MR imaging sequences, suggesting that comparison of results across studies is possible, despite differing microbleed sequences. © 2015 by American Journal of Neuroradiology.
Comparison of Dixon Sequences for Estimation of Percent Breast Fibroglandular Tissue
Ledger, Araminta E. W.; Scurr, Erica D.; Hughes, Julie; Macdonald, Alison; Wallace, Toni; Thomas, Karen; Wilson, Robin; Leach, Martin O.; Schmidt, Maria A.
2016-01-01
Objectives To evaluate sources of error in the Magnetic Resonance Imaging (MRI) measurement of percent fibroglandular tissue (%FGT) using two-point Dixon sequences for fat-water separation. Methods Ten female volunteers (median age: 31 yrs, range: 23–50 yrs) gave informed consent following Research Ethics Committee approval. Each volunteer was scanned twice following repositioning to enable an estimation of measurement repeatability from high-resolution gradient-echo (GRE) proton-density (PD)-weighted Dixon sequences. Differences in measures of %FGT attributable to resolution, T1 weighting and sequence type were assessed by comparison of this Dixon sequence with low-resolution GRE PD-weighted Dixon data, and against gradient-echo (GRE) or spin-echo (SE) based T1-weighted Dixon datasets, respectively. Results %FGT measurement from high-resolution PD-weighted Dixon sequences had a coefficient of repeatability of ±4.3%. There was no significant difference in %FGT between high-resolution and low-resolution PD-weighted data. Values of %FGT from GRE and SE T1-weighted data were strongly correlated with that derived from PD-weighted data (r = 0.995 and 0.96, respectively). However, both sequences exhibited higher mean %FGT by 2.9% (p < 0.0001) and 12.6% (p < 0.0001), respectively, in comparison with PD-weighted data; the increase in %FGT from the SE T1-weighted sequence was significantly larger at lower breast densities. Conclusion Although measurement of %FGT at low resolution is feasible, T1 weighting and sequence type impact on the accuracy of Dixon-based %FGT measurements; Dixon MRI protocols for %FGT measurement should be carefully considered, particularly for longitudinal or multi-centre studies. PMID:27011312
Gog, Julia R; Lever, Andrew M L; Skittrall, Jordan P
2018-01-01
We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.
Salehi, Mojtaba; Bahreininejad, Ardeshir
2011-08-01
Optimization of process planning is considered as the key technology for computer-aided process planning which is a rather complex and difficult procedure. A good process plan of a part is built up based on two elements: (1) the optimized sequence of the operations of the part; and (2) the optimized selection of the machine, cutting tool and Tool Access Direction (TAD) for each operation. In the present work, the process planning is divided into preliminary planning, and secondary/detailed planning. In the preliminary stage, based on the analysis of order and clustering constraints as a compulsive constraint aggregation in operation sequencing and using an intelligent searching strategy, the feasible sequences are generated. Then, in the detailed planning stage, using the genetic algorithm which prunes the initial feasible sequences, the optimized operation sequence and the optimized selection of the machine, cutting tool and TAD for each operation based on optimization constraints as an additive constraint aggregation are obtained. The main contribution of this work is the optimization of sequence of the operations of the part, and optimization of machine selection, cutting tool and TAD for each operation using the intelligent search and genetic algorithm simultaneously.
Salehi, Mojtaba
2010-01-01
Optimization of process planning is considered as the key technology for computer-aided process planning which is a rather complex and difficult procedure. A good process plan of a part is built up based on two elements: (1) the optimized sequence of the operations of the part; and (2) the optimized selection of the machine, cutting tool and Tool Access Direction (TAD) for each operation. In the present work, the process planning is divided into preliminary planning, and secondary/detailed planning. In the preliminary stage, based on the analysis of order and clustering constraints as a compulsive constraint aggregation in operation sequencing and using an intelligent searching strategy, the feasible sequences are generated. Then, in the detailed planning stage, using the genetic algorithm which prunes the initial feasible sequences, the optimized operation sequence and the optimized selection of the machine, cutting tool and TAD for each operation based on optimization constraints as an additive constraint aggregation are obtained. The main contribution of this work is the optimization of sequence of the operations of the part, and optimization of machine selection, cutting tool and TAD for each operation using the intelligent search and genetic algorithm simultaneously. PMID:21845020
ERIC Educational Resources Information Center
Le-Thi, Duyen; Rodgers, Michael P. H.; Pellicer-Sánchez, Ana
2017-01-01
This study investigates the relative effectiveness of different teaching approaches on the learning of formulaic sequences. Three comparisons were made in this study: the effects of explicit teaching of formulaic sequences versus teaching embedded in traditional coursebook instruction, the effects of the degree of salience of the sequences in the…
Image domain propeller fast spin echo☆
Skare, Stefan; Holdsworth, Samantha J.; Lilja, Anders; Bammer, Roland
2013-01-01
A new pulse sequence for high-resolution T2-weighted (T2-w) imaging is proposed –image domain propeller fast spin echo (iProp-FSE). Similar to the T2-w PROPELLER sequence, iProp-FSE acquires data in a segmented fashion, as blades that are acquired in multiple TRs. However, the iProp-FSE blades are formed in the image domain instead of in the k-space domain. Each iProp-FSE blade resembles a single-shot fast spin echo (SSFSE) sequence with a very narrow phase-encoding field of view (FOV), after which N rotated blade replicas yield the final full circular FOV. Our method of combining the image domain blade data to a full FOV image is detailed, and optimal choices of phase-encoding FOVs and receiver bandwidths were evaluated on phantom and volunteers. The results suggest that a phase FOV of 15–20%, a receiver bandwidth of ±32–63 kHz and a subsequent readout time of about 300 ms provide a good tradeoff between signal-to-noise ratio (SNR) efficiency and T2 blurring. Comparisons between iProp-FSE, Cartesian FSE and PROPELLER were made on single-slice axial brain data, showing similar T2-w tissue contrast and SNR with great anatomical conspicuity at similar scan times –without colored noise or streaks from motion. A new slice interleaving order is also proposed to improve the multislice capabilities of iProp-FSE. PMID:23200683
Image domain propeller fast spin echo.
Skare, Stefan; Holdsworth, Samantha J; Lilja, Anders; Bammer, Roland
2013-04-01
A new pulse sequence for high-resolution T2-weighted (T2-w) imaging is proposed - image domain propeller fast spin echo (iProp-FSE). Similar to the T2-w PROPELLER sequence, iProp-FSE acquires data in a segmented fashion, as blades that are acquired in multiple TRs. However, the iProp-FSE blades are formed in the image domain instead of in the k-space domain. Each iProp-FSE blade resembles a single-shot fast spin echo (SSFSE) sequence with a very narrow phase-encoding field of view (FOV), after which N rotated blade replicas yield the final full circular FOV. Our method of combining the image domain blade data to a full FOV image is detailed, and optimal choices of phase-encoding FOVs and receiver bandwidths were evaluated on phantom and volunteers. The results suggest that a phase FOV of 15-20%, a receiver bandwidth of ±32-63 kHz and a subsequent readout time of about 300 ms provide a good tradeoff between signal-to-noise ratio (SNR) efficiency and T2 blurring. Comparisons between iProp-FSE, Cartesian FSE and PROPELLER were made on single-slice axial brain data, showing similar T2-w tissue contrast and SNR with great anatomical conspicuity at similar scan times - without colored noise or streaks from motion. A new slice interleaving order is also proposed to improve the multislice capabilities of iProp-FSE. Copyright © 2013 Elsevier Inc. All rights reserved.
Bilgic, Hatice; Hakki, Erdogan E.; Akkaya, Mahinur S.
2016-01-01
Human history was transformed with the advent of agriculture in the Fertile Crescent with wheat as one of the founding crops. Although the Fertile Crescent is renowned as the center of wheat domestication, archaeological studies have shown the crucial involvement of Çatalhöyük in this process. This site first gained attention during the 1961–65 excavations due to the recovery of primitive hexaploid wheat. However, despite the seeds being well preserved, a detailed archaeobotanical description of the samples is missing. In this article, we report on the DNA isolation, amplification and sequencing of ancient DNA of charred wheat grains from Çatalhöyük and other Turkish archaeological sites and the comparison of these wheat grains with contemporary wheat species including T. monococcum, T. dicoccum, T. dicoccoides, T. durum and T. aestivum at HMW glutenin protein loci. These ancient samples represent the oldest wheat sample sequenced to date and the first ancient wheat sample from the Middle East. Remarkably, the sequence analysis of the short DNA fragments preserved in seeds that are approximately 8400 years old showed that the Çatalhöyük wheat stock contained hexaploid wheat, which is similar to contemporary hexaploid wheat species including both naked (T. aestivum) and hulled (T. spelta) wheat. This suggests an early transitory state of hexaploid wheat agriculture from the Fertile Crescent towards Europe spanning present-day Turkey. PMID:26998604
W-curve alignments for HIV-1 genomic comparisons.
Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H
2010-06-01
The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.
Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).
Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E
2017-01-01
Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.
USDA-ARS?s Scientific Manuscript database
Sequence comparison between the full-length 2412 bp DNA gyrase subunit B (gyrB) gene of a novobiocin resistant Aeromonas hydrophila AH11NOVO vaccine strain and that of its virulent parent strain AH11P revealed 10 missense mutations. Similarly, sequence comparison between the full-length 4092 bp RNA ...
Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H
2007-02-01
Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S; Hu, Ping; Malfatti, Stephanie
2006-01-01
Yersinia pestis, the causative agent of bubonic and pneumonic plagues, has undergone detailed study at the molecular level. To further investigate the genomic diversity among this group and to help characterize lineages of the plague organism that have no sequenced members, we present here the genomes of two isolates of the ''classical'' antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua and Nepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open reading frames, respectively. Though both strains belong to one of the three classical biovars, they represent separate lineages defined by recent phylogenetic studies. Wemore » compare all five currently sequenced Y. pestis genomes and the corresponding features in Yersinia pseudotuberculosis. There are strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. We found 453 single nucleotide polymorphisms in protein-coding regions, which were used to assess the evolutionary relationships of these Y. pestis strains. Gene reduction analysis revealed that the gene deletion processes are under selective pressure, and many of the inactivations are probably related to the organism's interaction with its host environment. The results presented here clearly demonstrate the differences between the two biovar antiqua lineages and support the notion that grouping Y. pestis strains based strictly on the classical definition of biovars (predicated upon two biochemical assays) does not accurately reflect the phylogenetic relationships within this species. A comparison of four virulent Y. pestis strains with the human-avirulent strain 91001 provides further insight into the genetic basis of virulence to humans.« less
Double-strand break repair processes drive evolution of the mitochondrial genome in Arabidopsis.
Davila, Jaime I; Arrieta-Montiel, Maria P; Wamboldt, Yashitola; Cao, Jun; Hagmann, Joerg; Shedge, Vikas; Xu, Ying-Zhi; Weigel, Detlef; Mackenzie, Sally A
2011-09-27
The mitochondrial genome of higher plants is unusually dynamic, with recombination and nonhomologous end-joining (NHEJ) activities producing variability in size and organization. Plant mitochondrial DNA also generally displays much lower nucleotide substitution rates than mammalian or yeast systems. Arabidopsis displays these features and expedites characterization of the mitochondrial recombination surveillance gene MSH1 (MutS 1 homolog), lending itself to detailed study of de novo mitochondrial genome activity. In the present study, we investigated the underlying basis for unusual plant features as they contribute to rapid mitochondrial genome evolution. We obtained evidence of double-strand break (DSB) repair, including NHEJ, sequence deletions and mitochondrial asymmetric recombination activity in Arabidopsis wild-type and msh1 mutants on the basis of data generated by Illumina deep sequencing and confirmed by DNA gel blot analysis. On a larger scale, with mitochondrial comparisons across 72 Arabidopsis ecotypes, similar evidence of DSB repair activity differentiated ecotypes. Forty-seven repeat pairs were active in DNA exchange in the msh1 mutant. Recombination sites showed asymmetrical DNA exchange within lengths of 50- to 556-bp sharing sequence identity as low as 85%. De novo asymmetrical recombination involved heteroduplex formation, gene conversion and mismatch repair activities. Substoichiometric shifting by asymmetrical exchange created the appearance of rapid sequence gain and loss in association with particular repeat classes. Extensive mitochondrial genomic variation within a single plant species derives largely from DSB activity and its repair. Observed gene conversion and mismatch repair activity contribute to the low nucleotide substitution rates seen in these genomes. On a phenotypic level, these patterns of rearrangement likely contribute to the reproductive versatility of higher plants.
van den Broek, M; Bolat, I; Nijkamp, J F; Ramos, E; Luttik, M A H; Koopman, F; Geertman, J M; de Ridder, D; Pronk, J T; Daran, J-M
2015-09-01
Lager brewing strains of Saccharomyces pastorianus are natural interspecific hybrids originating from the spontaneous hybridization of Saccharomyces cerevisiae and Saccharomyces eubayanus. Over the past 500 years, S. pastorianus has been domesticated to become one of the most important industrial microorganisms. Production of lager-type beers requires a set of essential phenotypes, including the ability to ferment maltose and maltotriose at low temperature, the production of flavors and aromas, and the ability to flocculate. Understanding of the molecular basis of complex brewing-related phenotypic traits is a prerequisite for rational strain improvement. While genome sequences have been reported, the variability and dynamics of S. pastorianus genomes have not been investigated in detail. Here, using deep sequencing and chromosome copy number analysis, we showed that S. pastorianus strain CBS1483 exhibited extensive aneuploidy. This was confirmed by quantitative PCR and by flow cytometry. As a direct consequence of this aneuploidy, a massive number of sequence variants was identified, leading to at least 1,800 additional protein variants in S. pastorianus CBS1483. Analysis of eight additional S. pastorianus strains revealed that the previously defined group I strains showed comparable karyotypes, while group II strains showed large interstrain karyotypic variability. Comparison of three strains with nearly identical genome sequences revealed substantial chromosome copy number variation, which may contribute to strain-specific phenotypic traits. The observed variability of lager yeast genomes demonstrates that systematic linking of genotype to phenotype requires a three-dimensional genome analysis encompassing physical chromosomal structures, the copy number of individual chromosomes or chromosomal regions, and the allelic variation of copies of individual genes. Copyright © 2015, van den Broek et al.
Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent
2017-01-01
Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information. A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. PMID:28289155
van den Broek, M.; Bolat, I.; Nijkamp, J. F.; Ramos, E.; Luttik, M. A. H.; Koopman, F.; Geertman, J. M.; de Ridder, D.; Pronk, J. T.
2015-01-01
Lager brewing strains of Saccharomyces pastorianus are natural interspecific hybrids originating from the spontaneous hybridization of Saccharomyces cerevisiae and Saccharomyces eubayanus. Over the past 500 years, S. pastorianus has been domesticated to become one of the most important industrial microorganisms. Production of lager-type beers requires a set of essential phenotypes, including the ability to ferment maltose and maltotriose at low temperature, the production of flavors and aromas, and the ability to flocculate. Understanding of the molecular basis of complex brewing-related phenotypic traits is a prerequisite for rational strain improvement. While genome sequences have been reported, the variability and dynamics of S. pastorianus genomes have not been investigated in detail. Here, using deep sequencing and chromosome copy number analysis, we showed that S. pastorianus strain CBS1483 exhibited extensive aneuploidy. This was confirmed by quantitative PCR and by flow cytometry. As a direct consequence of this aneuploidy, a massive number of sequence variants was identified, leading to at least 1,800 additional protein variants in S. pastorianus CBS1483. Analysis of eight additional S. pastorianus strains revealed that the previously defined group I strains showed comparable karyotypes, while group II strains showed large interstrain karyotypic variability. Comparison of three strains with nearly identical genome sequences revealed substantial chromosome copy number variation, which may contribute to strain-specific phenotypic traits. The observed variability of lager yeast genomes demonstrates that systematic linking of genotype to phenotype requires a three-dimensional genome analysis encompassing physical chromosomal structures, the copy number of individual chromosomes or chromosomal regions, and the allelic variation of copies of individual genes. PMID:26150454
Rohs, Remo; Sklenar, Heinz
2004-04-01
The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.
Nishito, Yukari; Osana, Yasunori; Hachiya, Tsuyoshi; Popendorf, Kris; Toyoda, Atsushi; Fujiyama, Asao; Itaya, Mitsuhiro; Sakakibara, Yasubumi
2010-04-16
Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks. Multiple genome-level comparisons among five closely related Bacillus species were also carried out. The determined genome sequence of B. subtilis natto and gene annotations are available from the Natto genome browser http://natto-genome.org/.
Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.
Alkhateeb, Abedalrhman; Rueda, Luis
2017-08-01
Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.
Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.
Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu
2017-10-03
Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.
Suckau, Detlev; Resemann, Anja
2009-12-01
The ability to match Top-Down protein sequencing (TDS) results by MALDI-TOF to protein sequences by classical protein database searching was evaluated in this work. Resulting from these analyses were the protein identity, the simultaneous assignment of the N- and C-termini and protein sequences of up to 70 residues from either terminus. In combination with de novo sequencing using the MALDI-TDS data, even fusion proteins were assigned and the detailed sequence around the fusion site was elucidated. MALDI-TDS allowed to efficiently match protein sequences quickly and to validate recombinant protein structures-in particular, protein termini-on the level of undigested proteins.
Task Analysis for the Jobs of Freight Train Conductor and Brakeman
DOT National Transportation Integrated Search
1975-05-31
This document describes the results of a research effort undertaken to detail the tasks of freight train conductors and brakemen. Included with text are detailed operational sequence diagrams for both conductor and brakeman. This task : analysis is s...
Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong
2014-01-01
Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372
Zhang, Gaihua; Su, Zhen
2012-01-01
Work on protein structure prediction is very useful in biological research. To evaluate their accuracy, experimental protein structures or their derived data are used as the 'gold standard'. However, as proteins are dynamic molecular machines with structural flexibility such a standard may be unreliable. To investigate the influence of the structure flexibility, we analysed 3,652 protein structures of 137 unique sequences from 24 protein families. The results showed that (1) the three-dimensional (3D) protein structures were not rigid: the root-mean-square deviation (RMSD) of the backbone Cα of structures with identical sequences was relatively large, with the average of the maximum RMSD from each of the 137 sequences being 1.06 Å; (2) the derived data of the 3D structure was not constant, e.g. the highest ratio of the secondary structure wobble site was 60.69%, with the sequence alignments from structural comparisons of two proteins in the same family sometimes being completely different. Proteins may have several stable conformations and the data derived from resolved structures as a 'gold standard' should be optimized before being utilized as criteria to evaluate the prediction methods, e.g. sequence alignment from structural comparison. Helix/β-sheet transition exists in normal free proteins. The coil ratio of the 3D structure could affect its resolution as determined by X-ray crystallography.
Lin, Xiuping; Zhou, Xuefeng; Wang, Fazuo; Liu, Kaisheng; Yang, Bin; Yang, Xianwen; Peng, Yan; Liu, Juan; Ren, Zhe; Liu, Yonghong
2012-01-01
A new fungal strain, displaying strong toxic activity against brine shrimp larvae, was isolated from a deep sea sediment sample collected at a depth of 1300 m. The strain, designated as F00120, was identified as a member of the genus Penicillium on the basis of morphology and ITS sequence analysis. One new sesquiterpene quinone, named penicilliumin A (1), along with two known compounds ergosterol (2) and ergosterol peroxide (3), were isolated and purified from the cultures of F00120 by silica gel column, Sephadex LH-20 column, and preparative thin layer chromatography. Their structures were elucidated by detailed nuclear magnetic resonance (NMR) and mass spectroscopic (MS) analysis as well as comparison with literature data. The new compound penicilliumin A inhibited in vitro proliferation of mouse melanoma (B16), human melanoma (A375), and human cervical carcinoma (Hela) cell lines moderately.
NASA Astrophysics Data System (ADS)
Sharbazheri, Khalid Mahmood; Ghafor, Imad Mahmood; Muhammed, Qahtan Ahmad
2009-10-01
The Cretaceous/Tertiary (K/T) boundary sequence, which crops out in the studied area is located within the High Folded Zone, in the Sirwan Valley, northeastern Iraq. These units mainly consist of flysch and flysch-type successions of thick clastic beds of Tanjero/Kolosh Formations. A detailed lithostratigraphic study is achieved on the outcropping uppermost part of the Upper Cretaceous successions (upper part of Tanjero Formation) and the lowermost part of the Kolosh Formation. On the basis of the identified planktonic foraminiferal assemblages, five biozones are recorded from the uppermost part of Tanjero Formation and four biozones from the lower part of the Kolosh Formation (Lower Paleocene) in the Sirwan section. The biostratigraphic correlations based on planktonic foraminiferal zonations showed a comparison between the biostratigraphic zones established in this study and other equivalents of the commonly used planktonic zonal scheme around the Cretaceous/Tertiary boundary in and outside Iraq.
Comparison of two underwater acoustic communications techniques for multi-user access
NASA Astrophysics Data System (ADS)
Hursky, Paul; Siderius, T. Martin; Kauaiex Group
2004-05-01
Frequency hopped frequency shift keying (FHFSK) and code division multiple access (CDMA) are two different modulation techniques for multiple users to communicate with a single receiver simultaneously. In July 2003, these two techniques were tested alongside each other in a shallow water coastal environment off the coast of Kauai. A variety of instruments were used to measure the prevailing oceanography, enabling detailed modeling of the channel. The channel was acoustically probed using LFM waveforms and m-sequences as well. We will present the results of demodulating the FHFSK and CDMA waveforms and discuss modeling the channel for the purpose of predicting multi-user communications performance. a)Michael B. Porter, Paul Hursky, Martin Siderius (SAIC), Mohsen Badiey (UD), Jerald Caruthers (USM), William S. Hodgkiss, Kaustubha Raghukumar (SIO), Dan Rouseff, Warren Fox (APL-UW), Christian de Moustier, Brian Calder, Barbara J. Kraft (UNH), Keyko McDonald (SPAWARSSC), Peter Stein, James K. Lewis, and Subramaniam Rajan (SSI).
Performance of the MIR Cooperative Solar Array After 2.5 Years in Orbit
NASA Technical Reports Server (NTRS)
Kerslake, Thomas W.; Hoffman, David J.
1999-01-01
The Mir Cooperative Solar Array (MCSA) was developed jointly by the United States and Russia to produce 6 kW of power for the Russian space station Mir. Four, multi-orbit test sequences were executed between June 1996 and December 1998 to measure MCSA electrical performance. A dedicated Fortran computer code was developed to analyze the detailed thermal-electrical performance of the MCSA. The computational performance results compared very favorably with the measured flight data in most cases. Minor performance degradation was detected in one current generating section of the MCSA. Yet overall, the flight data indicated the MCSA was meeting and exceeding performance expectations. There was no precipitous performance loss due to contamination or other causes after 2.5 years of operation. In this paper, we review the MCSA flight electrical performance tests, data and computational modeling and discuss findings from data comparisons with the computational results.
Comparison of genetic characteristics of canine papillomaviruses in Turkey.
Oğuzoğlu, Tuba Çiğdem; Timurkan, Mehmet Özkan; Koç, Bahattin Taylan; Alkan, Feray
2017-11-01
Papillomavirus (PV) infections often cause benign and malignant skin neoplasia in dogs. To date, twenty types of canine papillomaviruses (CPVs) have been described worldwide. A detailed molecular characterization of CPVs in Turkey is lacking. In the present study, oral and mucosal lesions from 13 dogs with suspected CPV infection from the Mediterranean and central Anatolian regions of Turkey were analyzed. The partial gene sequences of the L1, E6, and E7 regions were compared with those of CPV types in the GenBank database. The results showed that CPV-1 infection was the dominant type of canine papillomatosis in Turkey. In addition, there was no statistically significant association between the frequency of the disease and the age or gender of the dog (p>0.05). However, all the dogs were pedigree breeds, suggesting that the disease may be more prevalent among pure-bred dogs than mixed breeds. Copyright © 2017 Elsevier B.V. All rights reserved.
Bonnin, Rémy A; Girlich, Delphine; Imanci, Dilek; Dortet, Laurent; Naas, Thierry
2015-11-19
We provide here the first genome sequence of a Serratia rubidaea isolate, a human-opportunistic pathogen. This reference sequence will permit a comparison of this species with others of the Serratia genus. Copyright © 2015 Bonnin et al.
A clone-free, single molecule map of the domestic cow (Bos taurus) genome.
Zhou, Shiguo; Goldstein, Steve; Place, Michael; Bechner, Michael; Patino, Diego; Potamousis, Konstantinos; Ravindran, Prabu; Pape, Louise; Rincon, Gonzalo; Hernandez-Ortiz, Juan; Medrano, Juan F; Schwartz, David C
2015-08-28
The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts). Alignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds.
Mórocz, István Akos; Janoos, Firdaus; van Gelderen, Peter; Manor, David; Karni, Avi; Breznitz, Zvia; von Aster, Michael; Kushnir, Tammar; Shalev, Ruth
2012-01-01
The aim of this article is to report on the importance and challenges of a time-resolved and spatio-temporal analysis of fMRI data from complex cognitive processes and associated disorders using a study on developmental dyscalculia (DD). Participants underwent fMRI while judging the incorrectness of multiplication results, and the data were analyzed using a sequence of methods, each of which progressively provided more a detailed picture of the spatio-temporal aspect of this disease. Healthy subjects and subjects with DD performed alike behaviorally though they exhibited parietal disparities using traditional voxel-based group analyses. Further and more detailed differences, however, surfaced with a time-resolved examination of the neural responses during the experiment. While performing inter-group comparisons, a third group of subjects with dyslexia (DL) but with no arithmetic difficulties was included to test the specificity of the analysis and strengthen the statistical base with overall fifty-eight subjects. Surprisingly, the analysis showed a functional dissimilarity during an initial reading phase for the group of dyslexic but otherwise normal subjects, with respect to controls, even though only numerical digits and no alphabetic characters were presented. Thus our results suggest that time-resolved multi-variate analysis of complex experimental paradigms has the ability to yield powerful new clinical insights about abnormal brain function. Similarly, a detailed compilation of aberrations in the functional cascade may have much greater potential to delineate the core processing problems in mental disorders. PMID:22368322
Intestinal microbiota composition in fishes is influenced by host ecology and environment.
Wong, Sandi; Rawls, John F
2012-07-01
The digestive tracts of vertebrates are colonized by complex assemblages of micro-organisms, collectively called the gut microbiota. Recent studies have revealed important contributions of gut microbiota to vertebrate health and disease, stimulating intense interest in understanding how gut microbial communities are assembled and how they impact host fitness (Sekirov et al. 2010). Although all vertebrates harbour a gut microbiota, current information on microbiota composition and function has been derived primarily from mammals. Comparisons of different mammalian species have revealed intriguing associations between gut microbiota composition and host diet, anatomy and phylogeny (Ley et al. 2008b). However, mammals constitute <10% of all vertebrate species, and it remains unclear whether similar associations exist in more diverse and ancient vertebrate lineages such as fish. In this issue, Sullam et al. (2012) make an important contribution toward identifying factors determining gut microbiota composition in fishes. The authors conducted a detailed meta-analysis of 25 bacterial 16S rRNA gene sequence libraries derived from the intestines of different fish species. To provide a broader context for their analysis, they compared these data sets to a large collection of 16S rRNA gene sequence data sets from diverse free-living and host-associated bacterial communities. Their results suggest that variation in gut microbiota composition in fishes is strongly correlated with species habitat salinity, trophic level and possibly taxonomy. Comparison of data sets from fish intestines and other environments revealed that fish gut microbiota compositions are often similar to those of other animals and contain relatively few free-living environmental bacteria. These results suggest that the gut microbiota composition of fishes is not a simple reflection of the micro-organisms in their local habitat but may result from host-specific selective pressures within the gut (Bevins & Salzman 2011).
eShadow: A tool for comparing closely related sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.
2004-01-15
Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
RECOVIR Software for Identifying Viruses
NASA Technical Reports Server (NTRS)
Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui
2013-01-01
Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
1988-06-01
partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN MANAGEMENT from the NAVAL POSTGRADUATE SCHOOL June 1988 Author: Denise M...of work), management study reviews and detailed cost comparisons. A Cost Comparison Handbook ( CCH ), also published in 1979, provided detailed...1, dated 12 August 1985. The cost comparison methodology was changed from the complex full cost method outlined in the CCH , to a simpler incremen- tal
Cloud, Joann L; Harmsen, Dag; Iwen, Peter C; Dunn, James J; Hall, Gerri; Lasala, Paul Rocco; Hoggan, Karen; Wilson, Deborah; Woods, Gail L; Mellmann, Alexander
2010-04-01
Correct identification of nonfermenting Gram-negative bacilli (NFB) is crucial for patient management. We compared phenotypic identifications of 96 clinical NFB isolates with identifications obtained by 5' 16S rRNA gene sequencing. Sequencing identified 88 isolates (91.7%) with >99% similarity to a sequence from the assigned species; 61.5% of sequencing results were concordant with phenotypic results, indicating the usability of sequencing to identify NFB.
2016-09-09
evaluating 18 mutants using either the A or B conformer is only r = ~ 0.2. Given the poor performance of approximating the observed experimental ...1 Sequence Tolerance of a Highly Stable Single Domain Antibody: Comparison of Computational and Experimental Profiles Mark A. Olson,1 Patricia...unusually high thermal stability is explored by a combined computational and experimental study. Starting with the crystallographic structure
2018-01-01
Abstract It is widely assumed that distributed neuronal networks are fundamental to the functioning of the brain. Consistent spike timing between neurons is thought to be one of the key principles for the formation of these networks. This can involve synchronous spiking or spiking with time delays, forming spike sequences when the order of spiking is consistent. Finding networks defined by their sequence of time-shifted spikes, denoted here as spike timing networks, is a tremendous challenge. As neurons can participate in multiple spike sequences at multiple between-spike time delays, the possible complexity of networks is prohibitively large. We present a novel approach that is capable of (1) extracting spike timing networks regardless of their sequence complexity, and (2) that describes their spiking sequences with high temporal precision. We achieve this by decomposing frequency-transformed neuronal spiking into separate networks, characterizing each network’s spike sequence by a time delay per neuron, forming a spike sequence timeline. These networks provide a detailed template for an investigation of the experimental relevance of their spike sequences. Using simulated spike timing networks, we show network extraction is robust to spiking noise, spike timing jitter, and partial occurrences of the involved spike sequences. Using rat multineuron recordings, we demonstrate the approach is capable of revealing real spike timing networks with sub-millisecond temporal precision. By uncovering spike timing networks, the prevalence, structure, and function of complex spike sequences can be investigated in greater detail, allowing us to gain a better understanding of their role in neuronal functioning. PMID:29789811
The Pizza Problem: A Solution with Sequences
ERIC Educational Resources Information Center
Shafer, Kathryn G.; Mast, Caleb J.
2008-01-01
This article addresses the issues of coaching and assessing. A preservice middle school teacher's unique solution to the Pizza problem was not what the professor expected. The student's solution strategy, based on sequences and a reinvention of Pascal's triangle, is explained in detail. (Contains 8 figures.)
Cardiovascular magnetic resonance physics for clinicians: part II
2012-01-01
This is the second of two reviews that is intended to cover the essential aspects of cardiovascular magnetic resonance (CMR) physics in a way that is understandable and relevant to clinicians using CMR in their daily practice. Starting with the basic pulse sequences and contrast mechanisms described in part I, it briefly discusses further approaches to accelerate image acquisition. It then continues by showing in detail how the contrast behaviour of black blood fast spin echo and bright blood cine gradient echo techniques can be modified by adding rf preparation pulses to derive a number of more specialised pulse sequences. The simplest examples described include T2-weighted oedema imaging, fat suppression and myocardial tagging cine pulse sequences. Two further important derivatives of the gradient echo pulse sequence, obtained by adding preparation pulses, are used in combination with the administration of a gadolinium-based contrast agent for myocardial perfusion imaging and the assessment of myocardial tissue viability using a late gadolinium enhancement (LGE) technique. These two imaging techniques are discussed in more detail, outlining the basic principles of each pulse sequence, the practical steps required to achieve the best results in a clinical setting and, in the case of perfusion, explaining some of the factors that influence current approaches to perfusion image analysis. The key principles of contrast-enhanced magnetic resonance angiography (CE-MRA) are also explained in detail, especially focusing on timing of the acquisition following contrast agent bolus administration, and current approaches to achieving time resolved MRA. Alternative MRA techniques that do not require the use of an endogenous contrast agent are summarised, and the specialised pulse sequence used to image the coronary arteries, using respiratory navigator gating, is described in detail. The article concludes by explaining the principle behind phase contrast imaging techniques which create images that represent the phase of the MR signal rather than the magnitude. It is shown how this principle can be used to generate velocity maps by designing gradient waveforms that give rise to a relative phase change that is proportional to velocity. Choice of velocity encoding range and key pitfalls in the use of this technique are discussed. PMID:22995744
Genome sequence of Lactobacillus rhamnosus ATCC 8530.
Pittet, Vanessa; Ewen, Emily; Bushell, Barry R; Ziola, Barry
2012-02-01
Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.
Scalable Kernel Methods and Algorithms for General Sequence Analysis
ERIC Educational Resources Information Center
Kuksa, Pavel
2011-01-01
Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…
Identification of food and beverage spoilage yeasts from DNA sequence analyses
USDA-ARS?s Scientific Manuscript database
Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
Sequenced sorghum mutant library- an efficient platform for discovery of causal gene mutations
USDA-ARS?s Scientific Manuscript database
Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. We applied whole-genome sequencing to 256 phenotyped mutant lines of sorghum (Sorghum bicolor L. Moench) to 16x coverage. Comparisons with the reference sequence revealed >1.8 million canonical EMS-induced G/C to A...
De Novo Protein Structure Prediction
NASA Astrophysics Data System (ADS)
Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram
An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments
Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic
2001-01-01
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Cortijo, Sandra; Charoensawan, Varodom; Roudier, François; Wigge, Philip A
2018-01-01
Chromatin immunoprecipitation combined with next-generation sequencing (ChIP-seq) is a powerful technique to investigate in vivo transcription factor (TF) binding to DNA, as well as chromatin marks. Here we provide a detailed protocol for all the key steps to perform ChIP-seq in Arabidopsis thaliana roots, also working on other A. thaliana tissues and in most non-ligneous plants. We detail all steps from material collection, fixation, chromatin preparation, immunoprecipitation, library preparation, and finally computational analysis based on a combination of publicly available tools.
AMS 4.0: consensus prediction of post-translational modifications in protein sequences.
Plewczynski, Dariusz; Basu, Subhadip; Saha, Indrajit
2012-08-01
We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.
Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles
2012-06-01
The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps. Copyright © 2012 Elsevier Inc. All rights reserved.
Genome analysis of medicinal Ganoderma spp. with plant-pathogenic and saprotrophic life-styles.
Kües, Ursula; Nelson, David R; Liu, Chang; Yu, Guo-Jun; Zhang, Jianhui; Li, Jianqin; Wang, Xin-Cun; Sun, Hui
2015-06-01
Ganoderma is a fungal genus belonging to the Ganodermataceae family and Polyporales order. Plant-pathogenic species in this genus can cause severe diseases (stem, butt, and root rot) in economically important trees and perennial crops, especially in tropical countries. Ganoderma species are white rot fungi and have ecological importance in the breakdown of woody plants for nutrient mobilization. They possess effective machineries of lignocellulose-decomposing enzymes useful for bioenergy production and bioremediation. In addition, the genus contains many important species that produce pharmacologically active compounds used in health food and medicine. With the rapid adoption of next-generation DNA sequencing technologies, whole genome sequencing and systematic transcriptome analyses become affordable approaches to identify an organism's genes. In the last few years, numerous projects have been initiated to identify the genetic contents of several Ganoderma species, particularly in different strains of Ganoderma lucidum. In November 2013, eleven whole genome sequencing projects for Ganoderma species were registered in international databases, three of which were already completed with genomes being assembled to high quality. In addition to the nuclear genome, two mitochondrial genomes for Ganoderma species have also been reported. Complementing genome analysis, four transcriptome studies on various developmental stages of Ganoderma species have been performed. Information obtained from these studies has laid the foundation for the identification of genes involved in biological pathways that are critical for understanding the biology of Ganoderma, such as the mechanism of pathogenesis, the biosynthesis of active components, life cycle and cellular development, etc. With abundant genetic information becoming available, a few centralized resources have been established to disseminate the knowledge and integrate relevant data to support comparative genomic analyses of Ganoderma species. The current review carries out a detailed comparison of the nuclear genomes, mitochondrial genomes and transcriptomes from several Ganoderma species. Genes involved in biosynthetic pathways such as CYP450 genes and in cellular development such as matA and matB genes are characterized and compared in detail, as examples to demonstrate the usefulness of comparative genomic analyses for the identification of critical genes. Resources needed for future data integration and exploitation are also discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Automatic analysis of the 2015 Gorkha earthquake aftershock sequence.
NASA Astrophysics Data System (ADS)
Baillard, C.; Lyon-Caen, H.; Bollinger, L.; Rietbrock, A.; Letort, J.; Adhikari, L. B.
2016-12-01
The Mw 7.8 Gorkha earthquake, that partially ruptured the Main Himalayan Thrust North of Kathmandu on the 25th April 2015, was the largest and most catastrophic earthquake striking Nepal since the great M8.4 1934 earthquake. This mainshock was followed by multiple aftershocks, among them, two notable events that occurred on the 12th May with magnitudes of 7.3 Mw and 6.3 Mw. Due to these recent events it became essential for the authorities and for the scientific community to better evaluate the seismic risk in the region through a detailed analysis of the earthquake catalog, amongst others, the spatio-temporal distribution of the Gorkha aftershock sequence. Here we complement this first study by doing a microseismic study using seismic data coming from the eastern part of the Nepalese Seismological Center network associated to one broadband station in Everest. Our primary goal is to deliver an accurate catalog of the aftershock sequence. Due to the exceptional number of events detected we performed an automatic picking/locating procedure which can be splitted in 4 steps: 1) Coarse picking of the onsets using a classical STA/LTA picker, 2) phase association of picked onsets to detect and declare seismic events, 3) Kurtosis pick refinement around theoretical arrival times to increase picking and location accuracy and, 4) local magnitude calculation based amplitude of waveforms. This procedure is time efficient ( 1 sec/event), reduces considerably the location uncertainties ( 2 to 5 km errors) and increases the number of events detected compared to manual processing. Indeed, the automatic detection rate is 10 times higher than the manual detection rate. By comparing to the USGS catalog we were able to give a new attenuation law to compute local magnitudes in the region. A detailed analysis of the seismicity shows a clear migration toward the east of the region and a sudden decrease of seismicity 100 km east of Kathmandu which may reveal the presence of a tectonic feature acting as a seismic barrier. Comparison of the aftershock distribution with respect to the coseismic slip distribution will be discussed.d.
Núñez-Vivanco, Gabriel; Valdés-Jiménez, Alejandro; Besoaín, Felipe; Reyes-Parada, Miguel
2016-01-01
Since the structure of proteins is more conserved than the sequence, the identification of conserved three-dimensional (3D) patterns among a set of proteins, can be important for protein function prediction, protein clustering, drug discovery and the establishment of evolutionary relationships. Thus, several computational applications to identify, describe and compare 3D patterns (or motifs) have been developed. Often, these tools consider a 3D pattern as that described by the residues surrounding co-crystallized/docked ligands available from X-ray crystal structures or homology models. Nevertheless, many of the protein structures stored in public databases do not provide information about the location and characteristics of ligand binding sites and/or other important 3D patterns such as allosteric sites, enzyme-cofactor interaction motifs, etc. This makes necessary the development of new ligand-independent methods to search and compare 3D patterns in all available protein structures. Here we introduce Geomfinder, an intuitive, flexible, alignment-free and ligand-independent web server for detailed estimation of similarities between all pairs of 3D patterns detected in any two given protein structures. We used around 1100 protein structures to form pairs of proteins which were assessed with Geomfinder. In these analyses each protein was considered in only one pair (e.g. in a subset of 100 different proteins, 50 pairs of proteins can be defined). Thus: (a) Geomfinder detected identical pairs of 3D patterns in a series of monoamine oxidase-B structures, which corresponded to the effectively similar ligand binding sites at these proteins; (b) we identified structural similarities among pairs of protein structures which are targets of compounds such as acarbose, benzamidine, adenosine triphosphate and pyridoxal phosphate; these similar 3D patterns are not detected using sequence-based methods; (c) the detailed evaluation of three specific cases showed the versatility of Geomfinder, which was able to discriminate between similar and different 3D patterns related to binding sites of common substrates in a range of diverse proteins. Geomfinder allows detecting similar 3D patterns between any two pair of protein structures, regardless of the divergency among their amino acids sequences. Although the software is not intended for simultaneous multiple comparisons in a large number of proteins, it can be particularly useful in cases such as the structure-based design of multitarget drugs, where a detailed analysis of 3D patterns similarities between a few selected protein targets is essential.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovacik, Meric A.; Androulakis, Ioannis P., E-mail: yannis@rci.rutgers.edu; Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854
2013-09-15
Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogeneticmore » relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.« less
Sharma, Sanjeev Kumar; Bolser, Daniel; de Boer, Jan; Sønderkær, Mads; Amoros, Walter; Carboni, Martin Federico; D’Ambrosio, Juan Martín; de la Cruz, German; Di Genova, Alex; Douches, David S.; Eguiluz, Maria; Guo, Xiao; Guzman, Frank; Hackett, Christine A.; Hamilton, John P.; Li, Guangcun; Li, Ying; Lozano, Roberto; Maass, Alejandro; Marshall, David; Martinez, Diana; McLean, Karen; Mejía, Nilo; Milne, Linda; Munive, Susan; Nagy, Istvan; Ponce, Olga; Ramirez, Manuel; Simon, Reinhard; Thomson, Susan J.; Torres, Yerisf; Waugh, Robbie; Zhang, Zhonghua; Huang, Sanwen; Visser, Richard G. F.; Bachem, Christian W. B.; Sagredo, Boris; Feingold, Sergio E.; Orjeda, Gisella; Veilleux, Richard E.; Bonierbale, Merideth; Jacobs, Jeanne M. E.; Milbourne, Dan; Martin, David Michael Alan; Bryan, Glenn J.
2013-01-01
The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker−based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal “pseudomolecules”. PMID:24062527
Katharios, Pantelis; Seth-Smith, Helena M. B.; Fehr, Alexander; Mateos, José M.; Qi, Weihong; Richter, Denis; Nufer, Lisbeth; Ruetten, Maja; Guevara Soto, Maricruz; Ziegler, Urs; Thomson, Nicholas R; Schlapbach, Ralph; Vaughan, Lloyd
2015-01-01
Aquaculture is a burgeoning industry, requiring diversification into new farmed species, which are often at risk from infectious disease. We used a mesocosm technique to investigate the susceptibility of sharpsnout seabream (Diplodus puntazzo) larvae to potential environmental pathogens in seawater compared to control borehole water. Fish exposed to seawater succumbed to epitheliocystis from 21 days post hatching, causing mortality in a quarter of the hosts. The pathogen responsible was not chlamydial, as is often found in epitheliocystis, but a novel species of the γ-proteobacterial genus Endozoicomonas. Detailed characterisation of this pathogen within the infectious lesions using high resolution fluorescent and electron microscopy showed densely packed rod shaped bacteria. A draft genome sequence of this uncultured bacterium was obtained from preserved material. Comparison with the genome of the Endozoicomonas elysicola type strain shows that the genome of Ca. Endozoicomonas cretensis is undergoing decay through loss of functional genes and insertion sequence expansion, often indicative of adaptation to a new niche or restriction to an alternative lifestyle. These results demonstrate the advantage of mesocosm studies for investigating the effect of environmental bacteria on susceptible hosts and provide an important insight into the genome dynamics of a novel fish pathogen. PMID:26639610
FPGA Sequencer for Radar Altimeter Applications
NASA Technical Reports Server (NTRS)
Berkun, Andrew C.; Pollard, Brian D.; Chen, Curtis W.
2011-01-01
A sequencer for a radar altimeter provides accurate attitude information for a reliable soft landing of the Mars Science Laboratory (MSL). This is a field-programmable- gate-array (FPGA)-only implementation. A table loaded externally into the FPGA controls timing, processing, and decision structures. Radar is memory-less and does not use previous acquisitions to assist in the current acquisition. All cycles complete in exactly 50 milliseconds, regardless of range or whether a target was found. A RAM (random access memory) within the FPGA holds instructions for up to 15 sets. For each set, timing is run, echoes are processed, and a comparison is made. If a target is seen, more detailed processing is run on that set. If no target is seen, the next set is tried. When all sets have been run, the FPGA terminates and waits for the next 50-millisecond event. This setup simplifies testing and improves reliability. A single vertex chip does the work of an entire assembly. Output products require minor processing to become range and velocity. This technology is the heart of the Terminal Descent Sensor, which is an integral part of the Entry Decent and Landing system for MSL. In addition, it is a strong candidate for manned landings on Mars or the Moon.
van der Meer, Jitse M
2018-01-01
The genetic regulation of anterior-posterior segment pattern development has been elucidated in detail for Drosophila, but it is not canonical for insects. A surprising diversity of regulatory mechanisms is being uncovered not only between insect orders, but also within the order of the Diptera. The question is whether the same diversity of regulatory mechanisms exists within other insect orders. I show that anterior puncture of the egg of the pea beetle Callosobruchus maculatus submerged in RNase can induce double abdomen development suggesting a role for maternal mRNA. In a double abdomen, anterior segments are replaced by posterior segments oriented in mirror image symmetry to the original posterior segments. This effect is specific for RNase activity, for treatment of the anterior egg pole and for cytoplasmic RNA. Yield depends on developmental stage, enzyme concentration, and temperature. A maximum of 30% of treated eggs reversed segment sequence after submersion and puncture in 10 μg/mL RNase S reconstituted from S-protein and S-peptide at 30°C. This result sets the stage for an analysis of the genetic regulation of segment pattern formation in the long germ embryo of the coleopteran Callosobruchus and for comparison with the short germ embryo of the coleopteran Tribolium. © 2018 Wiley Periodicals, Inc.
Vissers, Lisenka E L M; van Nimwegen, Kirsten J M; Schieving, Jolanda H; Kamsteeg, Erik-Jan; Kleefstra, Tjitske; Yntema, Helger G; Pfundt, Rolph; van der Wilt, Gert Jan; Krabbenborg, Lotte; Brunner, Han G; van der Burg, Simone; Grutters, Janneke; Veltman, Joris A; Willemsen, Michèl A A P
2017-09-01
Implementation of novel genetic diagnostic tests is generally driven by technological advances because they promise shorter turnaround times and/or higher diagnostic yields. Other aspects, including impact on clinical management or cost-effectiveness, are often not assessed in detail prior to implementation. We studied the clinical utility of whole-exome sequencing (WES) in complex pediatric neurology in terms of diagnostic yield and costs. We analyzed 150 patients (and their parents) presenting with complex neurological disorders of suspected genetic origin. In a parallel study, all patients received both the standard diagnostic workup (e.g., cerebral imaging, muscle biopsies or lumbar punctures, and sequential gene-by-gene-based testing) and WES simultaneously. Our unique study design allowed direct comparison of diagnostic yield of both trajectories and provided insight into the economic implications of implementing WES in this diagnostic trajectory. We showed that WES identified significantly more conclusive diagnoses (29.3%) than the standard care pathway (7.3%) without incurring higher costs. Exploratory analysis of WES as a first-tier diagnostic test indicates that WES may even be cost-saving, depending on the extent of other tests being omitted. Our data support such a use of WES in pediatric neurology for disorders of presumed genetic origin.Genet Med advance online publication 23 March 2017.
Signatures of DNA Methylation across Insects Suggest Reduced DNA Methylation Levels in Holometabola
Provataris, Panagiotis; Meusemann, Karen; Niehuis, Oliver; Grath, Sonja; Misof, Bernhard
2018-01-01
Abstract It has been experimentally shown that DNA methylation is involved in the regulation of gene expression and the silencing of transposable element activity in eukaryotes. The variable levels of DNA methylation among different insect species indicate an evolutionarily flexible role of DNA methylation in insects, which due to a lack of comparative data is not yet well-substantiated. Here, we use computational methods to trace signatures of DNA methylation across insects by analyzing transcriptomic and genomic sequence data from all currently recognized insect orders. We conclude that: 1) a functional methylation system relying exclusively on DNA methyltransferase 1 is widespread across insects. 2) DNA methylation has potentially been lost or extremely reduced in species belonging to springtails (Collembola), flies and relatives (Diptera), and twisted-winged parasites (Strepsiptera). 3) Holometabolous insects display signs of reduced DNA methylation levels in protein-coding sequences compared with hemimetabolous insects. 4) Evolutionarily conserved insect genes associated with housekeeping functions tend to display signs of heavier DNA methylation in comparison to the genomic/transcriptomic background. With this comparative study, we provide the much needed basis for experimental and detailed comparative analyses required to gain a deeper understanding on the evolution and function of DNA methylation in insects. PMID:29697817
Steuten, Benedikt; Wagner, Rolf
2012-12-01
6S RNA is a bacterial transcriptional regulator,which accumulates during stationary phase and inhibits transcription from many promoters due to stable association with σ 70 -containing RNA polymerase. This inhibitory RNA polymerase ∼ 6S RNA complex dissociates during nutritional upshift, when cells undergo outgrowth from stationary phase, releasing active RNA polymerase ready for transcription. The release reaction depends on a characteristic property of 6S RNAs, namely to act as template for the de novo synthesis of small RNAs, termed pRNAs.Here, we used limited hydrolysis with structure-specific RNases and in-line probing of isolated 6S RNA and 6SRNA ∼ pRNA complexes to investigate the molecular details leading to the release reaction. Our results indicate that pRNA transcription induces the refolding of the 6S RNA secondary structure by disrupting part of the closing stem(conserved sequence regions CRI and CRIV) and formation of a new hairpin (conserved sequence regions CRIII and CRIV). Comparison of the dimethylsulfate modification pattern of 6S RNA in living cells at stationary growth and during outgrowth confirmed the conformational change observed in vitro. Based on our results, a model describing the individual steps of the release reaction is presented.
Vissers, Lisenka E.L.M.; van Nimwegen, Kirsten J.M.; Schieving, Jolanda H.; Kamsteeg, Erik-Jan; Kleefstra, Tjitske; Yntema, Helger G.; Pfundt, Rolph; van der Wilt, Gert Jan; Krabbenborg, Lotte; Brunner, Han G.; van der Burg, Simone; Grutters, Janneke; Veltman, Joris A.; Willemsen, Michèl A.A.P.
2017-01-01
Purpose: Implementation of novel genetic diagnostic tests is generally driven by technological advances because they promise shorter turnaround times and/or higher diagnostic yields. Other aspects, including impact on clinical management or cost-effectiveness, are often not assessed in detail prior to implementation. Methods: We studied the clinical utility of whole-exome sequencing (WES) in complex pediatric neurology in terms of diagnostic yield and costs. We analyzed 150 patients (and their parents) presenting with complex neurological disorders of suspected genetic origin. In a parallel study, all patients received both the standard diagnostic workup (e.g., cerebral imaging, muscle biopsies or lumbar punctures, and sequential gene-by-gene–based testing) and WES simultaneously. Results: Our unique study design allowed direct comparison of diagnostic yield of both trajectories and provided insight into the economic implications of implementing WES in this diagnostic trajectory. We showed that WES identified significantly more conclusive diagnoses (29.3%) than the standard care pathway (7.3%) without incurring higher costs. Exploratory analysis of WES as a first-tier diagnostic test indicates that WES may even be cost-saving, depending on the extent of other tests being omitted. Conclusion: Our data support such a use of WES in pediatric neurology for disorders of presumed genetic origin. Genet Med advance online publication 23 March 2017 PMID:28333917
Louca, Stilianos; Jacques, Saulo M S; Pires, Aliny P F; Leal, Juliana S; González, Angélica L; Doebeli, Michael; Farjalla, Vinicius F
2017-08-01
Phytotelmata in tank-forming Bromeliaceae plants are regarded as potential miniature models for aquatic ecology, but detailed investigations of their microbial communities are rare. Hence, the biogeochemistry in bromeliad tanks remains poorly understood. Here we investigate the structure of bacterial and archaeal communities inhabiting the detritus within the tanks of two bromeliad species, Aechmea nudicaulis and Neoregelia cruenta, from a Brazilian sand dune forest. We used metagenomic sequencing for functional community profiling and 16S sequencing for taxonomic profiling. We estimated the correlation between functional groups and various environmental variables, and compared communities between bromeliad species. In all bromeliads, microbial communities spanned a metabolic network adapted to oxygen-limited conditions, including all denitrification steps, ammonification, sulfate respiration, methanogenesis, reductive acetogenesis and anoxygenic phototrophy. Overall, CO2 reducers dominated in abundance over sulfate reducers, and anoxygenic phototrophs largely outnumbered oxygenic photoautotrophs. Functional community structure correlated strongly with environmental variables, between and within a single bromeliad species. Methanogens and reductive acetogens correlated with detrital volume and canopy coverage, and exhibited higher relative abundances in N. cruenta. A comparison of bromeliads to freshwater lake sediments and soil from around the world, revealed stark differences in terms of taxonomic as well as functional microbial community structure. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
Using Morpholinos to Probe Gene Networks in Sea Urchin.
Materna, Stefan C
2017-01-01
The control processes that underlie the progression of development can be summarized in maps of gene regulatory networks (GRNs). A critical step in their assembly is the systematic perturbation of network candidates. In sea urchins the most important method for interfering with expression in a gene-specific way is application of morpholino antisense oligonucleotides (MOs). MOs act by binding to their sequence complement in transcripts resulting in a block in translation or a change in splicing and thus result in a loss of function. Despite the tremendous success of this technology, recent comparisons to mutants generated by genome editing have led to renewed criticism and challenged its reliability. As with all methods based on sequence recognition, MOs are prone to off-target binding that may result in phenotypes that are erroneously ascribed to the loss of the intended target. However, the slow progression of development in sea urchins has enabled extremely detailed studies of gene activity in the embryo. This wealth of knowledge paired with the simplicity of the sea urchin embryo enables careful analysis of MO phenotypes through a variety of methods that do not rely on terminal phenotypes. This article summarizes the use of MOs in probing GRNs and the steps that should be taken to assure their specificity.
DNA sequence similarity recognition by hybridization to short oligomers
Milosavljevic, Aleksandar
1999-01-01
Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
Dostálová, Anna; Votýpka, Jan; Favreau, Amanda J; Barbian, Kent D; Volf, Petr; Valenzuela, Jesus G; Jochim, Ryan C
2011-05-10
Parasite-vector interactions are fundamental in the transmission of vector-borne diseases such as leishmaniasis. Leishmania development in the vector sand fly is confined to the digestive tract, where sand fly midgut molecules interact with the parasites. In this work we sequenced and analyzed two midgut-specific cDNA libraries from sugar fed and blood fed female Phlebotomus perniciosus and compared the transcript expression profiles. A total of 4111 high quality sequences were obtained from the two libraries and assembled into 370 contigs and 1085 singletons. Molecules with putative roles in blood meal digestion, peritrophic matrix formation, immunity and response to oxidative stress were identified, including proteins that were not previously reported in sand flies. These molecules were evaluated relative to other published sand fly transcripts. Comparative analysis of the two libraries revealed transcripts differentially expressed in response to blood feeding. Molecules up regulated by blood feeding include a putative peritrophin (PperPer1), two chymotrypsin-like proteins (PperChym1 and PperChym2), a putative trypsin (PperTryp3) and four putative microvillar proteins (PperMVP1, 2, 4 and 5). Additionally, several transcripts were more abundant in the sugar fed midgut, such as two putative trypsins (PperTryp1 and PperTryp2), a chymotrypsin (PperChym3) and a microvillar protein (PperMVP3). We performed a detailed temporal expression profile analysis of the putative trypsin transcripts using qPCR and confirmed the expression of blood-induced and blood-repressed trypsins. Trypsin expression was measured in Leishmania infantum-infected and uninfected sand flies, which identified the L. infantum-induced down regulation of PperTryp3 at 24 hours post-blood meal. This midgut tissue-specific transcriptome provides insight into the molecules expressed in the midgut of P. perniciosus, an important vector of visceral leishmaniasis in the Old World. Through the comparative analysis of the libraries we identified molecules differentially expressed during blood meal digestion. Additionally, this study provides a detailed comparison to transcripts of other sand flies. Moreover, our analysis of putative trypsins demonstrated that L. infantum infection can reduce the transcript abundance of trypsin PperTryp3 in the midgut of P. perniciosus.
A detailed gravimetric geoid of North America, Eurasia, and Australia
NASA Technical Reports Server (NTRS)
Vincent, S.; Strange, W. E.
1972-01-01
A detailed gravimetric geoid of North America, the North Atlantic, Eurasia, and Australia computed from a combination of satellite-derived and surface 1 x 1 gravity data, is presented. Using a consistent set of parameters, this geoid is referenced to an absolute datum. The precision of this detailed geoid is + or - 2 meters in the continents but may be in the range of 5 to 7 meters in those areas where data was sparse. Comparisons of the detailed gravimetric geoid with results of Rice for the United States, Bomford and Fischer in Eurasia, and Mather in Australia are presented. Comparisons are also presented with geoid heights from satellite solutions for geocentric station coordinates in North America, the Caribbean, Europe, and Australia.
Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M
2014-01-01
A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.
Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish
USDA-ARS?s Scientific Manuscript database
RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...
Genome Sequence of Lactobacillus rhamnosus ATCC 8530
Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.
2012-01-01
Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences. PMID:22247527
Khan, A S
1984-01-01
The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.
2006-01-20
Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less
Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed
2018-06-05
Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Bào, Yīmíng; Amarasinghe, Gaya K; Basler, Christopher F; Bavari, Sina; Bukreyev, Alexander; Chandran, Kartik; Dolnik, Olga; Dye, John M; Ebihara, Hideki; Formenty, Pierre; Hewson, Roger; Kobinger, Gary P; Leroy, Eric M; Mühlberger, Elke; Netesov, Sergey V; Patterson, Jean L; Paweska, Janusz T; Smither, Sophie J; Takada, Ayato; Towner, Jonathan S; Volchkov, Viktor E; Wahl-Jensen, Victoria; Kuhn, Jens H
2017-05-11
The mononegaviral family Filoviridae has eight members assigned to three genera and seven species. Until now, genus and species demarcation were based on arbitrarily chosen filovirus genome sequence divergence values (≈50% for genera, ≈30% for species) and arbitrarily chosen phenotypic virus or virion characteristics. Here we report filovirus genome sequence-based taxon demarcation criteria using the publicly accessible PAirwise Sequencing Comparison (PASC) tool of the US National Center for Biotechnology Information (Bethesda, MD, USA). Comparison of all available filovirus genomes in GenBank using PASC revealed optimal genus demarcation at the 55-58% sequence diversity threshold range for genera and at the 23-36% sequence diversity threshold range for species. Because these thresholds do not change the current official filovirus classification, these values are now implemented as filovirus taxon demarcation criteria that may solely be used for filovirus classification in case additional data are absent. A near-complete, coding-complete, or complete filovirus genome sequence will now be required to allow official classification of any novel "filovirus." Classification of filoviruses into existing taxa or determining the need for novel taxa is now straightforward and could even become automated using a presented algorithm/flowchart rooted in RefSeq (type) sequences.
Chimukangara, Benjamin; Varyani, Bhavini; Shamu, Tinei; Mutsvangwa, Junior; Manasa, Justen; White, Elizabeth; Chimbetete, Cleophas; Luethy, Ruedi; Katzenstein, David
2017-05-01
HIV genotyping is often unavailable in low and middle-income countries due to infrastructure requirements and cost. We compared genotype resistance testing in patients with virologic failure, by amplification of HIV pol gene, followed by "in-house" sequencing and commercial sequencing. Remnant plasma samples from adults and children failing second-line ART were amplified and sequenced using in-house and commercial di-deoxysequencing, and analyzed in Harare, Zimbabwe and at Stanford, U.S.A, respectively. HIV drug resistance mutations were determined using the Stanford HIV drug resistance database. Twenty-six of 28 samples were amplified and 25 were successfully genotyped. Comparison of average percent nucleotide and amino acid identities between 23 pairs sequenced in both laboratories were 99.51 (±0.56) and 99.11 (±0.95), respectively. All pairs clustered together in phylogenetic analysis. Sequencing analysis identified 6/23 pairs with mutation discordances resulting in differences in phenotype, but these did not impact future regimens. The results demonstrate our ability to produce good quality drug resistance data in-house. Despite discordant mutations in some sequence pairs, the phenotypic predictions were not clinically significant. Copyright © 2016 Elsevier B.V. All rights reserved.
Nagahama, Hiroshi; Suzuki, Kengo; Shonai, Takaharu; Aratani, Kazuki; Sakurai, Yuuki; Nakamura, Manami; Sakata, Motomichi
2015-01-01
Electrodes are surgically implanted into the subthalamic nucleus (STN) of Parkinson's disease patients to provide deep brain stimulation. For ensuring correct positioning, the anatomic location of the STN must be determined preoperatively. Magnetic resonance imaging has been used for pinpointing the location of the STN. To identify the optimal imaging sequence for identifying the STN, we compared images produced with T2 star-weighted angiography (SWAN), gradient echo T2*-weighted imaging, and fast spin echo T2-weighted imaging in 6 healthy volunteers. Our comparison involved measurement of the contrast-to-noise ratio (CNR) for the STN and substantia nigra and a radiologist's interpretations of the images. Of the sequences examined, the CNR and qualitative scores were significantly higher on SWAN images than on other images (p < 0.01) for STN visualization. Kappa value (0.74) on SWAN images was the highest in three sequences for visualizing the STN. SWAN is the sequence best suited for identifying the STN at the present time.
2013-01-01
Background Snake venoms generally show sequence and quantitative variation within and between species, but some rattlesnakes have undergone exceptionally rapid, dramatic shifts in the composition, lethality, and pharmacological effects of their venoms. Such shifts have occurred within species, most notably in Mojave (Crotalus scutulatus), South American (C. durissus), and timber (C. horridus) rattlesnakes, resulting in some populations with extremely potent, neurotoxic venoms without the hemorrhagic effects typical of rattlesnake bites. Results To better understand the evolutionary changes that resulted in the potent venom of a population of C. horridus from northern Florida, we sequenced the venom-gland transcriptome of an animal from this population for comparison with the previously described transcriptome of the eastern diamondback rattlesnake (C. adamanteus), a congener with a more typical rattlesnake venom. Relative to the toxin transcription of C. adamanteus, which consisted primarily of snake-venom metalloproteinases, C-type lectins, snake-venom serine proteinases, and myotoxin-A, the toxin transcription of C. horridus was far simpler in composition and consisted almost entirely of snake-venom serine proteinases, phospholipases A2, and bradykinin-potentiating and C-type natriuretic peptides. Crotalus horridus lacked significant expression of the hemorrhagic snake-venom metalloproteinases and C-type lectins. Evolution of shared toxin families involved differential expansion and loss of toxin clades within each species and pronounced differences in the highly expressed toxin paralogs. Toxin genes showed significantly higher rates of nonsynonymous substitution than nontoxin genes. The expression patterns of nontoxin genes were conserved between species, despite the vast differences in toxin expression. Conclusions Our results represent the first complete, sequence-based comparison between the venoms of closely related snake species and reveal in unprecedented detail the rapid evolution of snake venoms. We found that the difference in venom properties resulted from major changes in expression levels of toxin gene families, differential gene-family expansion and loss, changes in which paralogs within gene families were expressed at high levels, and higher nonsynonymous substitution rates in the toxin genes relative to nontoxins. These massive alterations in the genetics of the venom phenotype emphasize the evolutionary lability and flexibility of this ecologically critical trait. PMID:23758969
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalchman, M.; Lin, B.; Nasir, J.
1994-09-01
The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less
Isolation and Functional Characterization of the Novel Clostridium botulinum Neurotoxin A8 Subtype
Kull, Skadi; Schulz, K. Melanie; Strotmeier, Jasmin Weisemann née; Kirchner, Sebastian; Schreiber, Tanja; Bollenbach, Alexander; Dabrowski, P. Wojtek; Nitsche, Andreas; Kalb, Suzanne R.; Dorner, Martin B.; Barr, John R.; Rummel, Andreas; Dorner, Brigitte G.
2015-01-01
Botulism is a severe neurological disease caused by the complex family of botulinum neurotoxins (BoNT). Based on the different serotypes known today, a classification of serotype variants termed subtypes has been proposed according to sequence diversity and immunological properties. However, the relevance of BoNT subtypes is currently not well understood. Here we describe the isolation of a novel Clostridium botulinum strain from a food-borne botulism outbreak near Chemnitz, Germany. Comparison of its botulinum neurotoxin gene sequence with published sequences identified it to be a novel subtype within the BoNT/A serotype designated BoNT/A8. The neurotoxin gene is located within an ha-orfX+ cluster and showed highest homology to BoNT/A1, A2, A5, and A6. Unexpectedly, we found an arginine insertion located in the HC domain of the heavy chain, which is unique compared to all other BoNT/A subtypes known so far. Functional characterization revealed that the binding characteristics to its main neuronal protein receptor SV2C seemed unaffected, whereas binding to membrane-incorporated gangliosides was reduced in comparison to BoNT/A1. Moreover, we found significantly lower enzymatic activity of the natural, full-length neurotoxin and the recombinant light chain of BoNT/A8 compared to BoNT/A1 in different endopeptidase assays. Both reduced ganglioside binding and enzymatic activity may contribute to the considerably lower biological activity of BoNT/A8 as measured in a mouse phrenic nerve hemidiaphragm assay. Despite its reduced activity the novel BoNT/A8 subtype caused severe botulism in a 63-year-old male. To our knowledge, this is the first description and a comprehensive characterization of a novel BoNT/A subtype which combines genetic information on the neurotoxin gene cluster with an in-depth functional analysis using different technical approaches. Our results show that subtyping of BoNT is highly relevant and that understanding of the detailed toxin function might pave the way for the development of novel therapeutics and tailor-made antitoxins. PMID:25658638
Isolation and functional characterization of the novel Clostridium botulinum neurotoxin A8 subtype.
Kull, Skadi; Schulz, K Melanie; Weisemann, Jasmin; Kirchner, Sebastian; Schreiber, Tanja; Bollenbach, Alexander; Dabrowski, P Wojtek; Nitsche, Andreas; Kalb, Suzanne R; Dorner, Martin B; Barr, John R; Rummel, Andreas; Dorner, Brigitte G
2015-01-01
Botulism is a severe neurological disease caused by the complex family of botulinum neurotoxins (BoNT). Based on the different serotypes known today, a classification of serotype variants termed subtypes has been proposed according to sequence diversity and immunological properties. However, the relevance of BoNT subtypes is currently not well understood. Here we describe the isolation of a novel Clostridium botulinum strain from a food-borne botulism outbreak near Chemnitz, Germany. Comparison of its botulinum neurotoxin gene sequence with published sequences identified it to be a novel subtype within the BoNT/A serotype designated BoNT/A8. The neurotoxin gene is located within an ha-orfX+ cluster and showed highest homology to BoNT/A1, A2, A5, and A6. Unexpectedly, we found an arginine insertion located in the HC domain of the heavy chain, which is unique compared to all other BoNT/A subtypes known so far. Functional characterization revealed that the binding characteristics to its main neuronal protein receptor SV2C seemed unaffected, whereas binding to membrane-incorporated gangliosides was reduced in comparison to BoNT/A1. Moreover, we found significantly lower enzymatic activity of the natural, full-length neurotoxin and the recombinant light chain of BoNT/A8 compared to BoNT/A1 in different endopeptidase assays. Both reduced ganglioside binding and enzymatic activity may contribute to the considerably lower biological activity of BoNT/A8 as measured in a mouse phrenic nerve hemidiaphragm assay. Despite its reduced activity the novel BoNT/A8 subtype caused severe botulism in a 63-year-old male. To our knowledge, this is the first description and a comprehensive characterization of a novel BoNT/A subtype which combines genetic information on the neurotoxin gene cluster with an in-depth functional analysis using different technical approaches. Our results show that subtyping of BoNT is highly relevant and that understanding of the detailed toxin function might pave the way for the development of novel therapeutics and tailor-made antitoxins.
Multilocus sequence typing of total-genome-sequenced bacteria.
Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole
2012-04-01
Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST.
ERIC Educational Resources Information Center
Christie, Michael A.; Hersch, Steven M.
2004-01-01
In this paper, we demonstrate nondeclarative sequence learning in mice using an animal analog of the human serial reaction time task (SRT) that uses a within-group comparison of behavior in response to a repeating sequence versus a random sequence. Ten female B6CBA mice performed eleven 96-trial sessions containing 24 repetitions of a 4-trial…
Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.
2011-04-28
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758« less
Mahajan, Gaurang; Mande, Shekhar C
2017-04-04
A comprehensive map of the human-M. tuberculosis (MTB) protein interactome would help fill the gaps in our understanding of the disease, and computational prediction can aid and complement experimental studies towards this end. Several sequence-based in silico approaches tap the existing data on experimentally validated protein-protein interactions (PPIs); these PPIs serve as templates from which novel interactions between pathogen and host are inferred. Such comparative approaches typically make use of local sequence alignment, which, in the absence of structural details about the interfaces mediating the template interactions, could lead to incorrect inferences, particularly when multi-domain proteins are involved. We propose leveraging the domain-domain interaction (DDI) information in PDB complexes to score and prioritize candidate PPIs between host and pathogen proteomes based on targeted sequence-level comparisons. Our method picks out a small set of human-MTB protein pairs as candidates for physical interactions, and the use of functional meta-data suggests that some of them could contribute to the in vivo molecular cross-talk between pathogen and host that regulates the course of the infection. Further, we present numerical data for Pfam domain families that highlights interaction specificity on the domain level. Not every instance of a pair of domains, for which interaction evidence has been found in a few instances (i.e. structures), is likely to functionally interact. Our sorting approach scores candidates according to how "distant" they are in sequence space from known examples of DDIs (templates). Thus, it provides a natural way to deal with the heterogeneity in domain-level interactions. Our method represents a more informed application of local alignment to the sequence-based search for potential human-microbial interactions that uses available PPI data as a prior. Our approach is somewhat limited in its sensitivity by the restricted size and diversity of the template dataset, but, given the rapid accumulation of solved protein complex structures, its scope and utility are expected to keep steadily improving.
NASA Astrophysics Data System (ADS)
Gold, Ryan; Williams, Robert; Jibson, Randall
2014-05-01
Previous research indicates that deep translational and rotational landslides along the bluffs east of the Mississippi River in western Tennessee were triggered by the M7-8 1811-1812 New Madrid earthquake sequence. Analysis of recently acquired airborne LiDAR data suggests the possibility of multiple generations of landslides, possibly triggered by older, similar magnitude earthquake sequences, which paleoliquifaction studies show occurred circa 1450 and about 900 A.D. Using these LiDAR data, we have remapped recent landslides along two sections of the bluffs: a northern section near Reelfoot Lake and a southern section near Meeman-Shelby State Park (20 km north of Memphis, Tennessee). The bare-earth, digital-elevation models derived from these LiDAR data have a resolution of 0.5 m and reveal valuable details of topography given the region's dense forest canopy. Our mapping confirms much of the previous landslide mapping, refutes a few previously mapped landslides, and reveals new, undetected landslides. Importantly, we observe that the landslide deposits in the Reelfoot region are characterized by rotated blocks with sharp uphill-facing scarps and steep headwall scarps, indicating youthful, relatively recent movement. In comparison, landslide deposits near Meeman-Shelby are muted in appearance, with headwall scarps and rotated blocks that are extensively dissected by gullies, indicating they might be an older generation of landslides. Because of these differences in morphology, we hypothesize that the landslides near Reelfoot Lake were triggered by the 1811-1812 earthquake sequence and that landslides near Meeman-Shelby resulted from shaking associated with earlier earthquake sequences. To test this hypothesis, we will evaluate differences in bluff height, local geology, vegetation, and proximity to known seismic sources. Furthermore, planned fieldwork will help evaluate whether the observed landslide displacements occurred in single earthquakes or if they might result from episodic movements associated with a sequence of multiple prehistoric earthquake. This study highlights the value of high-resolution, bare-earth topographic data to investigate the secondary effects of groundshaking in stable continental regions, where primary tectonic deformation associated with large earthquakes is commonly obscure or subtle.
Pictorial detail and recall in adults and children.
Ritchey, G H
1982-03-01
Relatively little research has been done on the role of pictorial detail in memory, and the data that do exist are ambiguous. The issue is important because it touches on our understanding of basic issues such as encoding elaboration and trace distinctiveness. The present study attempts to extend our data base by testing recall of words, outlines, and detailed drawings in third graders, sixth graders, and adults. For a categorized set of items, specific comparisons showed that recall of both detailed drawings and outlines was superior to that of words but that these did not differ from one another. For an uncategorized set of items, specific comparisons showed that outlines were recalled significantly better than pictures and that both of these were recalled better than words. The finding of an advantage in recall for outlines over detailed drawings was quite surprising. A variety of explanations may be offered, but true understanding of this effect will depend on future research.
Berthier, Y; Thierry, D; Lemattre, M; Guesdon, J L
1994-01-01
A new insertion sequence was isolated from Xanthomonas campestris pv. dieffenbachiae. Sequence analysis showed that this element is 1,158 bp long and has 15-bp inverted repeat ends containing two mismatches. Comparison of this sequence with sequences in data bases revealed significant homology with Escherichia coli IS5. IS1051, which detected multiple restriction fragment length polymorphisms, was used as a probe to characterize strains from the pathovar dieffenbachiae. Images PMID:7906933
Mariner 9 mapping science sequence design.
NASA Technical Reports Server (NTRS)
Goldman, A. M., Jr.
1973-01-01
The primary mission of Mariner 9 was to map the Martian surface. This paper discusses in detail the design of the mapping science sequences which were executed by the spacecraft in sixty days and during which over eighty percent of the surface was photographed. The sequence design was influenced by many factors: experimenter scientific objectives, instrument capabilities, spacecraft capabilities, orbit characteristics, and data return rates, which are illustrated graphically. Typical orbits are depicted for each of the three different mapping phases lasting twenty days. Examples of typical orbital sequence plans prepared daily during mission operations are given.
An Optimal Seed Based Compression Algorithm for DNA Sequences
Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan
2016-01-01
This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868
CoCoNUT: an efficient system for the comparison and analysis of genomes
2008-01-01
Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit) that allows solving several different tasks in a unified framework: (1) finding regions of high similarity among multiple genomic sequences and aligning them, (2) comparing two draft or multi-chromosomal genomes, (3) locating large segmental duplications in large genomic sequences, and (4) mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component), CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics. PMID:19014477
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren
2011-01-01
Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less
Maggini, Valentina; Presta, Luana; Miceli, Elisangela; Fondi, Marco; Bosi, Emanuele; Chiellini, Carolina; Fagorzi, Camilla; Bogani, Patrizia; Di Pilato, Vincenzo; Rossolini, Gian Maria; Mengoni, Alessio; Firenzuoli, Fabio; Perrin, Elena; Fani, Renato
2017-05-18
In this announcement, we detail the draft genome sequence of the Pseudomonas sp. strain Ep R1, isolated from the roots of the medicinal plant Echinacea purpurea The elucidation of this genome sequence may allow the identification of genes associated with the production of antimicrobial compounds. Copyright © 2017 Maggini et al.
Maggini, Valentina; Presta, Luana; Miceli, Elisangela; Fondi, Marco; Bosi, Emanuele; Chiellini, Carolina; Fagorzi, Camilla; Bogani, Patrizia; Di Pilato, Vincenzo; Rossolini, Gian Maria; Mengoni, Alessio; Firenzuoli, Fabio; Perrin, Elena
2017-01-01
ABSTRACT In this announcement, we detail the draft genome sequence of the Pseudomonas sp. strain Ep R1, isolated from the roots of the medicinal plant Echinacea purpurea. The elucidation of this genome sequence may allow the identification of genes associated with the production of antimicrobial compounds. PMID:28522712
APMP.T-K3.4: key comparison of realizations of the ITS-90 over the range -38.8344 °C to 419.527 °C
NASA Astrophysics Data System (ADS)
Joung, W.; Gam, K. S.; Achmadi, A.; Trisna, B. A.
2016-01-01
The APMP bilateral key comparison APMP.T-K3.4 was initiated on the request from RCM-LIPI (Indonesia) to link their national standards to the average reference values (ARVs) of the CCT-K3. Korea Research Institute of Standards and Science (KRISS, Republic of Korea) provided the linkage to the CCT-K3 for temperatures ranging from -38.8344 °C to 419.527 °C. In the APMP.T-K3.4, two standard platinum resistance thermometers (SPRTs) were chosen as the transfer instruments and were calibrated at the ITS-90 fixed-points in the comparison range. The fixed-points in this comparison included Zn freezing point (419.527 °C), Sn freezing point (231.928 °C), In freezing point (156.5985 °C), Ga melting point (29.7646 °C), and Hg triple point (-38.8344 °C). The comparison was carried out in a participant-pilot-participant sequence where KRISS served as the pilot. The linkage was based on the fixed-point resistance ratios of RCM-LIPI relative to the ARVs of the CCT-K3 via the difference between the fixed-point resistance ratios of KRISS and the ARVs of the CCT-K3. The temperature differences between the national standards of RCM-LIPI and the ARVs of the CCT-K3 were within the evaluated comparison uncertainties of the ATPM.T-K3.4. This report provides detailed information on the comparison results, linkage mechanism, and the Degree of Equivalence of the RCM-LIPI relative to the institutes having participated in the CCT-K3. Main text To reach the main text of this paper, click on Final Report. Note that this text is that which appears in Appendix B of the BIPM key comparison database kcdb.bipm.org/. The final report has been peer-reviewed and approved for publication by the CCT, according to the provisions of the CIPM Mutual Recognition Arrangement (CIPM MRA).
An improved filtering algorithm for big read datasets and its application to single-cell assembly.
Wedemeyer, Axel; Kliemann, Lasse; Srivastav, Anand; Schielke, Christian; Reusch, Thorsten B; Rosenstiel, Philip
2017-07-03
For single-cell or metagenomic sequencing projects, it is necessary to sequence with a very high mean coverage in order to make sure that all parts of the sample DNA get covered by the reads produced. This leads to huge datasets with lots of redundant data. A filtering of this data prior to assembly is advisable. Brown et al. (2012) presented the algorithm Diginorm for this purpose, which filters reads based on the abundance of their k-mers. We present Bignorm, a faster and quality-conscious read filtering algorithm. An important new algorithmic feature is the use of phred quality scores together with a detailed analysis of the k-mer counts to decide which reads to keep. We qualify and recommend parameters for our new read filtering algorithm. Guided by these parameters, we remove in terms of median 97.15% of the reads while keeping the mean phred score of the filtered dataset high. Using the SDAdes assembler, we produce assemblies of high quality from these filtered datasets in a fraction of the time needed for an assembly from the datasets filtered with Diginorm. We conclude that read filtering is a practical and efficient method for reducing read data and for speeding up the assembly process. This applies not only for single cell assembly, as shown in this paper, but also to other projects with high mean coverage datasets like metagenomic sequencing projects. Our Bignorm algorithm allows assemblies of competitive quality in comparison to Diginorm, while being much faster. Bignorm is available for download at https://git.informatik.uni-kiel.de/axw/Bignorm .
A detailed phylogeny for the Methanomicrobiales
NASA Technical Reports Server (NTRS)
Rouviere, P.; Mandelco, L.; Winker, S.; Woese, C. R.
1992-01-01
The small subunit rRNA sequence of twenty archaea, members of the Methanomicrobiales, permits a detailed phylogenetic tree to be inferred for the group. The tree confirms earlier studies, based on far fewer sequences, in showing the group to be divided into two major clusters, temporarily designated the "methanosarcina" group and the "methanogenium" group. The tree also defines phylogenetic relationships within these two groups, which in some cases do not agree with the phylogenetic relationships implied by current taxonomic names--a problem most acute for the genus Methanogenium and its relatives. The present phylogenetic characterization provides the basis for a consistent taxonomic restructuring of this major methanogenic taxon.
Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F
2015-07-01
The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
Henssge, Uta; Do, Thuy; Gilbert, Steven C.; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A. J. M.; Radford, David R.; Beighton, David
2011-01-01
Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established. PMID:21738661
Henssge, Uta; Do, Thuy; Gilbert, Steven C; Cox, Steven; Clark, Douglas; Wickström, Claes; Ligtenberg, A J M; Radford, David R; Beighton, David
2011-01-01
Actinomyces naeslundii and Actinomyces oris are members of the oral biofilm. Their identification using 16S rRNA sequencing is problematic and better achieved by comparison of metG partial sequences. A. oris is more abundant and more frequently isolated than A. naeslundii. We used a multi-locus sequence typing approach to investigate the genotypic diversity of these species and assigned A. naeslundii (n = 37) and A. oris (n = 68) isolates to 32 and 68 sequence types (ST), respectively. Neighbor-joining and ClonalFrame dendrograms derived from the concatenated partial sequences of 7 house-keeping genes identified at least 4 significant subclusters within A. oris and 3 within A. naeslundii. The strain collection we had investigated was an under-representation of the total population since at least 3 STs composed of single strains may represent discrete clusters of strains not well represented in the collection. The integrity of these sub-clusters was supported by the sequence analysis of fimP and fimA, genes coding for the type 1 and 2 fimbriae, respectively. An A. naeslundii subcluster was identified with both fimA and fimP genes and these strains were able to bind to MUC7 and statherin while all other A. naeslundii strains possessed only fimA and did not bind to statherin. An A. oris subcluster harboured a fimA gene similar to that of Actinomyces odontolyticus but no detectable fimP failed to bind significantly to either MUC7 or statherin. These data are evidence of extensive genotypic and phenotypic diversity within the species A. oris and A. naeslundii but the status of the subclusters identified here will require genome comparisons before their phylogenic position can be unequivocally established.
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter
2013-01-01
Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Title Sequences, Dress, Settings, and Such.
ERIC Educational Resources Information Center
Bell, John
Comparisons of television shows along genre lines suggest significant elements of aural/visual richness as well as valuable categories of comparison for future use in other comparisons. An examination of two sitcoms and two police shows produced roughly 25 years apart--"Make Room for Daddy" with "The Cosby Show" and "Naked…
USDA-ARS?s Scientific Manuscript database
Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...
ERIC Educational Resources Information Center
Wiles, Clyde A.
The study's purpose was to investigate the differential effects on the achievement of second-grade students that could be attributed to three instructional sequences for the learning of the addition and subtraction algorithms. One sequence presented the addition algorithm first (AS), the second presented the subtraction algorithm first (SA), and…
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936.
Durrens, Pascal; Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J; Noël, Thierry
2017-08-03
Clavispora lusitaniae , an environmental saprophytic yeast belonging to the CTG clade of Candida , can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. Copyright © 2017 Durrens et al.
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936
Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J.; Noël, Thierry
2017-01-01
ABSTRACT Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. PMID:28774979
Resolution of the African hominoid trichotomy by use of a mitochondrial gene sequence
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruvolo, M.; Disotell, T.R.; Allard, M.W.
1991-02-15
Mitochondrial DNA sequences encoding the cytochrome oxidase subunit II gene have been determined for five primate species, siamang (Hylobates syndactylus), lowland gorilla (Gorilla gorilla), pygmy chimpanzee (Pan paniscus), crab-eating macaque (Macaca fascicularis), and green monkey (Cercopithecus aethiops), and compared with published sequences of other primate and nonprimate species. Comparisons of cytochrome oxidase subunit II gene sequences provide clear-cut evidence from the mitochondrial genome for the separation of the African ape trichotomy into two evolutionary lineages, one leading to gorillas and the other to humans and chimpanzees. Several different tree-building methods support this same phylogenetic tree topology. The comparisons also yieldmore » trees in which a substantial length separates the divergence point of gorillas from that of humans and chimpanzees, suggesting that the lineage most immediately ancestral to humans and chimpanzees may have been in existence for a relatively long time.« less
First results of the CINDI-2 semi-blind MAX-DOAS intercomparison
NASA Astrophysics Data System (ADS)
Kreher, Karin; van Roozendael, Michel; Hendrick, Francois; Apituley, Arnoud; Friess, Udo; Lampel, Johannes; Piters, Ankie; Richter, Andreas; Wagner, Thomas; Cindi-2 Participants, All
2017-04-01
The second Cabauw Intercomparison campaign for Nitrogen Dioxide measuring Instruments (CINDI-2) took place at the Cabauw Experimental Site for Atmospheric Research (CESAR; Utrecht area, The Netherlands) from 25 August until 7 October 2016. The goals of this inter-comparison campaign are to support the creation of high-quality ground-based data sets (e.g. to provide reliable long-term time series for trend analysis and satellite data validation), to characterise and better understand the differences between a large number of MAX-DOAS and DOAS instruments and analysis methods, and to contribute to a harmonisation of the measurement settings and retrieval methods. During a time period of 17 days, from 12 to 28 September 2016, a formal semi-blind intercomparison was held following a detailed measurement protocol. The development of this protocol was based on the experience gained during the first CINDI campaign held in 2009 as well as more recent projects and campaigns such as the MADCAT campaign in Mainz, Germany, in 2013. Strong emphasis was put on the careful synchronisation of the measurement sequence and on exact alignment of the elevation angles using horizon scans and lamp measurements. In this presentation, we provide an overview and some highlights of the MAX-DOAS semi-blind intercomparison campaign. We will introduce the participating groups, their instruments and the measurement protocol details, and then summarize the campaign outcomes to date. The CINDI-2 data sets have been investigated using a range of diagnostics including comparisons of daily time series and relative differences between the data sets, regression analysis and correlation plots. The data products so far investigated are NO2 (nitrogen dioxide) in the UV and visible wavelength region, O4 (oxygen dimer) in the same two wavelength intervals, O3 (ozone) in the UV and visible wavelength region, HCHO (formaldehyde) and NO2 in an additional (smaller) wavelength range in the visible. The results based on the regression analysis are presented in summary plots and tables, addressing MAX-DOAS and twilight zenith sky measurements separately. Further information on instrumental details such as the alignment of the viewing direction and elevation and the field of view are also summarized and included in the overall interpretation.
Collins, Brian D.; Jibson, Randall W.
2015-07-28
This report provides a detailed account of assessments performed in May and June 2015 and focuses on valley-blocking landslides because they have the potential to pose considerable hazard to many villages in Nepal. First, we provide a seismological background of Nepal and then detail the methods used for both external and in-country data collection and interpretation. Our results consist of an overview of landsliding extent, a characterization of all valley-blocking landslides identified during our work, and a description of video resources that provide high resolution coverage of approximately 1,000 kilometers (km) of river valleys and surrounding terrain affected by the Gorkha earthquake sequence. This is followed by a description of site-specific landslide-hazard assessments conducted while in Nepal and includes detailed descriptions of five noteworthy case studies. Finally, we assess the expectation for additional landslide hazards during the 2015 summer monsoon season.
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Fibonacci and Nature. Mathematics Investigations for Schools.
ERIC Educational Resources Information Center
Newton, Lynn D.
1987-01-01
Sets forth the history of the Fibonacci Sequence and details its occurrence in nature and its potential for project work in schools. Ideas and activities include the rabbit problem, investigations of the sequence itself, its relationship to plants, music, snail shells, and the golden section. Computer generation of spirals is also discussed. (PK)
Shuttle OFT Level C navigation requirements
NASA Technical Reports Server (NTRS)
1980-01-01
Detailed requirements for the orbital operations computer loads, OPS 2, and OPS 8 are given. These requirements represent the total on-orbit/rendezvous navigation baseline requirements for the following principal functions: on-orbital/rendezvous navigation sequencer; on-orbit/rendezvous UPP sequencer; on-orbit rendezvous navigation; on-orbit prediction; on-orbit user parameter processing; and landing Site update.
Accurate 3d Scanning of Damaged Ancient Greek Inscriptions for Revealing Weathered Letters
NASA Astrophysics Data System (ADS)
Papadaki, A. I.; Agrafiotis, P.; Georgopoulos, A.; Prignitz, S.
2015-02-01
In this paper two non-invasive non-destructive alternative techniques to the traditional and invasive technique of squeezes are presented alongside with specialized developed processing methods, aiming to help the epigraphists to reveal and analyse weathered letters in ancient Greek inscriptions carved in masonry or marble. The resulting 3D model would serve as a detailed basis for the epigraphists to try to decipher the inscription. The data were collected by using a Structured Light scanner. The creation of the final accurate three dimensional model is a complicated procedure requiring large computation cost and human effort. It includes the collection of geometric data in limited space and time, the creation of the surface, the noise filtering and the merging of individual surfaces. The use of structured light scanners is time consuming and requires costly hardware and software. Therefore an alternative methodology for collecting 3D data of the inscriptions was also implemented for reasons of comparison. Hence, image sequences from varying distances were collected using a calibrated DSLR camera aiming to reconstruct the 3D scene through SfM techniques in order to evaluate the efficiency and the level of precision and detail of the obtained reconstructed inscriptions. Problems in the acquisition processes as well as difficulties in the alignment step and mesh optimization are also encountered. A meta-processing framework is proposed and analysed. Finally, the results of processing and analysis and the different 3D models are critically inspected and then evaluated by a specialist in terms of accuracy, quality and detail of the model and the capability of revealing damaged and "hidden" letters.
Visual management of large scale data mining projects.
Shah, I; Hunter, L
2000-01-01
This paper describes a unified framework for visualizing the preparations for, and results of, hundreds of machine learning experiments. These experiments were designed to improve the accuracy of enzyme functional predictions from sequence, and in many cases were successful. Our system provides graphical user interfaces for defining and exploring training datasets and various representational alternatives, for inspecting the hypotheses induced by various types of learning algorithms, for visualizing the global results, and for inspecting in detail results for specific training sets (functions) and examples (proteins). The visualization tools serve as a navigational aid through a large amount of sequence data and induced knowledge. They provided significant help in understanding both the significance and the underlying biological explanations of our successes and failures. Using these visualizations it was possible to efficiently identify weaknesses of the modular sequence representations and induction algorithms which suggest better learning strategies. The context in which our data mining visualization toolkit was developed was the problem of accurately predicting enzyme function from protein sequence data. Previous work demonstrated that approximately 6% of enzyme protein sequences are likely to be assigned incorrect functions on the basis of sequence similarity alone. In order to test the hypothesis that more detailed sequence analysis using machine learning techniques and modular domain representations could address many of these failures, we designed a series of more than 250 experiments using information-theoretic decision tree induction and naive Bayesian learning on local sequence domain representations of problematic enzyme function classes. In more than half of these cases, our methods were able to perfectly discriminate among various possible functions of similar sequences. We developed and tested our visualization techniques on this application.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.
Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.; ...
2018-02-16
Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
SeqAPASS (Sequence Alignment to Predict Across Species Susceptibility) software and documentation
SeqAPASS is a software application facilitates rapid and streamlined, yet transparent, comparisons of the similarity of toxicologically-significant molecular targets across species. The present application facilitates analysis of primary amino acid sequence similarity (including ...
Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Thuan, Nguyen Huy; Dhakal, Dipesh; Pokhrel, Anaya Raj; Chu, Luan Luong; Van Pham, Thi Thuy; Shrestha, Anil; Sohng, Jae Kyung
2018-05-01
Streptomyces peucetius ATCC 27952 produces two major anthracyclines, doxorubicin (DXR) and daunorubicin (DNR), which are potent chemotherapeutic agents for the treatment of several cancers. In order to gain detailed insight on genetics and biochemistry of the strain, the complete genome was determined and analyzed. The result showed that its complete sequence contains 7187 protein coding genes in a total of 8,023,114 bp, whereas 87% of the genome contributed to the protein coding region. The genomic sequence included 18 rRNA, 66 tRNAs, and 3 non-coding RNAs. In silico studies predicted ~ 68 biosynthetic gene clusters (BCGs) encoding diverse classes of secondary metabolites, including non-ribosomal polyketide synthase (NRPS), polyketide synthase (PKS I, II, and III), terpenes, and others. Detailed analysis of the genome sequence revealed versatile biocatalytic enzymes such as cytochrome P450 (CYP), electron transfer systems (ETS) genes, methyltransferase (MT), glycosyltransferase (GT). In addition, numerous functional genes (transporter gene, SOD, etc.) and regulatory genes (afsR-sp, metK-sp, etc.) involved in the regulation of secondary metabolites were found. This minireview summarizes the genome-based genome mining (GM) of diverse BCGs and genome exploration (GE) of versatile biocatalytic enzymes, and other enzymes involved in maintenance and regulation of metabolism of S. peucetius. The detailed analysis of genome sequence provides critically important knowledge useful in the bioengineering of the strain or harboring catalytically efficient enzymes for biotechnological applications.
19. DETAIL OF INTERIOR WALL CONSTRUCTION, VIEW TOWARD SOUTH, THIRD ...
19. DETAIL OF INTERIOR WALL CONSTRUCTION, VIEW TOWARD SOUTH, THIRD BAY Showing asphalt felt applied to both sides of interior wall studs beneath wood cladding. Back-nailing of felt indicates sequence of felt and cladding installation. - U.S. Military Academy, Ice House, Mills Road at Howze Place, West Point, Orange County, NY
Comparisons between a high resolution discrete element model and analogue model
NASA Astrophysics Data System (ADS)
LI, C. S.; Yin, H.; WU, C.; Zhang, J.
2017-12-01
A two-dimensional discrete element model (DEM) with high resolution is constructed to simulate the evolution of thrust wedge and an analogue model (AM) experiment is constructed to compare with the DEM results. This efficient parallel DEM program is written in the C language, and it is useful to solve the complex geological problems. More detailed about fold and thrust belts of DEM can be identified with the help of strain field. With non-rotating and non-tensile assumption, dynamic evolution of DEM is highly consistent with AM. Simulations in different scale can compare with each other by conversion formulas in DEM. Our results show that: (1) The overall evolution of DEM and AM is broadly similar. (2) Shortening is accommodated by in-sequence forward propagation of thrusts. The surface slope of the thrust wedge is within the stable field predicted by critical taper theory. (3) Details of thrust spacing, dip angle and number of thrusts vary between DEM and AM for the shortening experiment, but the characteristics of thrusts are similar on the whole. (4) Dip angles of the forward thrusts increased from foreland (ca. 30°) to the mobile wall (ca. 80°) (5) With shortening, both models had not the obvious volume loss. Instead, the volume basic remained unchanged in the whole extrusion processes. (6) Almost all high strain values are within fold-and-thrust belts in DEM, which allows a direct comparison between the fault zone identified on the DEM deformation field and that in the strain field. (7) The first fault initiates at deep depths and propagate down toward the surface. For the maximal volumetric strain focused on the décollement near the mobile wall, strengthening the material and making it for brittle. (8) With non-tensile particles for DEM, contraction is broadly distributed throughout the model and dilation is hardly any, which also leads to a higher efficient computation. (9) High resolution DEM can to first order successfully reproduce structures observed in AM. The comparisons serve to highlight robust features in tectonic modelling of thrust wedges. This approach is very utility in modelling large displacement, complex deformation of analogue and geological materials.
Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts.
Göke, Jonathan; Schulz, Marcel H; Lasserre, Julia; Vingron, Martin
2012-03-01
The identity of cells and tissues is to a large degree governed by transcriptional regulation. A major part is accomplished by the combinatorial binding of transcription factors at regulatory sequences, such as enhancers. Even though binding of transcription factors is sequence-specific, estimating the sequence similarity of two functionally similar enhancers is very difficult. However, a similarity measure for regulatory sequences is crucial to detect and understand functional similarities between two enhancers and will facilitate large-scale analyses like clustering, prediction and classification of genome-wide datasets. We present the standardized alignment-free sequence similarity measure N2, a flexible framework that is defined for word neighbourhoods. We explore the usefulness of adding reverse complement words as well as words including mismatches into the neighbourhood. On simulated enhancer sequences as well as functional enhancers in mouse development, N2 is shown to outperform previous alignment-free measures. N2 is flexible, faster than competing methods and less susceptible to single sequence noise and the occurrence of repetitive sequences. Experiments on the mouse enhancers reveal that enhancers active in different tissues can be separated by pairwise comparison using N2. N2 represents an improvement over previous alignment-free similarity measures without compromising speed, which makes it a good candidate for large-scale sequence comparison of regulatory sequences. The software is part of the open-source C++ library SeqAn (www.seqan.de) and a compiled version can be downloaded at http://www.seqan.de/projects/alf.html. Supplementary data are available at Bioinformatics online.
Complete Genome Sequence of a Street Rabies Virus Isolated from a Dog in Nigeria
Zhou, Ming; Zhou, Zutao; Kia, Grace S. N.; Gnanadurai, Clement W.; Leyson, Christina M.; Umoh, Jarlath U.; Kwaga, Jacob P.; Kazeem, Haruna M.
2013-01-01
A canine rabies virus (RABV) was isolated from a trade dog in Nigeria. Its entire genome was sequenced and found to be closely related to canine RABVs circulating in Africa. Sequence comparison indicates that the virus is closely related to the Africa 2 RABV lineage. The virus is now termed DRV-NG11. PMID:23469344
Deep Sequencing Reveals a Divergent Ugandan cassava brown streak virus Isolate from Malawi
Winter, Stephan; Mukasa, Settumba; Tairo, Fred; Sseruwagi, Peter; Ndunguru, Joseph; Duffy, Siobain
2017-01-01
ABSTRACT Illumina sequencing of RNA from a cassava cutting from northern Malawi produced a genome of Ugandan cassava brown streak virus (UCBSV-MW-NB7_2013). Sequence comparisons revealed stronger similarity to an isolate from nearby Tanzania (93.4% pairwise nucleotide identity) than to those previously reported from Malawi (86.9 to 87.0%). PMID:28818908
USDA-ARS?s Scientific Manuscript database
Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...
Characterizing the D2 statistic: word matches in biological sequences.
Forêt, Sylvain; Wilson, Susan R; Burden, Conrad J
2009-01-01
Word matches are often used in sequence comparison methods, either as a measure of sequence similarity or in the first search steps of algorithms such as BLAST or BLAT. The D2 statistic is the number of matches of words of k letters between two sequences. Recent advances have been made in the characterization of this statistic and in the approximation of its distribution. Here, these results are extended to the case of approximate word matches. We compute the exact value of the variance of the D2 statistic for the case of a uniform letter distribution, and introduce a method to provide accurate approximations of the variance in the remaining cases. This enables the distribution of D2 to be approximated for typical situations arising in biological research. We apply these results to the identification of cis-regulatory modules, and show that this method detects such sequences with a high accuracy. The ability to approximate the distribution of D2 for both exact and approximate word matches will enable the use of this statistic in a more precise manner for sequence comparison, database searches, and identification of transcription factor binding sites.
Information capacity of nucleotide sequences and its applications.
Sadovsky, M G
2006-05-01
The information capacity of nucleotide sequences is defined through the specific entropy of frequency dictionary of a sequence determined with respect to another one containing the most probable continuations of shorter strings. This measure distinguishes a sequence both from a random one, and from ordered entity. A comparison of sequences based on their information capacity is studied. An order within the genetic entities is found at the length scale ranged from 3 to 8. Some other applications of the developed methodology to genetics, bioinformatics, and molecular biology are discussed.
Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong
2017-01-01
Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33) and F. virginiana (O477). However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33) and F. virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genus Fragaria. PMID:29038765
DOE Office of Scientific and Technical Information (OSTI.GOV)
. Wynne, E K
Throughout this project I have been involved in every step of the protocol. After proper training, I was introduced to the necessary lab techniques for the project. From then on it has been my responsibility to perform the necessary tasks to identify and isolate the mutants. This includes carrying out a detailed protocol of mixing reagents, streaking and incubating plates, inoculating cultures and evaluating any results in order to guide my actions for the next antibiotic concentration level. Simultaneously, I have been running PCR and sequencing reactions on all mutants in order to obtain the genetic sequence of the genesmore » of interest for comparison. Once I have the gene sequences of interest I am able, with the aid of a sequencing program (Sequencher 4.2.2), to analyze the sequences of the mutants against that of a wild type strain. This entails aligning the DNA sequences of a given gene for each of the mutants and locating any base changes from the wild types bacteria's genes. These polymorphisms allow me to identify the QRDR for that particular gene. Depending on whether the polymorphism occurred at a low antibiotic concentration level or high concentration level, we can evaluate whether that change is necessary for low or high-level quinolone resistance. Finally, I will compare the polymorphisms of each mutant at a given antibiotic selection level and evaluate whether B. anthracis consistently acquires resistance through the same polymorphisms or whether the resistance mechanism varies with each new mutant strain. Currently, I am analyzing the sequence data for stage one mutants, while simultaneously continuing the lab work necessary to select for stage two mutants. After I have left, the personnel at the lab that I've been working with at LLNL will continue this project. By the end of this experiment, we hope to corroborate the suggested mechanisms of resistance typically employed by B. anthracis Sterne at different resistance levels. Furthermore, if the mechanism is determined by one of the following genes: gyrA, gyrB, parC, parE we will be able to pinpoint which base pair changes are necessary for acquiring a given resistance level. Hopefully from these data researchers will be better able to determine an appropriate action should quinolone resistant strains of B. anthracis arise in either by natural evolution or selection in a laboratory.« less
Terrat, Yves; Biass, Daniel; Dutertre, Sébastien; Favreau, Philippe; Remm, Maido; Stöcklin, Reto; Piquemal, David; Ducancel, Frédéric
2012-01-01
Although cone snail venoms have been intensively investigated in the past few decades, little is known about the whole conopeptide and protein content in venom ducts, especially at the transcriptomic level. If most of the previous studies focusing on a limited number of sequences have contributed to a better understanding of conopeptide superfamilies, they did not give access to a complete panorama of a whole venom duct. Additionally, rare transcripts were usually not identified due to sampling effect. This work presents the data and analysis of a large number of sequences obtained from high throughput 454 sequencing technology using venom ducts of Conus consors, an Indo-Pacific living piscivorous cone snail. A total of 213,561 Expressed Sequence Tags (ESTs) with an average read length of 218 base pairs (bp) have been obtained. These reads were assembled into 65,536 contiguous DNA sequences (contigs) then into 5039 clusters. The data revealed 11 conopeptide superfamilies representing a total of 53 new isoforms (full length or nearly full-length sequences). Considerable isoform diversity and major differences in transcription level could be noted between superfamilies. A, O and M superfamilies are the most diverse. The A family isoforms account for more than 70% of the conopeptide cocktail (considering all ESTs before clustering step). In addition to traditional superfamilies and families, minor transcripts including both cysteine free and cysteine-rich peptides could be detected, some of them figuring new clades of conopeptides. Finally, several sets of transcripts corresponding to proteins commonly recruited in venom function could be identified for the first time in cone snail venom duct. This work provides one of the first large-scale EST project for a cone snail venom duct using next-generation sequencing, allowing a detailed overview of the venom duct transcripts. This leads to an expanded definition of the overall cone snail venom duct transcriptomic activity, which goes beyond the cysteine-rich conopeptides. For instance, this study enabled to detect proteins involved in common post-translational maturation and folding, and to reveal compounds classically involved in hemolysis and mechanical penetration of the venom into the prey. Further comparison with proteomic and genomic data will lead to a better understanding of conopeptides diversity and the underlying mechanisms involved in conopeptide evolution. Copyright © 2011 Elsevier Ltd. All rights reserved.
Ishii, Y; Ohno, A; Taguchi, H; Imajo, S; Ishiguro, M; Matsuzawa, H
1995-01-01
Escherichia coli TUH12191, which is resistant to piperacillin, cefazolin, cefotiam, ceftizoxime, cefuzonam, and aztreonam but is susceptible to cefoxitin, latamoxef, flomoxef, and imipenem, was isolated from the urine of a patient treated with beta-lactam antibiotics. The beta-lactamase (Toho-1) purified from the bacteria had a pI of 7.8, had a molecular weight of about 29,000, and hydrolyzed beta-lactam antibiotics such as penicillin G, ampicillin, oxacillin, carbenicillin, piperacillin, cephalothin, cefoxitin, cefotaxime, ceftazidime, and aztreonam. Toho-1 was markedly inhibited by beta-lactamase inhibitors such as clavulanic acid and tazobactam. Resistance to beta-lactams, streptomycin, spectinomycin, sulfamethoxazole, and trimethoprim was transferred by conjugational transfer from E. coli TUH12191 to E. coli ML4903, and the transferred plasmid was about 58 kbp, belonging to incompatibility group M. The cefotaxime resistance gene for Toho-1 was subcloned from the 58-kbp plasmid by transformation of E. coli MV1184. The sequence of the gene for Toho-1 was determined, and the open reading frame of the gene consisted of 873 or 876 bases (initial sequence, ATGATG). The nucleotide sequence of the gene (DDBJ accession number D37830) was found to be about 73% homologous to the sequence of the gene encoding a class A beta-lactamase produced by Klebsiella oxytoca E23004. According to the amino acid sequence deduced from the DNA sequence, the precursor consisted of 290 or 291 amino acid residues, which contained amino acid motifs common to class A beta-lactamases (70SXXK, 130SDN, and 234KTG). Toho-1 was about 83% homologous to the beta-lactamase mediated by the chromosome of K. oxytoca D488 and the beta-lactamase mediated by the plasmid of E. coli MEN-1. Therefore, the newly isolated beta-lactamase Toho-1 produced by E. coli TUH12191 is similar to beta-lactamases produced by K. oxytoca D488, K. oxytoca E23004, and E. coli MEN-1 rather than to mutants of TEM or SHV enzymes. Toho-1 has shown the highest degree of similarity to K. oxytoca class A beta-lactamase. Detailed comparison of Toho-1 with other beta-lactamases implied that replacement of Asn-276 by Arg with the concomitant substitution of Thr for Arg-244 is an important mutation in the extension of the substrate specificity. PMID:8619581
Sequence information signal processor for local and global string comparisons
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1997-01-01
A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
Pereira, J O P; Freitas, B M; Jorge, D M M; Torres, D C; Soares, C E A; Grangeiro, T B
2009-01-01
Melipona quinquefasciata is a ground-nesting South American stingless bee whose geographic distribution was believed to comprise only the central and southern states of Brazil. We obtained partial sequences (about 500-570 bp) of first internal transcribed spacer (ITS1) nuclear ribosomal DNA from Melipona specimens putatively identified as M. quinquefasciata collected from different localities in northeastern Brazil. To confirm the taxonomic identity of the northeastern samples, specimens from the state of Goiás (Central region of Brazil) were included for comparison. All sequences were deposited in GenBank (accession numbers EU073751-EU073759). The mean nucleotide divergence (excluding sites with insertions/deletions) in the ITS1 sequences was only 1.4%, ranging from 0 to 4.1%. When the sites with insertions/deletions were also taken into account, sequence divergences varied from 0 to 5.3%. In all pairwise comparisons, the ITS1 sequence from the specimens collected in Goiás was most divergent compared to the ITS1 sequences of the bees from the other locations. However, neighbor-joining phylogenetic analysis showed that all ITS1 sequences from northeastern specimens along with the sample of Goiás were resolved in a single clade with a bootstrap support of 100%. The ITS1 sequencing data thus support the occurrence of M. quinquefasciata in northeast Brazil.
NASA Astrophysics Data System (ADS)
Cheng, Ryan; Morcos, Faruck; Levine, Herbert; Onuchic, Jose
2014-03-01
An important challenge in biology is to distinguish the subset of residues that allow bacterial two-component signaling (TCS) proteins to preferentially interact with their correct TCS partner such that they can bind and transfer signal. Detailed knowledge of this information would allow one to search sequence-space for mutations that can systematically tune the signal transmission between TCS partners as well as re-encode a TCS protein to preferentially transfer signals to a non-partner. Motivated by the notion that this detailed information is found in sequence data, we explore the mutual sequence co-evolution between signaling partners to infer how mutations can positively or negatively alter their interaction. Using Direct Coupling Analysis (DCA) for determining evolutionarily conserved interprotein interactions, we apply a DCA-based metric to quantify mutational changes in the interaction between TCS proteins and demonstrate that it accurately correlates with experimental mutagenesis studies probing the mutational change in the in vitro phosphotransfer. Our methodology serves as a potential framework for the rational design of TCS systems as well as a framework for the system-level study of protein-protein interactions in sequence-rich systems. This research has been supported by the NSF INSPIRE award MCB-1241332 and by the CTBP sponsored by the NSF (Grant PHY-1308264).
Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas
2016-01-01
Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090
Lee, Sung Hak; Chung, Arthur Minwoo; Lee, Ahwon; Oh, Woo Jin; Choi, Yeong Jin; Lee, Youn-Soo; Jung, Eun Sun
2017-01-01
Mutations in the KRAS gene have been identified in approximately 50% of colorectal cancers (CRCs). KRAS mutations are well established biomarkers in anti-epidermal growth factor receptor therapy. Therefore, assessment of KRAS mutations is needed in CRC patients to ensure appropriate treatment. We compared the analytical performance of the cobas test to Sanger sequencing in 264 CRC cases. In addition, discordant specimens were evaluated by 454 pyrosequencing. KRAS mutations for codons 12/13 were detected in 43.2% of cases (114/264) by Sanger sequencing. Of 257 evaluable specimens for comparison, KRAS mutations were detected in 112 cases (43.6%) by Sanger sequencing and 118 cases (45.9%) by the cobas test. Concordance between the cobas test and Sanger sequencing for each lot was 93.8% positive percent agreement (PPA) and 91.0% negative percent agreement (NPA) for codons 12/13. Results from the cobas test and Sanger sequencing were discordant for 20 cases (7.8%). Twenty discrepant cases were subsequently subjected to 454 pyrosequencing. After comprehensive analysis of the results from combined Sanger sequencing-454 pyrosequencing and the cobas test, PPA was 97.5% and NPA was 100%. The cobas test is an accurate and sensitive test for detecting KRAS -activating mutations and has analytical power equivalent to Sanger sequencing. Prescreening using the cobas test with subsequent application of Sanger sequencing is the best strategy for routine detection of KRAS mutations in CRC.
ESTuber db: an online database for Tuber borchii EST sequences.
Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo
2007-03-08
The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir
2018-01-01
Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.
ChIP-seq and RNA-seq methods to study circadian control of transcription in mammals
Takahashi, Joseph S.; Kumar, Vivek; Nakashe, Prachi; Koike, Nobuya; Huang, Hung-Chung; Green, Carla B.; Kim, Tae-Kyung
2015-01-01
Genome-wide analyses have revolutionized our ability to study the transcriptional regulation of circadian rhythms. The advent of next-generation sequencing methods has facilitated the use of two such technologies, ChIP-seq and RNA-seq. In this chapter, we describe detailed methods and protocols for these two techniques, with emphasis on their usage in circadian rhythm experiments in the mouse liver, a major target organ of the circadian clock system. Critical factors for these methods are highlighted and issues arising with time series samples for ChIP-seq and RNA-seq are discussed. Finally detailed protocols for library preparation suitable for Illumina sequencing platforms are presented. PMID:25662462
How to Help Students Conceptualize the Rigorous Definition of the Limit of a Sequence
ERIC Educational Resources Information Center
Roh, Kyeong Hah
2010-01-01
This article suggests an activity, called the epsilon-strip activity, as an instructional method for conceptualization of the rigorous definition of the limit of a sequence via visualization. The article also describes the learning objectives of each instructional step of the activity, and then provides detailed instructional methods to guide…
ERIC Educational Resources Information Center
Majlesi, Ali Reza
2018-01-01
This study aims to show how multimodality, that is, the mobilization of various communicative resources in social actions (Mondada, 2016), can be used to teach grammar. Drawing on ethnomethodological conversation analysis (Sacks, 1992), the article provides a detailed analysis of 2 corrective feedback sequences in a Swedish-as-a-second-language…
Dover, James H.; Tailleur, Irvin L.; Dumoulin, Julie A.
2004-01-01
The map depicts the field distribution and contact relations between stratigraphic units, the tectonic relations between major stratigraphic sequences, and the detailed internal structure of these sequences. The stratigraphic sequences formed in a variety of continental margin depositional environments, and subsequently underwent a complexde formational history of imbricate thrust faulting and folding. A compilation of micro and macro fossil identifications is included in this data set.
1987-01-01
identified in the difference spectra, implying that: there are five to seven tryptophans within 17 A of the spin-label hapten. Amino acid sequences...of the heavy, and light chains were obtained by a combination of amino acid and DNA sequencing. A molecular model’ was constructed from the sequence...Clore & acids yields detailed information about the amino acid com- Gronenborn, 1982, 1983). This technique should also identify position of the combining
Devailly, Guillaume; Mantsoki, Anna; Joshi, Anagha
2016-11-01
Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats. Web application: http://www.heatstarseq.roslin.ed.ac.uk/ Source code: https://github.com/gdevailly CONTACT: Guillaume.Devailly@roslin.ed.ac.uk or Anagha.Joshi@roslin.ed.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Kimura, J; Kimura, M
1987-09-05
The amino acid sequences of two ribosomal proteins, S14 and S16, from the archaebacterium Halobacterium marismortui have been determined. Sequence data were obtained by the manual and solid-phase sequencing of peptides derived from enzymatic digestions with trypsin, chymotrypsin, pepsin, and Staphylococcus aureus protease as well as by chemical cleavage with cyanogen bromide. Proteins S14 and S16 contain 109 and 126 amino acid residues and have Mr values of 11,964 and 13,515, respectively. Comparison of the sequences with those of ribosomal proteins from other organisms demonstrates that S14 has a significant homology with the rat liver ribosomal protein S11 (36% identity) as well as with the Escherichia coli ribosomal protein S17 (37%), and that S16 is related to the yeast ribosomal protein YS22 (40%) and proteins S8 from E. coli (28%) and Bacillus stearothermophilus (30%). A comparison of the amino acid residues in the homologous regions of halophilic and nonhalophilic ribosomal proteins reveals that halophilic proteins have more glutamic acids, asparatic acids, prolines, and alanines, and less lysines, arginines, and isoleucines than their nonhalophilic counterparts. These amino acid substitutions probably contribute to the structural stability of halophilic ribosomal proteins.
Metavir 2: new tools for viral metagenome comparison and assembled virome analysis
2014-01-01
Background Metagenomics, based on culture-independent sequencing, is a well-fitted approach to provide insights into the composition, structure and dynamics of environmental viral communities. Following recent advances in sequencing technologies, new challenges arise for existing bioinformatic tools dedicated to viral metagenome (i.e. virome) analysis as (i) the number of viromes is rapidly growing and (ii) large genomic fragments can now be obtained by assembling the huge amount of sequence data generated for each metagenome. Results To face these challenges, a new version of Metavir was developed. First, all Metavir tools have been adapted to support comparative analysis of viromes in order to improve the analysis of multiple datasets. In addition to the sequence comparison previously provided, viromes can now be compared through their k-mer frequencies, their taxonomic compositions, recruitment plots and phylogenetic trees containing sequences from different datasets. Second, a new section has been specifically designed to handle assembled viromes made of thousands of large genomic fragments (i.e. contigs). This section includes an annotation pipeline for uploaded viral contigs (gene prediction, similarity search against reference viral genomes and protein domains) and an extensive comparison between contigs and reference genomes. Contigs and their annotations can be explored on the website through specifically developed dynamic genomic maps and interactive networks. Conclusions The new features of Metavir 2 allow users to explore and analyze viromes composed of raw reads or assembled fragments through a set of adapted tools and a user-friendly interface. PMID:24646187
Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.
2008-01-01
We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450
[Hepatitis C virus: sequence homology of a European isolate and divergence from the prototype].
Seelig, R; Seelig, H P; Renz, M
1991-08-01
The polymerase chain reaction (PCR) detected specific hepatitis C viral (HCV) RNA sequences in liver biopsies from two patients with chronic hepatitis, in the tissue of a liver implantate, in plasma from four chronic non-A, non-B hepatitis (NANBH) patients and, for the first time, in an infectious anti-D-immunoglobulin preparation. A comparison of the viral sequences coding for a region for the nonstructural NS3 protein from the liver tissues revealed only a very small degree of sequence divergence on the cDNA as well as on the amino acid level (between 0 and 5%). The sequence similarities of the RNA isolated from plasma of the four chronic NANBH patients and the anti-D-immunoglobulin preparation were partly somewhat lower but altogether also high (between 90 and 100%). In contrast, all eight cDNA and amino acid sequences exhibited a significantly higher degree of divergence in comparison with the HCV prototype sequence (between 29 and 32%) than among themselves (between 0 and 10%). This unexpected high sequence similarity of the eight European isolates and their low homology to the Northamerican prototype sequence is indicative for the existence of different types of HCV. This will be important not only for epidemiological studies but also for the development of effective diagnostic procedures and vaccines. Concerning the pathogenesis of NANBH, a double infection or a helper mechanism has to be considered: in addition to the C virus, sequences of an other virus particle were found in the infectious IgG preparation as well as in the liver biopsies.
Wilk, Tímea; Szabó, Móni; Szmolka, Ama; Kiss, János; Barta, Endre; Nagy, Tibor
2016-01-01
Three strains of Salmonella enterica serovar Infantis isolated from healthy broiler chickens from 2012 to 2013 have been sequenced. Comparison of these and previously published S. Infantis genome sequences of broiler origin in 1996 and 2004 will provide new insight into the genome evolution and recent spread of S. Infantis in poultry. PMID:27979950
Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre
2017-04-01
Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.
Huang, Tonghui; Sun, Jie; Zhou, Shanshan; Gao, Jian; Liu, Yi
2017-06-30
Adenosine monophosphate-activated protein kinase (AMPK) plays a critical role in the regulation of energy metabolism and has been targeted for drug development of therapeutic intervention in Type II diabetes and related diseases. Recently, there has been renewed interest in the development of direct β1-selective AMPK activators to treat patients with diabetic nephropathy. To investigate the details of AMPK domain structure, sequence alignment and structural comparison were used to identify the key amino acids involved in the interaction with activators and the structure difference between β1 and β2 subunits. Additionally, a series of potential β1-selective AMPK activators were identified by virtual screening using molecular docking. The retrieved hits were filtered on the basis of Lipinski's rule of five and drug-likeness. Finally, 12 novel compounds with diverse scaffolds were obtained as potential starting points for the design of direct β1-selective AMPK activators.
Puniamoorthy, N; Ismail, M R B; Tan, D S H; Meier, R
2009-11-01
Our understanding of how fast mating behaviour evolves in insects is rather poor due to a lack of comparative studies among insect groups for which phylogenetic relationships are known. Here, we present a detailed study of the mating behaviour of 27 species of Sepsidae (Diptera) for which a well-resolved and supported phylogeny is available. We demonstrate that mating behaviour is extremely diverse in sepsids with each species having its own mating profile. We define 32 behavioural characters and document them with video clips. Based on sister species comparisons, we provide several examples where mating behaviour evolves faster than all sexually dimorphic morphological traits. Mapping the behaviours onto the molecular tree reveals much homoplasy, comparable to that observed for third positions of mitochondrial protein-encoding genes. A partitioned Bremer support (PBS) analysis reveals conflict between the molecular and behavioural data, but behavioural characters have higher PBS values per parsimony-informative character than DNA sequence characters.
Variation in Lithic Technological Strategies among the Neanderthals of Gibraltar
Shipton, Ceri; Clarkson, Christopher; Bernal, Marco Antonio; Boivin, Nicole; Finlayson, Clive; Finlayson, Geraldine; Fa, Darren; Pacheco, Francisco Giles; Petraglia, Michael
2013-01-01
The evidence for Neanderthal lithic technology is reviewed and summarized for four caves on The Rock of Gibraltar: Vanguard, Beefsteak, Ibex and Gorham’s. Some of the observed patterns in technology are statistically tested including raw material selection, platform preparation, and the use of formal and expedient technological schemas. The main parameters of technological variation are examined through detailed analysis of the Gibraltar cores and comparison with samples from the classic Mousterian sites of Le Moustier and Tabun C. The Gibraltar Mousterian, including the youngest assemblage from Layer IV of Gorham’s Cave, spans the typical Middle Palaeolithic range of variation from radial Levallois to unidirectional and multi-platform flaking schemas, with characteristic emphasis on the former. A diachronic pattern of change in the Gorham’s Cave sequence is documented, with the younger assemblages utilising more localized raw material and less formal flaking procedures. We attribute this change to a reduction in residential mobility as the climate deteriorated during Marine Isotope Stage 3 and the Neanderthal population contracted into a refugium. PMID:23762312
Autoinhibition of Bruton's tyrosine kinase (Btk) and activation by soluble inositol hexakisphosphate
Wang, Qi; Vogan, Erik M; Nocka, Laura M; Rosen, Connor E; Zorn, Julie A; Harrison, Stephen C; Kuriyan, John
2015-01-01
Bruton's tyrosine kinase (Btk), a Tec-family tyrosine kinase, is essential for B-cell function. We present crystallographic and biochemical analyses of Btk, which together reveal molecular details of its autoinhibition and activation. Autoinhibited Btk adopts a compact conformation like that of inactive c-Src and c-Abl. A lipid-binding PH-TH module, unique to Tec kinases, acts in conjunction with the SH2 and SH3 domains to stabilize the inactive conformation. In addition to the expected activation of Btk by membranes containing phosphatidylinositol triphosphate (PIP3), we found that inositol hexakisphosphate (IP6), a soluble signaling molecule found in both animal and plant cells, also activates Btk. This activation is a consequence of a transient PH-TH dimerization induced by IP6, which promotes transphosphorylation of the kinase domains. Sequence comparisons with other Tec-family kinases suggest that activation by IP6 is unique to Btk. DOI: http://dx.doi.org/10.7554/eLife.06074.001 PMID:25699547
Approaches to the structural modelling of insect wings.
Wootton, R J; Herbert, R C; Young, P G; Evans, K E
2003-01-01
Insect wings lack internal muscles, and the orderly, necessary deformations which they undergo in flight and folding are in part remotely controlled, in part encoded in their structure. This factor is crucial in understanding their complex, extremely varied morphology. Models have proved particularly useful in clarifying the facilitation and control of wing deformation. Their development has followed a logical sequence from conceptual models through physical and simple analytical to numerical models. All have value provided their limitations are realized and constant comparisons made with the properties and mechanical behaviour of real wings. Numerical modelling by the finite element method is by far the most time-consuming approach, but has real potential in analysing the adaptive significance of structural details and interpreting evolutionary trends. Published examples are used to review the strengths and weaknesses of each category of model, and a summary is given of new work using finite element modelling to investigate the vibration properties and response to impact of hawkmoth wings. PMID:14561349
Theoretical energies for the n = 1 and 2 states of the helium isoelectronic sequence up to Z = 100
NASA Technical Reports Server (NTRS)
Drake, G. W.
1988-01-01
The unified method described previously for combining high-precision nonrelativistic variational calculations with relativistic and quantum electrodynamic corrections is applied to the 1s2 1S0, 1s2s 1S0, 1s2s 1S0, 1s2s 3S1, 1s2p 1P1, and 1s2p 3P(0.1,2) states of helium-like ions. Detailed tabulations are presented for all ions in the range Z = 2-100 and are compared with a wide range of experimental data up to (Kr-34)+. The results for (U-90)+ significantly alter the recent Lamb shift measurement of Munger and Gould (1986) from 70.4 + or - 8.3 to 71.0 + or - 8.3 eV, in comparison with a revised theoretical value of 74.3 + or - 0.4 eV. The improved agreement is due to the inclusion of higher order two-electron corrections in the present work.
Equatorial X-rays and their effect on the lower mesosphere
NASA Technical Reports Server (NTRS)
Goldberg, R. A.; Jones, W. H.; Williamson, P. R.; Barcus, J. R.; Hale, L. C.
1976-01-01
On the night of May 23/24, 1975, a sequence of rocket and balloon experiments was launched from Chilca Base, Peru (12.5 deg S, 76.8 deg W, magnetic dip = - 0.7 deg). Detailed analysis and comparisons of the data yielded the first direct measurement of lower mesospheric response to a galactic X-ray source. This result could only have been determined at the equator, where cosmic ray background effects are minimal. The objective of the experiments was to seek out the equatorial energetic electron belt, sporadically reported to contain fluxes near auroral levels, measure the bremsstrahlung radiation produced by this particle belt, and determine the influence of this radiation on the middle atmosphere. High altitude rocket payloads (Nike Tomahawk 18.170 and 18.171) were launched to probe the thermosphere during and following the anticipated downward drift period. Each carried an on-axis X-ray scintillation detector and Geiger Mueller energetic electron detectors. Magnetometers and lunar sensors were used to determine payload aspect.
Norrie disease gene: characterization of deletions and possible function.
Chen, Z Y; Battinelli, E M; Hendriks, R W; Powell, J F; Middleton-Price, H; Sims, K B; Breakefield, X O; Craig, I W
1993-05-01
Positional cloning experiments have resulted recently in the isolation of a candidate gene for Norrie disease (pseudoglioma; NDP), a severe X-linked neurodevelopmental disorder. Here we report the isolation and analysis of human genomic DNA clones encompassing the NDP gene. The gene spans 28 kb and consists of 3 exons, the first of which is entirely contained within the 5' untranslated region. Detailed analysis of genomic deletions in Norrie patients shows that they are heterogeneous, both in size and in position. By PCR analysis, we found that expression of the NDP gene was not confined to the eye or to the brain. An extensive DNA and protein sequence comparison between the human NDP gene and related genes from the database revealed homology with cysteine-rich protein-binding domains of immediate--early genes implicated in the regulation of cell proliferation. We propose that NDP is a molecule related in function to these genes and may be involved in a pathway that regulates neural cell differentiation and proliferation.
Gottlieb, Michael M; Arenillas, David J; Maithripala, Savanie; Maurer, Zachary D; Tarailo Graovac, Maja; Armstrong, Linlea; Patel, Millan; van Karnebeek, Clara; Wasserman, Wyeth W
2015-04-01
Advances in next-generation sequencing (NGS) technologies have helped reveal causal variants for genetic diseases. In order to establish causality, it is often necessary to compare genomes of unrelated individuals with similar disease phenotypes to identify common disrupted genes. When working with cases of rare genetic disorders, finding similar individuals can be extremely difficult. We introduce a web tool, GeneYenta, which facilitates the matchmaking process, allowing clinicians to coordinate detailed comparisons for phenotypically similar cases. Importantly, the system is focused on phenotype annotation, with explicit limitations on highly confidential data that create barriers to participation. The procedure for matching of patient phenotypes, inspired by online dating services, uses an ontology-based semantic case matching algorithm with attribute weighting. We evaluate the capacity of the system using a curated reference data set and 19 clinician entered cases comparing four matching algorithms. We find that the inclusion of clinician weights can augment phenotype matching. © 2015 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
Adam, Ammar; Kaminski, Michael; Abdullatif, Osman
2017-04-01
This work reports the first discovery Earlandia foraminifera in the Triassic succession of the Middle East, within the Upper Khartam Member of the Khuff Formation. The study area is located in central Saudi Arabia where four outcrop localities were logged in detail for sedimentology and micropaleontology. More than 300 samples were collected for detailed sedimentological and micropaleontological analysis. Of these, only six samples recovered fossil Earlandia; these are dominantly observed in the interlaminated quartz-bearing recrystallized limestone lithofacies type. The Earlandia occur in associations with quartz grains, peloids, ooids, ostracods, bivalves, bryozoans, cephalopods, and stromatolites. The defined fossils of Earlandia are restricted to the lower fourth-order sequence of the Upper Khartam member; where non-skeletal grains (mostly oolitic grainstones) prevail. The skeletal grains along with the Earlandia occur as a thin (20 cm) transgressive lag. Furthermore, the regional occurrences of the Earlandia are consistent with the previously established high-frequency sequence stratigraphic scheme, therefore, the Earlandia could be used as a biomarker for regional biostratigraphic correlation and enhance the high-resolution sequence stratigraphic correlations of the Upper Khartam Member. Essentially, the detailed sedimentological and micropaleontological analysis (Earlandia foraminifera) indicates a plate-wide extensive shallow epeiric sea. The latter is gently dipping and sporadically connected to the open marine system.
A statistical physics perspective on alignment-independent protein sequence comparison.
Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R
2015-08-01
Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
Desjardin, Dennis E; Hemmes, Don E; Perry, Brian A
2014-01-01
Pseudobaeospora wipapatiae is described as new based on material collected in alien wet habitats on the island of Hawaii. Unique features of this beautiful species include deep ruby-colored basidiomes with two-spored basidia, amyloid cheilocystidia and a hymeniderm pileipellis with abundant pileocystidia that is initially deep ruby in KOH then changes to lilac gray. Phylogenetic analysis of nuclear large ribosomal subunit sequence data suggest a close relationship between Pseudobaeospora and Tricholoma. BLAST comparisons of internal transcribed spacer and 5.8S nuclear ribosomal subunit regions sequence data reveal greatest similarity with existing sequences of Pseudobaeospora species. A comprehensive description, color photograph, illustrations of salient micromorphological features and comparisons with phenetically similar taxa are provided. © 2014 by The Mycological Society of America.
A directional nucleation-zipping mechanism for triple helix formation
Alberti, Patrizia; Arimondo, Paola B.; Mergny, Jean-Louis; Garestier, Thérèse; Hélène, Claude; Sun, Jian-Sheng
2002-01-01
A detailed kinetic study of triple helix formation was performed by surface plasmon resonance. Three systems were investigated involving 15mer pyrimidine oligonucleotides as third strands. Rate constants and activation energies were validated by comparison with thermodynamic values calculated from UV-melting analysis. Replacement of a T·A base pair by a C·G pair at either the 5′ or the 3′ end of the target sequence allowed us to assess mismatch effects and to delineate the mechanism of triple helix formation. Our data show that the association rate constant is governed by the sequence of base triplets on the 5′ side of the triplex (referred to as the 5′ side of the target oligopurine strand) and provides evidence that the reaction pathway for triple helix formation in the pyrimidine motif proceeds from the 5′ end to the 3′ end of the triplex according to the nucleation-zipping model. It seems that this is a general feature for all triple helices formation, probably due to the right-handedness of the DNA double helix that provides a stronger base stacking at the 5′ than at the 3′ duplex–triplex junction. Understanding the mechanism of triple helix formation is not only of fundamental interest, but may also help in designing better triple helix-forming oligonucleotides for gene targeting and control of gene expression. PMID:12490709
NASA Astrophysics Data System (ADS)
Bultreys, Tom; Van Hoorebeke, Luc; Cnudde, Veerle
2016-09-01
The two-phase flow properties of natural rocks depend strongly on their pore structure and wettability, both of which are often heterogeneous throughout the rock. To better understand and predict these properties, image-based models are being developed. Resulting simulations are however problematic in several important classes of rocks with broad pore-size distributions. We present a new multiscale pore network model to simulate secondary waterflooding in these rocks, which may undergo wettability alteration after primary drainage. This novel approach permits to include the effect of microporosity on the imbibition sequence without the need to describe each individual micropore. Instead, we show that fluid transport through unresolved pores can be taken into account in an upscaled fashion, by the inclusion of symbolic links between macropores, resulting in strongly decreased computational demands. Rules to describe the behavior of these links in the quasistatic invasion sequence are derived from percolation theory. The model is validated by comparison to a fully detailed network representation, which takes each separate micropore into account. Strongly and weakly water-and oil-wet simulations show good results, as do mixed-wettability scenarios with different pore-scale wettability distributions. We also show simulations on a network extracted from a micro-CT scan of Estaillades limestone, which yields good agreement with water-wet and mixed-wet experimental results.
Equilibrium, stability, and orbital evolution of close binary systems
NASA Technical Reports Server (NTRS)
Lai, Dong; Rasio, Frederic A.; Shapiro, Stuart L.
1994-01-01
We present a new analytic study of the equilibrium and stability properties of close binary systems containing polytropic components. Our method is based on the use of ellipsoidal trial functions in an energy variational principle. We consider both synchronized and nonsynchronized systems, constructing the compressible generalizations of the classical Darwin and Darwin-Riemann configurations. Our method can be applied to a wide variety of binary models where the stellar masses, radii, spins, entropies, and polytropic indices are all allowed to vary over wide ranges and independently for each component. We find that both secular and dynamical instabilities can develop before a Roche limit or contact is reached along a sequence of models with decreasing binary separation. High incompressibility always makes a given binary system more susceptible to these instabilities, but the dependence on the mass ratio is more complicated. As simple applications, we construct models of double degenerate systems and of low-mass main-sequence star binaries. We also discuss the orbital evoltuion of close binary systems under the combined influence of fluid viscosity and secular angular momentum losses from processes like gravitational radiation. We show that the existence of global fluid instabilities can have a profound effect on the terminal evolution of coalescing binaries. The validity of our analytic solutions is examined by means of detailed comparisons with the results of recent numerical fluid calculations in three dimensions.
CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy.
Zuo, Guanghong; Hao, Bailin
2015-10-01
A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/ without login requirements. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Enkhmandakh, Badam; Makeyev, Alexandr V.; Bayarsaihan, Dashzeveg
2006-01-01
Lim1, Ssdp1, and Ldb1 proteins are components of the Ldb1-associated transcriptional complex, which is important in the head-organizing activity during early mouse development. Depletion of each individual protein alone causes a headless phenotype. To explore in more detail the modular architecture of the complex, we have generated two different gene-trapped mouse lines that express truncated forms of Ssdp1. Embryos derived from the gene-trapped line that encodes a truncated Ssdp1 lacking the proline-rich sequence exhibit a lethal abnormal head-development phenotype, resembling mouse embryos deficient for Lim1, Ssdp1, or Otx2 genes. Embryos derived from the second gene-trapped line, in which most of the proline-rich domain of Ssdp1 is retained, did not show abnormalities in head development. Our data demonstrate that components of the Ldb1-dependent module can be subdivided further into discrete functional domains and that the proline-rich stretch of Ssdp1 is critical for embryonic head development. Furthermore, phylogenetic comparisons revealed that in Caenorhabditis elegans, a similar proline-rich sequence is absent in Ssdp but present in Ldb1. We conclude that although the overall architecture of the Ldb1-dependent module has been preserved, the genetic specification of its individual components has diversified during evolution, without compromising the function of the module. PMID:16864769
Enkhmandakh, Badam; Makeyev, Alexandr V; Bayarsaihan, Dashzeveg
2006-08-01
Lim1, Ssdp1, and Ldb1 proteins are components of the Ldb1-associated transcriptional complex, which is important in the head-organizing activity during early mouse development. Depletion of each individual protein alone causes a headless phenotype. To explore in more detail the modular architecture of the complex, we have generated two different gene-trapped mouse lines that express truncated forms of Ssdp1. Embryos derived from the gene-trapped line that encodes a truncated Ssdp1 lacking the proline-rich sequence exhibit a lethal abnormal head-development phenotype, resembling mouse embryos deficient for Lim1, Ssdp1, or Otx2 genes. Embryos derived from the second gene-trapped line, in which most of the proline-rich domain of Ssdp1 is retained, did not show abnormalities in head development. Our data demonstrate that components of the Ldb1-dependent module can be subdivided further into discrete functional domains and that the proline-rich stretch of Ssdp1 is critical for embryonic head development. Furthermore, phylogenetic comparisons revealed that in Caenorhabditis elegans, a similar proline-rich sequence is absent in Ssdp but present in Ldb1. We conclude that although the overall architecture of the Ldb1-dependent module has been preserved, the genetic specification of its individual components has diversified during evolution, without compromising the function of the module.
Zhang, Jin; Wang, Bing; Dong, Shuanglin; Cao, Depan; Dong, Junfeng; Walker, William B.; Liu, Yang; Wang, Guirong
2015-01-01
To better understand the olfactory mechanisms in the two lepidopteran pest model species, the Helicoverpa armigera and H. assulta, we conducted transcriptome analysis of the adult antennae using Illumina sequencing technology and compared the chemosensory genes between these two related species. Combined with the chemosensory genes we had identified previously in H. armigera by 454 sequencing, we identified 133 putative chemosensory unigenes in H. armigera including 60 odorant receptors (ORs), 19 ionotropic receptors (IRs), 34 odorant binding proteins (OBPs), 18 chemosensory proteins (CSPs), and 2 sensory neuron membrane proteins (SNMPs). Consistent with these results, 131 putative chemosensory genes including 64 ORs, 19 IRs, 29 OBPs, 17 CSPs, and 2 SNMPs were identified through male and female antennal transcriptome analysis in H. assulta. Reverse Transcription-PCR (RT-PCR) was conducted in H. assulta to examine the accuracy of the assembly and annotation of the transcriptome and the expression profile of these unigenes in different tissues. Most of the ORs, IRs and OBPs were enriched in adult antennae, while almost all the CSPs were expressed in antennae as well as legs. We compared the differences of the chemosensory genes between these two species in detail. Our work will surely provide valuable information for further functional studies of pheromones and host volatile recognition genes in these two related species. PMID:25659090
Bazhan, S I; Karpenko, L I; Ilyicheva, T N; Belavin, P A; Seregin, S V; Danilyuk, N K; Antonets, D V; Ilyichev, A A
2010-04-01
Advances in defining HIV-1 CD8+ T cell epitopes and understanding endogenous MHC class I antigen processing enable the rational design of polyepitope vaccines for eliciting broadly targeted CD8+ T cell responses to HIV-1. Here we describe the construction and comparison of experimental DNA vaccines consisting of ten selected HLA-A2 epitopes from the major HIV-1 antigens Env, Gag, Pol, Nef, and Vpr. The immunogenicity of designed gene constructs was assessed after double DNA prime, single vaccinia virus boost immunization of HLA-A2 transgenic mice. We compared a number of parameters including different strategies for fusing ubiquitin to the polyepitope and including spacer sequences between epitopes to optimize proteasome liberation and TAP transport. It was demonstrated that the vaccine construct that induced in vitro the largest number of [peptide-MHC class I] complexes was also the most immunogenic in the animal experiments. This most immunogenic vaccine construct contained the N-terminal ubiquitin for targeting the polyepitope to the proteasome and included both proteasome liberation and TAP transport optimized spacer sequences that flanked the epitopes within the polyepitope construct. The immunogenicity of determinants was strictly related to their affinities for HLA-A2. Our finding supports the concept of rational vaccine design based on detailed knowledge of antigen processing. Copyright 2010 Elsevier Ltd. All rights reserved.
Ancient genomic architecture for mammalian olfactory receptor clusters
Aloni, Ronny; Olender, Tsviya; Lancet, Doron
2006-01-01
Background Mammalian olfactory receptor (OR) genes reside in numerous genomic clusters of up to several dozen genes. Whole-genome sequence alignment nets of five mammals allow their comprehensive comparison, aimed at reconstructing the ancestral olfactory subgenome. Results We developed a new and general tool for genome-wide definition of genomic gene clusters conserved in multiple species. Syntenic orthologs, defined as gene pairs showing conservation of both genomic location and coding sequence, were subjected to a graph theory algorithm for discovering CLICs (clusters in conservation). When applied to ORs in five mammals, including the marsupial opossum, more than 90% of the OR genes were found within a framework of 48 multi-species CLICs, invoking a general conservation of gene order and composition. A detailed analysis of individual CLICs revealed multiple differences among species, interpretable through species-specific genomic rearrangements and reflecting complex mammalian evolutionary dynamics. One significant instance involves CLIC #1, which lacks a human member, implying the human-specific deletion of an OR cluster, whose mouse counterpart has been tentatively associated with isovaleric acid odorant detection. Conclusion The identified multi-species CLICs demonstrate that most of the mammalian OR clusters have a common ancestry, preceding the split between marsupials and placental mammals. However, only two of these CLICs were capable of incorporating chicken OR genes, parsimoniously implying that all other CLICs emerged subsequent to the avian-mammalian divergence. PMID:17010214
CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy
Zuo, Guanghong; Hao, Bailin
2015-01-01
A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/ without login requirements. PMID:26563468
Petrology and Geochemistry of D'Orbigny, Geochemistry of Sahara 99555, and the Origin of Angrites
NASA Technical Reports Server (NTRS)
Mittlefehldt, David W.; Killgore, Marvin; Lee, Michael T.
2001-01-01
We have done detailed petrologic study of the angrite, D'Orbigny, and geochemical study of it and Sahara 99555. D'Orbigny is an igneous-textured rock composed of Ca-rich olivine, Al-Ti-diopside-hedenbergite, subcalcic kirschsteinite, two generations of hercynitic spinel and anorthite, with the mesostasis phases ulv6spinel, Ca-phosphate, a silicophosphate phase and Fe-sulfide. We report an unknown Fe-Ca-Al-Ti-silicate phase in the mesostasis not previously found in angrites. One hercynitic spinel is a large, rounded homogeneous grain of a different composition than the euhedral and zoned grains. We believe the former is a xenocryst, the first such described from angrites. The mafic phases are highly zoned; mg# of cores for olivine are approx.64, and for clinopyroxene approx.58, and both are zoned to Mg-free rims. The Ca content of olivine increases with decreasing mg#, until olivine with approx.20 mole% Ca is overgrown by subcalcic kirschsteinite with Ca approx.30-35 mole%. Detailed zoning sequences in olivine-subcalcic kirschsteinite and clinopyroxene show slight compositional reversals. There is no mineralogic control that can explain these reversals, and we believe they were likely caused by local additions of more primitive melt during crystallization of D'Orbigny. D'Orbigny is the most ferroan angrite with a bulk rock mg# of 32. Compositionally, it is virtually identical to Sahara 99555; the first set of compositionally identical angrites. Comparison with the other angrites shows that there is no simple petrogenetic sequence, partial melting with or without fractional crystallization, that can explain the angrite suite. Angra dos Reis remains a very anomalous angrite. Angrites show no evidence for the brecciation, shock, or impact or thermal metamorphism that affected the HED suite and ordinary chondrites. This suggests the angrite parent body may have followed a fundamentally different evolutionary path than did these other parent bodies.
Genomics and privacy: implications of the new reality of closed data for the field.
Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark
2011-12-01
Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums. © 2011 Greenbaum et al.