Sample records for genomic binary characters

  1. Coevolution study of mitochondria respiratory chain proteins: toward the understanding of protein--protein interaction.

    PubMed

    Yang, Ming; Ge, Yan; Wu, Jiayan; Xiao, Jingfa; Yu, Jun

    2011-05-20

    Coevolution can be seen as the interdependency between evolutionary histories. In the context of protein evolution, functional correlation proteins are ever-present coordinated evolutionary characters without disruption of organismal integrity. As to complex system, there are two forms of protein--protein interactions in vivo, which refer to inter-complex interaction and intra-complex interaction. In this paper, we studied the difference of coevolution characters between inter-complex interaction and intra-complex interaction using "Mirror tree" method on the respiratory chain (RC) proteins. We divided the correlation coefficients of every pairwise RC proteins into two groups corresponding to the binary protein--protein interaction in intra-complex and the binary protein--protein interaction in inter-complex, respectively. A dramatical discrepancy is detected between the coevolution characters of the two sets of protein interactions (Wilcoxon test, p-value = 4.4 × 10(-6)). Our finding reveals some critical information on coevolutionary study and assists the mechanical investigation of protein--protein interaction. Furthermore, the results also provide some unique clue for supramolecular organization of protein complexes in the mitochondrial inner membrane. More detailed binding sites map and genome information of nuclear encoded RC proteins will be extraordinary valuable for the further mitochondria dynamics study. Copyright © 2011. Published by Elsevier Ltd.

  2. Explaining evolution via constrained persistent perfect phylogeny

    PubMed Central

    2014-01-01

    Background The perfect phylogeny is an often used model in phylogenetics since it provides an efficient basic procedure for representing the evolution of genomic binary characters in several frameworks, such as for example in haplotype inference. The model, which is conceptually the simplest, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. A main open problem regarding the model is finding generalizations that retain the computational tractability of the original model but are more flexible in modeling biological data when the infinite site assumption is violated because of e.g. back mutations. A special case of back mutations that has been considered in the study of the evolution of protein domains (where a domain is acquired and then lost) is persistency, that is the fact that a character is allowed to return back to the ancestral state. In this model characters can be gained and lost at most once. In this paper we consider the computational problem of explaining binary data by the Persistent Perfect Phylogeny model (referred as PPP) and for this purpose we investigate the problem of reconstructing an evolution where some constraints are imposed on the paths of the tree. Results We define a natural generalization of the PPP problem obtained by requiring that for some pairs (character, species), neither the species nor any of its ancestors can have the character. In other words, some characters cannot be persistent for some species. This new problem is called Constrained PPP (CPPP). Based on a graph formulation of the CPPP problem, we are able to provide a polynomial time solution for the CPPP problem for matrices whose conflict graph has no edges. Using this result, we develop a parameterized algorithm for solving the CPPP problem where the parameter is the number of characters. Conclusions A preliminary experimental analysis shows that the constrained persistent perfect phylogeny model allows to explain efficiently data that do not conform with the classical perfect phylogeny model. PMID:25572381

  3. Phylogenetic Trees and Networks Reduce to Phylogenies on Binary States: Does It Furnish an Explanation to the Robustness of Phylogenetic Trees against Lateral Transfers.

    PubMed

    Thuillard, Marc; Fraix-Burnet, Didier

    2015-01-01

    This article presents an innovative approach to phylogenies based on the reduction of multistate characters to binary-state characters. We show that the reduction to binary characters' approach can be applied to both character- and distance-based phylogenies and provides a unifying framework to explain simply and intuitively the similarities and differences between distance- and character-based phylogenies. Building on these results, this article gives a possible explanation on why phylogenetic trees obtained from a distance matrix or a set of characters are often quite reasonable despite lateral transfers of genetic material between taxa. In the presence of lateral transfers, outer planar networks furnish a better description of evolution than phylogenetic trees. We present a polynomial-time reconstruction algorithm for perfect outer planar networks with a fixed number of states, characters, and lateral transfers.

  4. Documentation for the machine-readable character coded version of the SKYMAP catalogue

    NASA Technical Reports Server (NTRS)

    Warren, W. H., Jr.

    1981-01-01

    The SKYMAP catalogue is a compilation of astronomical data prepared primarily for purposes of attitude guidance for satellites. In addition to the SKYMAP Master Catalogue data base, a software package of data base management and utility programs is available. The tape version of the SKYMAP Catalogue, as received by the Astronomical Data Center (ADC), contains logical records consisting of a combination of binary and EBCDIC data. Certain character coded data in each record are redundant in that the same data are present in binary form. In order to facilitate wider use of all SKYMAP data by the astronomical community, a formatted (character) version was prepared by eliminating all redundant character data and converting all binary data to character form. The character version of the catalogue is described. The document is intended to fully describe the formatted tape so that users can process the data problems and guess work; it should be distributed with any character version of the catalogue.

  5. Tse computers. [Chinese pictograph character binary image processor design for high speed applications

    NASA Technical Reports Server (NTRS)

    Strong, J. P., III

    1973-01-01

    Tse computers have the potential of operating four or five orders of magnitude faster than present digital computers. The computers of the new design use binary images as their basic computational entity. The word 'tse' is the transliteration of the Chinese word for 'pictograph character.' Tse computers are large collections of devices that perform logical operations on binary images. The operations on binary images are to be performed over the entire image simultaneously.

  6. Robust recognition of degraded machine-printed characters using complementary similarity measure and error-correction learning

    NASA Astrophysics Data System (ADS)

    Hagita, Norihiro; Sawaki, Minako

    1995-03-01

    Most conventional methods in character recognition extract geometrical features such as stroke direction, connectivity of strokes, etc., and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs, stains and the graphical background designs used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is completely accurate. This paper proposes a method for recognizing degraded characters and characters printed on graphical background designs. This method is based on the binary image feature method and uses binary images as features. A new similarity measure, called the complementary similarity measure, is used as a discriminant function. It compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2 which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, an special characters. The results show that this method is much more robust against noise than the conventional geometrical feature method. It also achieves high recognition rates of over 92% for characters with textured foregrounds, over 98% for characters with textured backgrounds, over 98% for outline fonts, and over 99% for reverse contrast characters.

  7. Identifying hidden rate changes in the evolution of a binary morphological character: the evolution of plant habit in campanulid angiosperms.

    PubMed

    Beaulieu, Jeremy M; O'Meara, Brian C; Donoghue, Michael J

    2013-09-01

    The growth of phylogenetic trees in scope and in size is promising from the standpoint of understanding a wide variety of evolutionary patterns and processes. With trees comprised of larger, older, and globally distributed clades, it is likely that the lability of a binary character will differ significantly among lineages, which could lead to errors in estimating transition rates and the associated inference of ancestral states. Here we develop and implement a new method for identifying different rates of evolution in a binary character along different branches of a phylogeny. We illustrate this approach by exploring the evolution of growth habit in Campanulidae, a flowering plant clade containing some 35,000 species. The distribution of woody versus herbaceous species calls into question the use of traditional models of binary character evolution. The recognition and accommodation of changes in the rate of growth form evolution in different lineages demonstrates, for the first time, a robust picture of growth form evolution across a very large, very old, and very widespread flowering plant clade.

  8. Identifying uniformly mutated segments within repeats.

    PubMed

    Sahinalp, S Cenk; Eichler, Evan; Goldberg, Paul; Berenbrink, Petra; Friedetzky, Tom; Ergun, Funda

    2004-12-01

    Given a long string of characters from a constant size alphabet we present an algorithm to determine whether its characters have been generated by a single i.i.d. random source. More specifically, consider all possible n-coin models for generating a binary string S, where each bit of S is generated via an independent toss of one of the n coins in the model. The choice of which coin to toss is decided by a random walk on the set of coins where the probability of a coin change is much lower than the probability of using the same coin repeatedly. We present a procedure to evaluate the likelihood of a n-coin model for given S, subject a uniform prior distribution over the parameters of the model (that represent mutation rates and probabilities of copying events). In the absence of detailed prior knowledge of these parameters, the algorithm can be used to determine whether the a posteriori probability for n=1 is higher than for any other n>1. Our algorithm runs in time O(l4logl), where l is the length of S, through a dynamic programming approach which exploits the assumed convexity of the a posteriori probability for n. Our test can be used in the analysis of long alignments between pairs of genomic sequences in a number of ways. For example, functional regions in genome sequences exhibit much lower mutation rates than non-functional regions. Because our test provides means for determining variations in the mutation rate, it may be used to distinguish functional regions from non-functional ones. Another application is in determining whether two highly similar, thus evolutionarily related, genome segments are the result of a single copy event or of a complex series of copy events. This is particularly an issue in evolutionary studies of genome regions rich with repeat segments (especially tandemly repeated segments).

  9. Independent evolution of genomic characters during major metazoan transitions.

    PubMed

    Simakov, Oleg; Kawashima, Takeshi

    2017-07-15

    Metazoan evolution encompasses a vast evolutionary time scale spanning over 600 million years. Our ability to infer ancestral metazoan characters, both morphological and functional, is limited by our understanding of the nature and evolutionary dynamics of the underlying regulatory networks. Increasing coverage of metazoan genomes enables us to identify the evolutionary changes of the relevant genomic characters such as the loss or gain of coding sequences, gene duplications, micro- and macro-synteny, and non-coding element evolution in different lineages. In this review we describe recent advances in our understanding of ancestral metazoan coding and non-coding features, as deduced from genomic comparisons. Some genomic changes such as innovations in gene and linkage content occur at different rates across metazoan clades, suggesting some level of independence among genomic characters. While their contribution to biological innovation remains largely unclear, we review recent literature about certain genomic changes that do correlate with changes to specific developmental pathways and metazoan innovations. In particular, we discuss the origins of the recently described pharyngeal cluster which is conserved across deuterostome genomes, and highlight different genomic features that have contributed to the evolution of this group. We also assess our current capacity to infer ancestral metazoan states from gene models and comparative genomics tools and elaborate on the future directions of metazoan comparative genomics relevant to evo-devo studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Optical character recognition based on nonredundant correlation measurements.

    PubMed

    Braunecker, B; Hauck, R; Lohmann, A W

    1979-08-15

    The essence of character recognition is a comparison between the unknown character and a set of reference patterns. Usually, these reference patterns are all possible characters themselves, the whole alphabet in the case of letter characters. Obviously, N analog measurements are highly redundant, since only K = log(2)N binary decisions are enough to identify one out of N characters. Therefore, we devised K reference patterns accordingly. These patterns, called principal components, are found by digital image processing, but used in an optical analog computer. We will explain the concept of principal components, and we will describe experiments with several optical character recognition systems, based on this concept.

  11. A Method of Character Detection and Segmentation for Highway Guide Signs

    NASA Astrophysics Data System (ADS)

    Xu, Jiawei; Zhang, Chongyang

    2018-01-01

    In this paper, a method of character detection and segmentation for highway signs in China is proposed. It consists of four steps. Firstly, the highway sign area is detectedby colour and geometric features, andthe possible character region is obtained by multi-level projection strategy. Secondly, pseudo target character region is removed by local binary patterns (LBP) feature. Thirdly, convolutional neural network (CNN)is used to classify target regions. Finally, adaptive projection strategies are used to segment characters strings. Experimental results indicate that the proposed method achieves new state-of-the-art results.

  12. A laid-back trip through the Hennigian Forests

    PubMed Central

    2017-01-01

    Background This paper is a comment on the idea of matrix-free Cladistics. Demonstration of this idea’s efficiency is a major goal of the study. Within the proposed framework, the ordinary (phenetic) matrix is necessary only as “source” of Hennigian trees, not as a primary subject of the analysis. Switching from the matrix-based thinking to the matrix-free Cladistic approach clearly reveals that optimizations of the character-state changes are related not to the real processes, but to the form of the data representation. Methods We focused our study on the binary data. We wrote the simple ruby-based script FORESTER version 1.0 that helps represent a binary matrix as an array of the rooted trees (as a “Hennigian forest”). The binary representations of the genomic (DNA) data have been made by script 1001. The Average Consensus method as well as the standard Maximum Parsimony (MP) approach has been used to analyze the data. Principle findings The binary matrix may be easily re-written as a set of rooted trees (maximal relationships). The latter might be analyzed by the Average Consensus method. Paradoxically, this method, if applied to the Hennigian forests, in principle can help to identify clades despite the absence of the direct evidence from the primary data. Our approach may handle the clock- or non clock-like matrices, as well as the hypothetical, molecular or morphological data. Discussion Our proposal clearly differs from the numerous phenetic alignment-free techniques of the construction of the phylogenetic trees. Dealing with the relations, not with the actual “data” also distinguishes our approach from all optimization-based methods, if the optimization is defined as a way to reconstruct the sequences of the character-state changes on a tree, either the standard alignment-based techniques or the “direct” alignment-free procedure. We are not viewing our recent framework as an alternative to the three-taxon statement analysis (3TA), but there are two major differences between our recent proposal and the 3TA, as originally designed and implemented: (1) the 3TA deals with the three-taxon statements or minimal relationships. According to the logic of 3TA, the set of the minimal trees must be established as a binary matrix and used as an input for the parsimony program. In this paper, we operate directly with maximal relationships written just as trees, not as binary matrices, while also using the Average Consensus method instead of the MP analysis. The solely ‘reversal’-based groups can always be found by our method without the separate scoring of the putative reversals before analyses. PMID:28740753

  13. Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models

    PubMed Central

    Chen, Han; Wang, Chaolong; Conomos, Matthew P.; Stilp, Adrienne M.; Li, Zilin; Sofer, Tamar; Szpiro, Adam A.; Chen, Wei; Brehm, John M.; Celedón, Juan C.; Redline, Susan; Papanicolaou, George J.; Thornton, Timothy A.; Laurie, Cathy C.; Rice, Kenneth; Lin, Xihong

    2016-01-01

    Linear mixed models (LMMs) are widely used in genome-wide association studies (GWASs) to account for population structure and relatedness, for both continuous and binary traits. Motivated by the failure of LMMs to control type I errors in a GWAS of asthma, a binary trait, we show that LMMs are generally inappropriate for analyzing binary traits when population stratification leads to violation of the LMM’s constant-residual variance assumption. To overcome this problem, we develop a computationally efficient logistic mixed model approach for genome-wide analysis of binary traits, the generalized linear mixed model association test (GMMAT). This approach fits a logistic mixed model once per GWAS and performs score tests under the null hypothesis of no association between a binary trait and individual genetic variants. We show in simulation studies and real data analysis that GMMAT effectively controls for population structure and relatedness when analyzing binary traits in a wide variety of study designs. PMID:27018471

  14. Schizosaccharomyces japonicus: the fission yeast is a fusion of yeast and hyphae.

    PubMed

    Niki, Hironori

    2014-03-01

    The clade of Schizosaccharomyces includes 4 species: S. pombe, S. octosporus, S. cryophilus, and S. japonicus. Although all 4 species exhibit unicellular growth with a binary fission mode of cell division, S. japonicus alone is dimorphic yeast, which can transit from unicellular yeast to long filamentous hyphae. Recently it was found that the hyphal cells response to light and then synchronously activate cytokinesis of hyphae. In addition to hyphal growth, S. japonicas has many properties that aren't shared with other fission yeast. Mitosis of S. japonicas is referred to as semi-open mitosis because dynamics of nuclear membrane is an intermediate mode between open mitosis and closed mitosis. Novel genetic tools and the whole genomic sequencing of S. japonicas now provide us with an opportunity for revealing unique characters of the dimorphic yeast. © 2013 The Author. Yeast Published by John Wiley & Sons Ltd.

  15. A neural net based architecture for the segmentation of mixed gray-level and binary pictures

    NASA Technical Reports Server (NTRS)

    Tabatabai, Ali; Troudet, Terry P.

    1991-01-01

    A neural-net-based architecture is proposed to perform segmentation in real time for mixed gray-level and binary pictures. In this approach, the composite picture is divided into 16 x 16 pixel blocks, which are identified as character blocks or image blocks on the basis of a dichotomy measure computed by an adaptive 16 x 16 neural net. For compression purposes, each image block is further divided into 4 x 4 subblocks; a one-bit nonparametric quantizer is used to encode 16 x 16 character and 4 x 4 image blocks; and the binary map and quantizer levels are obtained through a neural net segmentor over each block. The efficiency of the neural segmentation in terms of computational speed, data compression, and quality of the compressed picture is demonstrated. The effect of weight quantization is also discussed. VLSI implementations of such adaptive neural nets in CMOS technology are described and simulated in real time for a maximum block size of 256 pixels.

  16. Genome characterization of a novel binary toxin-positive strain of Clostridium difficile and comparison with the epidemic 027 and 078 strains.

    PubMed

    Peng, Zhong; Liu, Sidi; Meng, Xiujuan; Liang, Wan; Xu, Zhuofei; Tang, Biao; Wang, Yuanguo; Duan, Juping; Fu, Chenchao; Wu, Bin; Wu, Anhua; Li, Chunhui

    2017-01-01

    Clostridium difficile is an anaerobic Gram-positive spore-forming gut pathogen that causes antibiotic-associated diarrhea worldwide. A small number of C. difficile strains express the binary toxin (CDT), which is generally found in C. difficile 027 (ST1) and/or 078 (ST11) in clinic. However, we isolated a binary toxin-positive non-027, non-078 C. difficile LC693 that is associated with severe diarrhea in China. The genotype of this strain was determined as ST201. To understand the pathogenesis-basis of C. difficile ST201, the strain LC693 was chosen for whole genome sequencing, and its genome sequence was analyzed together with the other two ST201 strains VL-0104 and VL-0391 and compared to the epidemic 027/ST1 and 078/ST11 strains. The project finally generated an estimated genome size of approximately 4.07 Mbp for strain LC693. Genome size of the three ST201 strains ranged from 4.07 to 4.16 Mb, with an average GC content between 28.5 and 28.9%. Phylogenetic analysis demonstrated that the ST201 strains belonged to clade 3. The ST201 genomes contained more than 40 antibiotic resistance genes and 15 of them were predicted to be associated with vancomycin-resistance. The ST201 strains contained a larger PaLoc with a Tn6218 element inserted than the 027/ST1 and 078/ST11 strains, and encoded a truncated TcdC. In addition, the ST201 strains contained intact binary toxin coding and regulation genes which are highly homologous to the 027/ST1 strain. Genome comparison of the ST201 strains with the epidemic 027 and 078 strain identified 641 genes specific for C. difficile ST201, and a number of them were predicted as fitness and virulence associated genes. The presence of those genes also contributes to the pathogenesis of the ST201 strains. In this study, the genomic characterization of three binary toxin-positive C. difficile ST201 strains in clade 3 was discussed and compared to the genomes of the epidemic 027 and the 078 strains. Our analysis identified a number fitness and virulence associated genes/loci in the ST201 genomes that contribute to the pathogenesis of C. difficile ST201.

  17. Biclustering sparse binary genomic data.

    PubMed

    van Uitert, Miranda; Meuleman, Wouter; Wessels, Lodewyk

    2008-12-01

    Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.

  18. Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models.

    PubMed

    Chen, Han; Wang, Chaolong; Conomos, Matthew P; Stilp, Adrienne M; Li, Zilin; Sofer, Tamar; Szpiro, Adam A; Chen, Wei; Brehm, John M; Celedón, Juan C; Redline, Susan; Papanicolaou, George J; Thornton, Timothy A; Laurie, Cathy C; Rice, Kenneth; Lin, Xihong

    2016-04-07

    Linear mixed models (LMMs) are widely used in genome-wide association studies (GWASs) to account for population structure and relatedness, for both continuous and binary traits. Motivated by the failure of LMMs to control type I errors in a GWAS of asthma, a binary trait, we show that LMMs are generally inappropriate for analyzing binary traits when population stratification leads to violation of the LMM's constant-residual variance assumption. To overcome this problem, we develop a computationally efficient logistic mixed model approach for genome-wide analysis of binary traits, the generalized linear mixed model association test (GMMAT). This approach fits a logistic mixed model once per GWAS and performs score tests under the null hypothesis of no association between a binary trait and individual genetic variants. We show in simulation studies and real data analysis that GMMAT effectively controls for population structure and relatedness when analyzing binary traits in a wide variety of study designs. Copyright © 2016 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  19. Processing Of Binary Images

    NASA Astrophysics Data System (ADS)

    Hou, H. S.

    1985-07-01

    An overview of the recent progress in the area of digital processing of binary images in the context of document processing is presented here. The topics covered include input scan, adaptive thresholding, halftoning, scaling and resolution conversion, data compression, character recognition, electronic mail, digital typography, and output scan. Emphasis has been placed on illustrating the basic principles rather than descriptions of a particular system. Recent technology advances and research in this field are also mentioned.

  20. Structural model constructing for optical handwritten character recognition

    NASA Astrophysics Data System (ADS)

    Khaustov, P. A.; Spitsyn, V. G.; Maksimova, E. I.

    2017-02-01

    The article is devoted to the development of the algorithms for optical handwritten character recognition based on the structural models constructing. The main advantage of these algorithms is the low requirement regarding the number of reference images. The one-pass approach to a thinning of the binary character representation has been proposed. This approach is based on the joint use of Zhang-Suen and Wu-Tsai algorithms. The effectiveness of the proposed approach is confirmed by the results of the experiments. The article includes the detailed description of the structural model constructing algorithm’s steps. The proposed algorithm has been implemented in character processing application and has been approved on MNIST handwriting characters database. Algorithms that could be used in case of limited reference images number were used for the comparison.

  1. Short-Range-Order for fcc-based Binary Alloys Revisited from Microscopic Geometry

    NASA Astrophysics Data System (ADS)

    Yuge, Koretaka

    2018-04-01

    Short-range order (SRO) in disordered alloys is typically interpreted as competition between chemical effect of negative (or positive) energy gain by mixing constituent elements and geometric effects comes from difference in effective atomic radius. Although we have a number of theoretical approaches to quantitatively estimate SRO at given temperatures, it is still unclear to systematically understand trends in SRO for binary alloys in terms of geometric character, e.g., effective atomic radius for constituents. Since chemical effect plays significant role on SRO, it has been believed that purely geometric character cannot capture the SRO trends. Despite these considerations, based on the density functional theory (DFT) calculations on fcc-based 28 equiatomic binary alloys, we find that while conventional Goldschmidt or DFT-based atomic radius for constituents have no significant correlation with SRO, atomic radius for specially selected structure, constructed purely from information about underlying lattice, can successfully capture the magnitude of SRO. These facts strongly indicate that purely geometric information of the system plays central role to determine characteristic disordered structure.

  2. Binary Interval Search: a scalable algorithm for counting interval intersections.

    PubMed

    Layer, Ryan M; Skadron, Kevin; Robins, Gabriel; Hall, Ira M; Quinlan, Aaron R

    2013-01-01

    The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. https://github.com/arq5x/bits.

  3. A catalogue of potentially bright close binary gravitational wave sources

    NASA Technical Reports Server (NTRS)

    Webbink, Ronald F.

    1985-01-01

    This is a current print-out of results of a survey, undertaken in the spring of 1985, to identify those known binary stars which might produce significant gravitational wave amplitudes at earth, either dimensionless strain amplitudes exceeding a threshold h = 10(exp -21), or energy fluxes exceeding F = 10(exp -12) erg cm(exp -2) s(exp -1). All real or putative binaries brighter than a certain limiting magnitude (calculated as a function of primary spectral type, orbital period, orbital eccentricity, and bandpass) are included. All double degenerate binaries and Wolf-Rayet binaries with known or suspected orbital periods have also been included. The catalog consists of two parts: a listing of objects in ascending order of Right Ascension (Equinox B1950), followed by an index, listing of objects by identification number according to all major stellar catalogs. The object listing is a print-out of the spreadsheets on which the catalog is currently maintained. It should be noted that the use of this spreadsheet program imposes some limitations on the display of entries. Text entries which exceed the cell size may appear in truncated form, or may run into adjacent columns. Greek characters are not available; they are represented here by the first two or three letters of their Roman names, the first letter appearing as a capital or lower-case letter according to whether the capital or lower-case Greek character is represented. Neither superscripts nor subscripts are available; they appear here in normal position and type-face. The index provides the Right Ascension and Declination of objects sorted by catalogue number.

  4. Binary Interval Search: a scalable algorithm for counting interval intersections

    PubMed Central

    Layer, Ryan M.; Skadron, Kevin; Robins, Gabriel; Hall, Ira M.; Quinlan, Aaron R.

    2013-01-01

    Motivation: The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. Results: We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. Availability: https://github.com/arq5x/bits. Contact: arq5x@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23129298

  5. Phylogenetic Analysis of Genome Rearrangements among Five Mammalian Orders

    PubMed Central

    Luo, Haiwei; Arndt, William; Zhang, Yiwei; Shi, Guanqun; Alekseyev, Max; Tang, Jijun; Hughes, Austin L.; Friedman, Robert

    2015-01-01

    Evolutionary relationships among placental mammalian orders have been controversial. Whole genome sequencing and new computational methods offer opportunities to resolve the relationships among 10 genomes belonging to the mammalian orders Primates, Rodentia, Carnivora, Perissodactyla and Artiodactyla. By application of the double cut and join distance metric, where gene order is the phylogenetic character, we computed genomic distances among the sampled mammalian genomes. With a marsupial outgroup, the gene order tree supported a topology in which Rodentia fell outside the cluster of Primates, Carnivora, Perissodactyla, and Artiodactyla. Results of breakpoint reuse rate and synteny block length analyses were consistent with the prediction of random breakage model, which provided a diagnostic test to support use of gene order as an appropriate phylogenetic character in this study. We the influence of rate differences among lineages and other factors that may contribute to different resolutions of mammalian ordinal relationships by different methods of phylogenetic reconstruction. PMID:22929217

  6. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    USDA-ARS?s Scientific Manuscript database

    Technical Abstract: 20-75 CHARACTER LINES A strategy for a genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into respective genomes. In this study, nucle...

  7. Homoplastic microinversions and the avian tree of life

    PubMed Central

    2011-01-01

    Background Microinversions are cytologically undetectable inversions of DNA sequences that accumulate slowly in genomes. Like many other rare genomic changes (RGCs), microinversions are thought to be virtually homoplasy-free evolutionary characters, suggesting that they may be very useful for difficult phylogenetic problems such as the avian tree of life. However, few detailed surveys of these genomic rearrangements have been conducted, making it difficult to assess this hypothesis or understand the impact of microinversions upon genome evolution. Results We surveyed non-coding sequence data from a recent avian phylogenetic study and found substantially more microinversions than expected based upon prior information about vertebrate inversion rates, although this is likely due to underestimation of these rates in previous studies. Most microinversions were lineage-specific or united well-accepted groups. However, some homoplastic microinversions were evident among the informative characters. Hemiplasy, which reflects differences between gene trees and the species tree, did not explain the observed homoplasy. Two specific loci were microinversion hotspots, with high numbers of inversions that included both the homoplastic as well as some overlapping microinversions. Neither stem-loop structures nor detectable sequence motifs were associated with microinversions in the hotspots. Conclusions Microinversions can provide valuable phylogenetic information, although power analysis indicates that large amounts of sequence data will be necessary to identify enough inversions (and similar RGCs) to resolve short branches in the tree of life. Moreover, microinversions are not perfect characters and should be interpreted with caution, just as with any other character type. Independent of their use for phylogenetic analyses, microinversions are important because they have the potential to complicate alignment of non-coding sequences. Despite their low rate of accumulation, they have clearly contributed to genome evolution, suggesting that active identification of microinversions will prove useful in future phylogenomic studies. PMID:21612607

  8. Genomic selection for quantitative adult plant stem rust resistance in wheat

    USDA-ARS?s Scientific Manuscript database

    Quantitative adult plant resistance (APR) to stem rust (Puccinia graminis f. sp. tritici) is an important breeding target in wheat (Triticum aestivum L.) and a potential target for genomic selection (GS). To evaluate the relative importance of known APR loci in applying genomic selection, we charact...

  9. Methodology for the Evaluation of the Algorithms for Text Line Segmentation Based on Extended Binary Classification

    NASA Astrophysics Data System (ADS)

    Brodic, D.

    2011-01-01

    Text line segmentation represents the key element in the optical character recognition process. Hence, testing of text line segmentation algorithms has substantial relevance. All previously proposed testing methods deal mainly with text database as a template. They are used for testing as well as for the evaluation of the text segmentation algorithm. In this manuscript, methodology for the evaluation of the algorithm for text segmentation based on extended binary classification is proposed. It is established on the various multiline text samples linked with text segmentation. Their results are distributed according to binary classification. Final result is obtained by comparative analysis of cross linked data. At the end, its suitability for different types of scripts represents its main advantage.

  10. Fractional Gaussian noise-enhanced information capacity of a nonlinear neuron model with binary signal input

    NASA Astrophysics Data System (ADS)

    Gao, Feng-Yin; Kang, Yan-Mei; Chen, Xi; Chen, Guanrong

    2018-05-01

    This paper reveals the effect of fractional Gaussian noise with Hurst exponent H ∈(1 /2 ,1 ) on the information capacity of a general nonlinear neuron model with binary signal input. The fGn and its corresponding fractional Brownian motion exhibit long-range, strong-dependent increments. It extends standard Brownian motion to many types of fractional processes found in nature, such as the synaptic noise. In the paper, for the subthreshold binary signal, sufficient conditions are given based on the "forbidden interval" theorem to guarantee the occurrence of stochastic resonance, while for the suprathreshold binary signal, the simulated results show that additive fGn with Hurst exponent H ∈(1 /2 ,1 ) could increase the mutual information or bits count. The investigation indicated that the synaptic noise with the characters of long-range dependence and self-similarity might be the driving factor for the efficient encoding and decoding of the nervous system.

  11. Effect of Teosinte Cytoplasmic Genomes on Maize Phenotype

    PubMed Central

    Allen, James O.

    2005-01-01

    Determining the contribution of organelle genes to plant phenotype is hampered by several factors, including the paucity of variation in the plastid and mitochondrial genomes. To circumvent this problem, evolutionary divergence between maize (Zea mays ssp. mays) and the teosintes, its closest relatives, was utilized as a source of cytoplasmic genetic variation. Maize lines in which the maize organelle genomes were replaced through serial backcrossing by those representing the entire genus, yielding alloplasmic sublines, or cytolines were created. To avoid the confounding effects of segregating nuclear alleles, an inbred maize line was utilized. Cytolines with Z. mays teosinte cytoplasms were generally indistinguishable from maize. However, cytolines with cytoplasm from the more distantly related Z. luxurians, Z. diploperennis, or Z. perennis exhibited a plethora of differences in growth, development, morphology, and function. Significant differences were observed for 56 of the 58 characters studied. Each cytoline was significantly different from the inbred line for most characters. For a given character, variation was often greater among cytolines having cytoplasms from the same species than among those from different species. The characters differed largely independently of each other. These results suggest that the cytoplasm contributes significantly to a large proportion of plant traits and that many of the organelle genes are phenotypically important. PMID:15731518

  12. Genetic effect of the Aegilops caudata plasmon on the manifestation of the Ae. cylindrica genome.

    PubMed

    Tsunewaki, Koichiro; Mori, Naoki; Takumi, Shigeo

    2014-01-01

    In the course of reconstructing Aegilops caudata from its own genome (CC) and its plasmon, which had passed half a century in common wheat (genome AABBDD), we produced alloplasmic Ae. cylindrica (genome CCDD) with the plasmon of Ae. caudata. This line, designated (caudata)-CCDD, was found to express male sterility in its second substitution backcross generation (SB2) of (caudata)-AABBCCDD pollinated three times with the Ae. cylindrica pollen. We repeatedly backcrossed these SB2 plants with the Ae. cylindrica pollen until the SB5 generation, and SB5F2 progeny were produced by self-pollination of the SB5 plants. Thirteen morphological and physiological characters, including pollen and seed fertilities, of the (caudata)-CCDD SB5F2 were compared with those of the euplasmic Ae. cylindrica. The results indicated that the male sterility expressed by (caudata)-CCDD was due to genetic incompatibility between the Ae. cylindrica genome and Ae. caudata plasmon that did not affect any other characters of Ae. cylindrica. Also, we report that the genome integrity functions in keeping the univalent transmission rate high.

  13. Template protection and its implementation in 3D face recognition systems

    NASA Astrophysics Data System (ADS)

    Zhou, Xuebing

    2007-04-01

    As biometric recognition systems are widely applied in various application areas, security and privacy risks have recently attracted the attention of the biometric community. Template protection techniques prevent stored reference data from revealing private biometric information and enhance the security of biometrics systems against attacks such as identity theft and cross matching. This paper concentrates on a template protection algorithm that merges methods from cryptography, error correction coding and biometrics. The key component of the algorithm is to convert biometric templates into binary vectors. It is shown that the binary vectors should be robust, uniformly distributed, statistically independent and collision-free so that authentication performance can be optimized and information leakage can be avoided. Depending on statistical character of the biometric template, different approaches for transforming biometric templates into compact binary vectors are presented. The proposed methods are integrated into a 3D face recognition system and tested on the 3D facial images of the FRGC database. It is shown that the resulting binary vectors provide an authentication performance that is similar to the original 3D face templates. A high security level is achieved with reasonable false acceptance and false rejection rates of the system, based on an efficient statistical analysis. The algorithm estimates the statistical character of biometric templates from a number of biometric samples in the enrollment database. For the FRGC 3D face database, the small distinction of robustness and discriminative power between the classification results under the assumption of uniquely distributed templates and the ones under the assumption of Gaussian distributed templates is shown in our tests.

  14. The Mitochondrial Genome and Transcriptome of the Basal Dinoflagellate Hematodinium sp.: Character Evolution within the Highly Derived Mitochondrial Genomes of Dinoflagellates

    PubMed Central

    Gornik, S. G.; Waller, R. F.

    2012-01-01

    The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual dinoflagellate mtDNA. PMID:22113794

  15. The mitochondrial genome and transcriptome of the basal dinoflagellate Hematodinium sp.: character evolution within the highly derived mitochondrial genomes of dinoflagellates.

    PubMed

    Jackson, C J; Gornik, S G; Waller, R F

    2012-01-01

    The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual dinoflagellate mtDNA.

  16. Applying Agrep to r-NSA to solve multiple sequences approximate matching.

    PubMed

    Ni, Bing; Wong, Man-Hon; Lam, Chi-Fai David; Leung, Kwong-Sak

    2014-01-01

    This paper addresses the approximate matching problem in a database consisting of multiple DNA sequences, where the proposed approach applies Agrep to a new truncated suffix array, r-NSA. The construction time of the structure is linear to the database size, and the computations of indexing a substring in the structure are constant. The number of characters processed in applying Agrep is analysed theoretically, and the theoretical upper-bound can approximate closely the empirical number of characters, which is obtained through enumerating the characters in the actual structure built. Experiments are carried out using (synthetic) random DNA sequences, as well as (real) genome sequences including Hepatitis-B Virus and X-chromosome. Experimental results show that, compared to the straight-forward approach that applies Agrep to multiple sequences individually, the proposed approach solves the matching problem in much shorter time. The speed-up of our approach depends on the sequence patterns, and for highly similar homologous genome sequences, which are the common cases in real-life genomes, it can be up to several orders of magnitude.

  17. Lateral gene transfers have polished animal genomes: lessons from nematodes

    PubMed Central

    Danchin, Etienne G. J.; Rosso, Marie-Noëlle

    2012-01-01

    It is now accepted that lateral gene transfers (LGT), have significantly contributed to the composition of bacterial genomes. The amplitude of the phenomenon is considered so high in prokaryotes that it challenges the traditional view of a binary hierarchical tree of life to correctly represent the evolutionary history of species. Given the plethora of transfers between prokaryotes, it is currently impossible to infer the last common ancestral gene set for any extant species. For this ensemble of reasons, it has been proposed that the Darwinian binary tree of life may be inappropriate to correctly reflect the actual relations between species, at least in prokaryotes. In contrast, the contribution of LGT to the composition of animal genomes is less documented. In the light of recent analyses that reported series of LGT events in nematodes, we discuss the importance of this phenomenon in the evolutionary history and in the current composition of an animal genome. Far from being neutral, it appears that besides having contributed to nematode genome contents, LGT have favored the emergence of important traits such as plant-parasitism. PMID:22919619

  18. TRACTS: a program to map oligopurine.oligopyrimidine and other binary DNA tracts

    PubMed Central

    Gal, Moshe; Katz, Tzvi; Ovadia, Amir; Yagil, Gad

    2003-01-01

    A program to map the locations and frequencies of DNA tracts composed of only two bases (‘Binary DNA’) is described. The program, TRACTS (URL http://bioportal.weizmann.ac.il/tracts/tracts.html and/or http://bip.weizmann.ac.il/miwbin/servers/tracts) is of interest because long tracts composed of only two bases are highly over-represented in most genomes. In eukaryotes, oligopurine.oligopyrimidine tracts (‘R.Y tracts’) are found in the highest excess. In prokaryotes, W tracts predominate (A,T ‘rich’). A pre-program, ANEX, parses database annotation files of GenBank and EMBL, to produce a convenient one-line list of every gene (exon, intron) in a genome. The main unit lists and analyzes tracts of the three possible binary pairs (R.Y, K.M and S;W). As an example, the results of R.Y tract mapping of mammalian gene p53 is described. PMID:12824393

  19. Adhesion and friction of iron-base binary alloys in contact with silicon carbide in vacuum

    NASA Technical Reports Server (NTRS)

    Miyoshi, K.; Buckley, D. H.

    1980-01-01

    Single pass sliding friction experiments were conducted with various iron base binary alloys (alloying elements were Ti, Cr, Mn, Ni, Rh, and W) in contact with a single crystal silicon carbide /0001/ surface in vacuum. Results indicate that atomic size and concentration of alloying elements play an important role in controlling adhesion and friction properties of iron base binary alloys. The coefficient of friction generally increases with an increase in solute concentration. The coefficient of friction increases linearly as the solute to iron atomic radius ratio increases or decreases from unity. The chemical activity of the alloying elements was also an important parameter in controlling adhesion and friction of alloys, as these latter properties are highly dependent upon the d bond character of the elements.

  20. A simplified approach to construct infectious cDNA clones of a tobamovirus in a binary vector.

    PubMed

    Junqueira, Bruna Rayane Teodoro; Nicolini, Cícero; Lucinda, Natalia; Orílio, Anelise Franco; Nagata, Tatsuya

    2014-03-01

    Infectious cDNA clones of RNA viruses are important tools to study molecular processes such as replication and host-virus interactions. However, the cloning steps necessary for construction of cDNAs of viral RNA genomes in binary vectors are generally laborious. In this study, a simplified method of producing an agro-infectious Pepper mild mottle virus (PMMoV) clone is described in detail. Initially, the complete genome of PMMoV was amplified by a single-step RT-PCR, cloned, and subcloned into a small plasmid vector under the T7 RNA polymerase promoter to confirm the infectivity of the cDNA clone through transcript inoculation. The complete genome was then transferred to a binary vector using a single-step, overlap-extension PCR. The selected clones were agro-infiltrated to Nicotiana benthamiana plants and showed to be infectious, causing typical PMMoV symptoms. No differences in host responses were observed when the wild-type PMMoV isolate, the T7 RNA polymerase-derived transcripts and the agroinfiltration-derived viruses were inoculated to N. benthamiana, Capsicum chinense PI 159236 and Capsicum annuum plants. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Visually testing the dynamic character of a blazed-angle adjustable grating by digital holographic microscopy.

    PubMed

    Qin, Chuan; Zhao, Jianlin; Di, Jianglei; Wang, Le; Yu, Yiting; Yuan, Weizheng

    2009-02-10

    We employed digital holographic microscopy to visually test microoptoelectromechanical systems (MOEMS). The sample is a blazed-angle adjustable grating. Considering the periodic structure of the sample, a local area unwrapping method based on a binary template was adopted to demodulate the fringes obtained by referring to a reference hologram. A series of holograms at different deformation states due to different drive voltages were captured to analyze the dynamic character of the MOEMS, and the uniformity of different microcantilever beams was also inspected. The results show this testing method is effective for a periodic structure.

  2. On defining a unique phylogenetic tree with homoplastic characters.

    PubMed

    Goloboff, Pablo A; Wilkinson, Mark

    2018-05-01

    This paper discusses the problem of whether creating a matrix with all the character state combinations that have a fixed number of steps (or extra steps) on a given tree T, produces the same tree T when analyzed with maximum parsimony or maximum likelihood. Exhaustive enumeration of cases up to 20 taxa for binary characters, and up to 12 taxa for 4-state characters, shows that the same tree is recovered (as unique most likely or most parsimonious tree) as long as the number of extra steps is within 1/4 of the number of taxa. This dependence, 1/4 of the number of taxa, is discussed with a general argumentation, in terms of the spread of the character changes on the tree used to select character state distributions. The present finding allows creating matrices which have as much homoplasy as possible for the most parsimonious or likely tree to be predictable, and examination of these matrices with hill-climbing search algorithms provides additional evidence on the (lack of a) necessary relationship between homoplasy and the ability of search methods to find optimal trees. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. A New Experiment on Bengali Character Recognition

    NASA Astrophysics Data System (ADS)

    Barman, Sumana; Bhattacharyya, Debnath; Jeon, Seung-Whan; Kim, Tai-Hoon; Kim, Haeng-Kon

    This paper presents a method to use View based approach in Bangla Optical Character Recognition (OCR) system providing reduced data set to the ANN classification engine rather than the traditional OCR methods. It describes how Bangla characters are processed, trained and then recognized with the use of a Backpropagation Artificial neural network. This is the first published account of using a segmentation-free optical character recognition system for Bangla using a view based approach. The methodology presented here assumes that the OCR pre-processor has presented the input images to the classification engine described here. The size and the font face used to render the characters are also significant in both training and classification. The images are first converted into greyscale and then to binary images; these images are then scaled to a fit a pre-determined area with a fixed but significant number of pixels. The feature vectors are then formed extracting the characteristics points, which in this case is simply a series of 0s and 1s of fixed length. Finally, an artificial neural network is chosen for the training and classification process.

  4. A binary search approach to whole-genome data analysis.

    PubMed

    Brodsky, Leonid; Kogan, Simon; Benjacob, Eshel; Nevo, Eviatar

    2010-09-28

    A sequence analysis-oriented binary search-like algorithm was transformed to a sensitive and accurate analysis tool for processing whole-genome data. The advantage of the algorithm over previous methods is its ability to detect the margins of both short and long genome fragments, enriched by up-regulated signals, at equal accuracy. The score of an enriched genome fragment reflects the difference between the actual concentration of up-regulated signals in the fragment and the chromosome signal baseline. The "divide-and-conquer"-type algorithm detects a series of nonintersecting fragments of various lengths with locally optimal scores. The procedure is applied to detected fragments in a nested manner by recalculating the lower-than-baseline signals in the chromosome. The algorithm was applied to simulated whole-genome data, and its sensitivity/specificity were compared with those of several alternative algorithms. The algorithm was also tested with four biological tiling array datasets comprising Arabidopsis (i) expression and (ii) histone 3 lysine 27 trimethylation CHIP-on-chip datasets; Saccharomyces cerevisiae (iii) spliced intron data and (iv) chromatin remodeling factor binding sites. The analyses' results demonstrate the power of the algorithm in identifying both the short up-regulated fragments (such as exons and transcription factor binding sites) and the long--even moderately up-regulated zones--at their precise genome margins. The algorithm generates an accurate whole-genome landscape that could be used for cross-comparison of signals across the same genome in evolutionary and general genomic studies.

  5. CoGI: Towards Compressing Genomes as an Image.

    PubMed

    Xie, Xiaojing; Zhou, Shuigeng; Guan, Jihong

    2015-01-01

    Genomic science is now facing an explosive increase of data thanks to the fast development of sequencing technology. This situation poses serious challenges to genomic data storage and transferring. It is desirable to compress data to reduce storage and transferring cost, and thus to boost data distribution and utilization efficiency. Up to now, a number of algorithms / tools have been developed for compressing genomic sequences. Unlike the existing algorithms, most of which treat genomes as one-dimensional text strings and compress them based on dictionaries or probability models, this paper proposes a novel approach called CoGI (the abbreviation of Compressing Genomes as an Image) for genome compression, which transforms the genomic sequences to a two-dimensional binary image (or bitmap), then applies a rectangular partition coding algorithm to compress the binary image. CoGI can be used as either a reference-based compressor or a reference-free compressor. For the former, we develop two entropy-based algorithms to select a proper reference genome. Performance evaluation is conducted on various genomes. Experimental results show that the reference-based CoGI significantly outperforms two state-of-the-art reference-based genome compressors GReEn and RLZ-opt in both compression ratio and compression efficiency. It also achieves comparable compression ratio but two orders of magnitude higher compression efficiency in comparison with XM--one state-of-the-art reference-free genome compressor. Furthermore, our approach performs much better than Gzip--a general-purpose and widely-used compressor, in both compression speed and compression ratio. So, CoGI can serve as an effective and practical genome compressor. The source code and other related documents of CoGI are available at: http://admis.fudan.edu.cn/projects/cogi.htm.

  6. DiscML: an R package for estimating evolutionary rates of discrete characters using maximum likelihood.

    PubMed

    Kim, Tane; Hao, Weilong

    2014-09-27

    The study of discrete characters is crucial for the understanding of evolutionary processes. Even though great advances have been made in the analysis of nucleotide sequences, computer programs for non-DNA discrete characters are often dedicated to specific analyses and lack flexibility. Discrete characters often have different transition rate matrices, variable rates among sites and sometimes contain unobservable states. To obtain the ability to accurately estimate a variety of discrete characters, programs with sophisticated methodologies and flexible settings are desired. DiscML performs maximum likelihood estimation for evolutionary rates of discrete characters on a provided phylogeny with the options that correct for unobservable data, rate variations, and unknown prior root probabilities from the empirical data. It gives users options to customize the instantaneous transition rate matrices, or to choose pre-determined matrices from models such as birth-and-death (BD), birth-death-and-innovation (BDI), equal rates (ER), symmetric (SYM), general time-reversible (GTR) and all rates different (ARD). Moreover, we show application examples of DiscML on gene family data and on intron presence/absence data. DiscML was developed as a unified R program for estimating evolutionary rates of discrete characters with no restriction on the number of character states, and with flexibility to use different transition models. DiscML is ideal for the analyses of binary (1s/0s) patterns, multi-gene families, and multistate discrete morphological characteristics.

  7. Effect of pattern complexity on the visual span for Chinese and alphabet characters

    PubMed Central

    Wang, Hui; He, Xuanzi; Legge, Gordon E.

    2014-01-01

    The visual span for reading is the number of letters that can be recognized without moving the eyes and is hypothesized to impose a sensory limitation on reading speed. Factors affecting the size of the visual span have been studied using alphabet letters. There may be common constraints applying to recognition of other scripts. The aim of this study was to extend the concept of the visual span to Chinese characters and to examine the effect of the greater complexity of these characters. We measured visual spans for Chinese characters and alphabet letters in the central vision of bilingual subjects. Perimetric complexity was used as a metric to quantify the pattern complexity of binary character images. The visual span tests were conducted with four sets of stimuli differing in complexity—lowercase alphabet letters and three groups of Chinese characters. We found that the size of visual spans decreased with increasing complexity, ranging from 10.5 characters for alphabet letters to 4.5 characters for the most complex Chinese characters studied. A decomposition analysis revealed that crowding was the dominant factor limiting the size of the visual span, and the amount of crowding increased with complexity. Errors in the spatial arrangement of characters (mislocations) had a secondary effect. We conclude that pattern complexity has a major effect on the size of the visual span, mediated in large part by crowding. Measuring the visual span for Chinese characters is likely to have high relevance to understanding visual constraints on Chinese reading performance. PMID:24993020

  8. Analysis of binary responses with outcome-specific misclassification probability in genome-wide association studies.

    PubMed

    Rekaya, Romdhane; Smith, Shannon; Hay, El Hamidi; Farhat, Nourhene; Aggrey, Samuel E

    2016-01-01

    Errors in the binary status of some response traits are frequent in human, animal, and plant applications. These error rates tend to differ between cases and controls because diagnostic and screening tests have different sensitivity and specificity. This increases the inaccuracies of classifying individuals into correct groups, giving rise to both false-positive and false-negative cases. The analysis of these noisy binary responses due to misclassification will undoubtedly reduce the statistical power of genome-wide association studies (GWAS). A threshold model that accommodates varying diagnostic errors between cases and controls was investigated. A simulation study was carried out where several binary data sets (case-control) were generated with varying effects for the most influential single nucleotide polymorphisms (SNPs) and different diagnostic error rate for cases and controls. Each simulated data set consisted of 2000 individuals. Ignoring misclassification resulted in biased estimates of true influential SNP effects and inflated estimates for true noninfluential markers. A substantial reduction in bias and increase in accuracy ranging from 12% to 32% was observed when the misclassification procedure was invoked. In fact, the majority of influential SNPs that were not identified using the noisy data were captured using the proposed method. Additionally, truly misclassified binary records were identified with high probability using the proposed method. The superiority of the proposed method was maintained across different simulation parameters (misclassification rates and odds ratios) attesting to its robustness.

  9. Hable con Ella (Talk to Her) through the lens of gender.

    PubMed

    Yanof, Judith A

    2008-04-01

    In the 2002 film Hable con Ella (Talk to Her), Spanish writer-director Pedro Almodóvar plays with the ambiguity of gender, transcending conventional assumptions about "masculinity" and "femininity." Each of the four main characters holds complex, varied, and, in some cases, gender-bending gender identifications. The theme of gender plasticity is a prominent motif in this film. However, underlying the narrative, there is also a perverse subtext that relies on rigidly binary gender stereotypes to define relationships between men and women. Both these views of gender which operate dialectically, create a complex tapestry through which Almodóvar explores his characters' problems in attaining intimacy.

  10. Association mapping of agro-morphological characters among the global collection of finger millet genotypes using genomic SSR markers.

    PubMed

    Kalyana Babu, B; Agrawal, P K; Pandey, Dinesh; Jaiswal, J P; Kumar, Anil

    2014-08-01

    Identification of alleles responsible for various agro-morphological characters is a major concern to further improve the finger millet germplasm. Forty-six genomic SSRs were used for genetic analysis and population structure analysis of a global collection of 190 finger millet genotypes and fifteen agro-morphological characters were evaluated. The overall results showed that Asian genotypes were smaller in height, smaller flag leaf length, less basal tiller number, early flowering and early maturity nature, small ear head length, and smaller in length of longest finger. The 46 SSRs yielded 90 scorable alleles and the polymorphism information content values varied from 0.292 to 0.703 at an average of 0.442. The gene diversity was in the range of 0.355 to 0.750 with an average value of 0.528. The 46 genomic SSR loci grouped the 190 finger millet genotypes into two major clusters based on their geographical origin by the both phylogenetic clustering and population structure analysis by STRUCTURE software. Association mapping of QTLs for 15 agro-morphological characters with 46 genomic SSRs resulted in identification of five markers were linked to QTLs of four traits at a significant threshold (P) level of ≤ 0.01 and ≤ 0.001. The QTL for basal tiller number was strongly associated with the locus UGEP81 at a P value of 0.001 by explaining the phenotypic variance (R (2)) of 10.8%. The QTL for days to 50% flowering was linked by two SSR loci UGEP77 and UGEP90, explained 10 and 8.7% of R (2) respectively at a P value of 0.01. The SSR marker, FM9 found to have strong association to two agro-morphological traits, flag leaf width (P-0.001, R(2)-14.1 %) and plant height (P-0.001, R(2)-11.2%). The markers linked to the QTLs for above agro-morphological characters found in the present study can be further used for cloning of the full length gene, fine mapping and their further use in the marker assisted breeding programmes for introgression of alleles into locally well adapted germplasm.

  11. When outgroups fail; phylogenomics of rooting the emerging pathogen, Coxiella burnetii.

    PubMed

    Pearson, Talima; Hornstra, Heidie M; Sahl, Jason W; Schaack, Sarah; Schupp, James M; Beckstrom-Sternberg, Stephen M; O'Neill, Matthew W; Priestley, Rachael A; Champion, Mia D; Beckstrom-Sternberg, James S; Kersh, Gilbert J; Samuel, James E; Massung, Robert F; Keim, Paul

    2013-09-01

    Rooting phylogenies is critical for understanding evolution, yet the importance, intricacies and difficulties of rooting are often overlooked. For rooting, polymorphic characters among the group of interest (ingroup) must be compared to those of a relative (outgroup) that diverged before the last common ancestor (LCA) of the ingroup. Problems arise if an outgroup does not exist, is unknown, or is so distant that few characters are shared, in which case duplicated genes originating before the LCA can be used as proxy outgroups to root diverse phylogenies. Here, we describe a genome-wide expansion of this technique that can be used to solve problems at the other end of the evolutionary scale: where ingroup individuals are all very closely related to each other, but the next closest relative is very distant. We used shared orthologous single nucleotide polymorphisms (SNPs) from 10 whole genome sequences of Coxiella burnetii, the causative agent of Q fever in humans, to create a robust, but unrooted phylogeny. To maximize the number of characters informative about the rooting, we searched entire genomes for polymorphic duplicated regions where orthologs of each paralog could be identified so that the paralogs could be used to root the tree. Recent radiations, such as those of emerging pathogens, often pose rooting challenges due to a lack of ingroup variation and large genomic differences with known outgroups. Using a phylogenomic approach, we created a robust, rooted phylogeny for C. burnetii. [Coxiella burnetii; paralog SNPs; pathogen evolution; phylogeny; recent radiation; root; rooting using duplicated genes.].

  12. Effects of normal aging on memory for multiple contextual features.

    PubMed

    Gagnon, Sylvain; Soulard, Kathleen; Brasgold, Melissa; Kreller, Joshua

    2007-08-01

    Twenty-four younger (18-35 years) and 24 older adult participants (65 or older) were exposed to three experimental conditions involving the memorization words and their associated contextual features, with contextual feature complexity increasing from Conditions 1 to 3. In Condition 1, words presented varied only on one binary feature (color, size, or character), while in Conditions 2 and 3, words presented varied on two and three binary features, respectively. Each condition was carried out as follows: (1) learning of a word list; (2) encoding of words and their contextual features; (3) delay; and (4) memory for contextual features through a discrimination task. Results indicated that young adults discriminated more features than older adults on all conditions. In both age groups, contextual feature discrimination accuracy decreased as the number of features increased. Moreover, older adults demonstrated near floor performance when tested with two or more binary features. We conclude that increasing context complexity strains attentional resources.

  13. Sequencing of whole plastid genomes and nuclear ribosomal DNA of Diospyros species (Ebenaceae) endemic to New Caledonia: many species, little divergence

    PubMed Central

    Turner, Barbara; Paun, Ovidiu; Munzinger, Jérôme; Chase, Mark W.; Samuel, Rosabelle

    2016-01-01

    Background and Aims Some plant groups, especially on islands, have been shaped by strong ancestral bottlenecks and rapid, recent radiation of phenotypic characters. Single molecular markers are often not informative enough for phylogenetic reconstruction in such plant groups. Whole plastid genomes and nuclear ribosomal DNA (nrDNA) are viewed by many researchers as sources of information for phylogenetic reconstruction of groups in which expected levels of divergence in standard markers are low. Here we evaluate the usefulness of these data types to resolve phylogenetic relationships among closely related Diospyros species. Methods Twenty-two closely related Diospyros species from New Caledonia were investigated using whole plastid genomes and nrDNA data from low-coverage next-generation sequencing (NGS). Phylogenetic trees were inferred using maximum parsimony, maximum likelihood and Bayesian inference on separate plastid and nrDNA and combined matrices. Key Results The plastid and nrDNA sequences were, singly and together, unable to provide well supported phylogenetic relationships among the closely related New Caledonian Diospyros species. In the nrDNA, a 6-fold greater percentage of parsimony-informative characters compared with plastid DNA was found, but the total number of informative sites was greater for the much larger plastid DNA genomes. Combining the plastid and nuclear data improved resolution. Plastid results showed a trend towards geographical clustering of accessions rather than following taxonomic species. Conclusions In plant groups in which multiple plastid markers are not sufficiently informative, an investigation at the level of the entire plastid genome may also not be sufficient for detailed phylogenetic reconstruction. Sequencing of complete plastid genomes and nrDNA repeats seems to clarify some relationships among the New Caledonian Diospyros species, but the higher percentage of parsimony-informative characters in nrDNA compared with plastid DNA did not help to resolve the phylogenetic tree because the total number of variable sites was much lower than in the entire plastid genome. The geographical clustering of the individuals against a background of overall low sequence divergence could indicate transfer of plastid genomes due to hybridization and introgression following secondary contact. PMID:27098088

  14. Genomic prediction of continuous and binary fertility traits of females in a composite beef cattle breed

    USDA-ARS?s Scientific Manuscript database

    Reproduction efficiency is a major factor in the profitability of the beef cattle industry. Genomic selection (GS) is a promising tool that may improve the predictive accuracy and genetic gain of fertility traits. There is a wide range of traits used to measure fertility in dairy and beef cattle inc...

  15. To share or not to share: a randomized trial of consent for data sharing in genome research.

    PubMed

    McGuire, Amy L; Oliver, Jill M; Slashinski, Melody J; Graves, Jennifer L; Wang, Tao; Kelly, P Adam; Fisher, William; Lau, Ching C; Goss, John; Okcu, Mehmet; Treadwell-Deering, Diane; Goldman, Alica M; Noebels, Jeffrey L; Hilsenbeck, Susan G

    2011-11-01

    Despite growing concerns toward maintaining participants' privacy, individual investigators collecting tissue and other biological specimens for genomic analysis are encouraged to obtain informed consent for broad data sharing. Our purpose was to assess the effect on research enrollment and data sharing decisions of three different consent types (traditional, binary, or tiered) with varying levels of control and choices regarding data sharing. A single-blinded, randomized controlled trial was conducted with 323 eligible adult participants being recruited into one of six genome studies at Baylor College of Medicine in Houston, Texas, between January 2008 and August 2009. Participants were randomly assigned to one of three experimental consent documents (traditional, n = 110; binary, n = 103; and tiered, n = 110). Debriefing in follow-up visits provided participants a detailed review of all consent types and the chance to change data sharing choices or decline genome study participation. Before debriefing, 83.9% of participants chose public data release. After debriefing, 53.1% chose public data release, 33.1% chose restricted (controlled access database) release, and 13.7% opted out of data sharing. Only one participant declined genome study participation due to data sharing concerns. Our findings indicate that most participants are willing to publicly release their genomic data; however, a significant portion prefers restricted release. These results suggest discordance between existing data sharing policies and participants' judgments and desires.

  16. A New Literary Metaphor for the Genome or Proteome

    ERIC Educational Resources Information Center

    Pappas, Gus

    2005-01-01

    Previously, the idea of a blueprint has been used to explain the genome. The concept of a play's cast of characters, the Dramatis Personae, is a more fluid metaphor that allows for mutations and time-dependent phenomena to be taken into account. It also provides an educational and mnemonic exercise for students.

  17. SIMULTANEOUS PRODUCTION OF TWO CAPSULAR POLYSACCHARIDES BY PNEUMOCOCCUS

    PubMed Central

    Austrian, Robert; Bernheimer, Harriet P.; Smith, Evelyn E. B.; Mills, George T.

    1959-01-01

    Study of the capsular genome of pneumococcus has shown that it controls a multiplicity of biochemical reactions essential to the synthesis of capsular polysaccharide. Mutation affecting any one of several biochemical reactions concerned with capsular synthesis may result in loss of capsulation without alteration of other biochemical functions similarly concerned. Mutations affecting the synthesis of uronic acids are an important cause of loss of capsulation and of virulence by strains of pneumococcus Type I and Type III. The capsular genome appears to have a specific location in the total genome of the cell, this locus being occupied by the capsular genome of whatever capsular type is expressed by the cell. Transformation of capsulated or of non-capsulated pneumococci to heterologous capsular type results probably from a genetic exchange followed by the development of a new biosynthetic pathway in the transformed cell. The new capsular genome is transferred to the transformed cell as a single particle of DNA. Binary capsulation results from the simultaneous presence within the pneumococcal cell of two capsular genomes, one mutated, the other normal. Interaction between the biochemical pathways controlled by the two capsular genomes leads to augmentation of the phenotypic expression of the product controlled by one and to partial suppression of the product determined by the other. Knowledge of the biochemical basis of binary capsulation can be used to indicate the presence of uronic acid in the capsular polysaccharide of a pneurnococcal type the composition of the capsule of which is unknown. PMID:13795197

  18. BigWig and BigBed: enabling browsing of large distributed datasets.

    PubMed

    Kent, W J; Zweig, A S; Barber, G; Hinrichs, A S; Karolchik, D

    2010-09-01

    BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu.

  19. Quantitative complete tooth variation among east Asians and Native Americans: developmental biology as a tool for the assessment of human divergence.

    PubMed

    Shields, E D

    1996-01-01

    The quantification of total tooth structure derived from X-rays of Vietnamese, Southern Chinese, Mongolians, Western Eskimos, and Peruvian pre-Inca (Huari Empire) populations was used to examine dental divergence and the morphogenetics of change. Multivariate derived distances between the samples helped identify a quasicontinuous web of ethnic groups with two binary clusters ensconced within the web. One cluster was composed of Mongolians, Western Eskimos, and pre-Inca, and the other group consisted of the Southern Chinese and Vietnamese. Mongolians entered the quasicontinuum from a divergent angle (externally influenced) from that of the Southeast Asians. The Chinese and pre-Inca formed the polar samples of the distance superstructure. The pre-Inca sample was the most isolated, its closest neighbor being the Western Eskimos. Univariate and multivariate analyses suggested that the pre-Inca, whose ancestors arrived in America perhaps approximately 30,000 years ago, was the least derived sample. Clearly, microevolutionary change occurred among the samples, but the dental phenotype was resistant to environmental developmental perturbations. An assessment of dental divergence and developmental biology suggested that the overall dental phenotype is a complex multigenic morphological character, and that the observed variation evolved through total genomic drift. The quantified dental phenotype is greater than its highly multigenic algorithm and its development homeostasis is tightly controlled, or canalized, by the deterministic organization of a complex nonlinear epigenetic milieu. The overall dental phenotype quantified here was selectively neutral and a good character to help reconstruct the sequence of human evolution, but if the outlying homeostatic threshold was or will be exceeded in antecedents and descendants, respectively, evolutionary saltation occurs.

  20. Extracting the information of coastline shape and its multiple representations

    NASA Astrophysics Data System (ADS)

    Liu, Ying; Li, Shujun; Tian, Zhen; Chen, Huirong

    2007-06-01

    According to studying the coastline, a new way of multiple representations is put forward in the paper. That is stimulating human thinking way when they generalized, building the appropriate math model and describing the coastline with graphics, extracting all kinds of the coastline shape information. The coastline automatic generalization will be finished based on the knowledge rules and arithmetic operators. Showing the information of coastline shape by building the curve Douglas binary tree, it can reveal the shape character of coastline not only microcosmically but also macroscopically. Extracting the information of coastline concludes the local characteristic point and its orientation, the curve structure and the topology trait. The curve structure can be divided the single curve and the curve cluster. By confirming the knowledge rules of the coastline generalization, the generalized scale and its shape parameter, the coastline automatic generalization model is established finally. The method of the multiple scale representation of coastline in this paper has some strong points. It is human's thinking mode and can keep the nature character of the curve prototype. The binary tree structure can control the coastline comparability, avoid the self-intersect phenomenon and hold the unanimous topology relationship.

  1. Candidate Binary Microlensing Events from the MACHO Project

    NASA Astrophysics Data System (ADS)

    Becker, A. C.; Alcock, C.; Allsman, R. A.; Alves, D. R.; Axelrod, T. S.; Bennett, D. P.; Cook, K. H.; Drake, A. J.; Freeman, K. C.; Griest, K.; King, L. J.; Lehner, M. J.; Marshall, S. L.; Minniti, D.; Peterson, B. A.; Popowski, P.; Pratt, M. R.; Quinn, P. J.; Rodgers, A. W.; Stubbs, C. W.; Sutherland, W.; Tomaney, A.; Vandehei, T.; Welch, D. L.; Baines, D.; Brakel, A.; Crook, B.; Howard, J.; Leach, T.; McDowell, D.; McKeown, S.; Mitchell, J.; Moreland, J.; Pozza, E.; Purcell, P.; Ring, S.; Salmon, A.; Ward, K.; Wyper, G.; Heller, A.; Kaspi, S.; Kovo, O.; Maoz, D.; Retter, A.; Rhie, S. H.; Stetson, P.; Walker, A.; MACHO Collaboration

    1998-12-01

    We present the lightcurves of 22 gravitational microlensing events from the first six years of the MACHO Project gravitational microlensing survey which are likely examples of lensing by binary systems. These events were selected from a total sample of ~ 300 events which were either detected by the MACHO Alert System or discovered through retrospective analyses of the MACHO database. Many of these events appear to have undergone a caustic or cusp crossing, and 2 of the events are well fit with lensing by binary systems with large mass ratios, indicating secondary companions of approximately planetary mass. The event rate is roughly consistent with predictions based upon our knowledge of the properties of binary stars. The utility of binary lensing in helping to solve the Galactic dark matter problem is demonstrated with analyses of 3 binary microlensing events seen towards the Magellanic Clouds. Source star resolution during caustic crossings in 2 of these events allows us to estimate the location of the lensing systems, assuming each source is a single star and not a short period binary. * MACHO LMC-9 appears to be a binary lensing event with a caustic crossing partially resolved in 2 observations. The resulting lens proper motion appears too small for a single source and LMC disk lens. However, it is considerably less likely to be a single source star and Galactic halo lens. We estimate the a priori probability of a short period binary source with a detectable binary character to be ~ 10 %. If the source is also a binary, then we currently have no constraints on the lens location. * The most recent of these events, MACHO 98-SMC-1, was detected in real-time. Follow-up observations by the MACHO/GMAN, PLANET, MPS, EROS and OGLE microlensing collaborations lead to the robust conclusion that the lens likely resides in the SMC.

  2. When Outgroups Fail; Phylogenomics of Rooting the Emerging Pathogen, Coxiella burnetii

    PubMed Central

    Pearson, Talima; Hornstra, Heidie M.; Sahl, Jason W.; Schaack, Sarah; Schupp, James M.; Beckstrom-Sternberg, Stephen M.; O'Neill, Matthew W.; Priestley, Rachael A.; Champion, Mia D.; Beckstrom-Sternberg, James S.; Kersh, Gilbert J.; Samuel, James E.; Massung, Robert F.; Keim, Paul

    2013-01-01

    Rooting phylogenies is critical for understanding evolution, yet the importance, intricacies and difficulties of rooting are often overlooked. For rooting, polymorphic characters among the group of interest (ingroup) must be compared to those of a relative (outgroup) that diverged before the last common ancestor (LCA) of the ingroup. Problems arise if an outgroup does not exist, is unknown, or is so distant that few characters are shared, in which case duplicated genes originating before the LCA can be used as proxy outgroups to root diverse phylogenies. Here, we describe a genome-wide expansion of this technique that can be used to solve problems at the other end of the evolutionary scale: where ingroup individuals are all very closely related to each other, but the next closest relative is very distant. We used shared orthologous single nucleotide polymorphisms (SNPs) from 10 whole genome sequences of Coxiella burnetii, the causative agent of Q fever in humans, to create a robust, but unrooted phylogeny. To maximize the number of characters informative about the rooting, we searched entire genomes for polymorphic duplicated regions where orthologs of each paralog could be identified so that the paralogs could be used to root the tree. Recent radiations, such as those of emerging pathogens, often pose rooting challenges due to a lack of ingroup variation and large genomic differences with known outgroups. Using a phylogenomic approach, we created a robust, rooted phylogeny for C. burnetii. [Coxiella burnetii; paralog SNPs; pathogen evolution; phylogeny; recent radiation; root; rooting using duplicated genes.] PMID:23736103

  3. Fast genomic predictions via Bayesian G-BLUP and multilocus models of threshold traits including censored Gaussian data.

    PubMed

    Kärkkäinen, Hanni P; Sillanpää, Mikko J

    2013-09-04

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed.

  4. Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data

    PubMed Central

    Kärkkäinen, Hanni P.; Sillanpää, Mikko J.

    2013-01-01

    Because of the increased availability of genome-wide sets of molecular markers along with reduced cost of genotyping large samples of individuals, genomic estimated breeding values have become an essential resource in plant and animal breeding. Bayesian methods for breeding value estimation have proven to be accurate and efficient; however, the ever-increasing data sets are placing heavy demands on the parameter estimation algorithms. Although a commendable number of fast estimation algorithms are available for Bayesian models of continuous Gaussian traits, there is a shortage for corresponding models of discrete or censored phenotypes. In this work, we consider a threshold approach of binary, ordinal, and censored Gaussian observations for Bayesian multilocus association models and Bayesian genomic best linear unbiased prediction and present a high-speed generalized expectation maximization algorithm for parameter estimation under these models. We demonstrate our method with simulated and real data. Our example analyses suggest that the use of the extra information present in an ordered categorical or censored Gaussian data set, instead of dichotomizing the data into case-control observations, increases the accuracy of genomic breeding values predicted by Bayesian multilocus association models or by Bayesian genomic best linear unbiased prediction. Furthermore, the example analyses indicate that the correct threshold model is more accurate than the directly used Gaussian model with a censored Gaussian data, while with a binary or an ordinal data the superiority of the threshold model could not be confirmed. PMID:23821618

  5. Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park.

    PubMed

    Podar, Mircea; Makarova, Kira S; Graham, David E; Wolf, Yuri I; Koonin, Eugene V; Reysenbach, Anna-Louise

    2013-04-22

    A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deep-branching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships. This article was reviewed by Patrick Forterre, Bettina Siebers (nominated by Michael Galperin) and Purification Lopez-Garcia.

  6. Chloroplast DNA sequence of the green alga Oedogonium cardiacum (Chlorophyceae): Unique genome architecture, derived characters shared with the Chaetophorales and novel genes acquired through horizontal transfer

    PubMed Central

    Brouard, Jean-Simon; Otis, Christian; Lemieux, Claude; Turmel, Monique

    2008-01-01

    Background To gain insight into the branching order of the five main lineages currently recognized in the green algal class Chlorophyceae and to expand our understanding of chloroplast genome evolution, we have undertaken the sequencing of chloroplast DNA (cpDNA) from representative taxa. The complete cpDNA sequences previously reported for Chlamydomonas (Chlamydomonadales), Scenedesmus (Sphaeropleales), and Stigeoclonium (Chaetophorales) revealed tremendous variability in their architecture, the retention of only few ancestral gene clusters, and derived clusters shared by Chlamydomonas and Scenedesmus. Unexpectedly, our recent phylogenies inferred from these cpDNAs and the partial sequences of three other chlorophycean cpDNAs disclosed two major clades, one uniting the Chlamydomonadales and Sphaeropleales (CS clade) and the other uniting the Oedogoniales, Chaetophorales and Chaetopeltidales (OCC clade). Although molecular signatures provided strong support for this dichotomy and for the branching of the Oedogoniales as the earliest-diverging lineage of the OCC clade, more data are required to validate these phylogenies. We describe here the complete cpDNA sequence of Oedogonium cardiacum (Oedogoniales). Results Like its three chlorophycean homologues, the 196,547-bp Oedogonium chloroplast genome displays a distinctive architecture. This genome is one of the most compact among photosynthetic chlorophytes. It has an atypical quadripartite structure, is intron-rich (17 group I and 4 group II introns), and displays 99 different conserved genes and four long open reading frames (ORFs), three of which are clustered in the spacious inverted repeat of 35,493 bp. Intriguingly, two of these ORFs (int and dpoB) revealed high similarities to genes not usually found in cpDNA. At the gene content and gene order levels, the Oedogonium genome most closely resembles its Stigeoclonium counterpart. Characters shared by these chlorophyceans but missing in members of the CS clade include the retention of psaM, rpl32 and trnL(caa), the loss of petA, the disruption of three ancestral clusters and the presence of five derived gene clusters. Conclusion The Oedogonium chloroplast genome disclosed additional characters that bolster the evidence for a close alliance between the Oedogoniales and Chaetophorales. Our unprecedented finding of int and dpoB in this cpDNA provides a clear example that novel genes were acquired by the chloroplast genome through horizontal transfers, possibly from a mitochondrial genome donor. PMID:18558012

  7. Holographic implementation of a binary associative memory for improved recognition

    NASA Astrophysics Data System (ADS)

    Bandyopadhyay, Somnath; Ghosh, Ajay; Datta, Asit K.

    1998-03-01

    Neural network associate memory has found wide application sin pattern recognition techniques. We propose an associative memory model for binary character recognition. The interconnection strengths of the memory are binary valued. The concept of sparse coding is sued to enhance the storage efficiency of the model. The question of imposed preconditioning of pattern vectors, which is inherent in a sparsely coded conventional memory, is eliminated by using a multistep correlation technique an the ability of correct association is enhanced in a real-time application. A potential optoelectronic implementation of the proposed associative memory is also described. The learning and recall is possible by using digital optical matrix-vector multiplication, where full use of parallelism and connectivity of optics is made. A hologram is used in the experiment as a longer memory (LTM) for storing all input information. The short-term memory or the interconnection weight matrix required during the recall process is configured by retrieving the necessary information from the holographic LTM.

  8. A Novel Partial Sequence Alignment Tool for Finding Large Deletions

    PubMed Central

    Aruk, Taner; Ustek, Duran; Kursun, Olcay

    2012-01-01

    Finding large deletions in genome sequences has become increasingly more useful in bioinformatics, such as in clinical research and diagnosis. Although there are a number of publically available next generation sequencing mapping and sequence alignment programs, these software packages do not correctly align fragments containing deletions larger than one kb. We present a fast alignment software package, BinaryPartialAlign, that can be used by wet lab scientists to find long structural variations in their experiments. For BinaryPartialAlign, we make use of the Smith-Waterman (SW) algorithm with a binary-search-based approach for alignment with large gaps that we called partial alignment. BinaryPartialAlign implementation is compared with other straight-forward applications of SW. Simulation results on mtDNA fragments demonstrate the effectiveness (runtime and accuracy) of the proposed method. PMID:22566777

  9. Scheme for Entering Binary Data Into a Quantum Computer

    NASA Technical Reports Server (NTRS)

    Williams, Colin

    2005-01-01

    A quantum algorithm provides for the encoding of an exponentially large number of classical data bits by use of a smaller (polynomially large) number of quantum bits (qubits). The development of this algorithm was prompted by the need, heretofore not satisfied, for a means of entering real-world binary data into a quantum computer. The data format provided by this algorithm is suitable for subsequent ultrafast quantum processing of the entered data. Potential applications lie in disciplines (e.g., genomics) in which one needs to search for matches between parts of very long sequences of data. For example, the algorithm could be used to encode the N-bit-long human genome in only log2N qubits. The resulting log2N-qubit state could then be used for subsequent quantum data processing - for example, to perform rapid comparisons of sequences.

  10. DNABIT Compress - Genome compression algorithm.

    PubMed

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-22

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.

  11. High quality draft genome sequence of the moderately halophilic bacterium Pontibacillus yanchengensis Y32(T) and comparison among Pontibacillus genomes.

    PubMed

    Huang, Jing; Qiao, Zi Xu; Tang, Jing Wei; Wang, Gejiao

    2015-01-01

    Pontibacillus yanchengensis Y32(T) is an aerobic, motile, Gram-positive, endospore-forming, and moderately halophilic bacterium isolated from a salt field. In this study, we describe the features of P. yanchengensis strain Y32(T) together with a comparison with other four Pontibacillus genomes. The 4,281,464 bp high-quality-draft genome of strain Y32(T) is arranged into 153 contigs containing 3,965 protein-coding genes and 77 RNA encoding genes. The genome of strain Y32(T) possesses many genes related to its halophilic character, flagellar assembly and chemotaxis to support its survival in a salt-rich environment.

  12. KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.

    PubMed

    Wang, Dapeng; Xu, Jiayue; Yu, Jun

    2015-09-16

    The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.

  13. Optical Neural Classification Of Binary Patterns

    NASA Astrophysics Data System (ADS)

    Gustafson, Steven C.; Little, Gordon R.

    1988-05-01

    Binary pattern classification that may be implemented using optical hardware and neural network algorithms is considered. Pattern classification problems that have no concise description (as in classifying handwritten characters) or no concise computation (as in NP-complete problems) are expected to be particularly amenable to this approach. For example, optical processors that efficiently classify binary patterns in accordance with their Boolean function complexity might be designed. As a candidate for such a design, an optical neural network model is discussed that is designed for binary pattern classification and that consists of an optical resonator with a dynamic multiplex-recorded reflection hologram and a phase conjugate mirror with thresholding and gain. In this model, learning or training examples of binary patterns may be recorded on the hologram such that one bit in each pattern marks the pattern class. Any input pattern, including one with an unknown class or marker bit, will be modified by a large number of parallel interactions with the reflection hologram and nonlinear mirror. After perhaps several seconds and 100 billion interactions, a steady-state pattern may develop with a marker bit that represents a minimum-Boolean-complexity classification of the input pattern. Computer simulations are presented that illustrate progress in understanding the behavior of this model and in developing a processor design that could have commanding and enduring performance advantages compared to current pattern classification techniques.

  14. Optimal rates for phylogenetic inference and experimental design in the era of genome-scale datasets.

    PubMed

    Dornburg, Alex; Su, Zhuo; Townsend, Jeffrey P

    2018-06-25

    With the rise of genome- scale datasets there has been a call for increased data scrutiny and careful selection of loci appropriate for attempting the resolution of a phylogenetic problem. Such loci are desired to maximize phylogenetic information content while minimizing the risk of homoplasy. Theory posits the existence of characters that evolve under such an optimum rate, and efforts to determine optimal rates of inference have been a cornerstone of phylogenetic experimental design for over two decades. However, both theoretical and empirical investigations of optimal rates have varied dramatically in their conclusions: spanning no relationship to a tight relationship between the rate of change and phylogenetic utility. Here we synthesize these apparently contradictory views, demonstrating both empirical and theoretical conditions under which each is correct. We find that optimal rates of characters-not genes-are generally robust to most experimental design decisions. Moreover, consideration of site rate heterogeneity within a given locus is critical to accurate predictions of utility. Factors such as taxon sampling or the targeted number of characters providing support for a topology are additionally critical to the predictions of phylogenetic utility based on the rate of character change. Further, optimality of rates and predictions of phylogenetic utility are not equivalent, demonstrating the need for further development of comprehensive theory of phylogenetic experimental design.

  15. An efficient genome-wide association test for mixed binary and continuous phenotypes with applications to substance abuse research.

    PubMed

    Buu, Anne; Williams, L Keoki; Yang, James J

    2018-03-01

    We propose a new genome-wide association test for mixed binary and continuous phenotypes that uses an efficient numerical method to estimate the empirical distribution of the Fisher's combination statistic under the null hypothesis. Our simulation study shows that the proposed method controls the type I error rate and also maintains its power at the level of the permutation method. More importantly, the computational efficiency of the proposed method is much higher than the one of the permutation method. The simulation results also indicate that the power of the test increases when the genetic effect increases, the minor allele frequency increases, and the correlation between responses decreases. The statistical analysis on the database of the Study of Addiction: Genetics and Environment demonstrates that the proposed method combining multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests.

  16. The structural, electronic, magnetic and optical properties of the half-metallic binary alloys ZCl3 (Z=Be, Mg, Ca, Sr): A first-principles study

    NASA Astrophysics Data System (ADS)

    Song, Jun-Tao; Zhang, Jian-Min

    2018-06-01

    The investigations of the electronic and magnetic properties show the binary Heusler alloys ZCl3 (Z = Be, Mg, Ca, Sr) are half-metallic (HM) ferromagnets with an integer magnetic moment (Mt) of 1 μB /f.u.. The alloy BeCl3 is thermodynamic meta-stable, while other alloys are thermodynamic stable according to their cohesive energies and formation energies. Moreover, wide HM regions for alloys ZCl3 (Z = Be, Mg, Ca, Sr) show their HM characters are robust when the lattices are expanded or compressed under uniform and tetragonal strains. Finally, some optical properties are analyzed in detail, such as the dielectric function, the absorption coefficient, the refractive index and the extinction coefficient.

  17. A Robust Semi-Parametric Test for Detecting Trait-Dependent Diversification.

    PubMed

    Rabosky, Daniel L; Huang, Huateng

    2016-03-01

    Rates of species diversification vary widely across the tree of life and there is considerable interest in identifying organismal traits that correlate with rates of speciation and extinction. However, it has been challenging to develop methodological frameworks for testing hypotheses about trait-dependent diversification that are robust to phylogenetic pseudoreplication and to directionally biased rates of character change. We describe a semi-parametric test for trait-dependent diversification that explicitly requires replicated associations between character states and diversification rates to detect effects. To use the method, diversification rates are reconstructed across a phylogenetic tree with no consideration of character states. A test statistic is then computed to measure the association between species-level traits and the corresponding diversification rate estimates at the tips of the tree. The empirical value of the test statistic is compared to a null distribution that is generated by structured permutations of evolutionary rates across the phylogeny. The test is applicable to binary discrete characters as well as continuous-valued traits and can accommodate extremely sparse sampling of character states at the tips of the tree. We apply the test to several empirical data sets and demonstrate that the method has acceptable Type I error rates. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low. PMID:25405773

  19. Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park

    PubMed Central

    2013-01-01

    Background A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. Results The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deep-branching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. Conclusions Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships. Reviewers This article was reviewed by Patrick Forterre, Bettina Siebers (nominated by Michael Galperin) and Purification Lopez-Garcia PMID:23607440

  20. DNABIT Compress – Genome compression algorithm

    PubMed Central

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-01

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases. PMID:21383923

  1. Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

    PubMed

    Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin

    2016-01-01

    First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.

  2. Solving the problem of Trans-Genomic Query with alignment tables.

    PubMed

    Parker, Douglass Stott; Hsiao, Ruey-Lung; Xing, Yi; Resch, Alissa M; Lee, Christopher J

    2008-01-01

    The trans-genomic query (TGQ) problem--enabling the free query of biological information, even across genomes--is a central challenge facing bioinformatics. Solutions to this problem can alter the nature of the field, moving it beyond the jungle of data integration and expanding the number and scope of questions that can be answered. An alignment table is a binary relationship on locations (sequence segments). An important special case of alignment tables are hit tables ? tables of pairs of highly similar segments produced by alignment tools like BLAST. However, alignment tables also include general binary relationships, and can represent any useful connection between sequence locations. They can be curated, and provide a high-quality queryable backbone of connections between biological information. Alignment tables thus can be a natural foundation for TGQ, as they permit a central part of the TGQ problem to be reduced to purely technical problems involving tables of locations.Key challenges in implementing alignment tables include efficient representation and indexing of sequence locations. We define a location datatype that can be incorporated naturally into common off-the-shelf database systems. We also describe an implementation of alignment tables in BLASTGRES, an extension of the open-source POSTGRESQL database system that provides indexing and operators on locations required for querying alignment tables. This paper also reviews several successful large-scale applications of alignment tables for Trans-Genomic Query. Tables with millions of alignments have been used in queries about alternative splicing, an area of genomic analysis concerning the way in which a single gene can yield multiple transcripts. Comparative genomics is a large potential application area for TGQ and alignment tables.

  3. [Ethical considerations in genomic cohort study].

    PubMed

    Choi, Eun Kyung; Kim, Ock-Joo

    2007-03-01

    During the last decade, genomic cohort study has been developed in many countries by linking health data and genetic data in stored samples. Genomic cohort study is expected to find key genetic components that contribute to common diseases, thereby promising great advance in genome medicine. While many countries endeavor to build biobank systems, biobank-based genome research has raised important ethical concerns including genetic privacy, confidentiality, discrimination, and informed consent. Informed consent for biobank poses an important question: whether true informed consent is possible in population-based genomic cohort research where the nature of future studies is unforeseeable when consent is obtained. Due to the sensitive character of genetic information, protecting privacy and keeping confidentiality become important topics. To minimize ethical problems and achieve scientific goals to its maximum degree, each country strives to build population-based genomic cohort research project, by organizing public consultation, trying public and expert consensus in research, and providing safeguards to protect privacy and confidentiality.

  4. [The investigation of genomes of some species of the genus Gentiana in nature and in vitro cell culture].

    PubMed

    Mel'nyk, V M; Spiridonova, K V; Andrieiev, I O; Strashniuk, N M; Kunakh, V A

    2002-01-01

    The comparative study of the genomes of intact plants-representatives of some species of the genus Gentiana L. as well as cultured cells of G. lutea and G. punctata was performed using restriction analysis. Species specificity of restriction fragment patterns for studied representatives of this genus was revealed. The differences between electrophoretic patterns of digested DNA purified from rhizome and leaves of G. lutea and G. punctata were found. The changes in genomes of G. lutea and G. punctata cells cultured in vitro compared with the genomes of intact plants were detected. The data obtained evidence that some of them may be of nonrandom character.

  5. Sequencing and comparing whole mitochondrial genomes ofanimals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less

  6. Generation of Neo Octaploid Switchgrass

    USDA-ARS?s Scientific Manuscript database

    Switchgrass (Panicum virgatum L.) exists as multiple cytotypes with octaploid and tetraploid populations occupying distinct, overlapping ranges. These cytotypes tend to show differences in adaptation, yield potential, and other characters, but the specific result of whole genome duplication is not ...

  7. Data Management Requirements for the Rapid Identification and Character of Unknown Genomic Samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rosenzweig, Nicole

    2010-06-02

    Nicole Rosenzweig of OptiMetrics discusses the development of informatics infrastructure for studying bacterial pathogens on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM.

  8. Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

    PubMed

    Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

    2017-04-28

    Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.

  9. Lophotrochozoan mitochondrial genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valles, Yvonne; Boore, Jeffrey L.

    2005-10-01

    Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animalsmore » across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.« less

  10. Delayed Ionization in Transition Metal Carbon Clusters

    NASA Astrophysics Data System (ADS)

    Kooi, S. E.; Castleman, A. W., Jr.

    1997-03-01

    Mass spectrometric studies of several single and binary transition metal carbon cluster systems, produced in a laser vaporization source, reveal several species that undergo delayed ionization. Pulsed extraction and blocking electric fields, in a time-of-flight mass spectrometer, allow the study of delayed ionization over a time window after excitation with a pulsed laser. In systems where metallocarbohedrenes (Met-Cars) are produced, the Met-Cars are the dominate delayed species. Delayed ionization of binary metal Met-Cars Ti_xM_yC_12 (M=Zr,Nb,Y; x+y=8) is dependent on the ratio of the two metals. Delayed behavior is investigated over a range of photoionization wavelengths and fluences. In order to determine the degree to which the delayed ionization is thermionic in character, the experimental data have been compared to Klots's model for thermionic emission from small particles.

  11. Binary zone-plate array for a parallel joint transform correlator applied to face recognition.

    PubMed

    Kodate, K; Hashimoto, A; Thapliya, R

    1999-05-10

    Taking advantage of small aberrations, high efficiency, and compactness, we developed a new, to our knowledge, design procedure for a binary zone-plate array (BZPA) and applied it to a parallel joint transform correlator for the recognition of the human face. Pairs of reference and unknown images of faces are displayed on a liquid-crystal spatial light modulator (SLM), Fourier transformed by the BZPA, intensity recorded on an optically addressable SLM, and inversely Fourier transformed to obtain correlation signals. Consideration of the bandwidth allows the relations among the channel number, the numerical aperture of the zone plates, and the pattern size to be determined. Experimentally a five-channel parallel correlator was implemented and tested successfully with a 100-person database. The design and the fabrication of a 20-channel BZPA for phonetic character recognition are also included.

  12. Testing the impact of morphological rate heterogeneity on ancestral state reconstruction of five floral traits in angiosperms.

    PubMed

    Reyes, Elisabeth; Nadot, Sophie; von Balthazar, Maria; Schönenberger, Jürg; Sauquet, Hervé

    2018-06-21

    Ancestral state reconstruction is an important tool to study morphological evolution and often involves estimating transition rates among character states. However, various factors, including taxonomic scale and sampling density, may impact transition rate estimation and indirectly also the probability of the state at a given node. Here, we test the influence of rate heterogeneity using maximum likelihood methods on five binary perianth characters, optimized on a phylogenetic tree of angiosperms including 1230 species sampled from all families. We compare the states reconstructed by an equal-rate (Mk1) and a two-rate model (Mk2) fitted either with a single set of rates for the whole tree or as a partitioned model, allowing for different rates on five partitions of the tree. We find strong signal for rate heterogeneity among the five subdivisions for all five characters, but little overall impact of the choice of model on reconstructed ancestral states, which indicates that most of our inferred ancestral states are the same whether heterogeneity is accounted for or not.

  13. Challenging a bioinformatic tool's ability to detect microbial contaminants using in silico whole genome sequencing data.

    PubMed

    Olson, Nathan D; Zook, Justin M; Morrow, Jayne B; Lin, Nancy J

    2017-01-01

    High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus , Escherichia , and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods.

  14. Identifying the Basal Angiosperm Node in Chloroplast GenomePhylogenies: Sampling One's Way Out of the Felsenstein Zone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leebens-Mack, Jim; Raubeson, Linda A.; Cui, Liying

    2005-05-27

    While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxonmore » sampling. The added taxa include three monocots (Acorus, Yucca and Typha), a water lily (Nuphar), a ranunculid(Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein datasets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiospermphylogeny. However, their relative positions proved to be dependent on method of analysis, with parsimony favoring Amborella as sister to all other angiosperms, and maximum likelihood and neighbor-joining methods favoring an Amborella + Nympheales clade as sister. The maximum likelihood phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single gene phylogenies, estimated divergence dates and conflicting in del characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiospermphylogeny. Molecular dating analyses provided median age estimates of 161 mya for the most recent common ancestor of all extant angiosperms and 145 mya for the most recent common ancestor of monocots, magnoliids andeudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction can mislead genome-scale phylogenetic analyses.« less

  15. HIV-1 Full-Genome Phylogenetics of Generalized Epidemics in Sub-Saharan Africa: Impact of Missing Nucleotide Characters in Next-Generation Sequences

    PubMed Central

    Wymant, Chris; Colijn, Caroline; Danaviah, Siva; Essex, Max; Frost, Simon; Gall, Astrid; Gaseitsiwe, Simani; Grabowski, Mary K.; Gray, Ronald; Guindon, Stephane; von Haeseler, Arndt; Kaleebu, Pontiano; Kendall, Michelle; Kozlov, Alexey; Manasa, Justen; Minh, Bui Quang; Moyo, Sikhulile; Novitsky, Vlad; Nsubuga, Rebecca; Pillay, Sureshnee; Quinn, Thomas C.; Serwadda, David; Ssemwanga, Deogratius; Stamatakis, Alexandros; Trifinopoulos, Jana; Wawer, Maria; Brown, Andy Leigh; de Oliveira, Tulio; Kellam, Paul; Pillay, Deenan; Fraser, Christophe

    2017-01-01

    Abstract To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the “Phylogenetics and Networks for Generalised HIV Epidemics in Africa” consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n = 2,833; MRC/UVRI Uganda, n = 701; Mochudi Prevention Project, n = 359; Africa Health Research Institute Resistance Cohort, n = 92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai. Partial sequencing failure was primarily associated with low viral load, increased for amplicons closer to the 3′ end of the genome, was not associated with subtype diversity except HIV-1 subtype D, and remained significantly associated with sampling location after controlling for other factors. We assessed the impact of the missing data patterns in PANGEA-HIV sequences on phylogeny reconstruction in simulations. We found a threshold in terms of taxon sampling below which the patchy distribution of missing characters in next-generation sequences (NGS) has an excess negative impact on the accuracy of HIV-1 phylogeny reconstruction, which is attributable to tree reconstruction artifacts that accumulate when branches in viral trees are long. The large number of PANGEA-HIV sequences provides unprecedented opportunities for evaluating HIV-1 transmission dynamics across sub-Saharan Africa and identifying prevention opportunities. Molecular epidemiological analyses of these data must proceed cautiously because sequence sampling remains below the identified threshold and a considerable negative impact of missing characters on phylogeny reconstruction is expected. PMID:28540766

  16. HIV-1 full-genome phylogenetics of generalized epidemics in sub-Saharan Africa: impact of missing nucleotide characters in next-generation sequences.

    PubMed

    Ratmann, Oliver; Wymant, Chris; Colijn, Caroline; Danaviah, Siva; Essex, M; Frost, Simon D W; Gall, Astrid; Gaiseitsiwe, Simani; Grabowski, Mary; Gray, Ronald; Guindon, Stephane; von Haeseler, Arndt; Kaleebu, Pontiano; Kendall, Michelle; Kozlov, Alexey; Manasa, Justen; Minh, Bui Quang; Moyo, Sikhulile; Novitsky, Vladimir; Nsubuga, Rebecca; Pillay, Sureshnee; Quinn, Thomas C; Serwadda, David; Ssemwanga, Deogratius; Stamatakis, Alexandros; Trifinopoulos, Jana; Wawer, Maria; Leigh Brown, Andrew; de Oliveira, Tulio; Kellam, Paul; Pillay, Deenan; Fraser, Christophe

    2017-05-25

    To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the 'Phylogenetics and Networks for Generalised HIV Epidemics in Africa' consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n=2,833; MRC/UVRI Uganda, n=701; Mochudi Prevention Project, n=359; Africa Health Research Institute Resistance Cohort, n=92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai. Partial sequencing failure was primarily associated with low viral load, increased for amplicons closer to the 3' end of the genome, was not associated with subtype diversity except HIV-1 subtype D, and remained significantly associated with sampling location after controlling for other factors. We assessed the impact of the missing data patterns in PANGEA-HIV sequences on phylogeny reconstruction in simulations. We found a threshold in terms of taxon sampling below which the patchy distribution of missing characters in next-generation sequences has an excess negative impact on the accuracy of HIV-1 phylogeny reconstruction, which is attributable to tree reconstruction artifacts that accumulate when branches in viral trees are long. The large number of PANGEA-HIV sequences provides unprecedented opportunities for evaluating HIV-1 transmission dynamics across sub-Saharan Africa and identifying prevention opportunities. Molecular epidemiological analyses of these data must proceed cautiously because sequence sampling remains below the identified threshold and a considerable negative impact of missing characters on phylogeny reconstruction is expected.

  17. Genomic signals of migration and continuity in Britain before the Anglo-Saxons.

    PubMed

    Martiniano, Rui; Caffell, Anwen; Holst, Malin; Hunter-Mann, Kurt; Montgomery, Janet; Müldner, Gundula; McLaughlin, Russell L; Teasdale, Matthew D; van Rheenen, Wouter; Veldink, Jan H; van den Berg, Leonard H; Hardiman, Orla; Carroll, Maureen; Roskams, Steve; Oxley, John; Morgan, Colleen; Thomas, Mark G; Barnes, Ian; McDonnell, Christine; Collins, Matthew J; Bradley, Daniel G

    2016-01-19

    The purported migrations that have formed the peoples of Britain have been the focus of generations of scholarly controversy. However, this has not benefited from direct analyses of ancient genomes. Here we report nine ancient genomes (∼ 1 ×) of individuals from northern Britain: seven from a Roman era York cemetery, bookended by earlier Iron-Age and later Anglo-Saxon burials. Six of the Roman genomes show affinity with modern British Celtic populations, particularly Welsh, but significantly diverge from populations from Yorkshire and other eastern English samples. They also show similarity with the earlier Iron-Age genome, suggesting population continuity, but differ from the later Anglo-Saxon genome. This pattern concords with profound impact of migrations in the Anglo-Saxon period. Strikingly, one Roman skeleton shows a clear signal of exogenous origin, with affinities pointing towards the Middle East, confirming the cosmopolitan character of the Empire, even at its northernmost fringes.

  18. Genomic signals of migration and continuity in Britain before the Anglo-Saxons

    PubMed Central

    Martiniano, Rui; Caffell, Anwen; Holst, Malin; Hunter-Mann, Kurt; Montgomery, Janet; Müldner, Gundula; McLaughlin, Russell L.; Teasdale, Matthew D.; van Rheenen, Wouter; Veldink, Jan H.; van den Berg, Leonard H.; Hardiman, Orla; Carroll, Maureen; Roskams, Steve; Oxley, John; Morgan, Colleen; Thomas, Mark G.; Barnes, Ian; McDonnell, Christine; Collins, Matthew J.; Bradley, Daniel G.

    2016-01-01

    The purported migrations that have formed the peoples of Britain have been the focus of generations of scholarly controversy. However, this has not benefited from direct analyses of ancient genomes. Here we report nine ancient genomes (∼1 ×) of individuals from northern Britain: seven from a Roman era York cemetery, bookended by earlier Iron-Age and later Anglo-Saxon burials. Six of the Roman genomes show affinity with modern British Celtic populations, particularly Welsh, but significantly diverge from populations from Yorkshire and other eastern English samples. They also show similarity with the earlier Iron-Age genome, suggesting population continuity, but differ from the later Anglo-Saxon genome. This pattern concords with profound impact of migrations in the Anglo-Saxon period. Strikingly, one Roman skeleton shows a clear signal of exogenous origin, with affinities pointing towards the Middle East, confirming the cosmopolitan character of the Empire, even at its northernmost fringes. PMID:26783717

  19. A nuclear DNA perspective on delineating evolutionarily significant lineages in polyploids: the case of the endangered shortnose sturgeon (Acipenser brevirostrum)

    USGS Publications Warehouse

    King, Timothy L.; Henderson, Anne P.; Kynard, Boyd E.; Kieffer, Micah C.; Peterson, Douglas L.; Aunins, Aaron W.; Brown, Bonnie L.

    2014-01-01

    The shortnose sturgeon, Acipenser brevirostrum, oft considered a phylogenetic relic, is listed as an “endangered species threatened with extinction” in the US and “Vulnerable” on the IUCN Red List. Effective conservation of A. brevirostrum depends on understanding its diversity and evolutionary processes, yet challenges associated with the polyploid nature of its nuclear genome have heretofore limited population genetic analysis to maternally inherited haploid characters. We developed a suite of polysomic microsatellite DNA markers and characterized a sample of 561 shortnose sturgeon collected from major extant populations along the North American Atlantic coast. The 181 alleles observed at 11 loci were scored as binary loci and the data were subjected to multivariate ordination, Bayesian clustering, hierarchical partitioning of variance, and among-population distance metric tests. The methods uncovered moderately high levels of gene diversity suggesting population structuring across and within three metapopulations (Northeast, Mid-Atlantic, and Southeast) that encompass seven demographically discrete and evolutionarily distinct lineages. The predicted groups are consistent with previously described behavioral patterns, especially dispersal and migration, supporting the interpretation that A. brevirostrum exhibit adaptive differences based on watershed. Combined with results of prior genetic (mitochondrial DNA) and behavioral studies, the current work suggests that dispersal is an important factor in maintaining genetic diversity in A. brevirostrum and that the basic unit for conservation management is arguably the local population.

  20. A nuclear DNA perspective on delineating evolutionarily significant lineages in polyploids: the case of the endangered shortnose sturgeon (Acipenser brevirostrum).

    PubMed

    King, Tim L; Henderson, Anne P; Kynard, Boyd E; Kieffer, Micah C; Peterson, Douglas L; Aunins, Aaron W; Brown, Bonnie L

    2014-01-01

    The shortnose sturgeon, Acipenser brevirostrum, oft considered a phylogenetic relic, is listed as an "endangered species threatened with extinction" in the US and "Vulnerable" on the IUCN Red List. Effective conservation of A. brevirostrum depends on understanding its diversity and evolutionary processes, yet challenges associated with the polyploid nature of its nuclear genome have heretofore limited population genetic analysis to maternally inherited haploid characters. We developed a suite of polysomic microsatellite DNA markers and characterized a sample of 561 shortnose sturgeon collected from major extant populations along the North American Atlantic coast. The 181 alleles observed at 11 loci were scored as binary loci and the data were subjected to multivariate ordination, Bayesian clustering, hierarchical partitioning of variance, and among-population distance metric tests. The methods uncovered moderately high levels of gene diversity suggesting population structuring across and within three metapopulations (Northeast, Mid-Atlantic, and Southeast) that encompass seven demographically discrete and evolutionarily distinct lineages. The predicted groups are consistent with previously described behavioral patterns, especially dispersal and migration, supporting the interpretation that A. brevirostrum exhibit adaptive differences based on watershed. Combined with results of prior genetic (mitochondrial DNA) and behavioral studies, the current work suggests that dispersal is an important factor in maintaining genetic diversity in A. brevirostrum and that the basic unit for conservation management is arguably the local population.

  1. NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis.

    PubMed

    Le Morvan, Marine; Zinovyev, Andrei; Vert, Jean-Philippe

    2017-06-01

    Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations.

  2. NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis

    PubMed Central

    2017-01-01

    Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations. PMID:28650955

  3. The thermochemical characteristics of solution of phenol and benzoic acid in water-dimethylsulfoxide and water-acetonitrile mixtures

    NASA Astrophysics Data System (ADS)

    Zakharov, A. G.; Voronova, M. I.; Batov, D. V.; Smirnova, K. V.

    2011-03-01

    The solution of phenol and benzoic acid in water-dimethylsulfoxide (DMSO) and water-acetonitrile (AN) mixtures was studied. As distinct from benzoic acid, the thermodynamic characteristics of solution of phenol sharply change at concentrations corresponding to a change in the character of cluster formation in water-DMSO and water-AN mixtures. Differences in the solvation of phenol and benzoic acid are explained by different mechanisms of the interaction of the solutes with clusters existing in binary mixtures.

  4. Comparison of gene expression in segregating families identifies genes and genomic regions involved in a novel adaptation, zinc hyperaccumulation.

    PubMed

    Filatov, Victor; Dowdle, John; Smirnoff, Nicholas; Ford-Lloyd, Brian; Newbury, H John; Macnair, Mark R

    2006-09-01

    One of the challenges of comparative genomics is to identify specific genetic changes associated with the evolution of a novel adaptation or trait. We need to be able to disassociate the genes involved with a particular character from all the other genetic changes that take place as lineages diverge. Here we show that by comparing the transcriptional profile of segregating families with that of parent species differing in a novel trait, it is possible to narrow down substantially the list of potential target genes. In addition, by assuming synteny with a related model organism for which the complete genome sequence is available, it is possible to use the cosegregation of markers differing in transcription level to identify regions of the genome which probably contain quantitative trait loci (QTLs) for the character. This novel combination of genomics and classical genetics provides a very powerful tool to identify candidate genes. We use this methodology to investigate zinc hyperaccumulation in Arabidopsis halleri, the sister species to the model plant, Arabidopsis thaliana. We compare the transcriptional profile of A. halleri with that of its sister nonaccumulator species, Arabidopsis petraea, and between accumulator and nonaccumulator F(3)s derived from the cross between the two species. We identify eight genes which consistently show greater expression in accumulator phenotypes in both roots and shoots, including two metal transporter genes (NRAMP3 and ZIP6), and cytoplasmic aconitase, a gene involved in iron homeostasis in mammals. We also show that there appear to be two QTLs for zinc accumulation, on chromosomes 3 and 7.

  5. The effects of temperament, character, and defense mechanisms on grief severity among the elderly.

    PubMed

    Gana, Kamel; K'Delant, Pascaline

    2011-01-01

    The aims of this study were to examine the relationships between Cloninger's psychobiological model of personality, defense styles, and severity of grief, and to identify the influential temperament and character dimensions that differentiate subjects with prolonged grief from those without prolonged grief. A sample of 72 bereaved elderly persons for whom the loss of a loved one occurred on average 2.58 years (SD = 1.92) prior to participation in this study were assessed using the Inventory of Complicated Grief-Revised, the Temperament and Character Inventory, and the Defense Styles Questionnaire. Using the algorithm developed by Prigerson et al. (2009) for diagnosing prolonged grief, 18 of our participants were identified as having this disorder. A multiple regression analysis revealed that time since loss, persistence, an immature defense style, and the age of the bereaved person positively predicted severity of grief, whereas cooperativeness and the age of the deceased loved one negatively predicted severity of grief. A binary logistic regression showed that gender, a close kinship relation to the deceased, time since loss, self-directedness (SD), and self-transcendence (ST) were predictors of prolonged grief, whereas the age of the deceased and cooperativeness (CO) were negatively related to prolonged grief. Our sample was small. Self-report measures of grief were not supplemented with clinical evaluation. Our results suggest that only character dimensions (high SD and ST, and low CO) are involved in the psychopathology of prolonged grief. Also, according to Cloninger's character cube (Cloninger, 2004), high SD and ST scores, and low CO scores are indicative of a fanatical character. Copyright © 2010 Elsevier B.V. All rights reserved.

  6. Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes.

    PubMed

    Cook, James P; Mahajan, Anubha; Morris, Andrew P

    2017-02-01

    Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.

  7. On the influence of tetrahedral covalent-hybridization on electronic band structure of topological insulators from first principles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, X. M.; Xu, G. Z.; Liu, E. K.

    Based on first-principles calculations, we investigate the influence of tetrahedral covalent-hybridization between main-group and transition-metal atoms on the topological band structures of binary HgTe and ternary half-Heusler compounds, respectively. Results show that, for the binary HgTe, when its zinc-blend structure is artificially changed to rock-salt one, the tetrahedral covalent-hybridization will be removed and correspondingly the topologically insulating band character lost. While for the ternary half-Heusler system, the strength of covalent-hybridization can be tuned by varying both chemical compositions and atomic arrangements, and the competition between tetrahedral and octahedral covalent-hybridization has been discussed in details. As a result, we found thatmore » a proper strength of tetrahedral covalent-hybridization is probably in favor to realizing the topologically insulating state with band inversion occurring at the Γ point of the Brillouin zone.« less

  8. Challenging a bioinformatic tool’s ability to detect microbial contaminants using in silico whole genome sequencing data

    PubMed Central

    Zook, Justin M.; Morrow, Jayne B.; Lin, Nancy J.

    2017-01-01

    High sensitivity methods such as next generation sequencing and polymerase chain reaction (PCR) are adversely impacted by organismal and DNA contaminants. Current methods for detecting contaminants in microbial materials (genomic DNA and cultures) are not sensitive enough and require either a known or culturable contaminant. Whole genome sequencing (WGS) is a promising approach for detecting contaminants due to its sensitivity and lack of need for a priori assumptions about the contaminant. Prior to applying WGS, we must first understand its limitations for detecting contaminants and potential for false positives. Herein we demonstrate and characterize a WGS-based approach to detect organismal contaminants using an existing metagenomic taxonomic classification algorithm. Simulated WGS datasets from ten genera as individuals and binary mixtures of eight organisms at varying ratios were analyzed to evaluate the role of contaminant concentration and taxonomy on detection. For the individual genomes the false positive contaminants reported depended on the genus, with Staphylococcus, Escherichia, and Shigella having the highest proportion of false positives. For nearly all binary mixtures the contaminant was detected in the in-silico datasets at the equivalent of 1 in 1,000 cells, though F. tularensis was not detected in any of the simulated contaminant mixtures and Y. pestis was only detected at the equivalent of one in 10 cells. Once a WGS method for detecting contaminants is characterized, it can be applied to evaluate microbial material purity, in efforts to ensure that contaminants are characterized in microbial materials used to validate pathogen detection assays, generate genome assemblies for database submission, and benchmark sequencing methods. PMID:28924496

  9. Repeated divergent selection on pigmentation genes in a rapid finch radiation

    PubMed Central

    Campagna, Leonardo; Repenning, Márcio; Silveira, Luís Fábio; Fontana, Carla Suertegaray; Tubaro, Pablo L.; Lovette, Irby J.

    2017-01-01

    Instances of recent and rapid speciation are suitable for associating phenotypes with their causal genotypes, especially if gene flow homogenizes areas of the genome that are not under divergent selection. We study a rapid radiation of nine sympatric bird species known as capuchino seedeaters, which are differentiated in sexually selected characters of male plumage and song. We sequenced the genomes of a phenotypically diverse set of species to search for differentiated genomic regions. Capuchinos show differences in a small proportion of their genomes, yet selection has acted independently on the same targets in different members of this radiation. Many divergent regions contain genes involved in the melanogenesis pathway, with the strongest signal originating from putative regulatory regions. Selection has acted on these same genomic regions in different lineages, likely shaping the evolution of cis-regulatory elements, which control how more conserved genes are expressed and thereby generate diversity in classically sexually selected traits. PMID:28560331

  10. Java implementation of Class Association Rule algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tamura, Makio

    2007-08-30

    Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix and a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be appliedmore » more generally.« less

  11. A universe of dwarfs and giants: genome size and chromosome evolution in the monocot family Melanthiaceae.

    PubMed

    Pellicer, Jaume; Kelly, Laura J; Leitch, Ilia J; Zomlefer, Wendy B; Fay, Michael F

    2014-03-01

    • Since the occurrence of giant genomes in angiosperms is restricted to just a few lineages, identifying where shifts towards genome obesity have occurred is essential for understanding the evolutionary mechanisms triggering this process. • Genome sizes were assessed using flow cytometry in 79 species and new chromosome numbers were obtained. Phylogenetically based statistical methods were applied to infer ancestral character reconstructions of chromosome numbers and nuclear DNA contents. • Melanthiaceae are the most diverse family in terms of genome size, with C-values ranging more than 230-fold. Our data confirmed that giant genomes are restricted to tribe Parideae, with most extant species in the family characterized by small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor (MRCA) for the family had a relatively small genome (1C = 5.37 pg). Chromosome losses and polyploidy are recovered as the main evolutionary mechanisms generating chromosome number change. • Genome evolution in Melanthiaceae has been characterized by a trend towards genome size reduction, with just one episode of dramatic DNA accumulation in Parideae. Such extreme contrasting profiles of genome size evolution illustrate the key role of transposable elements and chromosome rearrangements in driving the evolution of plant genomes. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.

  12. Genomic Repeat Abundances Contain Phylogenetic Signal

    PubMed Central

    Dodsworth, Steven; Chase, Mark W.; Kelly, Laura J.; Leitch, Ilia J.; Macas, Jiří; Novák, Petr; Piednoël, Mathieu; Weiss-Schneeweiss, Hanna; Leitch, Andrew R.

    2015-01-01

    A large proportion of genomic information, particularly repetitive elements, is usually ignored when researchers are using next-generation sequencing. Here we demonstrate the usefulness of this repetitive fraction in phylogenetic analyses, utilizing comparative graph-based clustering of next-generation sequence reads, which results in abundance estimates of different classes of genomic repeats. Phylogenetic trees are then inferred based on the genome-wide abundance of different repeat types treated as continuously varying characters; such repeats are scattered across chromosomes and in angiosperms can constitute a majority of nuclear genomic DNA. In six diverse examples, five angiosperms and one insect, this method provides generally well-supported relationships at interspecific and intergeneric levels that agree with results from more standard phylogenetic analyses of commonly used markers. We propose that this methodology may prove especially useful in groups where there is little genetic differentiation in standard phylogenetic markers. At the same time as providing data for phylogenetic inference, this method additionally yields a wealth of data for comparative studies of genome evolution. PMID:25261464

  13. [Influence of mobile phase composition on chiral separation of organic selenium racemates].

    PubMed

    Han, Xiao-qian; Qi, Bang-feng; Dun, Hui-juan; Zhu, Xin-yi; Na, Peng-jun; Jiang, Sheng-xiang; Chen, Li-ren

    2002-05-01

    The chiral separation of some chiral compounds with similar structure on the cellulose tris (3,5-dimethylphenylcarbamate) chiral stationary phase prepared by us was obtained. Ternary mobile phases influencing chiral recognition were investigated. A mode of interaction between the structural character of samples and chiral stationary phase is discussed. The results indicated that the retention and chiral separation of the analytes had a bigger change with minute addition of alcohols or acetonitrile as modifier in n-hexane/2-propanol (80/20, volume ratio) binary mobile phase.

  14. Recent advances in phytoplasma research: from genetic diversity and genome evolution to pathogenic redirection of plant stem cell fate

    USDA-ARS?s Scientific Manuscript database

    Parasitizing phloem sieve cells and being transmitted by insects, phytoplasmas are a unique group of cell wall-less bacteria responsible for numerous plant diseases worldwide. Due to difficulties in establishing axenic culture of phytoplasmas, phenotypic characters suitable for conventional microbia...

  15. Stochastic model search with binary outcomes for genome-wide association studies.

    PubMed

    Russu, Alberto; Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-06-01

    The spread of case-control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model.

  16. Quantitative Tracking of Combinatorially Engineered Populations with Multiplexed Binary Assemblies.

    PubMed

    Zeitoun, Ramsey I; Pines, Gur; Grau, Willliam C; Gill, Ryan T

    2017-04-21

    Advances in synthetic biology and genomics have enabled full-scale genome engineering efforts on laboratory time scales. However, the absence of sufficient approaches for mapping engineered genomes at system-wide scales onto performance has limited the adoption of more sophisticated algorithms for engineering complex biological systems. Here we report on the development and application of a robust approach to quantitatively map combinatorially engineered populations at scales up to several dozen target sites. This approach works by assembling genome engineered sites with cell-specific barcodes into a format compatible with high-throughput sequencing technologies. This approach, called barcoded-TRACE (bTRACE) was applied to assess E. coli populations engineered by recursive multiplex recombineering across both 6-target sites and 31-target sites. The 31-target library was then tracked throughout growth selections in the presence and absence of isopentenol (a potential next-generation biofuel). We also use the resolution of bTRACE to compare the influence of technical and biological noise on genome engineering efforts.

  17. How might flukes and tapeworms maintain genome integrity without a canonical piRNA pathway?

    PubMed Central

    Skinner, Danielle E.; Rinaldi, Gabriel; Koziol, Uriel; Brehm, Klaus; Brindley, Paul J.

    2014-01-01

    Surveillance by RNA interference is central to controlling the mobilization of transposable elements (TEs). In stem cells, Piwi argonaute (Ago) proteins and associated proteins repress mobilization of TEs to maintain genome integrity. This defense mechanism targeting TEs is termed the Piwi-interacting RNA (Piwi-piRNA) pathway. In this Opinion, we draw attention to the situation that the genomes of cestodes and trematodes have lost the piwi and vasa genes that are hallmark characters of the germline multipotency program. This absence of Piwi-like Agos and Vasa helicases prompts the question: how does the germline of these flatworms withstand mobilization of TEs? Here we present an interpretation of mechanisms likely to defend the germline integrity of parasitic flatworms. PMID:24485046

  18. [Analysis of horizontal transfer gene of Bombyx mori NPV].

    PubMed

    Duan, Hai-Rong; Qiu, De-Bin; Gong, Cheng-Liang; Huang, Mo-Li

    2011-06-01

    For research on genetic characters and evolutionary origin of the genome of baculoviruses, a comprehensive homology search and phylogenetic analysis of the complete genomes of Bombyx mori NPV and Bombyx mori were used. Three horizontally transferred genes (inhibitor of apoptosis, chitinase, and UDP-glucosyltransferase) were identified, and there was evidence that all of these genes were derived from the insect host. The results of analysis showed lots of differences between the features of horizontal transferred genes and the ones of whole genomic genes, such as nucleotide composition, codon usagebias and selection pressure. These results reconfirmed that the horizontally transferred genes are exogenous. The analysis of gene function suggested that horizontally transferred genes acquired from an ancestral host insect can increase the efficiency of baculoviruses transmission.

  19. Variable Mixed Orbital Character in the Photoelectron Angular Distribution of NO_{2}

    NASA Astrophysics Data System (ADS)

    Laws, Benjamin A.; Cavanagh, Steven J.; Lewis, Brenton R.; Gibson, Stephen T.

    2017-06-01

    NO_{2} a key component of photochemical smog and an important species in the Earth's atmosphere, is an example of a molecule which exhibits significant mixed orbital character in the HOMO. In photoelectron experiments the geometric properties of the parent anion orbital are reflected in the photoelectron angular distribution (PAD), an area of research that has benefited largely from the ability of velocity-map imaging (VMI) to simultaneously record both the energetic and angular information, with 100% collection efficiency. Photoelectron spectra of NO_{2}^{-}, taken over a range of wavelengths (355nm-520nm) with the ANU's VMI spectrometer, reveal an anomalous jump in the anisotropy parameter near threshold. Consequently, the orbital behavior of NO_{2}^{-} appears to be quite different near threshold compared to detachment at higher photon energies. This surprising effect is due to the Wigner Threshold law, which causes p orbital character to dominate the photodetachment cross-section near threshold, before the mixed s/d orbital character becomes significant at higher electron kinetic energies. By extending recent work on binary character models to form a more general expression, the variable mixed orbital character of NO_{2}^{-} is able to be described. This study provides the first multi-wavelength NO_{2} anisotropy data, which is shown to be in decent agreement with much earlier zero-core model predictions of the anisotropy parameter. K. J. Reed, A. H. Zimmerman, H. C. Andersen, and J. I. Brauman, J. Chem. Phys. 64, 1368, (1976). doi:10.1063/1.432404 D. Khuseynov, C. C. Blackstone, L. M. Culberson, and A. Sanov, J. Chem. Phys. 141, 124312, (2014). doi:10.1063/1.4896241 W. B. Clodius, R. M. Stehman, and S. B. Woo, Phys. Rev. A. 28, 760, (1983). doi:10.1103/PhysRevA.28.760 Research supported by the Australian Research Council Discovery Project Grant DP160102585

  20. From algae to angiosperms–inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes

    PubMed Central

    2014-01-01

    Background Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have helped resolve the phylogeny of numerous clades (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome sequence data. Results We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Conclusions Analyses of the plastid sequence data recovered a strongly supported framework of relationships for green plants. This framework includes: i) the placement of Zygnematophyceace as sister to land plants (Embryophyta), ii) a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining extant gymnosperms and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees), and iii) within the monilophyte clade (Monilophyta), Equisetales + Psilotales are sister to Marattiales + leptosporangiate ferns. Our analyses also highlight the challenges of using plastid genome sequences in deep-level phylogenomic analyses, and we provide suggestions for future analyses that will likely incorporate plastid genome sequence data for thousands of species. We particularly emphasize the importance of exploring the effects of different partitioning and character coding strategies. PMID:24533922

  1. Construction of a plant-transformation-competent BIBAC library and genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.)

    USDA-ARS?s Scientific Manuscript database

    Cotton is a world’s leading crop important to the world’s textile and energy industries, and a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction and extensive analysis of a binary bacterial artificial chromosome (BI...

  2. Separation of non-racemic mixtures of enantiomers: an essential part of optical resolution.

    PubMed

    Faigl, Ferenc; Fogassy, Elemér; Nógrádi, Mihály; Pálovics, Emese; Schindler, József

    2010-03-07

    Non-racemic enantiomeric mixtures form homochiral and heterochiral aggregates in melt or suspension, during adsorption or recrystallization, and these diastereomeric associations determine the distribution of the enantiomers between the solid and other (liquid or vapour) phases. That distribution depends on the stability order of the homo- and heterochiral aggregates (conglomerate or racemate formation). Therefore, there is a correlation between the binary melting point phase diagrams and the experimental ee(I)vs. ee(0) curves (ee(I) refers to the crystallized enantiomeric mixtures, ee(0) is the composition of the starting ones). Accordingly, distribution of the enantiomeric mixtures between two phases is characteristic and usually significant enrichment can be achieved. There are two exceptions: no enrichment could be observed under thermodynamically controlled conditions when the starting enantiomer composition corresponded to the eutectic composition, or when the method used was unsuitable for separation. In several cases, when kinetic control governed the crystallization, the character of the ee(0)-ee(I) curve did not correlate with the melting point binary phase diagram.

  3. Correlated Temporal and Spectral Variability

    NASA Technical Reports Server (NTRS)

    Swank, Jean H.

    2007-01-01

    The variability of neutron star and black hole X-ray sources has several dimensions, because of the roles played by different important time-scales. The variations on time scales of hours, weeks, and months, ranging from 50% to orders of magnitude, arise out of changes in the flow in the disk. The most important driving forces for those changes are probably various possible instabilities in the disk, though there may be effects with other dominant causes. The changes in the rate of flow appear to be associated with changes in the flow's configuration, as the accreting material approaches the compact object, for there are generally correlated changes in both the Xray spectra and the character of the faster temporal variability. There has been a lot of progress in tracking these correlations, both for Z and Atoll neutron star low-mass X-ray binaries, and for black hole binaries. I will discuss these correlations and review briefly what they tell us about the physical states of the systems.

  4. PHOTONICS AND NANOTECHNOLOGY Pulsed laser ablation of binary semiconductors: mechanisms of vaporisation and cluster formation

    NASA Astrophysics Data System (ADS)

    Bulgakov, A. V.; Evtushenko, A. B.; Shukhov, Yu G.; Ozerov, I.; Marin, W.

    2010-12-01

    Formation of small clusters during pulsed ablation of two binary semiconductors, zinc oxide and indium phosphide, in vacuum by UV, visible, and IR laser radiation is comparatively studied. The irradiation conditions favourable for generation of neutral and charged ZnnOm and InnPm clusters of different stoichiometry in the ablation products are found. The size and composition of the clusters, their expansion dynamics and reactivity are analysed by time-of-flight mass spectrometry. A particular attention is paid to the mechanisms of ZnO and InP ablation as a function of laser fluence, with the use of different ablation models. It is established that ZnO evapourates congruently in a wide range of irradiation conditions, while InP ablation leads to enrichment of the target surface with indium. It is shown that this radically different character of semiconductor ablation determines the composition of the nanostructures formed: zinc oxide clusters are mainly stoichiometric, whereas InnPm particles are significantly enriched with indium.

  5. [sgRNA design for the CRISPR/Cas9 system and evaluation of its off-target effects].

    PubMed

    Xie, Sheng-song; Zhang, Yi; Zhang, Li-sheng; Li, Guang-lei; Zhao, Chang-zhi; Ni, Pan; Zhao, Shu-hong

    2015-11-01

    The third generation of CRISPR/Cas9-mediated genome editing technology has been successfully applied to genome modification of various species including animals, plants and microorganisms. How to improve the efficiency of CRISPR/Cas9 genome editing and reduce its off-target effects has been extensively explored in this field. Using sgRNA (Small guide RNA) with high efficiency and specificity is one of the critical factors for successful genome editing. Several software have been developed for sgRNA design and/or off-target evaluation, which have advantages and disadvantages respectively. In this review, we summarize characters of 16 kinds online and standalone software for sgRNA design and/or off-target evaluation and conduct a comparative analysis of these different kinds of software through developing 38 evaluation indexes. We also summarize 11 experimental approaches for testing genome editing efficiency and off-target effects as well as how to screen highly efficient and specific sgRNA.

  6. Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE).

    PubMed

    Paull, Evan O; Carlin, Daniel E; Niepel, Mario; Sorger, Peter K; Haussler, David; Stuart, Joshua M

    2013-11-01

    Identifying the cellular wiring that connects genomic perturbations to transcriptional changes in cancer is essential to gain a mechanistic understanding of disease initiation, progression and ultimately to predict drug response. We have developed a method called Tied Diffusion Through Interacting Events (TieDIE) that uses a network diffusion approach to connect genomic perturbations to gene expression changes characteristic of cancer subtypes. The method computes a subnetwork of protein-protein interactions, predicted transcription factor-to-target connections and curated interactions from literature that connects genomic and transcriptomic perturbations. Application of TieDIE to The Cancer Genome Atlas and a breast cancer cell line dataset identified key signaling pathways, with examples impinging on MYC activity. Interlinking genes are predicted to correspond to essential components of cancer signaling and may provide a mechanistic explanation of tumor character and suggest subtype-specific drug targets. Software is available from the Stuart lab's wiki: https://sysbiowiki.soe.ucsc.edu/tiedie. jstuart@ucsc.edu. Supplementary data are available at Bioinformatics online.

  7. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  8. A Bayesian Framework for Generalized Linear Mixed Modeling Identifies New Candidate Loci for Late-Onset Alzheimer’s Disease

    PubMed Central

    Wang, Xulong; Philip, Vivek M.; Ananda, Guruprasad; White, Charles C.; Malhotra, Ankit; Michalski, Paul J.; Karuturi, Krishna R. Murthy; Chintalapudi, Sumana R.; Acklin, Casey; Sasner, Michael; Bennett, David A.; De Jager, Philip L.; Howell, Gareth R.; Carter, Gregory W.

    2018-01-01

    Recent technical and methodological advances have greatly enhanced genome-wide association studies (GWAS). The advent of low-cost, whole-genome sequencing facilitates high-resolution variant identification, and the development of linear mixed models (LMM) allows improved identification of putatively causal variants. While essential for correcting false positive associations due to sample relatedness and population stratification, LMMs have commonly been restricted to quantitative variables. However, phenotypic traits in association studies are often categorical, coded as binary case-control or ordered variables describing disease stages. To address these issues, we have devised a method for genomic association studies that implements a generalized LMM (GLMM) in a Bayesian framework, called Bayes-GLMM. Bayes-GLMM has four major features: (1) support of categorical, binary, and quantitative variables; (2) cohesive integration of previous GWAS results for related traits; (3) correction for sample relatedness by mixed modeling; and (4) model estimation by both Markov chain Monte Carlo sampling and maximal likelihood estimation. We applied Bayes-GLMM to the whole-genome sequencing cohort of the Alzheimer’s Disease Sequencing Project. This study contains 570 individuals from 111 families, each with Alzheimer’s disease diagnosed at one of four confidence levels. Using Bayes-GLMM we identified four variants in three loci significantly associated with Alzheimer’s disease. Two variants, rs140233081 and rs149372995, lie between PRKAR1B and PDGFA. The coded proteins are localized to the glial-vascular unit, and PDGFA transcript levels are associated with Alzheimer’s disease-related neuropathology. In summary, this work provides implementation of a flexible, generalized mixed-model approach in a Bayesian framework for association studies. PMID:29507048

  9. Lampreys as Diverse Model Organisms in the Genomics Era.

    PubMed

    McCauley, David W; Docker, Margaret F; Whyard, Steve; Li, Weiming

    2015-11-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome.

  10. Lampreys as Diverse Model Organisms in the Genomics Era

    PubMed Central

    McCauley, David W.; Docker, Margaret F.; Whyard, Steve; Li, Weiming

    2015-01-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome. PMID:26951616

  11. Genome analysis of the platypus reveals unique signatures of evolution.

    PubMed

    Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

    2008-05-08

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.

  12. Genome analysis of the platypus reveals unique signatures of evolution

    PubMed Central

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  13. A Comprehensive Study of Cyanobacterial Morphological and Ecological Evolutionary Dynamics through Deep Geologic Time.

    PubMed

    Uyeda, Josef C; Harmon, Luke J; Blank, Carrine E

    2016-01-01

    Cyanobacteria have exerted a profound influence on the progressive oxygenation of Earth. As a complementary approach to examining the geologic record-phylogenomic and trait evolutionary analyses of extant species can lead to new insights. We constructed new phylogenomic trees and analyzed phenotypic trait data using novel phylogenetic comparative methods. We elucidated the dynamics of trait evolution in Cyanobacteria over billion-year timescales, and provide evidence that major geologic events in early Earth's history have shaped-and been shaped by-evolution in Cyanobacteria. We identify a robust core cyanobacterial phylogeny and a smaller set of taxa that exhibit long-branch attraction artifacts. We estimated the age of nodes and reconstruct the ancestral character states of 43 phenotypic characters. We find high levels of phylogenetic signal for nearly all traits, indicating the phylogeny carries substantial predictive power. The earliest cyanobacterial lineages likely lived in freshwater habitats, had small cell diameters, were benthic or sessile, and possibly epilithic/endolithic with a sheath. We jointly analyzed a subset of 25 binary traits to determine whether rates of trait evolution have shifted over time in conjunction with major geologic events. Phylogenetic comparative analysis reveal an overriding signal of decreasing rates of trait evolution through time. Furthermore, the data suggest two major rate shifts in trait evolution associated with bursts of evolutionary innovation. The first rate shift occurs in the aftermath of the Great Oxidation Event and "Snowball Earth" glaciations and is associated with decrease in the evolutionary rates around 1.8-1.6 Ga. This rate shift seems to indicate the end of a major diversification of cyanobacterial phenotypes-particularly related to traits associated with filamentous morphology, heterocysts and motility in freshwater ecosystems. Another burst appears around the time of the Neoproterozoic Oxidation Event in the Neoproterozoic, and is associated with the acquisition of traits involved in planktonic growth in marine habitats. Our results demonstrate how uniting genomic and phenotypic datasets in extant bacterial species can shed light on billion-year old events in Earth's history.

  14. A Comprehensive Study of Cyanobacterial Morphological and Ecological Evolutionary Dynamics through Deep Geologic Time

    PubMed Central

    Harmon, Luke J.; Blank, Carrine E.

    2016-01-01

    Cyanobacteria have exerted a profound influence on the progressive oxygenation of Earth. As a complementary approach to examining the geologic record—phylogenomic and trait evolutionary analyses of extant species can lead to new insights. We constructed new phylogenomic trees and analyzed phenotypic trait data using novel phylogenetic comparative methods. We elucidated the dynamics of trait evolution in Cyanobacteria over billion-year timescales, and provide evidence that major geologic events in early Earth’s history have shaped—and been shaped by—evolution in Cyanobacteria. We identify a robust core cyanobacterial phylogeny and a smaller set of taxa that exhibit long-branch attraction artifacts. We estimated the age of nodes and reconstruct the ancestral character states of 43 phenotypic characters. We find high levels of phylogenetic signal for nearly all traits, indicating the phylogeny carries substantial predictive power. The earliest cyanobacterial lineages likely lived in freshwater habitats, had small cell diameters, were benthic or sessile, and possibly epilithic/endolithic with a sheath. We jointly analyzed a subset of 25 binary traits to determine whether rates of trait evolution have shifted over time in conjunction with major geologic events. Phylogenetic comparative analysis reveal an overriding signal of decreasing rates of trait evolution through time. Furthermore, the data suggest two major rate shifts in trait evolution associated with bursts of evolutionary innovation. The first rate shift occurs in the aftermath of the Great Oxidation Event and “Snowball Earth” glaciations and is associated with decrease in the evolutionary rates around 1.8–1.6 Ga. This rate shift seems to indicate the end of a major diversification of cyanobacterial phenotypes–particularly related to traits associated with filamentous morphology, heterocysts and motility in freshwater ecosystems. Another burst appears around the time of the Neoproterozoic Oxidation Event in the Neoproterozoic, and is associated with the acquisition of traits involved in planktonic growth in marine habitats. Our results demonstrate how uniting genomic and phenotypic datasets in extant bacterial species can shed light on billion-year old events in Earth’s history. PMID:27649395

  15. Wavelet domain textual coding of Ottoman script images

    NASA Astrophysics Data System (ADS)

    Gerek, Oemer N.; Cetin, Enis A.; Tewfik, Ahmed H.

    1996-02-01

    Image coding using wavelet transform, DCT, and similar transform techniques is well established. On the other hand, these coding methods neither take into account the special characteristics of the images in a database nor are they suitable for fast database search. In this paper, the digital archiving of Ottoman printings is considered. Ottoman documents are printed in Arabic letters. Witten et al. describes a scheme based on finding the characters in binary document images and encoding the positions of the repeated characters This method efficiently compresses document images and is suitable for database research, but it cannot be applied to Ottoman or Arabic documents as the concept of character is different in Ottoman or Arabic. Typically, one has to deal with compound structures consisting of a group of letters. Therefore, the matching criterion will be according to those compound structures. Furthermore, the text images are gray tone or color images for Ottoman scripts for the reasons that are described in the paper. In our method the compound structure matching is carried out in wavelet domain which reduces the search space and increases the compression ratio. In addition to the wavelet transformation which corresponds to the linear subband decomposition, we also used nonlinear subband decomposition. The filters in the nonlinear subband decomposition have the property of preserving edges in the low resolution subband image.

  16. The hobbit - an unexpected deficiency.

    PubMed

    Hopkinson, Joseph A; Hopkinson, Nicholas S

    2013-12-16

    Vitamin D has been proposed to have beneficial effects in a wide range of contexts. We investigate the hypothesis that vitamin D deficiency, caused by both aversion to sunlight and unwholesome diet, could also be a significant contributor to the triumph of good over evil in fantasy literature. Data on the dietary habits, moral attributes and martial prowess of various inhabitants of Middle Earth were systematically extracted from J R R Tolkien's novel The hobbit. Goodness and victoriousness of characters were scored with binary scales, and dietary intake and habitual sun exposure were used to calculate a vitamin D score (range, 0-4). The vitamin D score was significantly higher among the good and victorious characters (mean, 3.4; SD, 0.5) than the evil and defeated ones (mean, 0.2; SD, 0.4; P < 0.001). Further work is needed to see if these pilot results can be extrapolated to other fantastic situations and whether randomised intervention trials need to be imagined.

  17. PirAB protein from Xenorhabdus nematophila HB310 exhibits a binary toxin with insecticidal activity and cytotoxicity in Galleria mellonella.

    PubMed

    Yang, Qing; Zhang, Jie; Li, Tianhui; Liu, Shen; Song, Ping; Nangong, Ziyan; Wang, Qinying

    2017-09-01

    PirAB (Photorhabdus insect-related proteins, PirAB) toxin was initially found in the Photorhabdus luminescens TT01 strain and has been shown to be a binary toxin with high insecticidal activity. Based on GenBank data, this gene was also found in the Xenorhabdus nematophila genome sequence. The predicted amino acid sequence of pirA and pirB in the genome of X. nematophila showed 51% and 50% identity with those gene sequences from P. luminescens. The purpose of this experiment is to identify the relevant information for this toxin gene in X. nematophila. The pirA, pirB and pirAB genes of X. nematophila HB310 were cloned and expressed in Escherichia coli BL21 (DE3) using the pET-28a vector. A PirAB-fusion protein (PirAB-F) was constructed by linking the pirA and pirB genes with the flexible linker (Gly) 4 DNA encoding sequence and then efficiently expressed in E. coli. The hemocoel and oral insecticidal activities of the recombinant proteins were analyzed against the larvae of Galleria mellonella. The results show that PirA/B alone, PirA/B mixture, co-expressed PirAB protein, and PirAB-F all had no oral insecticidal activity against the second-instar larvae of G. mellonella. Only PirA/B mixture and co-expressed PirAB protein had hemocoel insecticidal activity against G. mellonella fifth-instar larvae, with an LD 50 of 2.718μg/larva or 1.566μg/larva, respectively. Therefore, we confirmed that PirAB protein of X. nematophila HB310 is a binary insecticidal toxin. The successful expression and purification of PirAB laid a foundation for further studies on the function, insecticidal mechanism and expression regulation of the binary toxin. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Mechanisms generating long range correlation in nucleotide composition of the Borrelia Burgdorferi genome

    NASA Astrophysics Data System (ADS)

    Mackiewicz, P.; Gierlik, A.; Kowalczuk, M.; Szczepanik, D.; Dudek, M. R.; Cebrat, S.

    1999-12-01

    We have analysed protein coding and intergenic sequences in the Borrelia burgdorferi (the Lyme disease bacterium) genome using different kinds of DNA walks. Genes occupying the leading strand of DNA have significantly different nucleotide composition from genes occupying the lagging strand. Nucleotide compositional bias of the two DNA strands reflects the aminoacid composition of proteins. 96% of genes coding for ribosomal proteins lie on the leading DNA strand, which suggests that the positions of these as well as other genes are non-random. In the B. burgdorferi genome, the asymmetry in intergenic DNA sequences is lower than the asymmetry in the third positions in codons. All these characters of the B. burgdorferi genome suggest that both replication-associated mutational pressure and recombination mechanisms have established the specific structure of the genome and now any recombination leading to inversion of a gene in respect to the direction of replication is forbidden. This property of the genome allows us to assume that it is in a steady state, which enables us to fix some parameters for simulations of DNA evolution.

  19. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  20. A supermatrix analysis of genomic, morphological, and paleontological data from crown Cetacea

    PubMed Central

    2011-01-01

    Background Cetacea (dolphins, porpoises, and whales) is a clade of aquatic species that includes the most massive, deepest diving, and largest brained mammals. Understanding the temporal pattern of diversification in the group as well as the evolution of cetacean anatomy and behavior requires a robust and well-resolved phylogenetic hypothesis. Although a large body of molecular data has accumulated over the past 20 years, DNA sequences of cetaceans have not been directly integrated with the rich, cetacean fossil record to reconcile discrepancies among molecular and morphological characters. Results We combined new nuclear DNA sequences, including segments of six genes (~2800 basepairs) from the functionally extinct Yangtze River dolphin, with an expanded morphological matrix and published genomic data. Diverse analyses of these data resolved the relationships of 74 taxa that represent all extant families and 11 extinct families of Cetacea. The resulting supermatrix (61,155 characters) and its sub-partitions were analyzed using parsimony methods. Bayesian and maximum likelihood (ML) searches were conducted on the molecular partition, and a molecular scaffold obtained from these searches was used to constrain a parsimony search of the morphological partition. Based on analysis of the supermatrix and model-based analyses of the molecular partition, we found overwhelming support for 15 extant clades. When extinct taxa are included, we recovered trees that are significantly correlated with the fossil record. These trees were used to reconstruct the timing of cetacean diversification and the evolution of characters shared by "river dolphins," a non-monophyletic set of species according to all of our phylogenetic analyses. Conclusions The parsimony analysis of the supermatrix and the analysis of morphology constrained to fit the ML/Bayesian molecular tree yielded broadly congruent phylogenetic hypotheses. In trees from both analyses, all Oligocene taxa included in our study fell outside crown Mysticeti and crown Odontoceti, suggesting that these two clades radiated in the late Oligocene or later, contra some recent molecular clock studies. Our trees also imply that many character states shared by river dolphins evolved in their oceanic ancestors, contradicting the hypothesis that these characters are convergent adaptations to fluvial habitats. PMID:21518443

  1. A supermatrix analysis of genomic, morphological, and paleontological data from crown Cetacea.

    PubMed

    Geisler, Jonathan H; McGowen, Michael R; Yang, Guang; Gatesy, John

    2011-04-25

    Cetacea (dolphins, porpoises, and whales) is a clade of aquatic species that includes the most massive, deepest diving, and largest brained mammals. Understanding the temporal pattern of diversification in the group as well as the evolution of cetacean anatomy and behavior requires a robust and well-resolved phylogenetic hypothesis. Although a large body of molecular data has accumulated over the past 20 years, DNA sequences of cetaceans have not been directly integrated with the rich, cetacean fossil record to reconcile discrepancies among molecular and morphological characters. We combined new nuclear DNA sequences, including segments of six genes (~2800 basepairs) from the functionally extinct Yangtze River dolphin, with an expanded morphological matrix and published genomic data. Diverse analyses of these data resolved the relationships of 74 taxa that represent all extant families and 11 extinct families of Cetacea. The resulting supermatrix (61,155 characters) and its sub-partitions were analyzed using parsimony methods. Bayesian and maximum likelihood (ML) searches were conducted on the molecular partition, and a molecular scaffold obtained from these searches was used to constrain a parsimony search of the morphological partition. Based on analysis of the supermatrix and model-based analyses of the molecular partition, we found overwhelming support for 15 extant clades. When extinct taxa are included, we recovered trees that are significantly correlated with the fossil record. These trees were used to reconstruct the timing of cetacean diversification and the evolution of characters shared by "river dolphins," a non-monophyletic set of species according to all of our phylogenetic analyses. The parsimony analysis of the supermatrix and the analysis of morphology constrained to fit the ML/Bayesian molecular tree yielded broadly congruent phylogenetic hypotheses. In trees from both analyses, all Oligocene taxa included in our study fell outside crown Mysticeti and crown Odontoceti, suggesting that these two clades radiated in the late Oligocene or later, contra some recent molecular clock studies. Our trees also imply that many character states shared by river dolphins evolved in their oceanic ancestors, contradicting the hypothesis that these characters are convergent adaptations to fluvial habitats.

  2. Stochastic model search with binary outcomes for genome-wide association studies

    PubMed Central

    Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-01-01

    Objective The spread of case–control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Materials and methods Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. Results BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. Discussion BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. Conclusion The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model. PMID:22534080

  3. How might flukes and tapeworms maintain genome integrity without a canonical piRNA pathway?

    PubMed

    Skinner, Danielle E; Rinaldi, Gabriel; Koziol, Uriel; Brehm, Klaus; Brindley, Paul J

    2014-03-01

    Surveillance by RNA interference is central to controlling the mobilization of transposable elements (TEs). In stem cells, Piwi argonaute (Ago) proteins and associated proteins repress mobilization of TEs to maintain genome integrity. This defense mechanism targeting TEs is termed the Piwi-interacting RNA (piRNA) pathway. In this opinion article, we draw attention to the situation that the genomes of cestodes and trematodes have lost the piwi and vasa genes that are hallmark characters of the germline multipotency program. This absence of Piwi-like Agos and Vasa helicases prompts the question: how does the germline of these flatworms withstand mobilization of TEs? Here, we present an interpretation of mechanisms likely to defend the germline integrity of parasitic flatworms. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Agroinoculation of Beet necrotic yellow vein virus cDNA clones results in plant systemic infection and efficient Polymyxa betae transmission.

    PubMed

    Delbianco, Alice; Lanzoni, Chiara; Klein, Elodie; Rubies Autonell, Concepcion; Gilmer, David; Ratti, Claudio

    2013-05-01

    Agroinoculation is a quick and easy method for the infection of plants with viruses. This method involves the infiltration of tissue with a suspension of Agrobacterium tumefaciens carrying binary plasmids harbouring full-length cDNA copies of viral genome components. When transferred into host cells, transcription of the cDNA produces RNA copies of the viral genome that initiate infection. We produced full-length cDNA corresponding to Beet necrotic yellow vein virus (BNYVV) RNAs and derived replicon vectors expressing viral and fluorescent proteins in pJL89 binary plasmid under the control of the Cauliflower mosaic virus 35S promoter. We infected Nicotiana benthamiana and Beta macrocarpa plants with BNYVV by leaf agroinfiltration of combinations of agrobacteria carrying full-length cDNA clones of BNYVV RNAs. We validated the ability of agroclones to reproduce a complete viral cycle, from replication to cell-to-cell and systemic movement and, finally, plant-to-plant transmission by its plasmodiophorid vector. We also showed successful root agroinfection of B. vulgaris, a new tool for the assay of resistance to rhizomania, the sugar beet disease caused by BNYVV. © 2013 BSPP AND BLACKWELL PUBLISHING LTD.

  5. Development of swine-specific DNA markers for biosensor-based halal authentication.

    PubMed

    Ali, M E; Hashim, U; Kashif, M; Mustafa, S; Che Man, Y B; Abd Hamid, S B

    2012-06-29

    The pig (Sus scrofa) mitochondrial genome was targeted to design short (15-30 nucleotides) DNA markers that would be suitable for biosensor-based hybridization detection of target DNA. Short DNA markers are reported to survive harsh conditions in which longer ones are degraded into smaller fragments. The whole swine mitochondrial-genome was in silico digested with AluI restriction enzyme. Among 66 AluI fragments, five were selected as potential markers because of their convenient lengths, high degree of interspecies polymorphism and intraspecies conservatism. These were confirmed by NCBI blast analysis and ClustalW alignment analysis with 11 different meat-providing animal and fish species. Finally, we integrated a tetramethyl rhodamine-labeled 18-nucleotide AluI fragment into a 3-nm diameter citrate-tannate coated gold nanoparticle to develop a swine-specific hybrid nanobioprobe for the determination of pork adulteration in 2.5-h autoclaved pork-beef binary mixtures. This hybrid probe detected as low as 1% pork in deliberately contaminated autoclaved pork-beef binary mixtures and no cross-species detection was recorded, demonstrating the feasibility of this type of probe for biosensor-based detection of pork adulteration of halal and kosher foods.

  6. Hairy Root Transformation Using Agrobacterium rhizogenes as a Tool for Exploring Cell Type-Specific Gene Expression and Function Using Tomato as a Model1[W][OPEN

    PubMed Central

    Ron, Mily; Kajala, Kaisa; Pauluzzi, Germain; Wang, Dongxue; Reynoso, Mauricio A.; Zumstein, Kristina; Garcha, Jasmine; Winte, Sonja; Masson, Helen; Inagaki, Soichi; Federici, Fernán; Sinha, Neelima; Deal, Roger B.; Bailey-Serres, Julia; Brady, Siobhan M.

    2014-01-01

    Agrobacterium rhizogenes (or Rhizobium rhizogenes) is able to transform plant genomes and induce the production of hairy roots. We describe the use of A. rhizogenes in tomato (Solanum spp.) to rapidly assess gene expression and function. Gene expression of reporters is indistinguishable in plants transformed by Agrobacterium tumefaciens as compared with A. rhizogenes. A root cell type- and tissue-specific promoter resource has been generated for domesticated and wild tomato (Solanum lycopersicum and Solanum pennellii, respectively) using these approaches. Imaging of tomato roots using A. rhizogenes coupled with laser scanning confocal microscopy is facilitated by the use of a membrane-tagged protein fused to a red fluorescent protein marker present in binary vectors. Tomato-optimized isolation of nuclei tagged in specific cell types and translating ribosome affinity purification binary vectors were generated and used to monitor associated messenger RNA abundance or chromatin modification. Finally, transcriptional reporters, translational reporters, and clustered regularly interspaced short palindromic repeats-associated nuclease9 genome editing demonstrate that SHORT-ROOT and SCARECROW gene function is conserved between Arabidopsis (Arabidopsis thaliana) and tomato. PMID:24868032

  7. Phylogenetic Diversity of the Enteric Pathogen Salmonella enterica subsp. enterica Inferred from Genome-Wide Reference-Free SNP Characters

    USDA-ARS?s Scientific Manuscript database

    Salmonella enterica is a major cause of food-borne illness in the US, leading to more deaths than any other food-related pathogen. This is an extremely diverse bacterial species consisting of six subspecies and over 2500 named serovars. Examining the evolutionary history within Salmonella with techn...

  8. BAC Libraries from Wheat Chromosome 7D – Efficient Tool for Positional Cloning of Aphid Resistance Genes

    USDA-ARS?s Scientific Manuscript database

    Positional cloning in bread wheat is a tedious task due to its huge genome size (~17 Gbp) and polyploid character. BAC libraries represent an essential tool for positional cloning. However, wheat BAC libraries comprise more than million clones, which make their screening very laborious. Here we pres...

  9. CRISPR/Cas9-Based Multiplex Genome Editing in Monocot and Dicot Plants.

    PubMed

    Ma, Xingliang; Liu, Yao-Guang

    2016-07-01

    The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated genome targeting system has been applied to a variety of organisms, including plants. Compared to other genome-targeting technologies such as zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), the CRISPR/Cas9 system is easier to use and has much higher editing efficiency. In addition, multiple "single guide RNAs" (sgRNAs) with different target sequences can be designed to direct the Cas9 protein to multiple genomic sites for simultaneous multiplex editing. Here, we present a procedure for highly efficient multiplex genome targeting in monocot and dicot plants using a versatile and robust CRISPR/Cas9 vector system, emphasizing the construction of binary constructs with multiple sgRNA expression cassettes in one round of cloning using Golden Gate ligation. We also describe the genotyping of targeted mutations in transgenic plants by direct Sanger sequencing followed by decoding of superimposed sequencing chromatograms containing biallelic or heterozygous mutations using the Web-based tool DSDecode. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  10. Putative floral brood-site mimicry, loss of autonomous selfing, and reduced vegetative growth are significantly correlated with increased diversification in Asarum (Aristolochiaceae).

    PubMed

    Sinn, Brandon T; Kelly, Lawrence M; Freudenstein, John V

    2015-08-01

    The drivers of angiosperm diversity have long been sought and the flower-arthropod association has often been invoked as the most powerful driver of the angiosperm radiation. We now know that features that influence arthropod interactions cannot only affect the diversification of lineages, but also expedite or constrain their rate of extinction, which can equally influence the observed asymmetric richness of extant angiosperm lineages. The genus Asarum (Aristolochiaceae; ∼100 species) is widely distributed in north temperate forests, with substantial vegetative and floral divergence between its three major clades, Euasarum, Geotaenium, and Heterotropa. We used Binary-State Speciation and Extinction Model (BiSSE) Net Diversification tests of character state distributions on a Maximum Likelihood phylogram and a Coalescent Bayesian species tree, inferred from seven chloroplast markers and nuclear rDNA, to test for signal of asymmetric diversification, character state transition, and extinction rates of floral and vegetative characters. We found that reduction in vegetative growth, loss of autonomous self-pollination, and the presence of putative fungal-mimicking floral structures are significantly correlated with increased diversification in Asarum. No significant difference in model likelihood was identified between symmetric and asymmetric rates of character state transitions or extinction. We conclude that the flowers of the Heterotropa clade may have converged on some aspects of basidiomycete sporocarp morphology and that brood-site mimicry, coupled with a reduction in vegetative growth and the loss of autonomous self-pollination, may have driven diversification within Asarum. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Whole genome sequencing of Chinese clearhead icefish, Protosalanx hyalocranius.

    PubMed

    Liu, Kai; Xu, Dongpo; Li, Jia; Bian, Chao; Duan, Jinrong; Zhou, Yanfeng; Zhang, Minying; You, Xinxin; You, Yang; Chen, Jieming; Yu, Hui; Xu, Gangchun; Fang, Di-An; Qiang, Jun; Jiang, Shulun; He, Jie; Xu, Junmin; Shi, Qiong; Zhang, Zhiyong; Xu, Pao

    2017-04-01

    Chinese clearhead icefish, Protosalanx hyalocranius , is a representative icefish species with economic importance and special appearance. Due to its great economic value in China, the fish was introduced into Lake Dianchi and several other lakes from the Lake Taihu half a century ago. Similar to the Sinocyclocheilus cavefish, the clearhead icefish has certain cavefish-like traits, such as transparent body and nearly scaleless skin. Here, we provide the whole genome sequence of this surface-dwelling fish and generated a draft genome assembly, aiming at exploring molecular mechanisms for the biological interests. A total of 252.1 Gb of raw reads were sequenced. Subsequently, a novel draft genome assembly was generated, with the scaffold N50 reaching 1.163 Mb. The genome completeness was estimated to be 98.39 % by using the CEGMA evaluation. Finally, we annotated 19 884 protein-coding genes and observed that repeat sequences account for 24.43 % of the genome assembly. We report the first draft genome of the Chinese clearhead icefish. The genome assembly will provide a solid foundation for further molecular breeding and germplasm resource protection in Chinese clearhead icefish, as well as other icefishes. It is also a valuable genetic resource for revealing the molecular mechanisms for the cavefish-like characters. © The Authors 2017. Published by Oxford University Press.

  12. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

    PubMed Central

    Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.

    2010-01-01

    Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635

  13. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  14. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE PAGES

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  15. Evaluation of the efficacy of twelve mitochondrial protein-coding genes as barcodes for mollusk DNA barcoding.

    PubMed

    Yu, Hong; Kong, Lingfeng; Li, Qi

    2016-01-01

    In this study, we evaluated the efficacy of 12 mitochondrial protein-coding genes from 238 mitochondrial genomes of 140 molluscan species as potential DNA barcodes for mollusks. Three barcoding methods (distance, monophyly and character-based methods) were used in species identification. The species recovery rates based on genetic distances for the 12 genes ranged from 70.83 to 83.33%. There were no significant differences in intra- or interspecific variability among the 12 genes. The monophyly and character-based methods provided higher resolution than the distance-based method in species delimitation. Especially in closely related taxa, the character-based method showed some advantages. The results suggested that besides the standard COI barcode, other 11 mitochondrial protein-coding genes could also be potentially used as a molecular diagnostic for molluscan species discrimination. Our results also showed that the combination of mitochondrial genes did not enhance the efficacy for species identification and a single mitochondrial gene would be fully competent.

  16. Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

    PubMed

    Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

    1992-02-01

    The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.

  17. Analysis of photometric light curves solution for massive contact OB binary stars. LY Aurigae, BH Centauri, SV Centauri

    NASA Astrophysics Data System (ADS)

    Avvakumova, E. A.

    2010-01-01

    We searched for signs of the presence of circumstellar gaseous matter in photometric data for massive contact early-type binaries by analyzing residual curves (the dependence of the difference between the observed and theoretical brightness variations on the orbital-period phase) for three such stars. The residual curves make it possible to estimate the influence of gas in the common envelope on the observed light curves for different phase intervals and to qualitatively describe the character of the distortion of the light from the system’s components. Changes of the residual curves from filter to filter indicate varying conditions in the circumstellar matter. Changes of the residual curves from one observation epoch to another indicate varying conditions in the circumstellar matter. We compared the residual curves obtained for different photometric bands and epochs via a correlation analysis. The distortion of light from the components of LY Aurigae in the ultraviolet differs from that in the visual. The distortion of light from the components of SV Centauri is appreciable, but not selective, and does not vary in time, while the distortion of light from BH Centauri possesses a strong selective component. A comparison of the radii computed for the components of BH Centauri and SV Centauri shows that the gas distribution near these binaries varies in time.

  18. The telomeric sync model of speciation: species-wide telomere erosion triggers cycles of transposon-mediated genomic rearrangements, which underlie the saltatory appearance of nonadaptive characters

    NASA Astrophysics Data System (ADS)

    Stindl, Reinhard

    2014-03-01

    Charles Darwin knew that the fossil record is not overwhelmingly supportive of genetic and phenotypic gradualism; therefore, he developed the core of his theory on the basis of breeding experiments. Here, I present evidence for the existence of a cell biological mechanism that strongly points to the almost forgotten European concept of saltatory evolution of nonadaptive characters, which is in perfect agreement with the gaps in the fossil record. The standard model of chromosomal evolution has always been handicapped by a paradox, namely, how speciation can occur by spontaneous chromosomal rearrangements that are known to decrease the fertility of heterozygotes in a population. However, the hallmark of almost all closely related species is a differing chromosome complement and therefore chromosomal rearrangements seem to be crucial for speciation. Telomeres, the caps of eukaryotic chromosomes, erode in somatic tissues during life, but have been thought to remain stable in the germline of a species. Recently, a large human study spanning three healthy generations clearly found a cumulative telomere effect, which is indicative of transgenerational telomere erosion in the human species. The telomeric sync model of speciation presented here is based on telomere erosion between generations, which leads to identical fusions of chromosomes and triggers a transposon-mediated genomic repatterning in the germline of many individuals of a species. The phenotypic outcome of the telomere-triggered transposon activity is the saltatory appearance of nonadaptive characters simultaneously in many individuals. Transgenerational telomere erosion is therefore the material basis of aging at the species level.

  19. Genome editing in plants: Advancing crop transformation and overview of tools.

    PubMed

    Shah, Tariq; Andleeb, Tayyaba; Lateef, Sadia; Noor, Mehmood Ali

    2018-05-07

    Genome manipulation technology is one of emerging field which brings real revolution in genetic engineering and biotechnology. Targeted editing of genomes pave path to address a wide range of goals not only to improve quality and productivity of crops but also permit to investigate the fundamental roots of biological systems. These goals includes creation of plants with valued compositional properties and with characters that confer resistance to numerous biotic and abiotic stresses. Numerous novel genome editing systems have been introduced during the past few years; these comprise zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats/Cas9 (CRISPR/Cas9). Genome editing technique is consistent for improving average yield to achieve the growing demands of the world's existing food famine and to launch a feasible and environmentally safe agriculture scheme, to more specific, productive, cost-effective and eco-friendly. These exciting novel methods, concisely reviewed herein, have verified themselves as efficient and reliable tools for the genetic improvement of plants. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  20. Signatures of adaptation to plant parasitism in nematode genomes.

    PubMed

    Bird, David McK; Jones, John T; Opperman, Charles H; Kikuchi, Taisei; Danchin, Etienne G J

    2015-02-01

    Plant-parasitic nematodes cause considerable damage to global agriculture. The ability to parasitize plants is a derived character that appears to have independently emerged several times in the phylum Nematoda. Morphological convergence to feeding style has been observed, but whether this is emergent from molecular convergence is less obvious. To address this, we assess whether genomic signatures can be associated with plant parasitism by nematodes. In this review, we report genomic features and characteristics that appear to be common in plant-parasitic nematodes while absent or rare in animal parasites, predators or free-living species. Candidate horizontal acquisitions of parasitism genes have systematically been found in all plant-parasitic species investigated at the sequence level. Presence of peptides that mimic plant hormones also appears to be a trait of plant-parasitic species. Annotations of the few genomes of plant-parasitic nematodes available to date have revealed a set of apparently species-specific genes on every occasion. Effector genes, important for parasitism are frequently found among those species-specific genes, indicating poor overlap. Overall, nematodes appear to have developed convergent genomic solutions to adapt to plant parasitism.

  1. Theoretical estimates of exposure timescales of protein binding sites on DNA regulated by nucleosome kinetics.

    PubMed

    Parmar, Jyotsana J; Das, Dibyendu; Padinhateeri, Ranjith

    2016-02-29

    It is being increasingly realized that nucleosome organization on DNA crucially regulates DNA-protein interactions and the resulting gene expression. While the spatial character of the nucleosome positioning on DNA has been experimentally and theoretically studied extensively, the temporal character is poorly understood. Accounting for ATPase activity and DNA-sequence effects on nucleosome kinetics, we develop a theoretical method to estimate the time of continuous exposure of binding sites of non-histone proteins (e.g. transcription factors and TATA binding proteins) along any genome. Applying the method to Saccharomyces cerevisiae, we show that the exposure timescales are determined by cooperative dynamics of multiple nucleosomes, and their behavior is often different from expectations based on static nucleosome occupancy. Examining exposure times in the promoters of GAL1 and PHO5, we show that our theoretical predictions are consistent with known experiments. We apply our method genome-wide and discover huge gene-to-gene variability of mean exposure times of TATA boxes and patches adjacent to TSS (+1 nucleosome region); the resulting timescale distributions have non-exponential tails. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    PubMed Central

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  3. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    PubMed

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  4. Genetic Competence Drives Genome Diversity in Bacillus subtilis

    PubMed Central

    Chevreux, Bastien; Serra, Cláudia R; Schyns, Ghislain; Henriques, Adriano O

    2018-01-01

    Abstract Prokaryote genomes are the result of a dynamic flux of genes, with increases achieved via horizontal gene transfer and reductions occurring through gene loss. The ecological and selective forces that drive this genomic flexibility vary across species. Bacillus subtilis is a naturally competent bacterium that occupies various environments, including plant-associated, soil, and marine niches, and the gut of both invertebrates and vertebrates. Here, we quantify the genomic diversity of B. subtilis and infer the genome dynamics that explain the high genetic and phenotypic diversity observed. Phylogenomic and comparative genomic analyses of 42 B. subtilis genomes uncover a remarkable genome diversity that translates into a core genome of 1,659 genes and an asymptotic pangenome growth rate of 57 new genes per new genome added. This diversity is due to a large proportion of low-frequency genes that are acquired from closely related species. We find no gene-loss bias among wild isolates, which explains why the cloud genome, 43% of the species pangenome, represents only a small proportion of each genome. We show that B. subtilis can acquire xenologous copies of core genes that propagate laterally among strains within a niche. While not excluding the contributions of other mechanisms, our results strongly suggest a process of gene acquisition that is largely driven by competence, where the long-term maintenance of acquired genes depends on local and global fitness effects. This competence-driven genomic diversity provides B. subtilis with its generalist character, enabling it to occupy a wide range of ecological niches and cycle through them. PMID:29272410

  5. Properties of true quaternary fission of nuclei with allowance for its multistep and sequential character

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kadmensky, S. G., E-mail: kadmensky@phys.vsu.ru; Titova, L. V.; Bulychev, A. O.

    An analysis of basicmechanisms of binary and ternary fission of nuclei led to the conclusion that true ternary and quaternary fission of nuclei has a sequential two-step (three-step) character, where, at the first step, a fissile nucleus emits a third light particle (third and fourth light particles) under shakeup effects associated with a nonadiabatic character of its collective deformation motion, whereupon the residual nucleus undergoes fission to two fission fragments. Owing to this, the formulas derived earlier for the widths with respect to sequential two- and three-step decays of nuclei in constructing the theory of two-step twoproton decays and multistepmore » decays in chains of genetically related nuclei could be used to describe the relative yields and angular and energy distributions of third and fourth light particles emitted in (α, α), (t, t), and (α, t) pairs upon the true quaternary spontaneous fission of {sup 252}Cf and thermal-neutron-induced fission of {sup 235}U and {sup 233}U target nuclei. Mechanisms that explain a sharp decrease in the yield of particles appearing second in time and entering into the composition of light-particle pairs that originate from true quaternary fission of nuclei in relation to the yields of analogous particles in true ternary fission of nuclei are proposed.« less

  6. Use of qualitative environmental and phenotypic variables in the context of allele distribution models: detecting signatures of selection in the genome of Lake Victoria cichlids.

    PubMed

    Joost, Stéphane; Kalbermatten, Michael; Bezault, Etienne; Seehausen, Ole

    2012-01-01

    When searching for loci possibly under selection in the genome, an alternative to population genetics theoretical models is to establish allele distribution models (ADM) for each locus to directly correlate allelic frequencies and environmental variables such as precipitation, temperature, or sun radiation. Such an approach implementing multiple logistic regression models in parallel was implemented within a computing program named MATSAM: . Recently, this application was improved in order to support qualitative environmental predictors as well as to permit the identification of associations between genomic variation and individual phenotypes, allowing the detection of loci involved in the genetic architecture of polymorphic characters. Here, we present the corresponding methodological developments and compare the results produced by software implementing population genetics theoretical models (DFDIST: and BAYESCAN: ) and ADM (MATSAM: ) in an empirical context to detect signatures of genomic divergence associated with speciation in Lake Victoria cichlid fishes.

  7. Independent and combined analyses of sequences from all three genomic compartments converge on the root of flowering plant phylogeny

    PubMed Central

    Barkman, Todd J.; Chenery, Gordon; McNeal, Joel R.; Lyons-Weiler, James; Ellisens, Wayne J.; Moore, Gerry; Wolfe, Andrea D.; dePamphilis, Claude W.

    2000-01-01

    Plant phylogenetic estimates are most likely to be reliable when congruent evidence is obtained independently from the mitochondrial, plastid, and nuclear genomes with all methods of analysis. Here, results are presented from separate and combined genomic analyses of new and previously published data, including six and nine genes (8,911 bp and 12,010 bp, respectively) for different subsets of taxa that suggest Amborella + Nymphaeales (water lilies) are the first-branching angiosperm lineage. Before and after tree-independent noise reduction, most individual genomic compartments and methods of analysis estimated the Amborella + Nymphaeales basal topology with high support. Previous phylogenetic estimates placing Amborella alone as the first extant angiosperm branch may have been misled because of a series of specific problems with paralogy, suboptimal outgroups, long-branch taxa, and method dependence. Ancestral character state reconstructions differ between the two topologies and affect inferences about the features of early angiosperms. PMID:11069280

  8. Relations between Shannon entropy and genome order index in segmenting DNA sequences.

    PubMed

    Zhang, Yi

    2009-04-01

    Shannon entropy H and genome order index S are used in segmenting DNA sequences. Zhang [Phys. Rev. E 72, 041917 (2005)] found that the two schemes are equivalent when a DNA sequence is converted to a binary sequence of S (strong H bond) and W (weak H bond). They left the mathematical proof to mathematicians who are interested in this issue. In this paper, a possible mathematical explanation is given. Moreover, we find that Chargaff parity rule 2 is the necessary condition of the equivalence, and the equivalence disappears when a DNA sequence is regarded as a four-symbol sequence. At last, we propose that S-2(-H) may be related to species evolution.

  9. ArrayExpress update--trends in database growth and links to data analysis tools.

    PubMed

    Rustici, Gabriella; Kolesnikov, Nikolay; Brandizi, Marco; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Ison, Jon; Keays, Maria; Kurbatova, Natalja; Malone, James; Mani, Roby; Mupo, Annalisa; Pedro Pereira, Rui; Pilicheva, Ekaterina; Rung, Johan; Sharma, Anjan; Tang, Y Amy; Ternent, Tobias; Tikhonov, Andrew; Welter, Danielle; Williams, Eleanor; Brazma, Alvis; Parkinson, Helen; Sarkans, Ugis

    2013-01-01

    The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.

  10. Non-monotonic dynamics of water in its binary mixture with 1,2-dimethoxy ethane: A combined THz spectroscopic and MD simulation study.

    PubMed

    Das Mahanta, Debasish; Patra, Animesh; Samanta, Nirnay; Luong, Trung Quan; Mukherjee, Biswaroop; Mitra, Rajib Kumar

    2016-10-28

    A combined experimental (mid- and far-infrared FTIR spectroscopy and THz time domain spectroscopy (TTDS) (0.3-1.6 THz)) and molecular dynamics (MD) simulation technique are used to understand the evolution of the structure and dynamics of water in its binary mixture with 1,2-dimethoxy ethane (DME) over the entire concentration range. The cooperative hydrogen bond dynamics of water obtained from Debye relaxation of TTDS data reveals a non-monotonous behaviour in which the collective dynamics is much faster in the low X w region (where X w is the mole fraction of water in the mixture), whereas in X w ∼ 0.8 region, the dynamics gets slower than that of pure water. The concentration dependence of the reorientation times of water, calculated from the MD simulations, also captures this non-monotonous character. The MD simulation trajectories reveal presence of large amplitude angular jumps, which dominate the orientational relaxation. We rationalize the non-monotonous, concentration dependent orientational dynamics by identifying two different physical mechanisms which operate at high and low water concentration regimes.

  11. Transgenic fertile Scoparia dulcis L., a folk medicinal plant, conferred with a herbicide-resistant trait using an Ri binary vector.

    PubMed

    Yamazaki, M; Son, L; Hayashi, T; Morita, N; Asamizu, T; Mourakoshi, I; Saito, K

    1996-01-01

    Transgenic herbicide-resistant Scoparia dulcis plants were obtained by using an Ri binary vector system. The chimeric bar gene encoding phosphinothricin acetyltransferase flanked by the promoter for cauliflower mosaic virus 35S RNA and the terminal sequence for nopaline synthase was introduced in the plant genome by Agrobacterium-mediated transformation by means of scratching young plants. Hairy roots resistant to bialaphos were selected and plantlets (R0) were regenerated. Progenies (S1) were obtained by self-fertilization. The transgenic state was confirmed by DNA-blot hybridization and assaying of neomycin phosphotransferase II. Expression of the bar gene in the transgenic R0 and S1 progenies was indicated by the activity of phosphinothricin acetyltransferase. Transgenic plants accumulated scopadulcic acid B, a specific secondary metabolite of S. dulcis, in amounts of 15-60% compared with that in normal plants. The transgenic plants and progenies showed resistant trait towards bialaphos and phosphinothricin. These results suggest that an Ri binary system is one of the useful tools for the transformation of medicinal plants for which a regeneration protocol has not been established.

  12. High-Pressure Combustion of Binary Fuel Sprays

    NASA Technical Reports Server (NTRS)

    Williams, F. A.; Dietrich, Daniel L.

    2001-01-01

    The research addressed here represents a small cooperative project between the US and Japan. The authors have now been involved in this project for a number of years. In previous workshops, the presentation has focused narrowly on the specific most recent accomplishment. If this tradition were followed again, then material about to be published would form the basis of the present write-up. At the present stage, however, it may be of greater interest to step back and take a longer look at the overall character of the project and its history. The recent accomplishments therefore will be covered here only in an abbreviated manner.

  13. Mass correlation between light and heavy reaction products in multinucleon transfer 197Au+130Te collisions

    NASA Astrophysics Data System (ADS)

    Galtarossa, F.; Corradi, L.; Szilner, S.; Fioretto, E.; Pollarolo, G.; Mijatović, T.; Montanari, D.; Ackermann, D.; Bourgin, D.; Courtin, S.; Fruet, G.; Goasduff, A.; Grebosz, J.; Haas, F.; Jelavić Malenica, D.; Jeong, S. C.; Jia, H. M.; John, P. R.; Mengoni, D.; Milin, M.; Montagnoli, G.; Scarlassara, F.; Skukan, N.; Soić, N.; Stefanini, A. M.; Strano, E.; Tokić, V.; Ur, C. A.; Valiente-Dobón, J. J.; Watanabe, Y. X.

    2018-05-01

    We studied multinucleon transfer reactions in the 197Au+130Te system at Elab=1.07 GeV by employing the PRISMA magnetic spectrometer coupled to a coincident detector. For each light fragment we constructed, in coincidence, the distribution in mass of the heavy partner of the reaction. With a Monte Carlo method, starting from the binary character of the reaction, we simulated the de-excitation process of the produced heavy fragments to be able to understand their final mass distribution. The total cross sections for pure neutron transfer channels have also been extracted and compared with calculations performed with the grazing code.

  14. NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.

    PubMed

    Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit

    2018-01-01

    Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .

  15. Polarization effects in the reactions p + 3 He → π+ + 4 He, π+ + 4 He → p + 3 He and quantum character of spin correlations in the final (p, 3 He) system

    NASA Astrophysics Data System (ADS)

    Lyuboshitz, Valery V.; Lyuboshitz, Vladimir L.

    2017-12-01

    The general consequences of T invariance for the direct and inverse binary reactions a + b → c + d, c + d → a + b with spin-1/2 particles a, b and unpolarized particles c, d are considered. Using the formalism of helicity amplitudes, the polarization effects are studied in the reaction p + 3 He → π+ + 4 He and in the inverse process π+ + 4 He → p + 3 He. It is shown that in the reaction π + + 4 He → p + 3 He the spins of the final proton and 3 He nucleus are strongly correlated. A structural expression through helicity amplitudes, corresponding to arbitrary emission angles, is obtained for the correlation tensor. It is established that in the reaction π + + 4 He → p + 3 He one of the “classical” incoherence inequalities of the Bell type for diagonal components of the correlation tensor is necessarily violated and, thus, the spin correlations of the final particles have the strongly pronounced quantum character.

  16. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    PubMed

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. An intergeneric hybrid of a native minnow, the golden shiner, and an exotic minnow, the rudd

    USGS Publications Warehouse

    Burkhead, N.M.; Williams, J.D.

    1991-01-01

    The hybrid golden shiner Notemigonus crysoleucas × rudd Scardinius erythrophthalmus is the first known nonsalmonid, intergeneric hybrid of an exotic species and a North American native species. The cross is also the first valid record of a viable hybrid involving the native golden shiner. Meristic and mensural characters of 30 artificially produced hybrids of male golden shiners and female rudds were analyzed. Forty-seven percent of the meristic traits exhibited character states intermediate between those of parents. Twenty-seven percent of the meristic characters were supernumerary, suggesting developmental instability of the hybrid genome. Mensural hybrid characters were significantly skewed to the golden shiner phenotype. The skewed mensural inheritance and other skewed patterns of morphological inheritance also suggest problems in canalization of the hybrid phenome or atypical patterns of dominance. All hybrids were identifiable by intermediate squamation of the cultrate abdomen: the keel was mostly scaled but exhibited a small fleshy ridge posteriorly. This minnow hybrid allows general inferences to be made about the phylogenetic affinity of the golden shiner to other cultrate cyprinids of Eurasia. The hybrid cross has important management and conservation implications for fishes in North America. The hybrid is an example of how an exotic species may negatively affect a native species.

  18. Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes

    PubMed Central

    2014-01-01

    Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494

  19. Molecular characterization of natural orchid in South slopes of Mount Merapi, Sleman regency, Yogyakarta

    NASA Astrophysics Data System (ADS)

    Ferdiani, Defika I.; Devi, Fera L.; Koentjana, Johan P.; Milasari, Asri F.; Nur'aini, Indah; Semiarti, Endang

    2015-09-01

    Natural orchid is one of the most important tropical biodiversity. In Indonesia there are ± 6000 species out of 30000 orchids species in the world, of which there are ± 60 species at Mount Merapi. Repetitive eruption of Merapi have wiped out the biodiversity of orchids, therefore the efforts to conserve the orchids and to establish the database of natural orchids in Mount Merapi are needed. The orchid's database can be created based on DNA analysis, and establish barcoding DNA. DNA-barcodes can be used as molecular markers. The different character of morphology usually shows different pattern in DNA fragments. This research aims to characterize the phenotype and genotype of natural orchids of Mt. Merapi based on morphology and the structure of DNA in trnL-F intergenic region of chloroplasts DNA of orchid. Amplified Fragment Length Polymorphism (AFLP) technique was used to characterize the molecular types of orchids in silico of intergenic space area of orchid chloroplast. In this study, 11 species of orchids were characterized based on morphological and molecular characters. The molecular characters were obtained from trnL-F intergenic region of leaves chloroplasts. The data indicates that there is a conserve DNA pattern in all orchids and the distinctive characters of some orchids. In this study, based on trnL-F intergenic region of chloroplast genome, the phylogenetic tree revealed that 11 species of orchids at Mt. Merapi can be grouped into 2 clades, that matched with morphological characters.

  20. Simultaneous gene finding in multiple genomes.

    PubMed

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Phylogenomic evidence for ancient hybridization in the genomes of living cats (Felidae)

    PubMed Central

    Li, Gang; Davis, Brian W.; Eizirik, Eduardo; Murphy, William J.

    2016-01-01

    Inter-species hybridization has been recently recognized as potentially common in wild animals, but the extent to which it shapes modern genomes is still poorly understood. Distinguishing historical hybridization events from other processes leading to phylogenetic discordance among different markers requires a well-resolved species tree that considers all modes of inheritance and overcomes systematic problems due to rapid lineage diversification by sampling large genomic character sets. Here, we assessed genome-wide phylogenetic variation across a diverse mammalian family, Felidae (cats). We combined genotypes from a genome-wide SNP array with additional autosomal, X- and Y-linked variants to sample ∼150 kb of nuclear sequence, in addition to complete mitochondrial genomes generated using light-coverage Illumina sequencing. We present the first robust felid time tree that accounts for unique maternal, paternal, and biparental evolutionary histories. Signatures of phylogenetic discordance were abundant in the genomes of modern cats, in many cases indicating hybridization as the most likely cause. Comparison of big cat whole-genome sequences revealed a substantial reduction of X-linked divergence times across several large recombination cold spots, which were highly enriched for signatures of selection-driven post-divergence hybridization between the ancestors of the snow leopard and lion lineages. These results highlight the mosaic origin of modern felid genomes and the influence of sex chromosomes and sex-biased dispersal in post-speciation gene flow. A complete resolution of the tree of life will require comprehensive genomic sampling of biparental and sex-limited genetic variation to identify and control for phylogenetic conflict caused by ancient admixture and sex-biased differences in genomic transmission. PMID:26518481

  2. Approximate strip exchanging.

    PubMed

    Roy, Swapnoneel; Thakur, Ashok Kumar

    2008-01-01

    Genome rearrangements have been modelled by a variety of primitives such as reversals, transpositions, block moves and block interchanges. We consider such a genome rearrangement primitive Strip Exchanges. Given a permutation, the challenge is to sort it by using minimum number of strip exchanges. A strip exchanging move interchanges the positions of two chosen strips so that they merge with other strips. The strip exchange problem is to sort a permutation using minimum number of strip exchanges. We present here the first non-trivial 2-approximation algorithm to this problem. We also observe that sorting by strip-exchanges is fixed-parameter-tractable. Lastly we discuss the application of strip exchanges in a different area Optical Character Recognition (OCR) with an example.

  3. A simple Gateway-assisted construction system of TALEN genes for plant genome editing.

    PubMed

    Kusano, Hiroaki; Onodera, Hitomi; Kihira, Miho; Aoki, Hiromi; Matsuzaki, Hikaru; Shimada, Hiroaki

    2016-07-25

    TALEN is an artificial nuclease being applied for sequence-specific genome editing. For the plant genome editing, a pair of TALEN genes is expressed in the cells, and a binary plasmid for Agrobacterium-mediated transformation should be assembled. We developed a novel procedure using the Gateway-assisted plasmids, named Emerald-Gateway TALEN system. We constructed entry vectors, pPlat plasmids, for construction of a desired TALEN gene using Platinum Gate TALEN kit. We also created destination plasmid, pDual35SGw1301, which allowed two TALEN genes to both DNA strands to recruit using Gateway technology. Resultant TALEN genes were evaluated by the single-strand annealing (SSA) assay in E. coli cells. By this assay, the TALENs recognized the corresponding targets in the divided luciferase gene, and induced a specific recombination to generate an active luciferase gene. Using the TALEN genes constructed, we created a transformant potato cells in which a site-specific mutation occurred at the target site of the GBSS gene. This suggested that our system worked effectively and was applicable as a convenient tool for the plant genome editing.

  4. Viral genetic variation accounts for a third of variability in HIV-1 set-point viral load in Europe.

    PubMed

    Blanquart, François; Wymant, Chris; Cornelissen, Marion; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Hall, Matthew; Hillebregt, Mariska; Ong, Swee Hoe; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle J; Grabowski, M Kate; Gunsenheimer-Bartmeyer, Barbara; Günthard, Huldrych F; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Vanham, Guido; Berkhout, Ben; Kellam, Paul; Reiss, Peter; Fraser, Christophe

    2017-06-01

    HIV-1 set-point viral load-the approximately stable value of viraemia in the first years of chronic infection-is a strong predictor of clinical outcome and is highly variable across infected individuals. To better understand HIV-1 pathogenesis and the evolution of the viral population, we must quantify the heritability of set-point viral load, which is the fraction of variation in this phenotype attributable to viral genetic variation. However, current estimates of heritability vary widely, from 6% to 59%. Here we used a dataset of 2,028 seroconverters infected between 1985 and 2013 from 5 European countries (Belgium, Switzerland, France, the Netherlands and the United Kingdom) and estimated the heritability of set-point viral load at 31% (CI 15%-43%). Specifically, heritability was measured using models of character evolution describing how viral load evolves on the phylogeny of whole-genome viral sequences. In contrast to previous studies, (i) we measured viral loads using standardized assays on a sample collected in a strict time window of 6 to 24 months after infection, from which the viral genome was also sequenced; (ii) we compared 2 models of character evolution, the classical "Brownian motion" model and another model ("Ornstein-Uhlenbeck") that includes stabilising selection on viral load; (iii) we controlled for covariates, including age and sex, which may inflate estimates of heritability; and (iv) we developed a goodness of fit test based on the correlation of viral loads in cherries of the phylogenetic tree, showing that both models of character evolution fit the data well. An overall heritability of 31% (CI 15%-43%) is consistent with other studies based on regression of viral load in donor-recipient pairs. Thus, about a third of variation in HIV-1 virulence is attributable to viral genetic variation.

  5. Determinación de miembros, binaridad y metalicidad de gigantes rojas en el cúmulo abierto de edad intermedia NGC 2354

    NASA Astrophysics Data System (ADS)

    Clariá, J. J.; Mermilliod, J. C.; Piatti, A. E.

    We present new Coravel radial-velocity observations and photoelectric photometry in the UBV, DDO and Washington systems for a sample of red giant candidates in the field of the intermediate-age open cluster NGC 2354. Photometric membership probabilities show very good agrement with those obtained from Coravel radial velocities. The analysis of the photometric and kinematical data allow us to confirm cluster membership for 9 red giants, one of them being a spectroscopic binary, while 4 confirmed spectroscopic binaries appear to be probable members. We have also discovered 4 spectroscopic binaries not belonging to the cluster. A mean radial velocity of (33.40±0.27)km s-1 and a mean reddening E(B-V)= 0.13±0.03 were derived for the cluster giants. NGC 2354 has a mean ultraviolet excess <δ(U-B)>=-0.03±0.01, relative to the field K giants, and a mean new cyanogen anomaly ΔCN=-0.035±0.007, both implying [Fe/H]≈-0.3. The moderately metal-poor character of NGC 2354 is confirmed using five different metal abundance indicators of the Washington system. The cluster giant branch is formed by a well defined clump of 7 stars and 4 stars with high membership probabilities seem to define an ascending giant branch. The whole red giant locus cannot be reproduced by any theoretical track. This paper will appear in Astron. & Astrophys. Suppl. (1999).

  6. The genomic architecture and association genetics of adaptive characters using a candidate SNP approach in boreal black spruce

    PubMed Central

    2013-01-01

    Background The genomic architecture of adaptive traits remains poorly understood in non-model plants. Various approaches can be used to bridge this gap, including the mapping of quantitative trait loci (QTL) in pedigrees, and genetic association studies in non-structured populations. Here we present results on the genomic architecture of adaptive traits in black spruce, which is a widely distributed conifer of the North American boreal forest. As an alternative to the usual candidate gene approach, a candidate SNP approach was developed for association testing. Results A genetic map containing 231 gene loci was used to identify QTL that were related to budset timing and to tree height assessed over multiple years and sites. Twenty-two unique genomic regions were identified, including 20 that were related to budset timing and 6 that were related to tree height. From results of outlier detection and bulk segregant analysis for adaptive traits using DNA pool sequencing of 434 genes, 52 candidate SNPs were identified and subsequently tested in genetic association studies for budset timing and tree height assessed over multiple years and sites. A total of 34 (65%) SNPs were significantly associated with budset timing, or tree height, or both. Although the percentages of explained variance (PVE) by individual SNPs were small, several significant SNPs were shared between sites and among years. Conclusions The sharing of genomic regions and significant SNPs between budset timing and tree height indicates pleiotropic effects. Significant QTLs and SNPs differed quite greatly among years, suggesting that different sets of genes for the same characters are involved at different stages in the tree’s life history. The functional diversity of genes carrying significant SNPs and low observed PVE further indicated that a large number of polymorphisms are involved in adaptive genetic variation. Accordingly, for undomesticated species such as black spruce with natural populations of large effective size and low linkage disequilibrium, efficient marker systems that are predictive of adaptation should require the survey of large numbers of SNPs. Candidate SNP approaches like the one developed in the present study could contribute to reducing these numbers. PMID:23724860

  7. Distinctive characters of Nostoc genomes in cyanolichens.

    PubMed

    Gagunashvili, Andrey N; Andrésson, Ólafur S

    2018-06-05

    Cyanobacteria of the genus Nostoc are capable of forming symbioses with a wide range of organism, including a diverse assemblage of cyanolichens. Only certain lineages of Nostoc appear to be able to form a close, stable symbiosis, raising the question whether symbiotic competence is determined by specific sets of genes and functionalities. We present the complete genome sequencing, annotation and analysis of two lichen Nostoc strains. Comparison with other Nostoc genomes allowed identification of genes potentially involved in symbioses with a broad range of partners including lichen mycobionts. The presence of additional genes necessary for symbiotic competence is likely reflected in larger genome sizes of symbiotic Nostoc strains. Some of the identified genes are presumably involved in the initial recognition and establishment of the symbiotic association, while others may confer advantage to cyanobionts during cohabitation with a mycobiont in the lichen symbiosis. Our study presents the first genome sequencing and genome-scale analysis of lichen-associated Nostoc strains. These data provide insight into the molecular nature of the cyanolichen symbiosis and pinpoint candidate genes for further studies aimed at deciphering the genetic mechanisms behind the symbiotic competence of Nostoc. Since many phylogenetic studies have shown that Nostoc is a polyphyletic group that includes several lineages, this work also provides an improved molecular basis for demarcation of a Nostoc clade with symbiotic competence.

  8. Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding.

    PubMed

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo; Eskridge, Kent; Crossa, José

    2014-12-23

    Categorical scores for disease susceptibility or resistance often are recorded in plant breeding. The aim of this study was to introduce genomic models for analyzing ordinal characters and to assess the predictive ability of genomic predictions for ordered categorical phenotypes using a threshold model counterpart of the Genomic Best Linear Unbiased Predictor (i.e., TGBLUP). The threshold model was used to relate a hypothetical underlying scale to the outward categorical response. We present an empirical application where a total of nine models, five without interaction and four with genomic × environment interaction (G×E) and genomic additive × additive × environment interaction (G×G×E), were used. We assessed the proposed models using data consisting of 278 maize lines genotyped with 46,347 single-nucleotide polymorphisms and evaluated for disease resistance [with ordinal scores from 1 (no disease) to 5 (complete infection)] in three environments (Colombia, Zimbabwe, and Mexico). Models with G×E captured a sizeable proportion of the total variability, which indicates the importance of introducing interaction to improve prediction accuracy. Relative to models based on main effects only, the models that included G×E achieved 9-14% gains in prediction accuracy; adding additive × additive interactions did not increase prediction accuracy consistently across locations. Copyright © 2015 Montesinos-López et al.

  9. Phylogenetics of modern birds in the era of genomics

    PubMed Central

    Edwards, Scott V; Bryan Jennings, W; Shedlock, Andrew M

    2005-01-01

    In the 14 years since the first higher-level bird phylogenies based on DNA sequence data, avian phylogenetics has witnessed the advent and maturation of the genomics era, the completion of the chicken genome and a suite of technologies that promise to add considerably to the agenda of avian phylogenetics. In this review, we summarize current approaches and data characteristics of recent higher-level bird studies and suggest a number of as yet untested molecular and analytical approaches for the unfolding tree of life for birds. A variety of comparative genomics strategies, including adoption of objective quality scores for sequence data, analysis of contiguous DNA sequences provided by large-insert genomic libraries, and the systematic use of retroposon insertions and other rare genomic changes all promise an integrated phylogenetics that is solidly grounded in genome evolution. The avian genome is an excellent testing ground for such approaches because of the more balanced representation of single-copy and repetitive DNA regions than in mammals. Although comparative genomics has a number of obvious uses in avian phylogenetics, its application to large numbers of taxa poses a number of methodological and infrastructural challenges, and can be greatly facilitated by a ‘community genomics’ approach in which the modest sequencing throughputs of single PI laboratories are pooled to produce larger, complementary datasets. Although the polymerase chain reaction era of avian phylogenetics is far from complete, the comparative genomics era—with its ability to vastly increase the number and type of molecular characters and to provide a genomic context for these characters—will usher in a host of new perspectives and opportunities for integrating genome evolution and avian phylogenetics. PMID:16024355

  10. Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions.

    PubMed

    Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize; Zhao, Yun; Zhao, Hai

    2017-01-01

    Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela , Landoltia , Lemna , Wolffiella , and Wolffia . This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds.

  11. Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions

    PubMed Central

    Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize

    2017-01-01

    Background Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. Methods DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Results Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia. Discussion This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds. PMID:29302399

  12. Complete mitochondrial DNA genome of bonnethead shark, Sphyrna tiburo, and phylogenetic relationships among main superorders of modern elasmobranchs

    PubMed Central

    Díaz-Jaimes, Píndaro; Bayona-Vásquez, Natalia J.; Adams, Douglas H.; Uribe-Alcocer, Manuel

    2015-01-01

    Elasmobranchs are one of the most diverse groups in the marine realm represented by 18 orders, 55 families and about 1200 species reported, but also one of the most vulnerable to exploitation and to climate change. Phylogenetic relationships among main orders have been controversial since the emergence of the Hypnosqualean hypothesis by Shirai (1992) that considered batoids as a sister group of sharks. The use of the complete mitochondrial DNA (mtDNA) may shed light to further validate this hypothesis by increasing the number of informative characters. We report the mtDNA genome of the bonnethead shark Sphyrna tiburo, and compare it with mitogenomes of other 48 species to assess phylogenetic relationships. The mtDNA genome of S. tiburo, is quite similar in size to that of congeneric species but also similar to the reported mtDNA genome of other Carcharhinidae species. Like most vertebrate mitochondrial genomes, it contained 13 protein coding genes, two rRNA genes and 22 tRNA genes and the control region of 1086 bp (D-loop). The Bayesian analysis of the 49 mitogenomes supported the view that sharks and batoids are separate groups. PMID:27014583

  13. Ecological and evolutionary significance of genomic GC content diversity in monocots

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie; Leitch, Ilia J.; Mucina, Ladislav; Pacini, Ettore; Tichý, Lubomír; Grulich, Vít; Rotreklová, Olga

    2014-01-01

    Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth’s contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation. PMID:25225383

  14. Phylogenetic analysis of the true water bugs (Insecta: Hemiptera: Heteroptera: Nepomorpha): evidence from mitochondrial genomes

    PubMed Central

    Hua, Jimeng; Li, Ming; Dong, Pengzhi; Cui, Ying; Xie, Qiang; Bu, Wenjun

    2009-01-01

    Background The true water bugs are grouped in infraorder Nepomorpha (Insecta: Hemiptera: Heteroptera) and are of great economic importance. The phylogenetic relationships within Nepomorpha and the taxonomic hierarchies of Pleoidea and Aphelocheiroidea are uncertain. Most of the previous studies were based on morphological characters without algorithmic assessment. In the latest study, the molecular markers employed in phylogenetic analyses were partial sequences of 16S rDNA and 18S rDNA with a total length about 1 kb. Up to now, no mitochondrial genome of the true water bugs has been sequenced, which is one of the largest data sets that could be compared across animal taxa. In this study we analyzed the unresolved problems in Nepomorpha using evidence from mitochondrial genomes. Results Nine mitochondrial genomes of Nepomorpha and five of other hemipterans were sequenced. These mitochondrial genomes contain the commonly found 37 genes without gene rearrangements. Based on the nucleotide sequences of mt-genomes, Pleoidea is not a member of the Nepomorpha and Aphelocheiroidea should be grouped back into Naucoroidea. Phylogenetic relationships among the superfamilies of Nepomorpha were resolved robustly. Conclusion The mt-genome is an effective data source for resolving intraordinal phylogenetic problems at the superfamily level within Heteroptera. The mitochondrial genomes of the true water bugs are typical insect mt-genomes. Based on the nucleotide sequences of the mt-genomes, we propose the Pleoidea to be a separate heteropteran infraorder. The infraorder Nepomorpha consists of five superfamilies with the relationships (Corixoidea + ((Naucoroidea + Notonectoidea) + (Ochteroidea + Nepoidea))). PMID:19523246

  15. Concordance and discordance of sequence survey methods for molecular epidemiology

    PubMed Central

    Hasan, Nur A.; Cebula, Thomas A.; Colwell, Rita R.; Robison, Richard A.; Johnson, W. Evan; Crandall, Keith A.

    2015-01-01

    The post-genomic era is characterized by the direct acquisition and analysis of genomic data with many applications, including the enhancement of the understanding of microbial epidemiology and pathology. However, there are a number of molecular approaches to survey pathogen diversity, and the impact of these different approaches on parameter estimation and inference are not entirely clear. We sequenced whole genomes of bacterial pathogens, Burkholderia pseudomallei, Yersinia pestis, and Brucella spp. (60 new genomes), and combined them with 55 genomes from GenBank to address how different molecular survey approaches (whole genomes, SNPs, and MLST) impact downstream inferences on molecular evolutionary parameters, evolutionary relationships, and trait character associations. We selected isolates for sequencing to represent temporal, geographic origin, and host range variability. We found that substitution rate estimates vary widely among approaches, and that SNP and genomic datasets yielded different but strongly supported phylogenies. MLST yielded poorly supported phylogenies, especially in our low diversity dataset, i.e., Y. pestis. Trait associations showed that B. pseudomallei and Y. pestis phylogenies are significantly associated with geography, irrespective of the molecular survey approach used, while Brucella spp. phylogeny appears to be strongly associated with geography and host origin. We contrast inferences made among monomorphic (clonal) and non-monomorphic bacteria, and between intra- and inter-specific datasets. We also discuss our results in light of underlying assumptions of different approaches. PMID:25737810

  16. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yuhki, Naoya; O'Brien, S.J.

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragmentmore » length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations.« less

  17. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history.

    PubMed Central

    Yuhki, N; O'Brien, S J

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. We present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations. Images PMID:1967831

  18. Variants in TTC25 affect autistic trait in patients with autism spectrum disorder and general population.

    PubMed

    Vojinovic, Dina; Brison, Nathalie; Ahmad, Shahzad; Noens, Ilse; Pappa, Irene; Karssen, Lennart C; Tiemeier, Henning; van Duijn, Cornelia M; Peeters, Hilde; Amin, Najaf

    2017-08-01

    Autism spectrum disorder (ASD) is a highly heritable neurodevelopmental disorder with a complex genetic architecture. To identify genetic variants underlying ASD, we performed single-variant and gene-based genome-wide association studies using a dense genotyping array containing over 2.3 million single-nucleotide variants in a discovery sample of 160 families with at least one child affected with non-syndromic ASD using a binary (ASD yes/no) phenotype and a quantitative autistic trait. Replication of the top findings was performed in Psychiatric Genomics Consortium and Erasmus Rucphen Family (ERF) cohort study. Significant association of quantitative autistic trait was observed with the TTC25 gene at 17q21.2 (effect size=10.2, P-value=3.4 × 10 -7 ) in the gene-based analysis. The gene also showed nominally significant association in the cohort-based ERF study (effect=1.75, P-value=0.05). Meta-analysis of discovery and replication improved the association signal (P-value meta =1.5 × 10 -8 ). No genome-wide significant signal was observed in the single-variant analysis of either the binary ASD phenotype or the quantitative autistic trait. Our study has identified a novel gene TTC25 to be associated with quantitative autistic trait in patients with ASD. The replication of association in a cohort-based study and the effect estimate suggest that variants in TTC25 may also be relevant for broader ASD phenotype in the general population. TTC25 is overexpressed in frontal cortex and testis and is known to be involved in cilium movement and thus an interesting candidate gene for autistic trait.

  19. Genotoxic potential of the binary mixture of cyanotoxins microcystin-LR and cylindrospermopsin.

    PubMed

    Hercog, Klara; Maisanaba, Sara; Filipič, Metka; Jos, Ángeles; Cameán, Ana M; Žegura, Bojana

    2017-12-01

    Increased eutrophication of water bodies promotes cyanobacterial blooming that is hazardous due to the production of various bioactive compounds. Microcystin-LR (MCLR) is among the most widespread cyanotoxins classified as possible human carcinogen, while cylindrospermopsin (CYN) has only recently been recognized as health concern. Both cyanotoxins are genotoxic; however, the mechanisms of their action differ. They are ubiquitously present in water environment and are often detected together. Therefore, we studied genotoxic potential of the binary mixture of these cyanotoxins. Human hepatoma cells (HepG2) were exposed to a single dose of MCLR (1 μg/mL), graded doses of CYN (0.01-0.5 μg/mL), and their combinations. Comet and Cytokinesis block micronucleus assays were used to detect induction of DNA strand breaks (sb) and genomic instability, respectively, along with the transcriptional analyses of the expression of selected genes involved in xenobiotic metabolism, immediate/early cell response and DNA-damage response. MCLR induced DNA sb that were only transiently present after 4 h exposure, whereas CYN, after 24 h exposure, induced DNA sb and genomic instability. The MCLR/CYN mixture induced DNA sb after 24 h exposure, but to lesser extent as CYN alone. On the other hand, induction of genomic instability by the MCLR/CYN mixture was comparable to that induced by CYN alone. In addition, patterns of changes in the expression of selected genes induced by the MCLR/CYN mixture were not significantly different from those induced by CYN alone. Our results indicate that CYN exerts higher genotoxic potential than MCLR and that genotoxic potential of the MCLR/CYN mixture is comparable to that of CYN alone. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Development of Chloroplast and Nuclear DNA Markers for Chinese Oaks (Quercus Subgenus Quercus) and Assessment of Their Utility as DNA Barcodes

    PubMed Central

    Yang, Jia; Vázquez, Lucía; Chen, Xiaodan; Li, Huimin; Zhang, Hao; Liu, Zhanlin; Zhao, Guifang

    2017-01-01

    Chloroplast DNA (cpDNA) is frequently used for species demography, evolution, and species discrimination of plants. However, the lack of efficient and universal markers often brings particular challenges for genetic studies across different plant groups. In this study, chloroplast genomes from two closely related species (Quercus rubra and Castanea mollissima) in Fagaceae were compared to explore universal cpDNA markers for the Chinese oak species in Quercus subgenus Quercus, a diverse species group without sufficient molecular differentiation. With the comparison, nine and 14 plastid markers were selected as barcoding and phylogeographic candidates for the Chinese oaks. Five (psbA-trnH, matK-trnK, ycf3-trnS, matK, and ycf1) of the nine plastid candidate barcodes, with the addition of newly designed ITS and a single-copy nuclear gene (SAP), were then tested on 35 Chinese oak species employing four different barcoding approaches (genetic distance-, BLAST-, character-, and tree-based methods). The four methods showed different species identification powers with character-based method performing the best. Of the seven barcodes tested, a barcoding gap was absent in all of them across the Chinese oaks, while ITS and psbA-trnH provided the highest species resolution (30.30%) with the character- and BLAST-based methods, respectively. The six-marker combination (psbA-trnH + matK-trnK + matK + ycf1 + ITS + SAP) showed the best species resolution (84.85%) using the character-based method for barcoding the Chinese oaks. The barcoding results provided additional implications for taxonomy of the Chinese oaks in subg. Quercus, basically identifying three major infrageneric clades of the Chinese oaks (corresponding to Groups Quercus, Cerris, and Ilex) referenced to previous phylogenetic classification of Quercus. While the morphology-based allocations proposed for the Chinese oaks in subg. Quercus were challenged. A low variation rate of the chloroplast genome, and complex speciation patterns involving incomplete lineage sorting, interspecific hybridization and introgression, possibly have negative impacts on the species assignment and phylogeny of oak species. PMID:28579999

  1. How to become a superhero

    NASA Astrophysics Data System (ADS)

    Gleiser, Pablo M.

    2007-09-01

    We analyze a collaboration network based on the Marvel Universe comic books. First, we consider the system as a binary network, where two characters are connected if they appear in the same publication. The analysis of degree correlations reveals that, in contrast to most real social networks, the Marvel Universe presents a disassortative mixing on the degree. Then, we use a weight measure to study the system as a weighted network. This allows us to find and characterize well defined communities. Through the analysis of the community structure and the clustering as a function of the degree we show that the network presents a hierarchical structure. Finally, we comment on possible mechanisms responsible for the particular motifs observed.

  2. ELECTROSTATIC MEMORY SYSTEM

    DOEpatents

    Chu, J.C.

    1958-09-23

    An improved electrostatic memory system is de scribed fer a digital computer wherein a plarality of storage tubes are adapted to operate in either of two possible modes. According to the present irvention, duplicate storage tubes are provided fur each denominational order of the several binary digits. A single discriminator system is provided between corresponding duplicate tubes to determine the character of the infurmation stored in each. If either tube produces the selected type signal, corresponding to binazy "1" in the preferred embodiment, a "1" is regenerated in both tubes. In one mode of operation each bit of information is stored in two corresponding tubes, while in the other mode of operation each bit is stored in only one tube in the conventional manner.

  3. Inferring phylogenetic trees from the knowledge of rare evolutionary events.

    PubMed

    Hellmuth, Marc; Hernandez-Rosales, Maribel; Long, Yangjing; Stadler, Peter F

    2018-06-01

    Rare events have played an increasing role in molecular phylogenetics as potentially homoplasy-poor characters. In this contribution we analyze the phylogenetic information content from a combinatorial point of view by considering the binary relation on the set of taxa defined by the existence of a single event separating two taxa. We show that the graph-representation of this relation must be a tree. Moreover, we characterize completely the relationship between the tree of such relations and the underlying phylogenetic tree. With directed operations such as tandem-duplication-random-loss events in mind we demonstrate how non-symmetric information constrains the position of the root in the partially reconstructed phylogeny.

  4. A Phylogenomic Census of Molecular Functions Identifies Modern Thermophilic Archaea as the Most Ancient Form of Cellular Life

    PubMed Central

    Kim, Kyung Mo; Caetano-Anollés, Gustavo

    2014-01-01

    The origins of diversified life remain mysterious despite considerable efforts devoted to untangling the roots of the universal tree of life. Here we reconstructed phylogenies that described the evolution of molecular functions and the evolution of species directly from a genomic census of gene ontology (GO) definitions. We sampled 249 free-living genomes spanning organisms in the three superkingdoms of life, Archaea, Bacteria, and Eukarya, and used the abundance of GO terms as molecular characters to produce rooted phylogenetic trees. Results revealed an early thermophilic origin of Archaea that was followed by genome reduction events in microbial superkingdoms. Eukaryal genomes displayed extraordinary functional diversity and were enriched with hundreds of novel molecular activities not detected in the akaryotic microbial cells. Remarkably, the majority of these novel functions appeared quite late in evolution, synchronized with the diversification of the eukaryal superkingdom. The distribution of GO terms in superkingdoms confirms that Archaea appears to be the simplest and most ancient form of cellular life, while Eukarya is the most diverse and recent. PMID:25249790

  5. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less

  6. The UCSC genome browser and associated tools

    PubMed Central

    Haussler, David; Kent, W. James

    2013-01-01

    The UCSC Genome Browser (http://genome.ucsc.edu) is a graphical viewer for genomic data now in its 13th year. Since the early days of the Human Genome Project, it has presented an integrated view of genomic data of many kinds. Now home to assemblies for 58 organisms, the Browser presents visualization of annotations mapped to genomic coordinates. The ability to juxtapose annotations of many types facilitates inquiry-driven data mining. Gene predictions, mRNA alignments, epigenomic data from the ENCODE project, conservation scores from vertebrate whole-genome alignments and variation data may be viewed at any scale from a single base to an entire chromosome. The Browser also includes many other widely used tools, including BLAT, which is useful for alignments from high-throughput sequencing experiments. Private data uploaded as Custom Tracks and Data Hubs in many formats may be displayed alongside the rich compendium of precomputed data in the UCSC database. The Table Browser is a full-featured graphical interface, which allows querying, filtering and intersection of data tables. The Saved Session feature allows users to store and share customized views, enhancing the utility of the system for organizing multiple trains of thought. Binary Alignment/Map (BAM), Variant Call Format and the Personal Genome Single Nucleotide Polymorphisms (SNPs) data formats are useful for visualizing a large sequencing experiment (whole-genome or whole-exome), where the differences between the data set and the reference assembly may be displayed graphically. Support for high-throughput sequencing extends to compact, indexed data formats, such as BAM, bigBed and bigWig, allowing rapid visualization of large datasets from RNA-seq and ChIP-seq experiments via local hosting. PMID:22908213

  7. The UCSC genome browser and associated tools.

    PubMed

    Kuhn, Robert M; Haussler, David; Kent, W James

    2013-03-01

    The UCSC Genome Browser (http://genome.ucsc.edu) is a graphical viewer for genomic data now in its 13th year. Since the early days of the Human Genome Project, it has presented an integrated view of genomic data of many kinds. Now home to assemblies for 58 organisms, the Browser presents visualization of annotations mapped to genomic coordinates. The ability to juxtapose annotations of many types facilitates inquiry-driven data mining. Gene predictions, mRNA alignments, epigenomic data from the ENCODE project, conservation scores from vertebrate whole-genome alignments and variation data may be viewed at any scale from a single base to an entire chromosome. The Browser also includes many other widely used tools, including BLAT, which is useful for alignments from high-throughput sequencing experiments. Private data uploaded as Custom Tracks and Data Hubs in many formats may be displayed alongside the rich compendium of precomputed data in the UCSC database. The Table Browser is a full-featured graphical interface, which allows querying, filtering and intersection of data tables. The Saved Session feature allows users to store and share customized views, enhancing the utility of the system for organizing multiple trains of thought. Binary Alignment/Map (BAM), Variant Call Format and the Personal Genome Single Nucleotide Polymorphisms (SNPs) data formats are useful for visualizing a large sequencing experiment (whole-genome or whole-exome), where the differences between the data set and the reference assembly may be displayed graphically. Support for high-throughput sequencing extends to compact, indexed data formats, such as BAM, bigBed and bigWig, allowing rapid visualization of large datasets from RNA-seq and ChIP-seq experiments via local hosting.

  8. Are we Genomic Mosaics? Variations of the Genome of Somatic Cells can Contribute to Diversify our Phenotypes.

    PubMed

    Astolfi, P A; Salamini, F; Sgaramella, V

    2010-09-01

    Theoretical and experimental evidences support the hypothesis that the genomes and the epigenomes may be different in the somatic cells of complex organisms. In the genome, the differences range from single base substitutions to chromosome number; in the epigenome, they entail multiple postsynthetic modifications of the chromatin. Somatic genome variations (SGV) may accumulate during development in response both to genetic programs, which may differ from tissue to tissue, and to environmental stimuli, which are often undetected and generally irreproducible. SGV may jeopardize physiological cellular functions, but also create novel coding and regulatory sequences, to be exposed to intraorganismal Darwinian selection. Genomes acknowledged as comparatively poor in genes, such as humans', could thus increase their pristine informational endowment. A better understanding of SGV will contribute to basic issues such as the "nature vs nurture" dualism and the inheritance of acquired characters. On the applied side, they may explain the low yield of cloning via somatic cell nuclear transfer, provide clues to some of the problems associated with transdifferentiation, and interfere with individual DNA analysis. SGV may be unique in the different cells types and in the different developmental stages, and thus explain the several hundred gaps persisting in the human genomes "completed" so far. They may compound the variations associated to our epigenomes and make of each of us an "(epi)genomic" mosaic. An ensuing paradigm is the possibility that a single genome (the ephemeral one assembled at fertilization) has the capacity to generate several different brains in response to different environments.

  9. WEbcoli: an interactive and asynchronous web application for in silico design and analysis of genome-scale E.coli model.

    PubMed

    Jung, Tae-Sung; Yeo, Hock Chuan; Reddy, Satty G; Cho, Wan-Sup; Lee, Dong-Yup

    2009-11-01

    WEbcoli is a WEb application for in silico designing, analyzing and engineering Escherichia coli metabolism. It is devised and implemented using advanced web technologies, thereby leading to enhanced usability and dynamic web accessibility. As a main feature, the WEbcoli system provides a user-friendly rich web interface, allowing users to virtually design and synthesize mutant strains derived from the genome-scale wild-type E.coli model and to customize pathways of interest through a graph editor. In addition, constraints-based flux analysis can be conducted for quantifying metabolic fluxes and charactering the physiological and metabolic states under various genetic and/or environmental conditions. WEbcoli is freely accessible at http://webcoli.org. cheld@nus.edu.sg.

  10. [Advance of genetics and genomics in neurology].

    PubMed

    Ginter, E K; Illarioshkin, S N

    2012-01-01

    Studies of genomic background of neurological disorders are very actual in view of their high population prevalence, severe course, serious impact on patients' disability and progressive mental and physical de-adaptation. In the paper, problems of genetic heterogeneity of hereditary neurological disorders and character of the respective genetic burden in the regions of Russian Federation are discussed in detail, a 'dynamic' type of mutations (increase in number of microsatellite repeats copies) attributable to many neurodegenerative diseases is analyzed, and achievements of Russian researchers in the identification of genes for hereditary neurological disorders and in the realization of pilot protocols of gene therapy are presented. Problems related to studies of genetic predisposition to common multifactorial diseases of the nervous system are discussed.

  11. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    PubMed

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  12. Interactome INSIDER: a structural interactome browser for genomic studies.

    PubMed

    Meyer, Michael J; Beltrán, Juan Felipe; Liang, Siqi; Fragoza, Robert; Rumack, Aaron; Liang, Jin; Wei, Xiaomu; Yu, Haiyuan

    2018-01-01

    We present Interactome INSIDER, a tool to link genomic variant information with structural protein-protein interactomes. Underlying this tool is the application of machine learning to predict protein interaction interfaces for 185,957 protein interactions with previously unresolved interfaces in human and seven model organisms, including the entire experimentally determined human binary interactome. Predicted interfaces exhibit functional properties similar to those of known interfaces, including enrichment for disease mutations and recurrent cancer mutations. Through 2,164 de novo mutagenesis experiments, we show that mutations of predicted and known interface residues disrupt interactions at a similar rate and much more frequently than mutations outside of predicted interfaces. To spur functional genomic studies, Interactome INSIDER (http://interactomeinsider.yulab.org) enables users to identify whether variants or disease mutations are enriched in known and predicted interaction interfaces at various resolutions. Users may explore known population variants, disease mutations, and somatic cancer mutations, or they may upload their own set of mutations for this purpose.

  13. bwtool: a tool for bigWig files

    PubMed Central

    Pohl, Andy; Beato, Miguel

    2014-01-01

    BigWig files are a compressed, indexed, binary format for genome-wide signal data for calculations (e.g. GC percent) or experiments (e.g. ChIP-seq/RNA-seq read depth). bwtool is a tool designed to read bigWig files rapidly and efficiently, providing functionality for extracting data and summarizing it in several ways, globally or at specific regions. Additionally, the tool enables the conversion of the positions of signal data from one genome assembly to another, also known as ‘lifting’. We believe bwtool can be useful for the analyst frequently working with bigWig data, which is becoming a standard format to represent functional signals along genomes. The article includes supplementary examples of running the software. Availability and implementation: The C source code is freely available under the GNU public license v3 at http://cromatina.crg.eu/bwtool. Contact: andrew.pohl@crg.eu, andypohl@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24489365

  14. Search-based optimization

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    The problem of determining the minimum cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete (Wang and Jiang, 1994). Traditionally, point estimations of hypothetical ancestral sequences have been used to gain heuristic, upper bounds on cladogram cost. These include procedures with such diverse approaches as non-additive optimization of multiple sequence alignment, direct optimization (Wheeler, 1996), and fixed-state character optimization (Wheeler, 1999). A method is proposed here which, by extending fixed-state character optimization, replaces the estimation process with a search. This form of optimization examines a diversity of potential state solutions for cost-efficient hypothetical ancestral sequences and can result in greatly more parsimonious cladograms. Additionally, such an approach can be applied to other NP-complete phylogenetic optimization problems such as genomic break-point analysis. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  15. Editing Citrus Genome via SaCas9/sgRNA System

    PubMed Central

    Jia, Hongge; Xu, Jin; Orbović, Vladimir; Zhang, Yunzeng; Wang, Nian

    2017-01-01

    SaCas9/sgRNA, derived from Staphylococcus aureus, is an alternative system for genome editing to Streptococcus pyogenes SpCas9/sgRNA. The smaller SaCas9 recognizes a different protospacer adjacent motif (PAM) sequence from SpCas9. SaCas9/sgRNA has been employed to edit the genomes of Arabidopsis, tobacco and rice. In this study, we aimed to test its potential in genome editing of citrus. Transient expression of SaCas9/sgRNA in Duncan grapefruit via Xcc-facilitated agroinfiltration showed it can successfully modify CsPDS and Cs2g12470. Subsequently, binary vector GFP-p1380N-SaCas9/35S-sgRNA1:AtU6-sgRNA2 was developed to edit two target sites of Cs7g03360 in transgenic Carrizo citrange. Twelve GFP-positive Carrizo transformants were successfully established, designated as #Cz1 to #Cz12. Based on targeted next generation sequencing results, the mutation rates for the two targets ranged from 15.55 to 39.13% for sgRNA1 and 49.01 to 79.67% for sgRNA2. Therefore, SaCas9/sgRNA can be used as an alternative tool to SpCas9/sgRNA for citrus genome editing. PMID:29312390

  16. Structural classification of proteins using texture descriptors extracted from the cellular automata image.

    PubMed

    Kavianpour, Hamidreza; Vasighi, Mahdi

    2017-02-01

    Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.

  17. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  18. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  19. An Improved Binary Differential Evolution Algorithm to Infer Tumor Phylogenetic Trees.

    PubMed

    Liang, Ying; Liao, Bo; Zhu, Wen

    2017-01-01

    Tumourigenesis is a mutation accumulation process, which is likely to start with a mutated founder cell. The evolutionary nature of tumor development makes phylogenetic models suitable for inferring tumor evolution through genetic variation data. Copy number variation (CNV) is the major genetic marker of the genome with more genes, disease loci, and functional elements involved. Fluorescence in situ hybridization (FISH) accurately measures multiple gene copy number of hundreds of single cells. We propose an improved binary differential evolution algorithm, BDEP, to infer tumor phylogenetic tree based on FISH platform. The topology analysis of tumor progression tree shows that the pathway of tumor subcell expansion varies greatly during different stages of tumor formation. And the classification experiment shows that tree-based features are better than data-based features in distinguishing tumor. The constructed phylogenetic trees have great performance in characterizing tumor development process, which outperforms other similar algorithms.

  20. Heritability and linkage analysis of personality in bipolar disorder.

    PubMed

    Greenwood, Tiffany A; Badner, Judith A; Byerley, William; Keck, Paul E; McElroy, Susan L; Remick, Ronald A; Dessa Sadovnick, A; Kelsoe, John R

    2013-11-01

    The many attempts that have been made to identify genes for bipolar disorder (BD) have met with limited success, which may reflect an inadequacy of diagnosis as an informative and biologically relevant phenotype for genetic studies. Here we have explored aspects of personality as quantitative phenotypes for bipolar disorder through the use of the Temperament and Character Inventory (TCI), which assesses personality in seven dimensions. Four temperament dimensions are assessed: novelty seeking (NS), harm avoidance (HA), reward dependence (RD), and persistence (PS). Three character dimensions are also included: self-directedness (SD), cooperativeness (CO), and self-transcendence (ST). We compared personality scores between diagnostic groups and assessed heritability in a sample of 101 families collected for genetic studies of BD. A genome-wide SNP linkage analysis was then performed in the subset of 51 families for which genetic data was available. Significant group differences were observed between BD subjects, their first-degree relatives, and independent controls for all but RD and PS, and all but HA and RD were found to be significantly heritable in this sample. Linkage analysis of the heritable dimensions produced several suggestive linkage peaks for NS (chromosomes 7q21 and 10p15), PS (chromosomes 6q16, 12p13, and 19p13), and SD (chromosomes 4q35, 8q24, and 18q12). The relatively small size of our linkage sample likely limited our ability to reach genome-wide significance in this study. While not genome-wide significant, these results suggest that aspects of personality may prove useful in the identification of genes underlying BD susceptibility. © 2013 Elsevier B.V. All rights reserved.

  1. A three-genome phylogeny of malaria parasites (Plasmodium and closely related genera): evolution of life-history traits and host switches.

    PubMed

    Martinsen, Ellen S; Perkins, Susan L; Schall, Jos J

    2008-04-01

    Phylogenetic analysis of genomic data allows insights into the evolutionary history of pathogens, especially the events leading to host switching and diversification, as well as alterations of the life cycle (life-history traits). Hundreds, perhaps thousands, of malaria parasite species exploit squamate reptiles, birds, and mammals as vertebrate hosts as well as many genera of dipteran vectors, but the evolutionary and ecological events that led to this diversification and success remain unresolved. For a century, systematic parasitologists classified malaria parasites into genera based on morphology, life cycle, and vertebrate and insect host taxa. Molecular systematic studies based on single genes challenged the phylogenetic significance of these characters, but several significant nodes were not well supported. We recovered the first well resolved large phylogeny of Plasmodium and related haemosporidian parasites using sequence data for four genes from the parasites' three genomes by combining all data, correcting for variable rates of substitution by gene and site, and using both Bayesian and maximum parsimony analyses. Major clades are associated with vector shifts into different dipteran families, with other characters used in traditional parasitological studies, such as morphology and life-history traits, having variable phylogenetic significance. The common parasites of birds now placed into the genus Haemoproteus are found in two divergent clades, and the genus Plasmodium is paraphyletic with respect to Hepatocystis, a group of species with very different life history and morphology. The Plasmodium of mammal hosts form a well supported clade (including Plasmodium falciparum, the most important human malaria parasite), and this clade is associated with specialization to Anopheles mosquito vectors. The Plasmodium of birds and squamate reptiles all fall within a single clade, with evidence for repeated switching between birds and squamate hosts.

  2. Adaptation of Organisms by Resonance of RNA Transcription with the Cellular Redox Cycle

    NASA Technical Reports Server (NTRS)

    Stolc, Viktor

    2012-01-01

    Sequence variation in organisms differs across the genome and the majority of mutations are caused by oxidation, yet its origin is not fully understood. It has also been shown that the reduction-oxidation reaction cycle is the fundamental biochemical cycle that coordinates the timing of all biochemical processes in that cell, including energy production, DNA replication, and RNA transcription. It is shown that the temporal resonance of transcriptome biosynthesis with the oscillating binary state of the reduction-oxidation reaction cycle serves as a basis for non-random sequence variation at specific genome-wide coordinates that change faster than by accumulation of chance mutations. This work demonstrates evidence for a universal, persistent and iterative feedback mechanism between the environment and heredity, whereby acquired variation between cell divisions can outweigh inherited variation.

  3. The Role of Balanced Training and Testing Data Sets for Binary Classifiers in Bioinformatics

    PubMed Central

    Wei, Qiong; Dunbrack, Roland L.

    2013-01-01

    Training and testing of conventional machine learning models on binary classification problems depend on the proportions of the two outcomes in the relevant data sets. This may be especially important in practical terms when real-world applications of the classifier are either highly imbalanced or occur in unknown proportions. Intuitively, it may seem sensible to train machine learning models on data similar to the target data in terms of proportions of the two binary outcomes. However, we show that this is not the case using the example of prediction of deleterious and neutral phenotypes of human missense mutations in human genome data, for which the proportion of the binary outcome is unknown. Our results indicate that using balanced training data (50% neutral and 50% deleterious) results in the highest balanced accuracy (the average of True Positive Rate and True Negative Rate), Matthews correlation coefficient, and area under ROC curves, no matter what the proportions of the two phenotypes are in the testing data. Besides balancing the data by undersampling the majority class, other techniques in machine learning include oversampling the minority class, interpolating minority-class data points and various penalties for misclassifying the minority class. However, these techniques are not commonly used in either the missense phenotype prediction problem or in the prediction of disordered residues in proteins, where the imbalance problem is substantial. The appropriate approach depends on the amount of available data and the specific problem at hand. PMID:23874456

  4. System for line drawings interpretation

    NASA Astrophysics Data System (ADS)

    Boatto, L.; Consorti, Vincenzo; Del Buono, Monica; Eramo, Vincenzo; Esposito, Alessandra; Melcarne, F.; Meucci, Mario; Mosciatti, M.; Tucci, M.; Morelli, Arturo

    1992-08-01

    This paper describes an automatic system that extracts information from line drawings, in order to feed CAD or GIS systems. The line drawings that we analyze contain interconnected thin lines, dashed lines, text, and symbols. Characters and symbols may overlap with lines. Our approach is based on the properties of the run representation of a binary image that allow giving the image a graph structure. Using this graph structure, several algorithms have been designed to identify, directly in the raster image, straight segments, dashed lines, text, symbols, hatching lines, etc. Straight segments and dashed lines are converted into vectors, with high accuracy and good noise immunity. Characters and symbols are recognized by means of a recognizer, specifically developed for this application, designed to be insensitive to rotation and scaling. Subsequent processing steps include an `intelligent'' search through the graph in order to detect closed polygons, dashed lines, text strings, and other higher-level logical entities, followed by the identification of relationships (adjacency, inclusion, etc.) between them. Relationships are further translated into a formal description of the drawing. The output of the system can be used as input to a Geographic Information System package. The system is currently used by the Italian Land Register Authority to process cadastral maps.

  5. The historical biogeography of Mammalia

    PubMed Central

    Springer, Mark S.; Meredith, Robert W.; Janecka, Jan E.; Murphy, William J.

    2011-01-01

    Palaeobiogeographic reconstructions are underpinned by phylogenies, divergence times and ancestral area reconstructions, which together yield ancestral area chronograms that provide a basis for proposing and testing hypotheses of dispersal and vicariance. Methods for area coding include multi-state coding with a single character, binary coding with multiple characters and string coding. Ancestral reconstruction methods are divided into parsimony versus Bayesian/likelihood approaches. We compared nine methods for reconstructing ancestral areas for placental mammals. Ambiguous reconstructions were a problem for all methods. Important differences resulted from coding areas based on the geographical ranges of extant species versus the geographical provenance of the oldest fossil for each lineage. Africa and South America were reconstructed as the ancestral areas for Afrotheria and Xenarthra, respectively. Most methods reconstructed Eurasia as the ancestral area for Boreoeutheria, Euarchontoglires and Laurasiatheria. The coincidence of molecular dates for the separation of Afrotheria and Xenarthra at approximately 100 Ma with the plate tectonic sundering of Africa and South America hints at the importance of vicariance in the early history of Placentalia. Dispersal has also been important including the origins of Madagascar's endemic mammal fauna. Further studies will benefit from increased taxon sampling and the application of new ancestral area reconstruction methods. PMID:21807730

  6. Models for Accretion-Disk Fluctuations through Self-Organized Criticality Including Relativistic Effects

    NASA Astrophysics Data System (ADS)

    Xiong, Ying; Wiita, Paul J.; Bao, Gang

    2000-12-01

    The possibility that some of the observed X-ray and optical variability in active galactic nuclei and galactic black hole candidates are produced in accretion disks through the development of a self-organized critical state is reconsidered. New simulations, including more complete calculations of relativistic effects, do show that this model can produce light-curves and power-spectra for the variability which agree with the range observed in optical and X-ray studies of AGN and X-ray binaries. However, the universality of complete self-organized criticality has not quite been achieved. This is mainly because the character of the variations depend quite substantially on the extent of the unstable disk region. If it extends close to the innermost stable orbit, a physical scale is introduced and the scale-free character of self-organized criticality is vitiated. A significant dependence of the power spectrum density slope on the type of diffusion within the disk and a weaker dependence on the amount of differential rotation are noted. When general-relativistic effects are incorporated in the models, additional substantial differences are produced if the disk is viewed from directions far from the accretion disk axis.

  7. Genome-wide identification and evolution of the PIN-FORMED (PIN) gene family in Glycine max.

    PubMed

    Liu, Yuan; Wei, Haichao

    2017-07-01

    Soybean (Glycine max) is one of the most important crop plants. Wild and cultivated soybean varieties have significant differences worth further investigation, such as plant morphology, seed size, and seed coat development; these characters may be related to auxin biology. The PIN gene family encodes essential transport proteins in cell-to-cell auxin transport, but little research on soybean PIN genes (GmPIN genes) has been done, especially with respect to the evolution and differences between wild and cultivated soybean. In this study, we retrieved 23 GmPIN genes from the latest updated G. max genome database; six GmPIN protein sequences were changed compared with the previous database. Based on the Plant Genome Duplication Database, 18 GmPIN genes have been involved in segment duplication. Three pairs of GmPIN genes arose after the second soybean genome duplication, and six occurred after the first genome duplication. The duplicated GmPIN genes retained similar expression patterns. All the duplicated GmPIN genes experienced purifying selection (K a /K s < 1) to prevent accumulation of non-synonymous mutations and thus remained more similar. In addition, we also focused on the artificial selection of the soybean PIN genes. Five artificially selected GmPIN genes were identified by comparing the genome sequence of 17 wild and 14 cultivated soybean varieties. Our research provides useful and comprehensive basic information for understanding GmPIN genes.

  8. Inter-sectional hybrids obtained from reciprocal crosses between Begonia semperflorens (section Begonia) and B. ‘Orange Rubra’ (section Gaerdita × section Pritzelia)

    PubMed Central

    Chen, Yen-Ming; Mii, Masahiro

    2012-01-01

    Inter-sectional hybrids were successfully obtained by the reciprocal crosses between 11 cultivars (including 6 diploids and 5 tetraploids) of Begonia semperflorens (SS & SSSS genomes) and B. ‘Orange Rubra’ (RR genome) with the aid of in vitro culture of mature or immature seeds on MS medium containing 0.1 mg l−1 α-naphthylacetic acid, 0.1 mg l−1 6-benzyladenine, 10 mg l−1 gibberellic acid, 30 g l−1 sucrose and 2.5 g l−1 gellan gum. Embryo rescue as ovary culture with immature seeds 12th–16th day after pollination (DAP) generally gave higher efficiency of plantlet formation, but in some cross combinations, culture of mature seeds (30 DAP) resulted in higher yield of plantlets. Flow cytometric analysis revealed that they were consisted of the plants with various genomic combinations (RS, RR, RSS, RRS, RRSS and RRRRSS) as estimated by the DNA contents of both parents. Hybridity of these plants with various genomic combinations including RR was confirmed by random amplified polymorphic DNA analysis. These results suggested that unreduced gamete formation and spontaneous chromosome doubling were involved in the hybrid formation of various ploidy levels and genomic combinations. These hybrids showed various levels of intermediate traits between both parents according to the genomic compositions, and some of them had desirable characters of both parents. PMID:23136522

  9. Inter-sectional hybrids obtained from reciprocal crosses between Begonia semperflorens (section Begonia) and B. 'Orange Rubra' (section Gaerdita × section Pritzelia).

    PubMed

    Chen, Yen-Ming; Mii, Masahiro

    2012-06-01

    Inter-sectional hybrids were successfully obtained by the reciprocal crosses between 11 cultivars (including 6 diploids and 5 tetraploids) of Begonia semperflorens (SS & SSSS genomes) and B. 'Orange Rubra' (RR genome) with the aid of in vitro culture of mature or immature seeds on MS medium containing 0.1 mg l(-1) α-naphthylacetic acid, 0.1 mg l(-1) 6-benzyladenine, 10 mg l(-1) gibberellic acid, 30 g l(-1) sucrose and 2.5 g l(-1) gellan gum. Embryo rescue as ovary culture with immature seeds 12(th)-16(th) day after pollination (DAP) generally gave higher efficiency of plantlet formation, but in some cross combinations, culture of mature seeds (30 DAP) resulted in higher yield of plantlets. Flow cytometric analysis revealed that they were consisted of the plants with various genomic combinations (RS, RR, RSS, RRS, RRSS and RRRRSS) as estimated by the DNA contents of both parents. Hybridity of these plants with various genomic combinations including RR was confirmed by random amplified polymorphic DNA analysis. These results suggested that unreduced gamete formation and spontaneous chromosome doubling were involved in the hybrid formation of various ploidy levels and genomic combinations. These hybrids showed various levels of intermediate traits between both parents according to the genomic compositions, and some of them had desirable characters of both parents.

  10. Eclipsing Binaries with Possible Tertiary Components

    NASA Astrophysics Data System (ADS)

    Snyder, LeRoy F.

    2013-05-01

    Many eclipsing binary star systems (EBS) show long-term variations in their orbital periods which are evident in their O-C (observed minus calculated period) diagrams. This research carried out an analysis of 324 eclipsing binary systems taken from the systems analyzed in the Bob Nelson's O-C Files database. Of these 18 systems displayed evidence of periodic variations of the arrival times of the eclipses. These rates of period changes are sinusoidal variations. The sinusoidal character of these variations is suggestive of Keplerian motion caused by an orbiting companion. The reason for these changes is unknown, but mass loss, apsidal motion, magnetic activity and the presence of a third body have been proposed. This paper has assumed light time effect as the cause of the sinusoidal variations caused by the gravitational pull of a tertiary companion orbiting around the eclipsing binary systems. An observed minus calculated (O-C) diagram of the 324 systems was plotted using a quadratic ephemeris to determine if the system displayed a sinusoidal trend in theO-C residuals. After analysis of the 18 systems, seven systems, AW UMa, BB PEG, OO Aql, V508 Oph, VW Cep, WCrv and YY ERI met the benchmark of the criteria of a possible orbiting companion. The other 11 systems displayed a sinusoidal variation in the O-C residuals of the primary eclipses but these systems in the Bob Nelson's O-C Files did not contain times of minimum (Tmin) of the secondary eclipses and therefore not conclusive in determining the presents of the effects of a tertiary companion. An analysis of the residuals of the seven systems yields a light-time semi-amplitude, orbital period, eccentricity and mass of the tertiary companion as the amplitude of the variation is proportional to the mass, period and inclination of the 3rd orbiting body. Knowing the low mass of the tertiary body in the seven cases the possibility of five of these tertiary companions being brown dwarfs is discussed.

  11. Genome activation by raspberry bushy dwarf virus coat protein.

    PubMed

    Macfarlane, Stuart A; McGavin, Wendy J

    2009-03-01

    Two sets of infectious cDNA clones of raspberry bushy dwarf virus (RBDV) have been constructed, enabling either the synthesis of infectious RNA transcripts or the delivery of infectious binary plasmid DNA by infiltration of Agrobacterium tumefaciens. In whole plants and in protoplasts, inoculation of RBDV RNA1 and RNA2 transcripts led to a low level of infection, which was greatly increased by the addition of RNA3, a subgenomic RNA coding for the RBDV coat protein (CP). Agroinfiltration of RNA1 and RNA2 constructs did not produce a detectable infection but, again, inclusion of a construct encoding the CP led to high levels of infection. Thus, RBDV replication is greatly stimulated by the presence of the CP, a mechanism that also operates with ilarviruses and alfalfa mosaic virus, where it is referred to as genome activation. Mutation to remove amino acids from the N terminus of the CP showed that the first 15 RBDV CP residues are not required for genome activation. Other experiments, in which overlapping regions at the CP N terminus were fused to the monomeric red fluorescent protein, showed that sequences downstream of the first 48 aa are not absolutely required for genome activation.

  12. Insights from the complete chloroplast genome into the evolution of Sesamum indicum L.

    PubMed

    Zhang, Haiyang; Li, Chun; Miao, Hongmei; Xiong, Songjin

    2013-01-01

    Sesame (Sesamum indicum L.) is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded) using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603). The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC) regions and inverted repeats (IR) in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1-585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17) were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.

  13. Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

    DOE PAGES

    Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto; ...

    2015-12-29

    In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less

  14. Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto

    In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less

  15. The mitochondrial genome of Priapulus caudatus Lamarck (Priapulida: Priapulidae).

    PubMed

    Webster, Bonnie L; Mackenzie-Dodds, Jacqueline A; Telford, Maximilian J; Littlewood, D Timothy J

    2007-03-01

    We sequenced and annotated the complete mitochondrial (mt) genome of the priapulid Priapulus caudatus in order to provide a source of phylogenetic characters including an assessment of gene order arrangement. The genome was 14,919 bp in its entirety with few, short non-coding regions. A number of protein-coding and tRNA genes overlapped, making the genome relatively compact. The gene order was: cox1, cox2, trnK, trnD, atp8, atp6, cox3, trnG, nad3, trnA, trnR, trnN, rrnS, trnV, rrnL, trnL(yaa), trnL(nag), nad1, -trnS(nga), -cob, -nad6, trnP, -trnT, nad4L, nad4, trnH, nad5, trnF, -trnE, -trnS(nct), trnI, -trnQ, trnM, nad2, trnW, -trnC, -trnY; where '-' indicates genes transcribed on the opposite strand. The gene order, although unique amongst Metazoa, shared the greatest number of gene boundaries and the longest contiguous fragments with the chelicerate Limulus polyphemus. The mt genomes of these taxa differed only by a single inversion of 18 contiguous genes bounded by rrnS and trnS(nct). Other arthropods and nematodes shared fewer gene boundaries but considerably more than the most similar non-ecdysozoan.

  16. Genomic distribution and possible functions of DNA hydroxymethylation in the brain.

    PubMed

    Wen, Lu; Tang, Fuchou

    2014-11-01

    DNA methylation (5-methylcytosine, 5mC) is involved in many cellular processes and emerges as an important epigenetic player in brain development and memory formation. The recent discovery that 5mC can be oxidized to 5-hydroxymethylcytosine (5hmC) by TET (Ten-Eleven-Translocation) proteins provides novel insights into the dynamic character of 5mC in the brain. The content of 5hmC is remarkably high in the brain, adding further complexity. In this review, we discuss how recent advances have improved our understanding of the possible biological roles of 5hmC and TET proteins in the brain. These advances attribute to various approaches, including the genome-wide approach to map 5hmC in different genomic contexts, the gene knockout/knockdown approach to elucidate the functions of TET proteins and 5hmC, and the biochemical approach to uncover potential 5hmC readers. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Analysis of genome-wide copy number variations in Chinese indigenous and western pig breeds by 60 K SNP genotyping arrays.

    PubMed

    Wang, Yanan; Tang, Zhonglin; Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

    2014-01-01

    Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs.

  18. Analysis of Genome-Wide Copy Number Variations in Chinese Indigenous and Western Pig Breeds by 60 K SNP Genotyping Arrays

    PubMed Central

    Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

    2014-01-01

    Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs. PMID:25198154

  19. Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora).

    PubMed

    Nie, Xiaojun; Lv, Shuzuo; Zhang, Yingxin; Du, Xianghong; Wang, Le; Biradar, Siddanagouda S; Tan, Xiufang; Wan, Fanghao; Weining, Song

    2012-01-01

    Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing. The A. adenophora cp genome is 150, 689 bp in length including a small single-copy (SSC) region of 18, 358 bp and a large single-copy (LSC) region of 84, 815 bp separated by a pair of inverted repeats (IRs) of 23, 755 bp. The genome contains 130 unique genes and 18 duplicated in the IR regions, with the gene content and organization similar to other Asteraceae cp genomes. Comparative analysis identified five DNA regions (ndhD-ccsA, psbI-trnS, ndhF-ycf1, ndhI-ndhG and atpA-trnR) containing parsimony-informative characters higher than 2%, which may be potential informative markers for barcoding and phylogenetic analysis. Repeat structure, codon usage and contraction of the IR were also investigated to reveal the pattern of evolution. Phylogenetic analysis demonstrated a sister relationship between A. adenophora and Guizotia abyssinica and supported a monophyly of the Asterales. We have assembled and analyzed the chloroplast genome of A. adenophora in this study, which was the first sequenced plastome in the Eupatorieae tribe. The complete chloroplast genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family.

  20. Trichuris spp. (Nematoda: Trichuridae) from two rodents, Mastomys natalensis and Gerbilliscus vicinus in Tanzania.

    PubMed

    Ribas, Alexis; López, Sergi; Makundi, Rhodes H; Leirs, Herwig; de Bellocq, Joëlle Goüy

    2013-10-01

    During a survey of the helminth community of several rodent species in the Morogoro region (Tanzania), Trichuris whipworms (Nematoda: Trichuridae) were found in the ceca of the Natal multimammate mouse, Mastomys natalensis and a gerbil, Gerbilliscus vicinus (both Rodentia: Muridae). The taxonomic literature regarding Trichuris from African native rodents describes 10 species, but includes few metric and morphologic characters that discriminate between some of the pairs. The whipworms we sampled in Tanzanian Natal multimammate mice and gerbils were morphologically identified, respectively, as Trichuris mastomysi Verster, 1960 and Trichuris carlieri Gedoelst, 1916 sensu lato, but with characters that overlap or partially overlap with the cosmopolitan Murinae whipworm, Trichuris muris , already reported from several rodents in Africa. To clarify our identification, we sequenced the ITS-1, 5.8S, and ITS-2 ribosomal DNA region of the worms' nuclear genome. The genetic analyses clearly distinguish the whipworms we found in M. natalensis from those found in the gerbil, and both of these from T. muris whipworm reference sequences. The overlap of morphological characters between rodent whipworms suggests that reports of T. muris from rodent species not closely related to Murinae in other parts of Africa should be treated with caution.

  1. Morphological characters are compatible with mitogenomic data in resolving the phylogeny of nymphalid butterflies (lepidoptera: papilionoidea: nymphalidae).

    PubMed

    Shi, Qing-Hui; Sun, Xiao-Yan; Wang, Yun-Liang; Hao, Jia-Sheng; Yang, Qun

    2015-01-01

    Nymphalidae is the largest family of butterflies with their phylogenetic relationships not adequately approached to date. The mitochondrial genomes (mitogenomes) of 11 new nymphalid species were reported and a comparative mitogenomic analysis was conducted together with other 22 available nymphalid mitogenomes. A phylogenetic analysis of the 33 species from all 13 currently recognized nymphalid subfamilies was done based on the mitogenomic data set with three Lycaenidae species as the outgroups. The mitogenome comparison showed that the eleven new mitogenomes were similar with those of other butterflies in gene content and order. The reconstructed phylogenetic trees reveal that the nymphalids are made up of five major clades (the nymphaline, heliconiine, satyrine, danaine and libytheine clades), with sister relationship between subfamilies Cyrestinae and Biblidinae, and most likely between subfamilies Morphinae and Satyrinae. This whole mitogenome-based phylogeny is generally congruent with those of former studies based on nuclear-gene and mitogenomic analyses, but differs considerably from the result of morphological cladistic analysis, such as the basal position of Libytheinae in morpho-phylogeny is not confirmed in molecular studies. However, we found that the mitogenomic phylogeny established herein is compatible with selected morphological characters (including developmental and adult morpho-characters).

  2. Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex

    PubMed Central

    Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P.; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel

    2016-01-01

    The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094

  3. Excision of Nucleopolyhedrovirus Form Transgenic Silkworm Using the CRISPR/Cas9 System.

    PubMed

    Dong, Zhanqi; Dong, Feifan; Yu, Xinbo; Huang, Liang; Jiang, Yaming; Hu, Zhigang; Chen, Peng; Lu, Cheng; Pan, Minhui

    2018-01-01

    The CRISPR/Cas9-mediated genome engineering has been shown to efficiently suppress infection by disrupting genes of the pathogen. We recently constructed transgenic lines expressing CRISPR/Cas9 and the double sgRNA target Bombyx mori nucleopolyhedrovirus (BmNPV) immediate early-1 ( ie-1 ) gene in the silkworm, respectively, and obtained four transgenic hybrid lines by G1 generation hybridization: Cas9(-)/sgRNA(-), Cas9(+)/sgRNA(-), Cas9(-)/sgRNA(+), and Cas9(+)/sgRNA(+). We demonstrated that the Cas9(+)/sgRNA(+) transgenic lines effectively edited the target site of the BmNPV genome, and large fragment deletion was observed after BmNPV infection. Further antiviral analysis of the Cas9(+)/sgRNA(+) transgenic lines shows that the median lethal dose (LD50) is 1,000-fold higher than the normal lines after inoculation with occlusion bodies. The analysis of economic characters and off-target efficiency of Cas9(+)/sgRNA(+) transgenic hybrid line showed no significant difference compared with the normal lines. Our findings indicate that CRISPR/Cas9-mediated genome engineering more effectively targets the BmNPV genomes and could be utilized as an insect antiviral treatment.

  4. Excision of Nucleopolyhedrovirus Form Transgenic Silkworm Using the CRISPR/Cas9 System

    PubMed Central

    Dong, Zhanqi; Dong, Feifan; Yu, Xinbo; Huang, Liang; Jiang, Yaming; Hu, Zhigang; Chen, Peng; Lu, Cheng; Pan, Minhui

    2018-01-01

    The CRISPR/Cas9-mediated genome engineering has been shown to efficiently suppress infection by disrupting genes of the pathogen. We recently constructed transgenic lines expressing CRISPR/Cas9 and the double sgRNA target Bombyx mori nucleopolyhedrovirus (BmNPV) immediate early-1 (ie-1) gene in the silkworm, respectively, and obtained four transgenic hybrid lines by G1 generation hybridization: Cas9(-)/sgRNA(-), Cas9(+)/sgRNA(-), Cas9(-)/sgRNA(+), and Cas9(+)/sgRNA(+). We demonstrated that the Cas9(+)/sgRNA(+) transgenic lines effectively edited the target site of the BmNPV genome, and large fragment deletion was observed after BmNPV infection. Further antiviral analysis of the Cas9(+)/sgRNA(+) transgenic lines shows that the median lethal dose (LD50) is 1,000-fold higher than the normal lines after inoculation with occlusion bodies. The analysis of economic characters and off-target efficiency of Cas9(+)/sgRNA(+) transgenic hybrid line showed no significant difference compared with the normal lines. Our findings indicate that CRISPR/Cas9-mediated genome engineering more effectively targets the BmNPV genomes and could be utilized as an insect antiviral treatment. PMID:29503634

  5. Langevin Dynamics Simulations of Genome Packing in Bacteriophage

    PubMed Central

    Forrey, Christopher; Muthukumar, M.

    2006-01-01

    We use Langevin dynamics simulations to study the process by which a coarse-grained DNA chain is packaged within an icosahedral container. We focus our inquiry on three areas of interest in viral packing: the evolving structure of the packaged DNA condensate; the packing velocity; and the internal buildup of energy and resultant forces. Each of these areas has been studied experimentally, and we find that we can qualitatively reproduce experimental results. However, our findings also suggest that the phage genome packing process is fundamentally different than that suggested by the inverse spool model. We suggest that packing in general does not proceed in the deterministic fashion of the inverse-spool model, but rather is stochastic in character. As the chain configuration becomes compressed within the capsid, the structure, energy, and packing velocity all become dependent upon polymer dynamics. That many observed features of the packing process are rooted in condensed-phase polymer dynamics suggests that statistical mechanics, rather than mechanics, should serve as the proper theoretical basis for genome packing. Finally we suggest that, as a result of an internal protein unique to bacteriophage T7, the T7 genome may be significantly more ordered than is true for bacteriophage in general. PMID:16617089

  6. Langevin dynamics simulations of genome packing in bacteriophage.

    PubMed

    Forrey, Christopher; Muthukumar, M

    2006-07-01

    We use Langevin dynamics simulations to study the process by which a coarse-grained DNA chain is packaged within an icosahedral container. We focus our inquiry on three areas of interest in viral packing: the evolving structure of the packaged DNA condensate; the packing velocity; and the internal buildup of energy and resultant forces. Each of these areas has been studied experimentally, and we find that we can qualitatively reproduce experimental results. However, our findings also suggest that the phage genome packing process is fundamentally different than that suggested by the inverse spool model. We suggest that packing in general does not proceed in the deterministic fashion of the inverse-spool model, but rather is stochastic in character. As the chain configuration becomes compressed within the capsid, the structure, energy, and packing velocity all become dependent upon polymer dynamics. That many observed features of the packing process are rooted in condensed-phase polymer dynamics suggests that statistical mechanics, rather than mechanics, should serve as the proper theoretical basis for genome packing. Finally we suggest that, as a result of an internal protein unique to bacteriophage T7, the T7 genome may be significantly more ordered than is true for bacteriophage in general.

  7. Genome-Wide Convergence during Evolution of Mangroves from Woody Plants.

    PubMed

    Xu, Shaohua; He, Ziwen; Guo, Zixiao; Zhang, Zhang; Wyckoff, Gerald J; Greenberg, Anthony; Wu, Chung-I; Shi, Suhua

    2017-04-01

    When living organisms independently invade a new environment, the evolution of similar phenotypic traits is often observed. An interesting but contentious issue is whether the underlying molecular biology also converges in the new habitat. Independent invasions of tropical intertidal zones by woody plants, collectively referred to as mangrove trees, represent some dramatic examples. The high salinity, hypoxia, and other stressors in the new habitat might have affected both genomic features and protein structures. Here, we developed a new method for detecting convergence at conservative Sites (CCS) and applied it to the genomic sequences of mangroves. In simulations, the CCS method drastically reduces random convergence at rapidly evolving sites as well as falsely inferred convergence caused by the misinferences of the ancestral character. In mangrove genomes, we estimated ∼400 genes that have experienced convergence over the background level of convergence in the nonmangrove relatives. The convergent genes are enriched in pathways related to stress response and embryo development, which could be important for mangroves' adaptation to the new habitat. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Genomic Predictions and Genome-Wide Association Study of Resistance Against Piscirickettsia salmonis in Coho Salmon (Oncorhynchus kisutch) Using ddRAD Sequencing

    PubMed Central

    Barría, Agustín; Christensen, Kris A.; Yoshida, Grazyella M.; Correa, Katharina; Jedlicki, Ana; Lhorente, Jean P.; Davidson, William S.; Yáñez, José M.

    2018-01-01

    Piscirickettsia salmonis is one of the main infectious diseases affecting coho salmon (Oncorhynchus kisutch) farming, and current treatments have been ineffective for the control of this disease. Genetic improvement for P. salmonis resistance has been proposed as a feasible alternative for the control of this infectious disease in farmed fish. Genotyping by sequencing (GBS) strategies allow genotyping of hundreds of individuals with thousands of single nucleotide polymorphisms (SNPs), which can be used to perform genome wide association studies (GWAS) and predict genetic values using genome-wide information. We used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and to identify molecular markers associated with the trait. We also evaluated genomic selection (GS) models in order to determine the potential to accelerate the genetic improvement of this trait by means of using genome-wide molecular information. A total of 764 individuals from 33 full-sib families (17 highly resistant and 16 highly susceptible) were experimentally challenged against P. salmonis and their genotypes were assayed using ddRAD sequencing. A total of 9,389 SNPs markers were identified in the population. These markers were used to test genomic selection models and compare different GWAS methodologies for resistance measured as day of death (DD) and binary survival (BIN). Genomic selection models showed higher accuracies than the traditional pedigree-based best linear unbiased prediction (PBLUP) method, for both DD and BIN. The models showed an improvement of up to 95% and 155% respectively over PBLUP. One SNP related with B-cell development was identified as a potential functional candidate associated with resistance to P. salmonis defined as DD. PMID:29440129

  9. Automated ensemble assembly and validation of microbial genomes.

    PubMed

    Koren, Sergey; Treangen, Todd J; Hill, Christopher M; Pop, Mihai; Phillippy, Adam M

    2014-05-03

    The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.

  10. Species tree phylogeny and character evolution in the genus Centipeda (Asteraceae): evidence from DNA sequences from coding and non-coding loci from the plastid and nuclear genomes.

    PubMed

    Nylinder, Stephan; Cronholm, Bodil; de Lange, Peter J; Walsh, Neville; Anderberg, Arne A

    2013-08-01

    A species tree phylogeny of the Australian/New Zealand genus Centipeda (Asteraceae) is estimated based on nucleotide sequence data. We analysed sequences of nuclear ribosomal DNA (ETS, ITS) and three plasmid loci (ndhF, psbA-trnH, and trnL-F) using the multi-species coalescent module in BEAST. A total of 129 individuals from all 10 recognised species of Centipeda were sampled throughout the species distribution ranges, including two subspecies. We conclude that the inferred species tree topology largely conform previous assumptions on species relationships. Centipeda racemosa (Snuffweed) is the sister to remaining species, which is also the only consistently perennial representative in the genus. Centipeda pleiocephala (Tall Sneezeweed) and C. nidiformis (Cotton Sneezeweed) constitute a species pair, as does C. borealis and C. minima (Spreading Sneezeweed), all sharing the symplesiomorphic characters of spherical capitulum and convex receptacle with C. racemosa. Another species group comprising C. thespidioides (Desert Sneezeweed), C. cunninghamii (Old man weed, or Common sneeze-weed), C. crateriformis is well-supported but then include the morphologically aberrant C. aotearoana, all sharing the character of having capitula that mature more slowly relative the subtending shoot. Centipeda elatinoides takes on a weakly supported intermediate position between the two mentioned groups, and is difficult to relate to any of the former groups based on morphological characters. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Making sense of genetic risk: A qualitative focus-group study of healthy participants in genomic research.

    PubMed

    Viberg Johansson, Jennifer; Segerdahl, Pär; Ugander, Ulrika Hösterey; Hansson, Mats G; Langenskiöld, Sophie

    2018-03-01

    It is well known that research participants want to receive genetic risk information that is about high risks, serious diseases and potential preventive measures. The aim of this study was to explore, by qualitative means, something less well known: how do healthy research participants themselves make sense of genetic risk information? A phenomenographic approach was chosen to explore research participants' understanding and assessment of genetic risk. We conducted four focus-group (N=16) interviews with participants in a research programme designed to identify biomarkers for cardiopulmonary disease. Among the research participants, we found four ways of understanding genetic risk: as a binary concept, as an explanation, as revealing who I am (knowledge of oneself) and as affecting life ahead. Research participants tend to understand genetic risk as a binary concept. This does not necessarily imply a misunderstanding of, or an irrational approach to, genetic risk. Rather, it may have a heuristic function in decision-making. Risk communication may be enhanced by tailoring the communication to the participants' own lay conceptions. For example, researchers and counselors should address risk in binary terms, maybe looking out for how individual participants search for threshold figures. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. A proteome-scale map of the human interactome network

    PubMed Central

    Rolland, Thomas; Taşan, Murat; Charloteaux, Benoit; Pevzner, Samuel J.; Zhong, Quan; Sahni, Nidhi; Yi, Song; Lemmens, Irma; Fontanillo, Celia; Mosca, Roberto; Kamburov, Atanas; Ghiassian, Susan D.; Yang, Xinping; Ghamsari, Lila; Balcha, Dawit; Begg, Bridget E.; Braun, Pascal; Brehme, Marc; Broly, Martin P.; Carvunis, Anne-Ruxandra; Convery-Zupan, Dan; Corominas, Roser; Coulombe-Huntington, Jasmin; Dann, Elizabeth; Dreze, Matija; Dricot, Amélie; Fan, Changyu; Franzosa, Eric; Gebreab, Fana; Gutierrez, Bryan J.; Hardy, Madeleine F.; Jin, Mike; Kang, Shuli; Kiros, Ruth; Lin, Guan Ning; Luck, Katja; MacWilliams, Andrew; Menche, Jörg; Murray, Ryan R.; Palagi, Alexandre; Poulin, Matthew M.; Rambout, Xavier; Rasla, John; Reichert, Patrick; Romero, Viviana; Ruyssinck, Elien; Sahalie, Julie M.; Scholz, Annemarie; Shah, Akash A.; Sharma, Amitabh; Shen, Yun; Spirohn, Kerstin; Tam, Stanley; Tejeda, Alexander O.; Trigg, Shelly A.; Twizere, Jean-Claude; Vega, Kerwin; Walsh, Jennifer; Cusick, Michael E.; Xia, Yu; Barabási, Albert-László; Iakoucheva, Lilia M.; Aloy, Patrick; De Las Rivas, Javier; Tavernier, Jan; Calderwood, Michael A.; Hill, David E.; Hao, Tong; Roth, Frederick P.; Vidal, Marc

    2014-01-01

    SUMMARY Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution. PMID:25416956

  13. Universal Plant DNA Barcode Loci May Not Work in Complex Groups: A Case Study with Indian Berberis Species

    PubMed Central

    Roy, Sribash; Tyagi, Antariksh; Shukla, Virendra; Kumar, Anil; Singh, Uma M.; Chaudhary, Lal Babu; Datt, Bhaskar; Bag, Sumit K.; Singh, Pradhyumna K.; Nair, Narayanan K.; Husain, Tariq; Tuli, Rakesh

    2010-01-01

    Background The concept of DNA barcoding for species identification has gained considerable momentum in animals because of fairly successful species identification using cytochrome oxidase I (COI). In plants, matK and rbcL have been proposed as standard barcodes. However, barcoding in complex genera is a challenging task. Methodology and Principal Findings We investigated the species discriminatory power of four reportedly most promising plant DNA barcoding loci (one from nuclear genome- ITS, and three from plastid genome- trnH-psbA, rbcL and matK) in species of Indian Berberis L. (Berberidaceae) and two other genera, Ficus L. (Moraceae) and Gossypium L. (Malvaceae). Berberis species were delineated using morphological characters. These characters resulted in a well resolved species tree. Applying both nucleotide distance and nucleotide character-based approaches, we found that none of the loci, either singly or in combinations, could discriminate the species of Berberis. ITS resolved all the tested species of Ficus and Gossypium and trnH-psbA resolved 82% of the tested species in Ficus. The highly regarded matK and rbcL could not resolve all the species. Finally, we employed amplified fragment length polymorphism test in species of Berberis to determine their relationships. Using ten primer pair combinations in AFLP, the data demonstrated incomplete species resolution. Further, AFLP analysis showed that there was a tendency of the Berberis accessions to cluster according to their geographic origin rather than species affiliation. Conclusions/Significance We reconfirm the earlier reports that the concept of universal barcode in plants may not work in a number of genera. Our results also suggest that the matK and rbcL, recommended as universal barcode loci for plants, may not work in all the genera of land plants. Morphological, geographical and molecular data analyses of Indian species of Berberis suggest probable reticulate evolution and thus barcode markers may not work in this case. PMID:21060687

  14. Viral genetic variation accounts for a third of variability in HIV-1 set-point viral load in Europe

    PubMed Central

    Wymant, Chris; Cornelissen, Marion; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Hall, Matthew; Hillebregt, Mariska; Ong, Swee Hoe; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle J.; Grabowski, M. Kate; Gunsenheimer-Bartmeyer, Barbara; Günthard, Huldrych F.; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Vanham, Guido; Berkhout, Ben; Kellam, Paul; Reiss, Peter; Fraser, Christophe

    2017-01-01

    HIV-1 set-point viral load—the approximately stable value of viraemia in the first years of chronic infection—is a strong predictor of clinical outcome and is highly variable across infected individuals. To better understand HIV-1 pathogenesis and the evolution of the viral population, we must quantify the heritability of set-point viral load, which is the fraction of variation in this phenotype attributable to viral genetic variation. However, current estimates of heritability vary widely, from 6% to 59%. Here we used a dataset of 2,028 seroconverters infected between 1985 and 2013 from 5 European countries (Belgium, Switzerland, France, the Netherlands and the United Kingdom) and estimated the heritability of set-point viral load at 31% (CI 15%–43%). Specifically, heritability was measured using models of character evolution describing how viral load evolves on the phylogeny of whole-genome viral sequences. In contrast to previous studies, (i) we measured viral loads using standardized assays on a sample collected in a strict time window of 6 to 24 months after infection, from which the viral genome was also sequenced; (ii) we compared 2 models of character evolution, the classical “Brownian motion” model and another model (“Ornstein–Uhlenbeck”) that includes stabilising selection on viral load; (iii) we controlled for covariates, including age and sex, which may inflate estimates of heritability; and (iv) we developed a goodness of fit test based on the correlation of viral loads in cherries of the phylogenetic tree, showing that both models of character evolution fit the data well. An overall heritability of 31% (CI 15%–43%) is consistent with other studies based on regression of viral load in donor–recipient pairs. Thus, about a third of variation in HIV-1 virulence is attributable to viral genetic variation. PMID:28604782

  15. Cnidarian phylogenetic relationships as revealed by mitogenomics

    PubMed Central

    2013-01-01

    Background Cnidaria (corals, sea anemones, hydroids, jellyfish) is a phylum of relatively simple aquatic animals characterized by the presence of the cnidocyst: a cell containing a giant capsular organelle with an eversible tubule (cnida). Species within Cnidaria have life cycles that involve one or both of the two distinct body forms, a typically benthic polyp, which may or may not be colonial, and a typically pelagic mostly solitary medusa. The currently accepted taxonomic scheme subdivides Cnidaria into two main assemblages: Anthozoa (Hexacorallia + Octocorallia) – cnidarians with a reproductive polyp and the absence of a medusa stage – and Medusozoa (Cubozoa, Hydrozoa, Scyphozoa, Staurozoa) – cnidarians that usually possess a reproductive medusa stage. Hypothesized relationships among these taxa greatly impact interpretations of cnidarian character evolution. Results We expanded the sampling of cnidarian mitochondrial genomes, particularly from Medusozoa, to reevaluate phylogenetic relationships within Cnidaria. Our phylogenetic analyses based on a mitochogenomic dataset support many prior hypotheses, including monophyly of Hexacorallia, Octocorallia, Medusozoa, Cubozoa, Staurozoa, Hydrozoa, Carybdeida, Chirodropida, and Hydroidolina, but reject the monophyly of Anthozoa, indicating that the Octocorallia + Medusozoa relationship is not the result of sampling bias, as proposed earlier. Further, our analyses contradict Scyphozoa [Discomedusae + Coronatae], Acraspeda [Cubozoa + Scyphozoa], as well as the hypothesis that Staurozoa is the sister group to all the other medusozoans. Conclusions Cnidarian mitochondrial genomic data contain phylogenetic signal informative for understanding the evolutionary history of this phylum. Mitogenome-based phylogenies, which reject the monophyly of Anthozoa, provide further evidence for the polyp-first hypothesis. By rejecting the traditional Acraspeda and Scyphozoa hypotheses, these analyses suggest that the shared morphological characters in these groups are plesiomorphies, originated in the branch leading to Medusozoa. The expansion of mitogenomic data along with improvements in phylogenetic inference methods and use of additional nuclear markers will further enhance our understanding of the phylogenetic relationships and character evolution within Cnidaria. PMID:23302374

  16. Cnidarian phylogenetic relationships as revealed by mitogenomics.

    PubMed

    Kayal, Ehsan; Roure, Béatrice; Philippe, Hervé; Collins, Allen G; Lavrov, Dennis V

    2013-01-09

    Cnidaria (corals, sea anemones, hydroids, jellyfish) is a phylum of relatively simple aquatic animals characterized by the presence of the cnidocyst: a cell containing a giant capsular organelle with an eversible tubule (cnida). Species within Cnidaria have life cycles that involve one or both of the two distinct body forms, a typically benthic polyp, which may or may not be colonial, and a typically pelagic mostly solitary medusa. The currently accepted taxonomic scheme subdivides Cnidaria into two main assemblages: Anthozoa (Hexacorallia + Octocorallia) - cnidarians with a reproductive polyp and the absence of a medusa stage - and Medusozoa (Cubozoa, Hydrozoa, Scyphozoa, Staurozoa) - cnidarians that usually possess a reproductive medusa stage. Hypothesized relationships among these taxa greatly impact interpretations of cnidarian character evolution. We expanded the sampling of cnidarian mitochondrial genomes, particularly from Medusozoa, to reevaluate phylogenetic relationships within Cnidaria. Our phylogenetic analyses based on a mitochogenomic dataset support many prior hypotheses, including monophyly of Hexacorallia, Octocorallia, Medusozoa, Cubozoa, Staurozoa, Hydrozoa, Carybdeida, Chirodropida, and Hydroidolina, but reject the monophyly of Anthozoa, indicating that the Octocorallia + Medusozoa relationship is not the result of sampling bias, as proposed earlier. Further, our analyses contradict Scyphozoa [Discomedusae + Coronatae], Acraspeda [Cubozoa + Scyphozoa], as well as the hypothesis that Staurozoa is the sister group to all the other medusozoans. Cnidarian mitochondrial genomic data contain phylogenetic signal informative for understanding the evolutionary history of this phylum. Mitogenome-based phylogenies, which reject the monophyly of Anthozoa, provide further evidence for the polyp-first hypothesis. By rejecting the traditional Acraspeda and Scyphozoa hypotheses, these analyses suggest that the shared morphological characters in these groups are plesiomorphies, originated in the branch leading to Medusozoa. The expansion of mitogenomic data along with improvements in phylogenetic inference methods and use of additional nuclear markers will further enhance our understanding of the phylogenetic relationships and character evolution within Cnidaria.

  17. Evolution of the mitochondrial genome in snakes: Gene rearrangements and phylogenetic relationships

    PubMed Central

    Yan, Jie; Li, Hongdan; Zhou, Kaiya

    2008-01-01

    Background Snakes as a major reptile group display a variety of morphological characteristics pertaining to their diverse behaviours. Despite abundant analyses of morphological characters, molecular studies using mitochondrial and nuclear genes are limited. As a result, the phylogeny of snakes remains controversial. Previous studies on mitochondrial genomes of snakes have demonstrated duplication of the control region and translocation of trnL to be two notable features of the alethinophidian (all serpents except blindsnakes and threadsnakes) mtDNAs. Our purpose is to further investigate the gene organizations, evolution of the snake mitochondrial genome, and phylogenetic relationships among several major snake families. Results The mitochondrial genomes were sequenced for four taxa representing four different families, and each had a different gene arrangement. Comparative analyses with other snake mitochondrial genomes allowed us to summarize six types of mitochondrial gene arrangement in snakes. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (BI, ML, MP, NJ) arrived at a similar topology, which was used to reconstruct the evolution of mitochondrial gene arrangements in snakes. Conclusion The phylogenetic relationships among the major families of snakes are in accordance with the mitochondrial genomes in terms of gene arrangements. The gene arrangement in Ramphotyphlops braminus mtDNA is inferred to be ancestral for snakes. After the divergence of the early Ramphotyphlops lineage, three types of rearrangements occurred. These changes involve translocations within the IQM tRNA gene cluster and the duplication of the CR. All phylogenetic methods support the placement of Enhydris plumbea outside of the (Colubridae + Elapidae) cluster, providing mitochondrial genomic evidence for the familial rank of Homalopsidae. PMID:19038056

  18. A new computational method for the detection of horizontal gene transfer events.

    PubMed

    Tsirigos, Aristotelis; Rigoutsos, Isidore

    2005-01-01

    In recent years, the increase in the amounts of available genomic data has made it easier to appreciate the extent by which organisms increase their genetic diversity through horizontally transferred genetic material. Such transfers have the potential to give rise to extremely dynamic genomes where a significant proportion of their coding DNA has been contributed by external sources. Because of the impact of these horizontal transfers on the ecological and pathogenic character of the recipient organisms, methods are continuously sought that are able to computationally determine which of the genes of a given genome are products of transfer events. In this paper, we introduce and discuss a novel computational method for identifying horizontal transfers that relies on a gene's nucleotide composition and obviates the need for knowledge of codon boundaries. In addition to being applicable to individual genes, the method can be easily extended to the case of clusters of horizontally transferred genes. With the help of an extensive and carefully designed set of experiments on 123 archaeal and bacterial genomes, we demonstrate that the new method exhibits significant improvement in sensitivity when compared to previously published approaches. In fact, it achieves an average relative improvement across genomes of between 11 and 41% compared to the Codon Adaptation Index method in distinguishing native from foreign genes. Our method's horizontal gene transfer predictions for 123 microbial genomes are available online at http://cbcsrv.watson.ibm.com/HGT/.

  19. compendiumdb: an R package for retrieval and storage of functional genomics data.

    PubMed

    Nandal, Umesh K; van Kampen, Antoine H C; Moerland, Perry D

    2016-09-15

    Currently, the Gene Expression Omnibus (GEO) contains public data of over 1 million samples from more than 40 000 microarray-based functional genomics experiments. This provides a rich source of information for novel biological discoveries. However, unlocking this potential often requires retrieving and storing a large number of expression profiles from a wide range of different studies and platforms. The compendiumdb R package provides an environment for downloading functional genomics data from GEO, parsing the information into a local or remote database and interacting with the database using dedicated R functions, thus enabling seamless integration with other tools available in R/Bioconductor. The compendiumdb package is written in R, MySQL and Perl. Source code and binaries are available from CRAN (http://cran.r-project.org/web/packages/compendiumdb/) for all major platforms (Linux, MS Windows and OS X) under the GPLv3 license. p.d.moerland@amc.uva.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Genome-wide regression and prediction with the BGLR statistical package.

    PubMed

    Pérez, Paulino; de los Campos, Gustavo

    2014-10-01

    Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.

  1. Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer.

    PubMed

    Goremykin, Vadim V; Salamini, Francesco; Velasco, Riccardo; Viola, Roberto

    2009-01-01

    The mitochondrial genome of grape (Vitis vinifera), the largest organelle genome sequenced so far, is presented. The genome is 773,279 nt long and has the highest coding capacity among known angiosperm mitochondrial DNAs (mtDNAs). The proportion of promiscuous DNA of plastid origin in the genome is also the largest ever reported for an angiosperm mtDNA, both in absolute and relative terms. In all, 42.4% of chloroplast genome of Vitis has been incorporated into its mitochondrial genome. In order to test if horizontal gene transfer (HGT) has also contributed to the gene content of the grape mtDNA, we built phylogenetic trees with the coding sequences of mitochondrial genes of grape and their homologs from plant mitochondrial genomes. Many incongruent gene tree topologies were obtained. However, the extent of incongruence between these gene trees is not significantly greater than that observed among optimal trees for chloroplast genes, the common ancestry of which has never been in doubt. In both cases, we attribute this incongruence to artifacts of tree reconstruction, insufficient numbers of characters, and gene paralogy. This finding leads us to question the recent phylogenetic interpretation of Bergthorsson et al. (2003, 2004) and Richardson and Palmer (2007) that rampant HGT into the mtDNA of Amborella best explains phylogenetic incongruence between mitochondrial gene trees for angiosperms. The only evidence for HGT into the Vitis mtDNA found involves fragments of two coding sequences stemming from two closteroviruses that cause the leaf roll disease of this plant. We also report that analysis of sequences shared by both chloroplast and mitochondrial genomes provides evidence for a previously unknown gene transfer route from the mitochondrion to the chloroplast.

  2. Complete chloroplast genome sequences of Praxelis (Eupatorium catarium Veldkamp), an important invasive species.

    PubMed

    Zhang, Ying; Li, Lei; Yan, Ting Liang; Liu, Qiang

    2014-10-01

    Praxelis (Eupatorium catarium Veldkamp) is a new hazardous invasive plant species that has caused serious economic losses and environmental damage in the Northern hemisphere tropical and subtropical regions. Although previous studies focused on detecting the biological characteristics of this plant to prevent its expansion, little effort has been made to understand the impact of Praxelis on the ecosystem in an evolutionary process. The genetic information of Praxelis is required for further phylogenetic identification and evolutionary studies. Here, we report the complete Praxelis chloroplast (cp) genome sequence. The Praxelis chloroplast genome is 151,410 bp in length including a small single-copy region (18,547 bp) and a large single-copy region (85,311 bp) separated by a pair of inverted repeats (IRs; 23,776 bp). The genome contains 85 unique and 18 duplicated genes in the IR region. The gene content and organization are similar to other Asteraceae tribe cp genomes. We also analyzed the whole cp genome sequence, repeat structure, codon usage, contraction of the IR and gene structure/organization features between native and invasive Asteraceae plants, in order to understand the evolution of organelle genomes between native and invasive Asteraceae. Comparative analysis identified the 14 markers containing greater than 2% parsimony-informative characters, indicating that they are potential informative markers for barcoding and phylogenetic analysis. Moreover, a sister relationship between Praxelis and seven other species in Asteraceae was found based on phylogenetic analysis of 28 protein-coding sequences. Complete cp genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Calibration of the modulation transfer function of surface profilometers with binary pseudo-random test standards: expanding the application range

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yashchuk, Valeriy V.; Anderson, Erik H.; Barber, Samuel K.

    2011-03-14

    A modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) gratings and arrays [Proc. SPIE 7077-7 (2007), Opt. Eng. 47, 073602 (2008)] has been proven to be an effective MTF calibration method for a number of interferometric microscopes and a scatterometer [Nucl. Instr. and Meth. A616, 172 (2010)]. Here we report on a further expansion of the application range of the method. We describe the MTF calibration of a 6 inch phase shifting Fizeau interferometer. Beyond providing a direct measurement of the interferometer's MTF, tests with a BPR array surface have revealed an asymmetry in the instrument's datamore » processing algorithm that fundamentally limits its bandwidth. Moreover, the tests have illustrated the effects of the instrument's detrending and filtering procedures on power spectral density measurements. The details of the development of a BPR test sample suitable for calibration of scanning and transmission electron microscopes are also presented. Such a test sample is realized as a multilayer structure with the layer thicknesses of two materials corresponding to BPR sequence. The investigations confirm the universal character of the method that makes it applicable to a large variety of metrology instrumentation with spatial wavelength bandwidths from a few nanometers to hundreds of millimeters.« less

  4. Binary pseudo-random patterned structures for modulation transfer function calibration and resolution characterization of a full-field transmission soft x-ray microscope

    DOE PAGES

    Yashchuk, V. V.; Fischer, P. J.; Chan, E. R.; ...

    2015-12-09

    We present a modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) one-dimensional sequences and two-dimensional arrays as an effective method for spectral characterization in the spatial frequency domain of a broad variety of metrology instrumentation, including interferometric microscopes, scatterometers, phase shifting Fizeau interferometers, scanning and transmission electron microscopes, and at this time, x-ray microscopes. The inherent power spectral density of BPR gratings and arrays, which has a deterministic white-noise-like character, allows a direct determination of the MTF with a uniform sensitivity over the entire spatial frequency range and field of view of an instrument. We demonstrate themore » MTF calibration and resolution characterization over the full field of a transmission soft x-ray microscope using a BPR multilayer (ML) test sample with 2.8 nm fundamental layer thickness. We show that beyond providing a direct measurement of the microscope's MTF, tests with the BPRML sample can be used to fine tune the instrument's focal distance. Finally, our results confirm the universality of the method that makes it applicable to a large variety of metrology instrumentation with spatial wavelength bandwidths from a few nanometers to hundreds of millimeters.« less

  5. Network-Forming Nanoclusters in Binary As-S/Se Glasses: From Ab Initio Quantum Chemical Modeling to Experimental Evidences.

    PubMed

    Hyla, M

    2017-12-01

    Network-forming As 2 (S/Se) m nanoclusters are employed to recognize expected variations in a vicinity of some remarkable compositions in binary As-Se/S glassy systems accepted as signatures of optimally constrained intermediate topological phases in earlier temperature-modulated differential scanning calorimetry experiments. The ab initio quantum chemical calculations performed using the cation-interlinking network cluster approach show similar oscillating character in tendency to local chemical decomposition but obvious step-like behavior in preference to global phase separation on boundary chemical compounds (pure chalcogen and stoichiometric arsenic chalcogenides). The onsets of stability are defined for chalcogen-rich glasses, these being connected with As 2 Se 5 (Z = 2.29) and As 2 S 6 (Z = 2.25) nanoclusters for As-Se and As-S glasses, respectively. The physical aging effects result preferentially from global phase separation in As-S glass system due to high localization of covalent bonding and local demixing on neighboring As 2 Se m+1 and As 2 Se m-1 nanoclusters in As-Se system. These nanoclusters well explain the lower limits of reversibility windows in temperature-modulated differential scanning calorimetry, but they cannot be accepted as signatures of topological phase transitions in respect to the rigidity theory.

  6. The sea cucumber genome provides insights into morphological evolution and visceral regeneration

    PubMed Central

    Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B.; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng

    2017-01-01

    Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs. PMID:29023486

  7. The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

    PubMed

    Zhang, Xiaojun; Sun, Lina; Yuan, Jianbo; Sun, Yamin; Gao, Yi; Zhang, Libin; Li, Shihao; Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng; Xiang, Jianhai

    2017-10-01

    Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.

  8. Half-metallic Co-based quaternary Heusler alloys for spintronics: Defect- and pressure-induced transitions and properties

    DOE PAGES

    Enamullah, .; Johnson, D. D.; Suresh, K. G.; ...

    2016-11-07

    Heusler compounds offer potential as spintronic devices due to their spin polarization and half-metallicity properties, where electron spin-majority (minority) manifold exhibits states (band gap) at the electronic chemical potential, yielding full spin polarization in a single manifold. Yet, Heuslers often exhibit intrinsic disorder that degrades its half-metallicity and spin polarization. Using density-functional theory, we analyze the electronic and magnetic properties of equiatomic Heusler (L2 1) CoMnCrAl and CoFeCrGe alloys for effects of hydrostatic pressure and intrinsic disorder (thermal antisites, binary swaps, and vacancies). Under pressure, CoMnCrAl undergoes a metallic transition, while half-metallicity in CoFeCrGe is retained for a limited range.more » Antisite disorder between Cr-Al pair in CoMnCrAl alloy is energetically the most favorable, and retains half-metallic character in Cr-excess regime. However, Co-deficient samples in both alloys undergo a transition from half-metallic to metallic, with a discontinuity in the saturation magnetization. For binary swaps, configurations that compete with the ground state are identified and show no loss of half-metallicity; however, the minority-spin band gap and magnetic moments vary depending on the atoms swapped. For single binary swaps, there is a significant energy cost in CoMnCrAl but with no loss of half-metallicity. Although a few configurations in CoFeCrGe energetically compete with the ground state, the minority-spin band gap and magnetic moments vary depending on the atoms swapped. Furthermore, this information should help in controlling these potential spintronic materials.« less

  9. Half-metallic Co-based quaternary Heusler alloys for spintronics: Defect- and pressure-induced transitions and properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Enamullah, .; Johnson, D. D.; Suresh, K. G.

    Heusler compounds offer potential as spintronic devices due to their spin polarization and half-metallicity properties, where electron spin-majority (minority) manifold exhibits states (band gap) at the electronic chemical potential, yielding full spin polarization in a single manifold. Yet, Heuslers often exhibit intrinsic disorder that degrades its half-metallicity and spin polarization. Using density-functional theory, we analyze the electronic and magnetic properties of equiatomic Heusler (L2 1) CoMnCrAl and CoFeCrGe alloys for effects of hydrostatic pressure and intrinsic disorder (thermal antisites, binary swaps, and vacancies). Under pressure, CoMnCrAl undergoes a metallic transition, while half-metallicity in CoFeCrGe is retained for a limited range.more » Antisite disorder between Cr-Al pair in CoMnCrAl alloy is energetically the most favorable, and retains half-metallic character in Cr-excess regime. However, Co-deficient samples in both alloys undergo a transition from half-metallic to metallic, with a discontinuity in the saturation magnetization. For binary swaps, configurations that compete with the ground state are identified and show no loss of half-metallicity; however, the minority-spin band gap and magnetic moments vary depending on the atoms swapped. For single binary swaps, there is a significant energy cost in CoMnCrAl but with no loss of half-metallicity. Although a few configurations in CoFeCrGe energetically compete with the ground state, the minority-spin band gap and magnetic moments vary depending on the atoms swapped. Furthermore, this information should help in controlling these potential spintronic materials.« less

  10. GAC: Gene Associations with Clinical, a web based application.

    PubMed

    Zhang, Xinyan; Rupji, Manali; Kowalski, Jeanne

    2017-01-01

    We present GAC, a shiny R based tool for interactive visualization of clinical associations based on high-dimensional data. The tool provides a web-based suite to perform supervised principal component analysis (SuperPC), an approach that uses both high-dimensional data, such as gene expression, combined with clinical data to infer clinical associations. We extended the approach to address binary outcomes, in addition to continuous and time-to-event data in our package, thereby increasing the use and flexibility of SuperPC.  Additionally, the tool provides an interactive visualization for summarizing results based on a forest plot for both binary and time-to-event data.  In summary, the GAC suite of tools provide a one stop shop for conducting statistical analysis to identify and visualize the association between a clinical outcome of interest and high-dimensional data types, such as genomic data. Our GAC package has been implemented in R and is available via http://shinygispa.winship.emory.edu/GAC/. The developmental repository is available at https://github.com/manalirupji/GAC.

  11. Re-criticizing RNA-mediated cell evolution: a radical perspective

    NASA Astrophysics Data System (ADS)

    Kotakis, Christos

    2016-01-01

    Genetic inter-communication of the nucleic-organellar dual in eukaryotes is dominated by DNA-directed phenomena. RNA regulatory circuits have also been observed in artificial laboratory prototypes where gene transfer events are reconstructed, but they are excluded from the primary norm due to their rarity. Recent technical advances in organellar biotechnology, genome engineering and single-molecule tracking give novel experimental insights on RNA metabolism not only at cellular level, but also on organismal survival. Here, I put forward a hypothesis for RNA's involvement in gene piece transfer, taken together the current knowledge on the primitive RNA character as a biochemical modulator with model organisms from peculiar natural habitats. It is proposed that RNA molecules of special structural signature and functional identity can drive evolution, integrating the ecological pressure of environmental oscillations into genome imprinting by buffering-out epigenetic aberrancies.

  12. First instalment in resolution of the Banksia spinulosa complex (Proteaceae): B. neoanglica, a new species supported by phenetic analysis, ecology and geography

    PubMed Central

    Stimpson, Margaret L.; Weston, Peter H.; Telford, Ian R.H.; Bruhl, Jeremy J.

    2012-01-01

    Abstract Taxa in the Banksia spinulosa Sm. complex (Proteaceae) have populations with sympatric, parapatric and allopatric distributions and unclear or disputed boundaries. Our hypothesis is that under biological, phenetic and diagnosable species concepts that each of the currently named taxa within the Banksia spinulosa complex is a separate species. Based on specimens collected as part of this study, and data recorded from specimens in six Australian herbaria, complemented by phenetic analysis (semi–strong multidimensional scaling and UPGMA clustering) and a detailed morphological study, we investigated both morphological variation and geographic distribution in the Banksia spinulosa complex. All specimens used for this study are held at the N.C.W. Beadle Herbarium or the National Herbarium of New South Wales. In total 23 morphological characters (11 quantitative, five binary, and seven multistate characters) were analysed phenetically for 89 specimens. Ordination and cluster analysis resulted in individuals grouping strongly allowing recognition of distinct groups consistent with their recognition as separate species. Additional morphological analysis was completed on all specimens using leaf, floral, fruit and stem morphology, providing clear cut diagnosable groups and strong support for the recognition of Banksia spinulosa var. cunninghamii and Banksia spinulosa var. neoanglica as species. PMID:23170073

  13. The archetype-genome exemplar in molecular dynamics and continuum mechanics

    NASA Astrophysics Data System (ADS)

    Greene, M. Steven; Li, Ying; Chen, Wei; Liu, Wing Kam

    2014-04-01

    We argue that mechanics and physics of solids rely on a fundamental exemplar: the apparent properties of a system depend on the building blocks that comprise it. Building blocks are referred to as archetypes and apparent system properties as the system genome. Three entities are of importance: the archetype properties, the conformation of archetypes, and the properties of interactions activated by that conformation. The combination of these entities into the system genome is called assembly. To show the utility of the archetype-genome exemplar, this work presents the mathematical ingredients and computational implementation of theories in solid mechanics that are (1) molecular and (2) continuum manifestations of the assembly process. Both coarse-grained molecular dynamics (CGMD) and the archetype-blending continuum (ABC) theories are formulated then applied to polymer nanocomposites (PNCs) to demonstrate the impact the components of the assembly triplet have on a material genome. CGMD simulations demonstrate the sensitivity of nanocomposite viscosities and diffusion coefficients to polymer chain types (archetype), polymer-nanoparticle interaction potentials (interaction), and the structural configuration (conformation) of dispersed nanoparticles. ABC simulations show the contributions of bulk polymer (archetype) properties, occluded region of bound rubber (interaction) properties, and microstructural binary images (conformation) to predictions of linear damping properties, the Payne effect, and localization/size effects in the same class of PNC material. The paper is light on mathematics. Instead, the focus is on the usefulness of the archetype-genome exemplar to predict system behavior inaccessible to classical theories by transitioning mechanics away from heuristic laws to mechanism-based ones. There are two core contributions of this research: (1) presentation of a fundamental axiom—the archetype-genome exemplar—to guide theory development in computational mechanics, and (2) demonstrations of its utility in modern theoretical realms: CGMD, and generalized continuum mechanics.

  14. Secure communication of static information by electronic means

    DOEpatents

    Gritton, Dale G.

    1994-01-01

    A method and apparatus (10) for the secure transmission of static data (16) from a tag (11) to a remote reader (12). Each time the static data (16) is to be transmitted to the reader (12), the 10 bits of static data (16) are combined with 54 bits of binary data (21), which constantly change from one transmission to the next, into a 64-bit number (22). This number is then encrypted and transmitted to the remote reader (12) where it is decrypted (26) to produce the same 64 bit number that was encrypted in the tag (11). With a continual change in the value of the 64 bit number (22) in the tag, the encrypted numbers transmitted to the reader (12) will appear to be dynamic in character rather than being static.

  15. Integral Phylogenomic Approach over Ilex L. Species from Southern South America

    PubMed Central

    Cascales, Jimena; Bracco, Mariana; Garberoglio, Mariana J.; Poggio, Lidia; Gottlieb, Alexandra M.

    2017-01-01

    The use of molecular markers with inadequate variation levels has resulted in poorly resolved phylogenetic relationships within Ilex. Focusing on southern South American and Asian species, we aimed at contributing informative plastid markers. Also, we intended to gain insights into the nature of morphological and physiological characters used to identify species. We obtained the chloroplast genomes of I. paraguariensis and I. dumosa, and combined these with all the congeneric plastomes currently available to accomplish interspecific comparisons and multilocus analyses. We selected seven introns and nine IGSs as variable non-coding markers that were used in phylogenomic analyses. Eight extra IGSs were proposed as candidate markers. Southern South American species formed one lineage, except for I. paraguariensis, I. dumosa and I. argentina, which occupied intermediate positions among sampled taxa; Euroasiatic species formed two lineages. Some concordant relationships were retrieved from nuclear sequence data. We also conducted integral analyses, involving a supernetwork of molecular data, and a simultaneous analysis of quantitative and qualitative morphological and phytochemical characters, together with molecular data. The total evidence tree was used to study the evolution of non-molecular data, evidencing fifteen non-ambiguous synapomorphic character states and consolidating the relationships among southern South American species. More South American representatives should be incorporated to elucidate their origin. PMID:29165335

  16. Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks

    PubMed Central

    Zhao, Yongan; Carey, Knox; Lloyd, David; Sofia, Heidi; Baker, Dixie; Flicek, Paul; Shringarpure, Suyash; Bustamante, Carlos; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Tang, Haixu; Wang, XiaoFeng; Hubaux, Jean-Pierre

    2018-01-01

    The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context—a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or “beacon”) is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual’s whole genome sequence), the individual’s membership in a beacon can be inferred through repeated queries for variants present in the individual’s genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets. PMID:28339683

  17. Genetic dissection of ethanol tolerance in the budding yeast Saccharomyces cerevisiae.

    PubMed

    Hu, X H; Wang, M H; Tan, T; Li, J R; Yang, H; Leach, L; Zhang, R M; Luo, Z W

    2007-03-01

    Uncovering genetic control of variation in ethanol tolerance in natural populations of yeast Saccharomyces cerevisiae is essential for understanding the evolution of fermentation, the dominant lifestyle of the species, and for improving efficiency of selection for strains with high ethanol tolerance, a character of great economic value for the brewing and biofuel industries. To date, as many as 251 genes have been predicted to be involved in influencing this character. Candidacy of these genes was determined from a tested phenotypic effect following gene knockout, from an induced change in gene function under an ethanol stress condition, or by mutagenesis. This article represents the first genomics approach for dissecting genetic variation in ethanol tolerance between two yeast strains with a highly divergent trait phenotype. We developed a simple but reliable experimental protocol for scoring the phenotype and a set of STR/SNP markers evenly covering the whole genome. We created a mapping population comprising 319 segregants from crossing the parental strains. On the basis of the data sets, we find that the tolerance trait has a high heritability and that additive genetic variance dominates genetic variation of the trait. Segregation at five QTL detected has explained approximately 50% of phenotypic variation; in particular, the major QTL mapped on yeast chromosome 9 has accounted for a quarter of the phenotypic variation. We integrated the QTL analysis with the predicted candidacy of ethanol resistance genes and found that only a few of these candidates fall in the QTL regions.

  18. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records.

    PubMed

    Peissig, Peggy L; Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries.

  19. Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

    PubMed Central

    Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

    2005-01-01

    Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134

  20. The Complete Chloroplast Genome of 17 Individuals of Pest Species Jacobaea vulgaris: SNPs, Microsatellites and Barcoding Markers for Population and Phylogenetic Studies

    PubMed Central

    Doorduin, Leonie; Gravendeel, Barbara; Lammers, Youri; Ariyurek, Yavuz; Chin-A-Woeng, Thomas; Vrieling, Klaas

    2011-01-01

    Invasive individuals from the pest species Jacobaea vulgaris show different allocation patterns in defence and growth compared with native individuals. To examine if these changes are caused by fast evolution, it is necessary to identify native source populations and compare these with invasive populations. For this purpose, we are in need of intraspecific polymorphic markers. We therefore sequenced the complete chloroplast genomes of 12 native and 5 invasive individuals of J. vulgaris with next generation sequencing and discovered single-nucleotide polymorphisms (SNPs) and microsatellites. This is the first study in which the chloroplast genome of that many individuals within a single species was sequenced. Thirty-two SNPs and 34 microsatellite regions were found. For none of the individuals, differences were found between the inverted repeats. Furthermore, being the first chloroplast genome sequenced in the Senecioneae clade, we compared it with four other members of the Asteraceae family to identify new regions for phylogentic inference within this clade and also within the Asteraceae family. Five markers (ndhC-trnV, ndhC-atpE, rps18-rpl20, clpP and psbM-trnD) contained parsimony-informative characters higher than 2%. Finally, we compared two procedures of preparing chloroplast DNA for next generation sequencing. PMID:21444340

  1. Gene Editing and Crop Improvement Using CRISPR-Cas9 System

    PubMed Central

    Arora, Leena; Narula, Alka

    2017-01-01

    Advancements in Genome editing technologies have revolutionized the fields of functional genomics and crop improvement. CRISPR/Cas9 (clustered regularly interspaced short palindromic repeat)-Cas9 is a multipurpose technology for genetic engineering that relies on the complementarity of the guideRNA (gRNA) to a specific sequence and the Cas9 endonuclease activity. It has broadened the agricultural research area, bringing in new opportunities to develop novel plant varieties with deletion of detrimental traits or addition of significant characters. This RNA guided genome editing technology is turning out to be a groundbreaking innovation in distinct branches of plant biology. CRISPR technology is constantly advancing including options for various genetic manipulations like generating knockouts; making precise modifications, multiplex genome engineering, and activation and repression of target genes. The review highlights the progression throughout the CRISPR legacy. We have studied the rapid evolution of CRISPR/Cas9 tools with myriad functionalities, capabilities, and specialized applications. Among varied diligences, plant nutritional improvement, enhancement of plant disease resistance and production of drought tolerant plants are reviewed. The review also includes some information on traditional delivery methods of Cas9-gRNA complexes into plant cells and incorporates the advent of CRISPR ribonucleoproteins (RNPs) that came up as a solution to various limitations that prevailed with plasmid-based CRISPR system. PMID:29167680

  2. Gene Editing and Crop Improvement Using CRISPR-Cas9 System.

    PubMed

    Arora, Leena; Narula, Alka

    2017-01-01

    Advancements in Genome editing technologies have revolutionized the fields of functional genomics and crop improvement. CRISPR/Cas9 (clustered regularly interspaced short palindromic repeat)-Cas9 is a multipurpose technology for genetic engineering that relies on the complementarity of the guideRNA (gRNA) to a specific sequence and the Cas9 endonuclease activity. It has broadened the agricultural research area, bringing in new opportunities to develop novel plant varieties with deletion of detrimental traits or addition of significant characters. This RNA guided genome editing technology is turning out to be a groundbreaking innovation in distinct branches of plant biology. CRISPR technology is constantly advancing including options for various genetic manipulations like generating knockouts; making precise modifications, multiplex genome engineering, and activation and repression of target genes. The review highlights the progression throughout the CRISPR legacy. We have studied the rapid evolution of CRISPR/Cas9 tools with myriad functionalities, capabilities, and specialized applications. Among varied diligences, plant nutritional improvement, enhancement of plant disease resistance and production of drought tolerant plants are reviewed. The review also includes some information on traditional delivery methods of Cas9-gRNA complexes into plant cells and incorporates the advent of CRISPR ribonucleoproteins (RNPs) that came up as a solution to various limitations that prevailed with plasmid-based CRISPR system.

  3. A Drosophila Toolkit for the Visualization and Quantification of Viral Replication Launched from Transgenic Genomes

    PubMed Central

    Wernet, Mathias F.; Klovstad, Martha; Clandinin, Thomas R.

    2014-01-01

    Arthropod RNA viruses pose a serious threat to human health, yet many aspects of their replication cycle remain incompletely understood. Here we describe a versatile Drosophila toolkit of transgenic, self-replicating genomes (‘replicons’) from Sindbis virus that allow rapid visualization and quantification of viral replication in vivo. We generated replicons expressing Luciferase for the quantification of viral replication, serving as useful new tools for large-scale genetic screens for identifying cellular pathways that influence viral replication. We also present a new binary system in which replication-deficient viral genomes can be activated ‘in trans’, through co-expression of an intact replicon contributing an RNA-dependent RNA polymerase. The utility of this toolkit for studying virus biology is demonstrated by the observation of stochastic exclusion between replicons expressing different fluorescent proteins, when co-expressed under control of the same cellular promoter. This process is analogous to ‘superinfection exclusion’ between virus particles in cell culture, a process that is incompletely understood. We show that viral polymerases strongly prefer to replicate the genome that encoded them, and that almost invariably only a single virus genome is stochastically chosen for replication in each cell. Our in vivo system now makes this process amenable to detailed genetic dissection. Thus, this toolkit allows the cell-type specific, quantitative study of viral replication in a genetic model organism, opening new avenues for molecular, genetic and pharmacological dissection of virus biology and tool development. PMID:25386852

  4. CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

    PubMed

    Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

    2015-09-22

    Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis.

  5. Genome-Based Comparison of Clostridioides difficile: Average Amino Acid Identity Analysis of Core Genomes.

    PubMed

    Cabal, Adriana; Jun, Se-Ran; Jenjaroenpun, Piroon; Wanchai, Visanu; Nookaew, Intawat; Wongsurawat, Thidathip; Burgess, Mary J; Kothari, Atul; Wassenaar, Trudy M; Ussery, David W

    2018-02-14

    Infections due to Clostridioides difficile (previously known as Clostridium difficile) are a major problem in hospitals, where cases can be caused by community-acquired strains as well as by nosocomial spread. Whole genome sequences from clinical samples contain a lot of information but that needs to be analyzed and compared in such a way that the outcome is useful for clinicians or epidemiologists. Here, we compare 663 public available complete genome sequences of C. difficile using average amino acid identity (AAI) scores. This analysis revealed that most of these genomes (640, 96.5%) clearly belong to the same species, while the remaining 23 genomes produce four distinct clusters within the Clostridioides genus. The main C. difficile cluster can be further divided into sub-clusters, depending on the chosen cutoff. We demonstrate that MLST, either based on partial or full gene-length, results in biased estimates of genetic differences and does not capture the true degree of similarity or differences of complete genomes. Presence of genes coding for C. difficile toxins A and B (ToxA/B), as well as the binary C. difficile toxin (CDT), was deduced from their unique PfamA domain architectures. Out of the 663 C. difficile genomes, 535 (80.7%) contained at least one copy of ToxA or ToxB, while these genes were missing from 128 genomes. Although some clusters were enriched for toxin presence, these genes are variably present in a given genetic background. The CDT genes were found in 191 genomes, which were restricted to a few clusters only, and only one cluster lacked the toxin A/B genes consistently. A total of 310 genomes contained ToxA/B without CDT (47%). Further, published metagenomic data from stools were used to assess the presence of C. difficile sequences in blinded cases of C. difficile infection (CDI) and controls, to test if metagenomic analysis is sensitive enough to detect the pathogen, and to establish strain relationships between cases from the same hospital. We conclude that metagenomics can contribute to the identification of CDI and can assist in characterization of the most probable causative strain in CDI patients.

  6. Retrospective Binary-Trait Association Test Elucidates Genetic Architecture of Crohn Disease

    PubMed Central

    Jiang, Duo; Zhong, Sheng; McPeek, Mary Sara

    2016-01-01

    In genetic association testing, failure to properly control for population structure can lead to severely inflated type 1 error and power loss. Meanwhile, adjustment for relevant covariates is often desirable and sometimes necessary to protect against spurious association and to improve power. Many recent methods to account for population structure and covariates are based on linear mixed models (LMMs), which are primarily designed for quantitative traits. For binary traits, however, LMM is a misspecified model and can lead to deteriorated performance. We propose CARAT, a binary-trait association testing approach based on a mixed-effects quasi-likelihood framework, which exploits the dichotomous nature of the trait and achieves computational efficiency through estimating equations. We show in simulation studies that CARAT consistently outperforms existing methods and maintains high power in a wide range of population structure settings and trait models. Furthermore, CARAT is based on a retrospective approach, which is robust to misspecification of the phenotype model. We apply our approach to a genome-wide analysis of Crohn disease, in which we replicate association with 17 previously identified regions. Moreover, our analysis on 5p13.1, an extensively reported region of association, shows evidence for the presence of multiple independent association signals in the region. This example shows how CARAT can leverage known disease risk factors to shed light on the genetic architecture of complex traits. PMID:26833331

  7. Rhizome of life, catastrophes, sequence exchanges, gene creations, and giant viruses: how microbial genomics challenges Darwin

    PubMed Central

    Merhej, Vicky; Raoult, Didier

    2012-01-01

    Darwin's theory about the evolution of species has been the object of considerable dispute. In this review, we have described seven key principles in Darwin's book The Origin of Species and tried to present how genomics challenge each of these concepts and improve our knowledge about evolution. Darwin believed that species evolution consists on a positive directional selection ensuring the “survival of the fittest.” The most developed state of the species is characterized by increasing complexity. Darwin proposed the theory of “descent with modification” according to which all species evolve from a single common ancestor through a gradual process of small modification of their vertical inheritance. Finally, the process of evolution can be depicted in the form of a tree. However, microbial genomics showed that evolution is better described as the “biological changes over time.” The mode of change is not unidirectional and does not necessarily favors advantageous mutations to increase fitness it is rather subject to random selection as a result of catastrophic stochastic processes. Complexity is not necessarily the completion of development: several complex organisms have gone extinct and many microbes including bacteria with intracellular lifestyle have streamlined highly effective genomes. Genomes evolve through large events of gene deletions, duplications, insertions, and genomes rearrangements rather than a gradual adaptative process. Genomes are dynamic and chimeric entities with gene repertoires that result from vertical and horizontal acquisitions as well as de novo gene creation. The chimeric character of microbial genomes excludes the possibility of finding a single common ancestor for all the genes recorded currently. Genomes are collections of genes with different evolutionary histories that cannot be represented by a single tree of life (TOL). A forest, a network or a rhizome of life may be more accurate to represent evolutionary relationships among species. PMID:22973559

  8. The complete mitochondrial genomes of two band-winged grasshoppers, Gastrimargus marmoratus and Oedaleus asiaticus

    PubMed Central

    Ma, Chuan; Liu, Chunxiang; Yang, Pengcheng; Kang, Le

    2009-01-01

    Background The two closely related species of band-winged grasshoppers, Gastrimargus marmoratus and Oedaleus asiaticus, display significant differences in distribution, biological characteristics and habitat preferences. They are so similar to their respective congeneric species that it is difficult to differentiate them from other species within each genus. Hoppers of the two species have quite similar morphologies to that of Locusta migratoria, hence causing confusion in species identification. Thus we determined and compared the mitochondrial genomes of G. marmoratus and O. asiaticus to address these questions. Results The complete mitochondrial genomes of G. marmoratus and O. asiaticus are 15,924 bp and 16,259 bp in size, respectively, with O. asiaticus being the largest among all known mitochondrial genomes in Orthoptera. Both mitochondrial genomes contain a standard set of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and an A+T-rich region in the same order as those of the other analysed caeliferan species, but different from those of the ensiferan species by the rearrangement of trnD and trnK. The putative initiation codon for the cox1 gene in the two species is ATC. The presence of different sized tandem repeats in the A+T-rich region leads to size variation between their mitochondrial genomes. Except for nad2, nad4L, and nad6, most of the caeliferan mtDNA genes exhibit low levels of divergence. In phylogenetic analyses, the species from the suborder Caelifera form a monophyletic group, as is the case for the Ensifera. Furthermore, the two suborders cluster as sister groups, supporting the monophyly of Orthoptera. Conclusion The mitochondrial genomes of both G. marmoratus and O. asiaticus harbor the typical 37 genes and an A+T-rich region, exhibiting similar characters to those of other grasshopper species. Characterization of the two mitochondrial genomes has enriched our knowledge on mitochondrial genomes of Orthoptera. PMID:19361334

  9. The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification.

    PubMed

    Cavalier-Smith, T

    2002-01-01

    Prokaryotes constitute a single kingdom, Bacteria, here divided into two new subkingdoms: Negibacteria, with a cell envelope of two distinct genetic membranes, and Unibacteria, comprising the new phyla Archaebacteria and Posibacteria, with only one. Other new bacterial taxa are established in a revised higher-level classification that recognizes only eight phyla and 29 classes. Morphological, palaeontological and molecular data are integrated into a unified picture of large-scale bacterial cell evolution despite occasional lateral gene transfers. Archaebacteria and eukaryotes comprise the clade neomura, with many common characters, notably obligately co-translational secretion of N-linked glycoproteins, signal recognition particle with 7S RNA and translation-arrest domain, protein-spliced tRNA introns, eight-subunit chaperonin, prefoldin, core histones, small nucleolar ribonucleoproteins (snoRNPs), exosomes and similar replication, repair, transcription and translation machinery. Eubacteria (posibacteria and negibacteria) are paraphyletic, neomura having arisen from Posibacteria within the new subphylum Actinobacteria (possibly from the new class Arabobacteria, from which eukaryotic cholesterol biosynthesis probably came). Replacement of eubacterial peptidoglycan by glycoproteins and adaptation to thermophily are the keys to neomuran origins. All 19 common neomuran character suites probably arose essentially simultaneously during the radical modification of an actinobacterium. At least 11 were arguably adaptations to thermophily. Most unique archaebacterial characters (prenyl ether lipids; flagellar shaft of glycoprotein, not flagellin; DNA-binding protein lob; specially modified tRNA; absence of Hsp90) were subsequent secondary adaptations to hyperthermophily and/or hyperacidity. The insertional origin of protein-spliced tRNA introns and an insertion in proton-pumping ATPase also support the origin of neomura from eubacteria. Molecular co-evolution between histones and DNA-handling proteins, and in novel protein initiation and secretion machineries, caused quantum evolutionary shifts in their properties in stem neomura. Proteasomes probably arose in the immediate common ancestor of neomura and Actinobacteria. Major gene losses (e.g. peptidoglycan synthesis, hsp90, secA) and genomic reduction were central to the origin of archaebacteria. Ancestral archaebacteria were probably heterotrophic, anaerobic, sulphur-dependent hyperthermoacidophiles; methanogenesis and halophily are secondarily derived. Multiple lateral gene transfers from eubacteria helped secondary archaebacterial adaptations to mesophily and genome re-expansion. The origin from a drastically altered actinobacterium of neomura, and the immediately subsequent simultaneous origins of archaebacteria and eukaryotes, are the most extreme and important cases of quantum evolution since cells began. All three strikingly exemplify De Beer's principle of mosaic evolution: the fact that, during major evolutionary transformations, some organismal characters are highly innovative and change remarkably swiftly, whereas others are largely static, remaining conservatively ancestral in nature. This phenotypic mosaicism creates character distributions among taxa that are puzzling to those mistakenly expecting uniform evolutionary rates among characters and lineages. The mixture of novel (neomuran or archaebacterial) and ancestral eubacteria-like characters in archaebacteria primarily reflects such vertical mosaic evolution, not chimaeric evolution by lateral gene transfer. No symbiogenesis occurred. Quantum evolution of the basic neomuran characters, and between sister paralogues in gene duplication trees, makes many sequence trees exaggerate greatly the apparent age of archaebacteria. Fossil evidence is compelling for the extreme antiquity of eubacteria [over 3500 million years (My)] but, like their eukaryote sisters, archaebacteria probably arose only 850 My ago. Negibacteria are the most ancient, radiating rapidly into six phyla. Evidence from molecular sequences, ultrastructure, evolution of photosynthesis, envelope structure and chemistry and motility mechanisms fits the view that the cenancestral cell was a photosynthetic negibacterium, specifically an anaerobic green non-sulphur bacterium, and that the universal tree is rooted at the divergence between sulphur and non-sulphur green bacteria. The negibacterial outer membrane was lost once only in the history of life, when Posibacteria arose about 2800 My ago after their ancestors diverged from Cyanobacteria.

  10. Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales

    PubMed Central

    Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.

    2015-01-01

    With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that unit two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea. PMID:25764277

  11. Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits.

    PubMed

    Larsson, John; Nylander, Johan Aa; Bergman, Birgitta

    2011-06-30

    Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets.

  12. Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits

    PubMed Central

    2011-01-01

    Background Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. Results A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. Conclusions The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets. PMID:21718514

  13. The Echinococcus canadensis (G7) genome: a key knowledge of parasitic platyhelminth human diseases.

    PubMed

    Maldonado, Lucas L; Assis, Juliana; Araújo, Flávio M Gomes; Salim, Anna C M; Macchiaroli, Natalia; Cucher, Marcela; Camicia, Federico; Fox, Adolfo; Rosenzvit, Mara; Oliveira, Guilherme; Kamenetzky, Laura

    2017-02-27

    The parasite Echinococcus canadensis (G7) (phylum Platyhelminthes, class Cestoda) is one of the causative agents of echinococcosis. Echinococcosis is a worldwide chronic zoonosis affecting humans as well as domestic and wild mammals, which has been reported as a prioritized neglected disease by the World Health Organisation. No genomic data, comparative genomic analyses or efficient therapeutic and diagnostic tools are available for this severe disease. The information presented in this study will help to understand the peculiar biological characters and to design species-specific control tools. We sequenced, assembled and annotated the 115-Mb genome of E. canadensis (G7). Comparative genomic analyses using whole genome data of three Echinococcus species not only confirmed the status of E. canadensis (G7) as a separate species but also demonstrated a high nucleotide sequences divergence in relation to E. granulosus (G1). The E. canadensis (G7) genome contains 11,449 genes with a core set of 881 orthologs shared among five cestode species. Comparative genomics revealed that there are more single nucleotide polymorphisms (SNPs) between E. canadensis (G7) and E. granulosus (G1) than between E. canadensis (G7) and E. multilocularis. This result was unexpected since E. canadensis (G7) and E. granulosus (G1) were considered to belong to the species complex E. granulosus sensu lato. We described SNPs in known drug targets and metabolism genes in the E. canadensis (G7) genome. Regarding gene regulation, we analysed three particular features: CpG island distribution along the three Echinococcus genomes, DNA methylation system and small RNA pathway. The results suggest the occurrence of yet unknown gene regulation mechanisms in Echinococcus. This is the first work that addresses Echinococcus comparative genomics. The resources presented here will promote the study of mechanisms of parasite development as well as new tools for drug discovery. The availability of a high-quality genome assembly is critical for fully exploring the biology of a pathogenic organism. The E. canadensis (G7) genome presented in this study provides a unique opportunity to address the genetic diversity among the genus Echinococcus and its particular developmental features. At present, there is no unequivocal taxonomic classification of Echinococcus species; however, the genome-wide SNPs analysis performed here revealed the phylogenetic distance among these three Echinococcus species. Additional cestode genomes need to be sequenced to be able to resolve their phylogeny.

  14. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments.

    PubMed

    Pelgas, Betty; Bousquet, Jean; Meirmans, Patrick G; Ritland, Kermit; Isabel, Nathalie

    2011-03-10

    The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers.For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies.Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and association genetic studies of adaptation and growth in Picea taxa. The putative QTNs identified will be tested for associations in natural populations, with potential applications in molecular breeding and gene conservation programs. QTLs mapping consistently across years and environments could also be the most important targets for breeding, because they represent genomic regions that may be least affected by G × E interactions.

  15. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments

    PubMed Central

    2011-01-01

    Background The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Results Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers. For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies. Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. Conclusions This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and association genetic studies of adaptation and growth in Picea taxa. The putative QTNs identified will be tested for associations in natural populations, with potential applications in molecular breeding and gene conservation programs. QTLs mapping consistently across years and environments could also be the most important targets for breeding, because they represent genomic regions that may be least affected by G × E interactions. PMID:21392393

  16. M13-Tailed Simple Sequence Repeat (SSR) Markers in Studies of Genetic Diversity and Population Structure of Common Oat Germplasm.

    PubMed

    Onyśk, Agnieszka; Boczkowska, Maja

    2017-01-01

    Simple Sequence Repeat (SSR) markers are one of the most frequently used molecular markers in studies of crop diversity and population structure. This is due to their uniform distribution in the genome, the high polymorphism, reproducibility, and codominant character. Additional advantages are the possibility of automatic analysis and simple interpretation of the results. The M13 tagged PCR reaction significantly reduces the costs of analysis by the automatic genetic analyzers. Here, we also disclose a short protocol of SSR data analysis.

  17. The complete mitochondrial genome sequence of Aesopia cornuta (Pleuronectiformes: Soleidae).

    PubMed

    Wang, Shu-Ying; Shi, Wei; Wang, Zhong-Ming; Gong, Li; Kong, Xiao-Yu

    2015-02-01

    Aesopia cornuta belongs to the family Soleidae of Pleuronectiformes, and the morphological characters are much similar to those of Zebrias. In this article, we sequenced, characterized, and compared the complete mitogenome of A. cornuta for the first time. The genome is 16,737 base pairs in length, and is typically consist of 37 genes, including 13 protein-coding genes, two ribosomal RNA, 22 transfer RNA, as well as a putative L-strand replication origin and a putative control region. The gene organization is identical to that of typical bony fishes. The overall base composition is 29.1, 28.3, 26.8 and 15.8% for C, A, T and G, respectively, with a slight AT bias of 55.1%. This result is expected to contribute to understanding the systematic evolution of the genus Aesopia and further taxonomic and phylogenetic studies of Soleidae and Pleuronectiformes.

  18. Chemometrical characterization of four italian rice varieties based on genetic and chemical analyses.

    PubMed

    Brandolini, Vincenzo; Coïsson, Jean Daniel; Tedeschi, Paola; Barile, Daniela; Cereti, Elisabetta; Maietti, Annalisa; Vecchiati, Giorgio; Martelli, Aldo; Arlorio, Marco

    2006-12-27

    This paper describes a method for achieving qualitative identification of four rice varieties from two different Italian regions. To estimate the presence of genetic diversity among the four rice varieties, we used polymerase chain reaction-randomly amplified polymorphic DNA (PCR-RAPD) markers, and to elucidate whether a relationship exists between the ground and the specific characteristics of the product, we studied proximate composition, fatty acid composition, mineral content, and total antioxidant capacity. Using principal component analysis on genomic and compositional data, we were able to classify rice samples according to their variety and their district of production. This work also examined the discrimination ability of different parameters. It was found that genomic data give the best discrimination based on varieties, indicating that RAPD assays could be useful in discriminating among closely related species, while compositional analyses do not depend on the genetic characters only but are related to the production area.

  19. Confounders of mutation-rate estimators: selection and phenotypic lag in Thermus thermophilus

    PubMed Central

    Kissling, Grace E.; Grogan, Dennis W.; Drake, John W.

    2015-01-01

    In a recent description of the rate and character of spontaneous mutation in the hyperthermophilic bacterium Thermus thermophilus, the mutation rate was observed to be substantially lower than seen in several mesophiles. Subsequently, a report appeared indicating that this bacterium maintains an average of about 4.5 genomes per cell. This number of genomes might result in a segregation lag for the expression of a recessive mutation and might therefore lead to an underestimate of the rate of mutation. Here we describe some kinds of problems that may arise when estimating mutation rates and outline ways to adjust the rates accordingly. The emphasis is mainly on differential rates of growth of mutants versus their parents and on various kinds of phenotypic lag. We then apply these methods to the T. thermophilus data and conclude that there is as yet no reliable impact on a previously described rate. PMID:23916418

  20. Phylogenetic relationships in Epidendroideae (Orchidaceae), one of the great flowering plant radiations: progressive specialization and diversification.

    PubMed

    Freudenstein, John V; Chase, Mark W

    2015-03-01

    The largest subfamily of orchids, Epidendroideae, represents one of the most significant diversifications among flowering plants in terms of pollination strategy, vegetative adaptation and number of species. Although many groups in the subfamily have been resolved, significant relationships in the tree remain unclear, limiting conclusions about diversification and creating uncertainty in the classification. This study brings together DNA sequences from nuclear, plastid and mitochrondrial genomes in order to clarify relationships, to test associations of key characters with diversification and to improve the classification. Sequences from seven loci were concatenated in a supermatrix analysis for 312 genera representing most of epidendroid diversity. Maximum-likelihood and parsimony analyses were performed on this matrix and on subsets of the data to generate trees and to investigate the effect of missing values. Statistical character-associated diversification analyses were performed. Likelihood and parsimony analyses yielded highly resolved trees that are in strong agreement and show significant support for many key clades. Many previously proposed relationships among tribes and subtribes are supported, and some new relationships are revealed. Analyses of subsets of the data suggest that the relatively high number of missing data for the full analysis is not problematic. Diversification analyses show that epiphytism is most strongly associated with diversification among epidendroids, followed by expansion into the New World and anther characters that are involved with pollinator specificity, namely early anther inflexion, cellular pollinium stalks and the superposed pollinium arrangement. All tested characters show significant association with speciation in Epidendroideae, suggesting that no single character accounts for the success of this group. Rather, it appears that a succession of key features appeared that have contributed to diversification, sometimes in parallel. © The Author 2015. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Revealing the Character of Orbits in a Binary System Consisting of a Primary Galaxy and a Satellite Companion

    NASA Astrophysics Data System (ADS)

    Zotos, Euaggelos E.

    2013-02-01

    In this article, we present a galactic gravitational model of three degrees of freedom (3D), in order to study and reveal the character of the orbits of the stars, in a binary stellar system composed of a primary quiet or active galaxy and a small satellite companion galaxy. Our main dynamical analysis will be focused on the behaviour of the primary galaxy. We investigate in detail the regular or chaotic nature of motion, in two different cases: (i) the time-independent model in both 2D and 3D dynamical systems and (ii) the time-evolving 3D model. For the description of the structure of the 2D system, we use the classical method of the Poincaré (x, px ), y = 0, py < 0 phase plane. In order to study the structure of the phase space of the 3D system, we take sections in the plane y = 0 of the 3D orbits, whose initial conditions differ from the plane parent periodic orbits, only by the z component. The set of the four-dimensional points in the (x, px , z, pz ) phase space is projected on the (z, pz ) plane. The maximum Lyapunov characteristic exponent is used in order to make an estimation of the chaoticity of our galactic system, in both 2D and 3D dynamical models. Our numerical calculations indicate that the percentage of the chaotic orbits increases when the primary galaxy has a dense and massive nucleus. The presence of the dense galactic core also increases the stellar velocities near the center of the galaxy. Moreover, for small values of the distance R between the two bodies, low-energy stars display chaotic motion, near the central region of the galaxy, while for larger values of the distance R, the motion in active galaxies is entirely regular for low-energy stars. Our simulations suggest that in galaxies with a satellite companion, the chaotic nature of motion is not only a result of the galactic interaction between the primary galaxy and its companion, but also a result caused by the presence of the dense nucleus in the core of the primary galaxy. Theoretical arguments are presented in order to support and interpret the numerically derived outcomes. Furthermore, we follow the 3D evolution of the primary galaxy, when mass is transported adiabatically from the disk to the nucleus. Our numerical results are in satisfactory agreement with observational data obtained from the M51-type binary stellar systems. A comparison between the present research and similar and earlier work is also made.

  2. Extensive gene tree discordance and hemiplasy shaped the genomes of North American columnar cacti.

    PubMed

    Copetti, Dario; Búrquez, Alberto; Bustamante, Enriquena; Charboneau, Joseph L M; Childs, Kevin L; Eguiarte, Luis E; Lee, Seunghee; Liu, Tiffany L; McMahon, Michelle M; Whiteman, Noah K; Wing, Rod A; Wojciechowski, Martin F; Sanderson, Michael J

    2017-11-07

    Few clades of plants have proven as difficult to classify as cacti. One explanation may be an unusually high level of convergent and parallel evolution (homoplasy). To evaluate support for this phylogenetic hypothesis at the molecular level, we sequenced the genomes of four cacti in the especially problematic tribe Pachycereeae, which contains most of the large columnar cacti of Mexico and adjacent areas, including the iconic saguaro cactus ( Carnegiea gigantea ) of the Sonoran Desert. We assembled a high-coverage draft genome for saguaro and lower coverage genomes for three other genera of tribe Pachycereeae ( Pachycereus , Lophocereus , and Stenocereus ) and a more distant outgroup cactus, Pereskia We used these to construct 4,436 orthologous gene alignments. Species tree inference consistently returned the same phylogeny, but gene tree discordance was high: 37% of gene trees having at least 90% bootstrap support conflicted with the species tree. Evidently, discordance is a product of long generation times and moderately large effective population sizes, leading to extensive incomplete lineage sorting (ILS). In the best supported gene trees, 58% of apparent homoplasy at amino sites in the species tree is due to gene tree-species tree discordance rather than parallel substitutions in the gene trees themselves, a phenomenon termed "hemiplasy." The high rate of genomic hemiplasy may contribute to apparent parallelisms in phenotypic traits, which could confound understanding of species relationships and character evolution in cacti. Published under the PNAS license.

  3. Extensive gene tree discordance and hemiplasy shaped the genomes of North American columnar cacti

    PubMed Central

    Búrquez, Alberto; Bustamante, Enriquena; Charboneau, Joseph L. M.; Childs, Kevin L.; Eguiarte, Luis E.; Lee, Seunghee; Liu, Tiffany L.; McMahon, Michelle M.; Whiteman, Noah K.; Wing, Rod A.; Wojciechowski, Martin F.; Sanderson, Michael J.

    2017-01-01

    Few clades of plants have proven as difficult to classify as cacti. One explanation may be an unusually high level of convergent and parallel evolution (homoplasy). To evaluate support for this phylogenetic hypothesis at the molecular level, we sequenced the genomes of four cacti in the especially problematic tribe Pachycereeae, which contains most of the large columnar cacti of Mexico and adjacent areas, including the iconic saguaro cactus (Carnegiea gigantea) of the Sonoran Desert. We assembled a high-coverage draft genome for saguaro and lower coverage genomes for three other genera of tribe Pachycereeae (Pachycereus, Lophocereus, and Stenocereus) and a more distant outgroup cactus, Pereskia. We used these to construct 4,436 orthologous gene alignments. Species tree inference consistently returned the same phylogeny, but gene tree discordance was high: 37% of gene trees having at least 90% bootstrap support conflicted with the species tree. Evidently, discordance is a product of long generation times and moderately large effective population sizes, leading to extensive incomplete lineage sorting (ILS). In the best supported gene trees, 58% of apparent homoplasy at amino sites in the species tree is due to gene tree-species tree discordance rather than parallel substitutions in the gene trees themselves, a phenomenon termed “hemiplasy.” The high rate of genomic hemiplasy may contribute to apparent parallelisms in phenotypic traits, which could confound understanding of species relationships and character evolution in cacti. PMID:29078296

  4. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences

    PubMed Central

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-01-01

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096

  5. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

    PubMed

    Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron

    2012-02-01

    Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.

  6. Fast ancestral gene order reconstruction of genomes with unequal gene content.

    PubMed

    Feijão, Pedro; Araujo, Eloi

    2016-11-11

    During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content, while running much faster, making it more scalable to larger datasets. Studing ancestral reconstruction problems under a new light, using the concept of intermediate genomes, allows the design of very fast algorithms by greatly reducing the solution search space, while also giving very good results. The algorithms introduced in this paper were implemented in an open-source software called RINGO (ancestral Reconstruction with INtermediate GenOmes), available at https://github.com/pedrofeijao/RINGO .

  7. STELLAR: fast and exact local alignments

    PubMed Central

    2011-01-01

    Background Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. Results We present here the local pairwise aligner STELLAR that has full sensitivity for ε-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. Conclusions STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de. PMID:22151882

  8. Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks.

    PubMed

    Raisaro, Jean Louis; Tramèr, Florian; Ji, Zhanglong; Bu, Diyue; Zhao, Yongan; Carey, Knox; Lloyd, David; Sofia, Heidi; Baker, Dixie; Flicek, Paul; Shringarpure, Suyash; Bustamante, Carlos; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Tang, Haixu; Wang, XiaoFeng; Hubaux, Jean-Pierre

    2017-07-01

    The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context-a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or "beacon") is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards.While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual's whole genome sequence), the individual's membership in a beacon can be inferred through repeated queries for variants present in the individual's genome.In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  9. CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics.

    PubMed

    Gai, Xiaowu; Perin, Juan C; Murphy, Kevin; O'Hara, Ryan; D'arcy, Monica; Wenocur, Adam; Xie, Hongbo M; Rappaport, Eric F; Shaikh, Tamim H; White, Peter S

    2010-02-04

    Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects. Available on the web at: http://sourceforge.net/projects/cnv.

  10. Local Atomic Arrangements and Band Structure of Boron Carbide.

    PubMed

    Rasim, Karsten; Ramlau, Reiner; Leithe-Jasper, Andreas; Mori, Takao; Burkhardt, Ulrich; Borrmann, Horst; Schnelle, Walter; Carbogno, Christian; Scheffler, Matthias; Grin, Yuri

    2018-05-22

    Boron carbide, the simple chemical combination of boron and carbon, is one of the best-known binary ceramic materials. Despite that, a coherent description of its crystal structure and physical properties resembles one of the most challenging problems in materials science. By combining ab initio computational studies, precise crystal structure determination from diffraction experiments, and state-of-the-art high-resolution transmission electron microscopy imaging, this concerted investigation reveals hitherto unknown local structure modifications together with the known structural alterations. The mixture of different local atomic arrangements within the real crystal structure reduces the electron deficiency of the pristine structure CBC+B 12 , answering the question about electron precise character of boron carbide and introducing new electronic states within the band gap, which allow a better understanding of physical properties. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. An alkyl polyglucoside-mixed emulsifier as stabilizer of emulsion systems: the influence of colloidal structure on emulsions skin hydration potential.

    PubMed

    Savic, Snezana; Lukic, Milica; Jaksic, Ivana; Reichl, Stephan; Tamburic, Slobodanka; Müller-Goymann, Christel

    2011-06-01

    To be considered as a suitable vehicle for drugs/cosmetic actives, an emulsion system should have a number of desirable properties mainly dependent on surfactant used for its stabilization. In the current study, C(12-14) alkyl polyglucoside (APG)-mixed emulsifier of natural origin has been investigated in a series of binary (emulsifier concentration 10-25% (w/w)) and ternary systems with fixed emulsifier content (15% (w/w)) with or without glycerol. To elucidate the systems' colloidal structure the following physicochemical techniques were employed: polarization and transmission electron microscopy, X-ray diffraction (WAXD and SAXD), thermal analysis (DSC and TGA), complex rheological, pH, and conductivity measurements. Additionally, the emulsion vehicles' skin hydration potential was tested in vivo, on human skin under occlusion. In a series of binary systems with fixed emulsifier/water ratios ranging from 10/90 to 25/75 the predominance of a lamellar mesophase was found, changing its character from a liquid crystalline to a gel crystalline type. The same was observed in gel emulsions containing equal amounts of emulsifier and oil (15% (w/w)), but varying in glycerol content (0-25%). Different emulsion samples exhibited different water distribution modes in the structure, reflecting their rheological behavior and also their skin hydration capacity. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. Achieving robust n-type nitrogen-doped graphene via a binary-doping approach

    NASA Astrophysics Data System (ADS)

    Kim, Hyo Seok; Kim, Han Seul; Kim, Seong Sik; Kim, Yong-Hoon

    2014-03-01

    Among various dopant candidates, nitrogen (N) atoms are considered as the most effective dopants to improve the diverse properties of graphene. Unfortunately, recent experimental and theoretical studies have revealed that different N-doped graphene (NGR) conformations can result in both p- and n-type characters depending on the bonding nature of N atoms (substitutional, pyridinic, pyrrolic, and nitrilic). To overcome this obstacle in achieving reliable graphene doping, we have carried out density functional theory calculations and explored the feasibility of converting p-type NGRs into n-type by introducing additional dopant candidates atoms (B, C, O, F, Al, Si, P, S, and Cl). Evaluating the relative formation energies of various binary-doped NGRs and the change in their electronic structure, we conclude that B and P atoms are promising candidates to achieve robust n-type NGRs. The origin of such p- to n-type change is analyzed based on the crystal orbital Hamiltonian population analysis. Implications of our findings in the context of electronic and energy device applications will be also discussed. This work was supported by the Basic Science Research Grant (No. 2012R1A1A2044793), Global Frontier Program (No. 2013-073298), and Nano-Material Technology Development Program (2012M3A7B4049888) of the National Research Foundation funded by the Ministry of Education, Science and Technology of Korea. Corresponding author

  13. Calibration of the modulation transfer function of surface profilometers with binary pseudo-random test standards: Expanding the application range

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yashchuk, Valeriy V; Anderson, Erik H.; Barber, Samuel K.

    2010-07-26

    A modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) gratings and arrays [Proc. SPIE 7077-7 (2007), Opt. Eng. 47(7), 073602-1-5 (2008)] has been proven to be an effective MTF calibration method for a number of interferometric microscopes and a scatterometer [Nucl. Instr. and Meth. A 616, 172-82 (2010]. Here we report on a significant expansion of the application range of the method. We describe the MTF calibration of a 6 inch phase shifting Fizeau interferometer. Beyond providing a direct measurement of the interferometer's MTF, tests with a BPR array surface have revealed an asymmetry in the instrument'smore » data processing algorithm that fundamentally limits its bandwidth. Moreover, the tests have illustrated the effects of the instrument's detrending and filtering procedures on power spectral density measurements. The details of the development of a BPR test sample suitable for calibration of scanning and transmission electron microscopes are also presented. Such a test sample is realized as a multilayer structure with the layer thicknesses of two materials corresponding to BPR sequence. The investigations confirm the universal character of the method that makes it applicable to a large variety of metrology instrumentation with spatial wavelength bandwidths from a few nanometers to hundreds of millimeters.« less

  14. A phase field model for segregation and precipitation induced by irradiation in alloys

    NASA Astrophysics Data System (ADS)

    Badillo, A.; Bellon, P.; Averback, R. S.

    2015-04-01

    A phase field model is introduced to model the evolution of multicomponent alloys under irradiation, including radiation-induced segregation and precipitation. The thermodynamic and kinetic components of this model are derived using a mean-field model. The mobility coefficient and the contribution of chemical heterogeneity to free energy are rescaled by the cell size used in the phase field model, yielding microstructural evolutions that are independent of the cell size. A new treatment is proposed for point defect clusters, using a mixed discrete-continuous approach to capture the stochastic character of defect cluster production in displacement cascades, while retaining the efficient modeling of the fate of these clusters using diffusion equations. The model is tested on unary and binary alloy systems using two-dimensional simulations. In a unary system, the evolution of point defects under irradiation is studied in the presence of defect clusters, either pre-existing ones or those created by irradiation, and compared with rate theory calculations. Binary alloys with zero and positive heats of mixing are then studied to investigate the effect of point defect clustering on radiation-induced segregation and precipitation in undersaturated solid solutions. Lastly, irradiation conditions and alloy parameters leading to irradiation-induced homogeneous precipitation are investigated. The results are discussed in the context of experimental results reported for Ni-Si and Al-Zn undersaturated solid solutions subjected to irradiation.

  15. IRiS: construction of ARG networks at genomic scales.

    PubMed

    Javed, Asif; Pybus, Marc; Melé, Marta; Utro, Filippo; Bertranpetit, Jaume; Calafell, Francesc; Parida, Laxmi

    2011-09-01

    Given a set of extant haplotypes IRiS first detects high confidence recombination events in their shared genealogy. Next using the local sequence topology defined by each detected event, it integrates these recombinations into an ancestral recombination graph. While the current system has been calibrated for human population data, it is easily extendible to other species as well. IRiS (Identification of Recombinations in Sequences) binary files are available for non-commercial use in both Linux and Microsoft Windows, 32 and 64 bit environments from https://researcher.ibm.com/researcher/view_project.php?id = 2303 parida@us.ibm.com.

  16. SCRAM: a pipeline for fast index-free small RNA read alignment and visualization.

    PubMed

    Fletcher, Stephen J; Boden, Mikael; Mitter, Neena; Carroll, Bernard J

    2018-03-15

    Small RNAs play key roles in gene regulation, defense against viral pathogens and maintenance of genome stability, though many aspects of their biogenesis and function remain to be elucidated. SCRAM (Small Complementary RNA Mapper) is a novel, simple-to-use short read aligner and visualization suite that enhances exploration of small RNA datasets. The SCRAM pipeline is implemented in Go and Python, and is freely available under MIT license. Source code, multiplatform binaries and a Docker image can be accessed via https://sfletc.github.io/scram/. s.fletcher@uq.edu.au. Supplementary data are available at Bioinformatics online.

  17. Genomic Prediction of Testcross Performance in Canola (Brassica napus)

    PubMed Central

    Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.

    2016-01-01

    Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924

  18. Complete genome sequence of DSM 30083(T), the type strain (U5/41(T)) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy.

    PubMed

    Meier-Kolthoff, Jan P; Hahnke, Richard L; Petersen, Jörn; Scheuner, Carmen; Michael, Victoria; Fiebig, Anne; Rohde, Christine; Rohde, Manfred; Fartmann, Berthold; Goodwin, Lynne A; Chertkov, Olga; Reddy, Tbk; Pati, Amrita; Ivanova, Natalia N; Markowitz, Victor; Kyrpides, Nikos C; Woyke, Tanja; Göker, Markus; Klenk, Hans-Peter

    2014-01-01

    Although Escherichia coli is the most widely studied bacterial model organism and often considered to be the model bacterium per se, its type strain was until now forgotten from microbial genomics. As a part of the G enomic E ncyclopedia of B acteria and A rchaea project, we here describe the features of E. coli DSM 30083(T) together with its genome sequence and annotation as well as novel aspects of its phenotype. The 5,038,133 bp containing genome sequence includes 4,762 protein-coding genes and 175 RNA genes as well as a single plasmid. Affiliation of a set of 250 genome-sequenced E. coli strains, Shigella and outgroup strains to the type strain of E. coli was investigated using digital DNA:DNA-hybridization (dDDH) similarities and differences in genomic G+C content. As in the majority of previous studies, results show Shigella spp. embedded within E. coli and in most cases forming a single subgroup of it. Phylogenomic trees also recover the proposed E. coli phylotypes as monophyla with minor exceptions and place DSM 30083(T) in phylotype B2 with E. coli S88 as its closest neighbor. The widely used lab strain K-12 is not only genomically but also physiologically strongly different from the type strain. The phylotypes do not express a uniform level of character divergence as measured using dDDH, however, thus an alternative arrangement is proposed and discussed in the context of bacterial subspecies. Analyses of the genome sequences of a large number of E. coli strains and of strains from > 100 other bacterial genera indicate a value of 79-80% dDDH as the most promising threshold for delineating subspecies, which in turn suggests the presence of five subspecies within E. coli.

  19. Higher level phylogenetic relationships within the bamboos (Poaceae: Bambusoideae) based on five plastid markers.

    PubMed

    Kelchner, Scot A

    2013-05-01

    Bamboos are large perennial grasses of temperate and tropical forests worldwide. Two general growth forms exist: the economically and ecologically important woody bamboos (tribes Arundinarieae and Bambuseae), and the understory herbaceous bamboos (tribe Olyreae). Evolutionary relationships among the 1400+described species have been difficult to resolve with confidence. Comparative analysis of bamboo plastid (chloroplast) DNA has revealed three to five major lineages that show distinct biogeographic distributions. Taxon sampling across tribes and subtribes has been incomplete and most published data sets include a relatively small number of nucleotide characters. Branching order among lineages is often poorly supported, and in more than one study herbaceous bamboos form a clade within the woody bamboos. In this paper, the Bamboo Phylogeny Group presents the most complete phylogeny estimation to date of bamboo tribes and subtribes using 6.7 kb of coding and noncoding sequence data and 37 microstructural characters from the chloroplast genome. Quality of data is assessed, as is the possibility of long branch attraction, the degree of character conflict at key nodes in the tree, and the legitimacy of three alternative hypotheses of relationship. Four major plastid lineages are recognized: temperate woody, paleotropical woody, neotropical woody, and herbaceous bamboos. Woody bamboos are resolved as paraphyletic with respect to Olyreae but SH tests cannot reject monophyly of woody species (Arundinarieae+Bambuseae). Published by Elsevier Inc.

  20. Molecular Species Delimitation in the Racomitrium canescens Complex (Grimmiaceae) and Implications for DNA Barcoding of Species Complexes in Mosses

    PubMed Central

    Stech, Michael; Veldman, Sarina; Larraín, Juan; Muñoz, Jesús; Quandt, Dietmar; Hassel, Kristian; Kruijer, Hans

    2013-01-01

    In bryophytes a morphological species concept is still most commonly employed, but delimitation of closely related species based on morphological characters is often difficult. Here we test morphological species circumscriptions in a species complex of the moss genus Racomitrium, the R. canescens complex, based on variable DNA sequence markers from the plastid (rps4-trnT-trnL region) and nuclear (nrITS) genomes. The extensive morphological variability within the complex has led to different opinions about the number of species and intraspecific taxa to be distinguished. Molecular phylogenetic reconstructions allowed to clearly distinguish all eight currently recognised species of the complex plus a ninth species that was inferred to belong to the complex in earlier molecular analyses. The taxonomic significance of intraspecific sequence variation is discussed. The present molecular data do not support the division of the R. canescens complex into two groups of species (subsections or sections). Most morphological characters, albeit being in part difficult to apply, are reliable for species identification in the R. canescens complex. However, misidentification of collections that were morphologically intermediate between species questioned the suitability of leaf shape as diagnostic character. Four partitions of the molecular markers (rps4-trnT, trnT-trnL, ITS1, ITS2) that could potentially be used for molecular species identification (DNA barcoding) performed almost equally well concerning amplification and sequencing success. Of these, ITS1 provided the highest species discrimination capacity and should be considered as a DNA barcoding marker for mosses, especially in complexes of closely related species. Molecular species identification should be complemented by redefining morphological characters, to develop a set of easy-to-use molecular and non-molecular identification tools for improving biodiversity assessments and ecological research including mosses. PMID:23341927

  1. Construction of a plant-transformation-competent BIBAC library and genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.)

    PubMed Central

    2013-01-01

    Background Cotton, one of the world’s leading crops, is important to the world’s textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr. Results We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5). Conclusions The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii. PMID:23537070

  2. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records

    PubMed Central

    Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    Objective There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. Materials and methods We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. Results An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. Discussion A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. Conclusion We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries. PMID:22319176

  3. Analog hardware implementation of neocognitron networks

    NASA Astrophysics Data System (ADS)

    Inigo, Rafael M.; Bonde, Allen, Jr.; Holcombe, Bradford

    1990-08-01

    This paper deals with the analog implementation of neocognitron based neural networks. All of Fukushima''s and related work on the neocognitron is based on digital computer simulations. To fully take advantage of the power of this network paradigm an analog electronic approach is proposed. We first implemented a 6-by-6 sensor network with discrete analog components and fixed weights. The network was given weight values to recognize the characters U L and F. These characters are recognized regardless of their location on the sensor and with various levels of distortion and noise. The network performance has also shown an excellent correlation with software simulation results. Next we implemented a variable weight network which can be trained to recognize simple patterns by means of self-organization. The adaptable weights were implemented with PETs configured as voltage-controlled resistors. To implement a variable weight there must be some type of " memory" to store the weight value and hold it while the value is reinforced or incremented. Two methods were evaluated: an analog sample-hold circuit and a digital storage scheme using binary counters. The latter is preferable for VLSI implementation because it uses standard components and does not require the use of capacitors. The analog design and implementation of these small-scale networks demonstrates the feasibility of implementing more complicated ANNs in electronic hardware. The circuits developed can also be designed for VLSI implementation. 1.

  4. A Validated All-Pressure Fluid Drop Model and Lewis Number Effects for a Binary Mixture

    NASA Technical Reports Server (NTRS)

    Harstad, K.; Bellan, J.

    1999-01-01

    The differences between subcritical liquid drop and supercritical fluid drop behavior are discussed. Under subcritical, evaporative high emission rate conditions, a film layer is present in the inner part of the drop surface which contributes to the unique determination of the boundary conditions; it is this film layer which contributes to the solution's convective-diffusive character. In contrast, under supercritical condition as the boundary conditions contain a degree of arbitrariness due to the absence of a surface, and the solution has then a purely diffusive character. Results from simulations of a free fluid drop under no-gravity conditions are compared to microgravity experimental data from suspended, large drop experiments at high, low and intermediary temperatures and in a range of pressures encompassing the sub-and supercritical regime. Despite the difference between the conditions of the simulations and experiments (suspension vs. free floating), the time rate of variation of the drop diameter square is remarkably well predicted in the linear curve regime. The drop diameter is determined in the simulations from the location of the maximum density gradient, and agrees well with the data. It is also shown that the classical calculation of the Lewis number gives qualitatively erroneous results at supercritical conditions, but that an effective Lewis number previously defined gives qualitatively correct estimates of the length scales for heat and mass transfer at all pressures.

  5. MAJIQ-SPEL: Web-tool to interrogate classical and complex splicing variations from RNA-Seq data.

    PubMed

    Green, Christopher J; Gazzara, Matthew R; Barash, Yoseph

    2017-09-11

    Analysis of RNA sequencing (RNA-Seq) data have highlighted the fact that most genes undergo alternative splicing (AS) and that these patterns are tightly regulated. Many of these events are complex, resulting in numerous possible isoforms that quickly become difficult to visualize, interpret, and experimentally validate. To address these challenges we developed MAJIQ-SPEL, a web-tool that takes as input local splicing variations (LSVs) quantified from RNA-Seq data and provides users with visualization and quantification of gene isoforms associated with those. Importantly, MAJIQ-SPEL is able to handle both classical (binary) and complex, non-binary, splicing variations. Using a matching primer design algorithm it also suggests users possible primers for experimental validation by RT-PCR and displays those, along with the matching protein domains affected by the LSV, on UCSC Genome Browser for further downstream analysis. Program and code will be available at http://majiq.biociphers.org/majiq-spel. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  6. GAC: Gene Associations with Clinical, a web based application

    PubMed Central

    Zhang, Xinyan; Rupji, Manali; Kowalski, Jeanne

    2018-01-01

    We present GAC, a shiny R based tool for interactive visualization of clinical associations based on high-dimensional data. The tool provides a web-based suite to perform supervised principal component analysis (SuperPC), an approach that uses both high-dimensional data, such as gene expression, combined with clinical data to infer clinical associations. We extended the approach to address binary outcomes, in addition to continuous and time-to-event data in our package, thereby increasing the use and flexibility of SuperPC.  Additionally, the tool provides an interactive visualization for summarizing results based on a forest plot for both binary and time-to-event data.  In summary, the GAC suite of tools provide a one stop shop for conducting statistical analysis to identify and visualize the association between a clinical outcome of interest and high-dimensional data types, such as genomic data. Our GAC package has been implemented in R and is available via http://shinygispa.winship.emory.edu/GAC/. The developmental repository is available at https://github.com/manalirupji/GAC. PMID:29263780

  7. Analysis of the Genome Structure of the Nonpathogenic Probiotic Escherichia coli Strain Nissle 1917

    PubMed Central

    Grozdanov, Lubomir; Raasch, Carsten; Schulze, Jürgen; Sonnenborn, Ulrich; Gottschalk, Gerhard; Hacker, Jörg; Dobrindt, Ulrich

    2004-01-01

    Nonpathogenic Escherichia coli strain Nissle 1917 (O6:K5:H1) is used as a probiotic agent in medicine, mainly for the treatment of various gastroenterological diseases. To gain insight on the genetic level into its properties of colonization and commensalism, this strain's genome structure has been analyzed by three approaches: (i) sequence context screening of tRNA genes as a potential indication of chromosomal integration of horizontally acquired DNA, (ii) sequence analysis of 280 kb of genomic islands (GEIs) coding for important fitness factors, and (iii) comparison of Nissle 1917 genome content with that of other E. coli strains by DNA-DNA hybridization. PCR-based screening of 324 nonpathogenic and pathogenic E. coli isolates of different origins revealed that some chromosomal regions are frequently detectable in nonpathogenic E. coli and also among extraintestinal and intestinal pathogenic strains. Many known fitness factor determinants of strain Nissle 1917 are localized on four GEIs which have been partially sequenced and analyzed. Comparison of these data with the available knowledge of the genome structure of E. coli K-12 strain MG1655 and of uropathogenic E. coli O6 strains CFT073 and 536 revealed structural similarities on the genomic level, especially between the E. coli O6 strains. The lack of defined virulence factors (i.e., alpha-hemolysin, P-fimbrial adhesins, and the semirough lipopolysaccharide phenotype) combined with the expression of fitness factors such as microcins, different iron uptake systems, adhesins, and proteases, which may support its survival and successful colonization of the human gut, most likely contributes to the probiotic character of E. coli strain Nissle 1917. PMID:15292145

  8. Molecular genetic characterization of the RD-114 gene family of endogenous feline retroviral sequences.

    PubMed Central

    Reeves, R H; O'Brien, S J

    1984-01-01

    RD-114 is a replication-competent, xenotropic retrovirus which is homologous to a family of moderately repetitive DNA sequences present at ca. 20 copies in the normal cellular genome of domestic cats. To examine the extent and character of genomic divergence of the RD-114 gene family as well as to assess their positional association within the cat genome, we have prepared a series of molecular clones of endogenous RD-114 DNA segments from a genomic library of cat cellular DNA. Their restriction endonuclease maps were compared with each other as well as to that of the prototype-inducible RD-114 which was molecularly cloned from a chronically infected human cell line. The endogenous sequences analyzed were similar to each other in that they were colinear with RD-114 proviral DNA, were bounded by long terminal redundancies, and conserved many restriction sites in the gag and pol regions. However, the env regions of many of the sequences examined were substantially deleted. Several of the endogenous RD-114 genomes contained a novel envelope sequence which was unrelated to the env gene of the prototype RD-114 env gene but which, like RD-114 and endogenous feline leukemia virus provirus, was found only in species of the genus Felis, and not in other closely related Felidae genera. The endogenous RD-114 sequences each had a distinct cellular flank which indicates that these sequences are not tandem but dispersed nonspecifically throughout the genome. Southern analysis of cat cellular DNA confirmed the conclusions about conserved restriction sites in endogenous sequences and indicated that a single locus may be responsible for the production of the major inducible form of RD-114. Images PMID:6090693

  9. Ancestral chromosomal blocks are triplicated in Brassiceae species with varying chromosome number and genome size.

    PubMed

    Lysak, Martin A; Cheung, Kwok; Kitschke, Michaela; Bures, Petr

    2007-10-01

    The paleopolyploid character of genomes of the economically important genus Brassica and closely related species (tribe Brassiceae) is still fairly controversial. Here, we report on the comparative painting analysis of block F of the crucifer Ancestral Karyotype (AK; n = 8), consisting of 24 conserved genomic blocks, in 10 species traditionally treated as members of the tribe Brassiceae. Three homeologous copies of block F were identified per haploid chromosome complement in Brassiceae species with 2n = 14, 18, 20, 32, and 36. In high-polyploid (n >or= 30) species Crambe maritima (2n = 60), Crambe cordifolia (2n = 120), and Vella pseudocytisus (2n = 68), six, 12, and six copies of the analyzed block have been revealed, respectively. Homeologous regions resembled the ancestral structure of block F within the AK or were altered by inversions and/or translocations. In two species of the subtribe Zillineae, two of the three homeologous regions were combined via a reciprocal translocation onto one chromosome. Altogether, these findings provide compelling evidence of an ancient hexaploidization event and corresponding whole-genome triplication shared by the tribe Brassiceae. No direct relationship between chromosome number and genome size variation (1.2-2.5 pg/2C) has been found in Brassiceae species with 2n = 14 to 36. Only two homeologous copies of block F suggest a whole-genome duplication but not the triplication event in Orychophragmus violaceus (2n = 24), and confirm a phylogenetic position of this species outside the tribe Brassiceae. Chromosome duplication detected in Orychophragmus as well as chromosome rearrangements shared by Zillineae species demonstrate the usefulness of comparative cytogenetics for elucidation of phylogenetic relationships.

  10. Ancestral Chromosomal Blocks Are Triplicated in Brassiceae Species with Varying Chromosome Number and Genome Size1

    PubMed Central

    Lysak, Martin A.; Cheung, Kwok; Kitschke, Michaela; Bureš, Petr

    2007-01-01

    The paleopolyploid character of genomes of the economically important genus Brassica and closely related species (tribe Brassiceae) is still fairly controversial. Here, we report on the comparative painting analysis of block F of the crucifer Ancestral Karyotype (AK; n = 8), consisting of 24 conserved genomic blocks, in 10 species traditionally treated as members of the tribe Brassiceae. Three homeologous copies of block F were identified per haploid chromosome complement in Brassiceae species with 2n = 14, 18, 20, 32, and 36. In high-polyploid (n ≥ 30) species Crambe maritima (2n = 60), Crambe cordifolia (2n = 120), and Vella pseudocytisus (2n = 68), six, 12, and six copies of the analyzed block have been revealed, respectively. Homeologous regions resembled the ancestral structure of block F within the AK or were altered by inversions and/or translocations. In two species of the subtribe Zillineae, two of the three homeologous regions were combined via a reciprocal translocation onto one chromosome. Altogether, these findings provide compelling evidence of an ancient hexaploidization event and corresponding whole-genome triplication shared by the tribe Brassiceae. No direct relationship between chromosome number and genome size variation (1.2–2.5 pg/2C) has been found in Brassiceae species with 2n = 14 to 36. Only two homeologous copies of block F suggest a whole-genome duplication but not the triplication event in Orychophragmus violaceus (2n = 24), and confirm a phylogenetic position of this species outside the tribe Brassiceae. Chromosome duplication detected in Orychophragmus as well as chromosome rearrangements shared by Zillineae species demonstrate the usefulness of comparative cytogenetics for elucidation of phylogenetic relationships. PMID:17720758

  11. Adaptive genomic evolution of opsins reveals that early mammals flourished in nocturnal environments.

    PubMed

    Borges, Rui; Johnson, Warren E; O'Brien, Stephen J; Gomes, Cidália; Heesy, Christopher P; Antunes, Agostinho

    2018-02-05

    Based on evolutionary patterns of the vertebrate eye, Walls (1942) hypothesized that early placental mammals evolved primarily in nocturnal habitats. However, not only Eutheria, but all mammals show photic characteristics (i.e. dichromatic vision, rod-dominated retina) suggestive of a scotopic eye design. Here, we used integrative comparative genomic and phylogenetic methodologies employing the photoreceptive opsin gene family in 154 mammals to test the likelihood of a nocturnal period in the emergence of all mammals. We showed that mammals possess genomic patterns concordant with a nocturnal ancestry. The loss of the RH2, VA, PARA, PARIE and OPN4x opsins in all mammals led us to advance a probable and most-parsimonious hypothesis of a global nocturnal bottleneck that explains the loss of these genes in the emerging lineage (> > 215.5 million years ago). In addition, ancestral character reconstruction analyses provided strong evidence that ancestral mammals possessed a nocturnal lifestyle, ultra-violet-sensitive vision, low visual acuity and low orbit convergence (i.e. panoramic vision). Overall, this study provides insight into the evolutionary history of the mammalian eye while discussing important ecological aspects of the photic paleo-environments ancestral mammals have occupied.

  12. Evaluating phylogenetic congruence in the post-genomic era.

    PubMed

    Leigh, Jessica W; Lapointe, François-Joseph; Lopez, Philippe; Bapteste, Eric

    2011-01-01

    Congruence is a broadly applied notion in evolutionary biology used to justify multigene phylogeny or phylogenomics, as well as in studies of coevolution, lateral gene transfer, and as evidence for common descent. Existing methods for identifying incongruence or heterogeneity using character data were designed for data sets that are both small and expected to be rarely incongruent. At the same time, methods that assess incongruence using comparison of trees test a null hypothesis of uncorrelated tree structures, which may be inappropriate for phylogenomic studies. As such, they are ill-suited for the growing number of available genome sequences, most of which are from prokaryotes and viruses, either for phylogenomic analysis or for studies of the evolutionary forces and events that have shaped these genomes. Specifically, many existing methods scale poorly with large numbers of genes, cannot accommodate high levels of incongruence, and do not adequately model patterns of missing taxa for different markers. We propose the development of novel incongruence assessment methods suitable for the analysis of the molecular evolution of the vast majority of life and support the investigation of homogeneity of evolutionary process in cases where markers do not share identical tree structures.

  13. Evaluating Phylogenetic Congruence in the Post-Genomic Era

    PubMed Central

    Leigh, Jessica W.; Lapointe, François-Joseph; Lopez, Philippe; Bapteste, Eric

    2011-01-01

    Congruence is a broadly applied notion in evolutionary biology used to justify multigene phylogeny or phylogenomics, as well as in studies of coevolution, lateral gene transfer, and as evidence for common descent. Existing methods for identifying incongruence or heterogeneity using character data were designed for data sets that are both small and expected to be rarely incongruent. At the same time, methods that assess incongruence using comparison of trees test a null hypothesis of uncorrelated tree structures, which may be inappropriate for phylogenomic studies. As such, they are ill-suited for the growing number of available genome sequences, most of which are from prokaryotes and viruses, either for phylogenomic analysis or for studies of the evolutionary forces and events that have shaped these genomes. Specifically, many existing methods scale poorly with large numbers of genes, cannot accommodate high levels of incongruence, and do not adequately model patterns of missing taxa for different markers. We propose the development of novel incongruence assessment methods suitable for the analysis of the molecular evolution of the vast majority of life and support the investigation of homogeneity of evolutionary process in cases where markers do not share identical tree structures. PMID:21712432

  14. Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.

    PubMed

    Tsangaras, Kyriakos; Siracusa, Matthew C; Nikolaidis, Nikolas; Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D

    2014-01-01

    The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.

  15. Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome

    PubMed Central

    Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.

    2014-01-01

    The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422

  16. International Guidelines for Privacy in Genomic Biobanking (or the Unexpected Virtue of Pluralism).

    PubMed

    Thorogood, Adrian; Zawati, Ma'n H

    2015-01-01

    This article reviews international privacy norms governing human genomic biobanks and databases, and how they address issues related to consent, secondary use, de- identification, access, security, and governance. A range of international instruments were identified, varying in substance - e.g., human rights, data protection, research ethics, biobanks, and genetics - and legal character. Some norms detail processes for broad consent, namely, that even where potential participants cannot consent to specific users and uses, they should be given clear information on access policies, procedures, and governance structures. Some also give guidance about the conditions under which secondary use of data and samples without consent is appropriate, e.g., where consent is impracticable. International norms exhibit a confusing range of terminology relating to de-identification. They also continue to rely heavily on consent and anonymity as the basis for privacy protection, though governance is becoming more prominent. It may not be fatal that such a plurality of norms apply to biobanking; what is essential is that governance be built on shared values, our common interest in the success of genomic research, and practical tools that incentivize responsible, global sharing. © 2015 American Society of Law, Medicine & Ethics, Inc.

  17. The genomic potential of Marinobacter aquaeolei - A biogeochemical opportunotroph

    NASA Astrophysics Data System (ADS)

    Singer, E.; Webb, E.; Nelson, W.; Heidelberg, J.; Edwards, K. J.

    2009-12-01

    The family of Marinobacter is one of the most ubiquitous in the ocean. Members of this genus are found throughout the water column, in the deep sea, and are often associated with hydrothermal plume particles and marine snow. They are known to degrade hydrocarbons and show some extremophilic lifestyles, such as pyschrophily, oligotrophy and halotolerance. This study has determined the genomic potential of one particular strain - Marinobacter aquaeolei VT8, which relies on a very large set of survival strategies. Isolated from an oil well in Southern Vietnam, M. aquaeolei was known to be a facultative anaerobe with the ability to utilize various carbon sources. Fitting with these observations, genome annotation has revealed: four variations of the TCA cycle, complete pathways of glycolysis and the degradation of more complex hydrocarbons (including octane oxidation and cyclohexanol degradation), alternative phosphorous and nitrogen sources, genes for the use of nitrate and sulfate as electron acceptors as well as complete pathways for sulfite oxidation, denitrification and iron oxidation. The versatility and interrelatedness of these metabolic potentials coin the opportunistic character of M. aquaeolei and help to more completely define the biogeochemical niche of the genus.

  18. Continuous Morphological Variation Correlated with Genome Size Indicates Frequent Introgressive Hybridization among Diphasiastrum Species (Lycopodiaceae) in Central Europe

    PubMed Central

    Hanušová, Kristýna; Ekrt, Libor; Vít, Petr; Kolář, Filip; Urfus, Tomáš

    2014-01-01

    Introgressive hybridization is an important evolutionary process frequently contributing to diversification and speciation of angiosperms. Its extent in other groups of land plants has only rarely been studied, however. We therefore examined the levels of introgression in the genus Diphasiastrum, a taxonomically challenging group of Lycopodiophytes, using flow cytometry and numerical and geometric morphometric analyses. Patterns of morphological and cytological variation were evaluated in an extensive dataset of 561 individuals from 57 populations of six taxa from Central Europe, the region with the largest known taxonomic complexity. In addition, genome size values of 63 individuals from Northern Europe were acquired for comparative purposes. Within Central European populations, we detected a continuous pattern in both morphological variation and genome size (strongly correlated together) suggesting extensive levels of interspecific gene flow within this region, including several large hybrid swarm populations. The secondary character of habitats of Central European hybrid swarm populations suggests that man-made landscape changes might have enhanced unnatural contact of species, resulting in extensive hybridization within this area. On the contrary, a distinct pattern of genome size variation among individuals from other parts of Europe indicates that pure populations prevail outside Central Europe. All in all, introgressive hybridization among Diphasiastrum species in Central Europe represents a unique case of extensive interspecific gene flow among spore producing vascular plants that cause serious complications of taxa delimitation. PMID:24932509

  19. Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes.

    PubMed

    Sabir, Jamal; Schwarz, Erika; Ellison, Nicholas; Zhang, Jin; Baeshen, Nabih A; Mutwakil, Muhammed; Jansen, Robert; Ruhlman, Tracey

    2014-08-01

    Land plant plastid genomes (plastomes) provide a tractable model for evolutionary study in that they are relatively compact and gene dense. Among the groups that display an appropriate level of variation for structural features, the inverted-repeat-lacking clade (IRLC) of papilionoid legumes presents the potential to advance general understanding of the mechanisms of genomic evolution. Here, are presented six complete plastome sequences from economically important species of the IRLC, a lineage previously represented by only five completed plastomes. A number of characters are compared across the IRLC including gene retention and divergence, synteny, repeat structure and functional gene transfer to the nucleus. The loss of clpP intron 2 was identified in one newly sequenced member of IRLC, Glycyrrhiza glabra. Using deeply sequenced nuclear transcriptomes from two species helped clarify the nature of the functional transfer of accD to the nucleus in Trifolium, which likely occurred in the lineage leading to subgenus Trifolium. Legumes are second only to cereal crops in agricultural importance based on area harvested and total production. Genetic improvement via plastid transformation of IRLC crop species is an appealing proposition. Comparative analyses of intergenic spacer regions emphasize the need for complete genome sequences for developing transformation vectors for plastid genetic engineering of legume crops. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  20. Complete mitochondrial genomes of Trisidos kiyoni and Potiarca pilula: Varied mitochondrial genome size and highly rearranged gene order in Arcidae

    PubMed Central

    Sun, Shao’e; Li, Qi; Kong, Lingfeng; Yu, Hong

    2016-01-01

    We present the complete mitochondrial genomes (mitogenomes) of Trisidos kiyoni and Potiarca pilula, both important species from the family Arcidae (Arcoida: Arcacea). Typical bivalve mtDNA features were described, such as the relatively conserved gene number (36 and 37), a high A + T content (62.73% and 61.16%), the preference for A + T-rich codons, and the evidence of non-optimal codon usage. The mitogenomes of Arcidae species are exceptional for their extraordinarily large and variable sizes and substantial gene rearrangements. The mitogenome of T. kiyoni (19,614 bp) and P. pilula (28,470 bp) are the two smallest Arcidae mitogenomes. The compact mitogenomes are weakly associated with gene number and primarily reflect shrinkage of the non-coding regions. The varied size in Arcidae mitogenomes reflect a dynamic history of expansion. A significant positive correlation is observed between mitogenome size and the combined length of cox1-3, the lengths of Cytb, and the combined length of rRNAs (rrnS and rrnL) (P < 0.001). Both protein coding genes (PCGs) and tRNA rearrangements is observed in P. pilula and T. kiyoni mitogenomes. This analysis imply that the complicated gene rearrangement in mitochondrial genome could be considered as one of key characters in inferring higher-level phylogenetic relationship of Arcidae. PMID:27653979

  1. Genome-wide molecular dissection of serotype M3 group A Streptococcus strains causing two epidemics of invasive infections.

    PubMed

    Beres, Stephen B; Sylva, Gail L; Sturdevant, Daniel E; Granville, Chanel N; Liu, Mengyao; Ricklefs, Stacy M; Whitney, Adeline R; Parkins, Larye D; Hoe, Nancy P; Adams, Gerald J; Low, Donald E; DeLeo, Frank R; McGeer, Allison; Musser, James M

    2004-08-10

    Molecular factors that contribute to the emergence of new virulent bacterial subclones and epidemics are poorly understood. We hypothesized that analysis of a population-based strain sample of serotype M3 group A Streptococcus (GAS) recovered from patients with invasive infection by using genome-wide investigative methods would provide new insight into this fundamental infectious disease problem. Serotype M3 GAS strains (n = 255) cultured from patients in Ontario, Canada, over 11 years and representing two distinct infection peaks were studied. Genetic diversity was indexed by pulsed-field gel electrophoresis, DNA-DNA microarray, whole-genome PCR scanning, prophage genotyping, targeted gene sequencing, and single-nucleotide polymorphism genotyping. All variation in gene content was attributable to acquisition or loss of prophages, a molecular process that generated unique combinations of proven or putative virulence genes. Distinct serotype M3 genotypes experienced rapid population expansion and caused infections that differed significantly in character and severity. Molecular genetic analysis, combined with immunologic studies, implicated a 4-aa duplication in the extreme N terminus of M protein as a factor contributing to an epidemic wave of serotype M3 invasive infections. This finding has implications for GAS vaccine research. Genome-wide analysis of population-based strain samples cultured from clinically well defined patients is crucial for understanding the molecular events underlying bacterial epidemics.

  2. The complete mitochondrial genome of the onychophoran Epiperipatus biolleyi reveals a unique transfer RNA set and provides further support for the ecdysozoa hypothesis.

    PubMed

    Podsiadlowski, Lars; Braband, Anke; Mayer, Georg

    2008-01-01

    Onychophora (velvet worms) play a crucial role in current discussions on position of arthropods. The ongoing Articulata/Ecdysozoa debate is in need of additional ground pattern characters for Panarthropoda (Arthropoda, Tardigrada, and Onychophora). Hence, Onychophora is an important outgroup taxon in resolving the relationships among arthropods, irrespective of whether morphological or molecular data are used. To date, there has been a noticeable lack of mitochondrial genome data from onychophorans. Here, we present the first complete mitochondrial genome sequence of an onychophoran, Epiperipatus biolleyi (Peripatidae), which shows several characteristic features. Specifically, the gene order is considerably different from that in other arthropods and other bilaterians. In addition, there is a lack of 9 tRNA genes usually present in bilaterian mitochondrial genomes. All these missing tRNAs have anticodon sequences corresponding to 4-fold degenerate codons, whereas the persisting 13 tRNAs all have anticodons pairing with 2-fold degenerate codons. Sequence-based phylogenetic analysis of the mitochondrial protein-coding genes provides a robust support for a clade consisting of Onychophora, Priapulida, and Arthropoda, which confirms the Ecdysozoa hypothesis. However, resolution of the internal ecdysozoan relationships suffers from a cluster of long-branching taxa (including Nematoda and Platyhelminthes) and a lack of data from Tardigrada and further nemathelminth taxa in addition to nematodes and priapulids.

  3. Genome-wide association study using deregressed breeding values for cryptorchidism and scrotal/inguinal hernia in two pig lines.

    PubMed

    Sevillano, Claudia A; Lopes, Marcos S; Harlizius, Barbara; Hanenberg, Egiel H A T; Knol, Egbert F; Bastiaansen, John W M

    2015-03-21

    Cryptorchidism and scrotal/inguinal hernia are the most frequent congenital defects in pigs. Identification of genomic regions that control these congenital defects is of great interest to breeding programs, both from an animal welfare point of view as well as for economic reasons. The aim of this genome-wide association study (GWAS) was to identify single nucleotide polymorphisms (SNPs) that are strongly associated with these congenital defects. Genotypes were available for 2570 Large White (LW) and 2272 Landrace (LR) pigs. Breeding values were estimated based on 1 359 765 purebred and crossbred male offspring, using a binary trait animal model. Estimated breeding values were deregressed (DEBV) and taken as the response variable in the GWAS. Heritability estimates were equal to 0.26 ± 0.02 for cryptorchidism and to 0.31 ± 0.01 for scrotal/inguinal hernia. Seven and 31 distinct QTL regions were associated with cryptorchidism in the LW and LR datasets, respectively. The top SNP per region explained between 0.96% and 1.10% and between 0.48% and 2.77% of the total variance of cryptorchidism incidence in the LW and LR populations, respectively. Five distinct QTL regions associated with scrotal/inguinal hernia were detected in both LW and LR datasets. The top SNP per region explained between 1.22% and 1.60% and between 1.15% and 1.46% of the total variance of scrotal/inguinal hernia incidence in the LW and LR populations, respectively. For each trait, we identified one overlapping region between the LW and LR datasets, i.e. a region on SSC8 (Sus scrofa chromosome) between 65 and 73 Mb for cryptorchidism and a region on SSC13 between 34 and 37 Mb for scrotal/inguinal hernia. The use of DEBV in combination with a binary trait model was a powerful approach to detect regions associated with difficult traits such as cryptorchidism and scrotal/inguinal hernia that have a low incidence and for which affected animals are generally not available for genotyping. Several novel QTL regions were detected for cryptorchidism and scrotal/inguinal hernia, and for several previously known QTL regions, the confidence interval was narrowed down.

  4. SNPassoc: an R package to perform whole genome association studies.

    PubMed

    González, Juan R; Armengol, Lluís; Solé, Xavier; Guinó, Elisabet; Mercader, Josep M; Estivill, Xavier; Moreno, Víctor

    2007-03-01

    The popularization of large-scale genotyping projects has led to the widespread adoption of genetic association studies as the tool of choice in the search for single nucleotide polymorphisms (SNPs) underlying susceptibility to complex diseases. Although the analysis of individual SNPs is a relatively trivial task, when the number is large and multiple genetic models need to be explored it becomes necessary a tool to automate the analyses. In order to address this issue, we developed SNPassoc, an R package to carry out most common analyses in whole genome association studies. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Package SNPassoc is available at CRAN from http://cran.r-project.org. A tutorial is available on Bioinformatics online and in http://davinci.crg.es/estivill_lab/snpassoc.

  5. Gene Expression Dynamics Inspector (GEDI): for integrative analysis of expression profiles

    NASA Technical Reports Server (NTRS)

    Eichler, Gabriel S.; Huang, Sui; Ingber, Donald E.

    2003-01-01

    Genome-wide expression profiles contain global patterns that evade visual detection in current gene clustering analysis. Here, a Gene Expression Dynamics Inspector (GEDI) is described that uses self-organizing maps to translate high-dimensional expression profiles of time courses or sample classes into animated, coherent and robust mosaics images. GEDI facilitates identification of interesting patterns of molecular activity simultaneously across gene, time and sample space without prior assumption of any structure in the data, and then permits the user to retrieve genes of interest. Important changes in genome-wide activities may be quickly identified based on 'Gestalt' recognition and hence, GEDI may be especially useful for non-specialist end users, such as physicians. AVAILABILITY: GEDI v1.0 is written in Matlab, and binary Matlab.dll files which require Matlab to run can be downloaded for free by academic institutions at http://www.chip.org/ge/gedihome.html Supplementary information: http://www.chip.org/ge/gedihome.html.

  6. Sorting protein lists with nwCompare: a simple and fast algorithm for n-way comparison of proteomic data files.

    PubMed

    Pont, Frédéric; Fournié, Jean Jacques

    2010-03-01

    MS, the reference technology for proteomics, routinely produces large numbers of protein lists whose fast comparison would prove very useful. Unfortunately, most softwares only allow comparisons of two to three lists at once. We introduce here nwCompare, a simple tool for n-way comparison of several protein lists without any query language, and exemplify its use with differential and shared cancer cell proteomes. As the software compares character strings, it can be applied to any type of data mining, such as genomic or metabolomic datalists.

  7. Complete Genome Sequence of the Filamentous Anoxygenic Phototrophic Bacterium Chloroflexus aurantiacus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Kuo-Hsiang; Barry, Kerrie; Chertkov, Olga

    Chloroflexus aurantiacus is a thermophilic filamentous anoxygenic phototrophic (FAP) bacterium, and can grow phototrophically under anaerobic conditions or chemotrophically under aerobic and dark conditions. According to 16S rRNA analysis, Chloroflexi species are the earliest branching bacteria capable of photosynthesis, and Cfl. aurantiacus has been long regarded as a key organism to resolve the obscurity of the origin and early evolution of photosynthesis. Cfl. aurantiacus contains a chimeric photosystem that comprises some characters of green sulfur bacteria and purple photosynthetic bacteria, and also has some unique electron transport proteins compared to other photosynthetic bacteria.

  8. Long-Term Circulation of Vaccine-Derived Poliovirus That Causes Paralytic Disease

    PubMed Central

    Cherkasova, Elena A.; Korotkova, Ekaterina A.; Yakovenko, Maria L.; Ivanova, Olga E.; Eremeeva, Tatyana P.; Chumakov, Konstantin M.; Agol, Vadim I.

    2002-01-01

    Successful implementation of the global poliomyelitis eradication program raises the problem of vaccination against poliomyelitis in the posteradication era. One of the options under consideration envisions completely stopping worldwide the use of the Sabin vaccine. This strategy is based on the assumption that the natural circulation of attenuated strains and their derivatives is strictly limited. Here, we report the characterization of a highly evolved derivative of the Sabin vaccine strain isolated in a case of paralytic poliomyelitis from a 7-month-old immunocompetent baby in an apparently adequately immunized population. Analysis of the genome of this isolate showed that it is a double (type 1-type 2-type 1) vaccine-derived recombinant. The number of mutations accumulated in both the type 1-derived and type 2-derived portions of the recombinant genome suggests that both had diverged from their vaccine predecessors ∼2 years before the onset of the illness. This fact, along with other recent observations, points to the possibility of long-term circulation of Sabin vaccine strain derivatives associated with an increase in their neurovirulence. Comparison of genomic sequences of this and other evolved vaccine-derived isolates reveals some general features of natural poliovirus evolution. They include a very high preponderance and nonrandom distribution of synonymous substitutions, conservation of secondary structures of important cis-acting elements of the genome, and an apparently adaptive character of most of the amino acid mutations, with only a few of them occurring in the antigenic determinants. Another interesting feature is a frequent occurrence of tripartite intertypic recombinants with either type 1 or type 3 homotypic genomic ends. PMID:12050392

  9. The clc Element of Pseudomonas sp. Strain B13, a Genomic Island with Various Catabolic Properties

    PubMed Central

    Gaillard, Muriel; Vallaeys, Tatiana; Vorhölter, Frank Jörg; Minoia, Marco; Werlen, Christoph; Sentchilo, Vladimir; Pühler, Alfred; van der Meer, Jan Roelof

    2006-01-01

    Pseudomonas sp. strain B13 is a bacterium known to degrade chloroaromatic compounds. The properties to use 3- and 4-chlorocatechol are determined by a self-transferable DNA element, the clc element, which normally resides at two locations in the cell's chromosome. Here we report the complete nucleotide sequence of the clc element, demonstrating the unique catabolic properties while showing its relatedness to genomic islands and integrative and conjugative elements rather than to other known catabolic plasmids. As far as catabolic functions, the clc element harbored, in addition to the genes for chlorocatechol degradation, a complete functional operon for 2-aminophenol degradation and genes for a putative aromatic compound transport protein and for a multicomponent aromatic ring dioxygenase similar to anthranilate hydroxylase. The genes for catabolic functions were inducible under various conditions, suggesting a network of catabolic pathway induction. For about half of the open reading frames (ORFs) on the clc element, no clear functional prediction could be given, although some indications were found for functions that were similar to plasmid conjugation. The region in which these ORFs were situated displayed a high overall conservation of nucleotide sequence and gene order to genomic regions in other recently completed bacterial genomes or to other genomic islands. Most notably, except for two discrete regions, the clc element was almost 100% identical over the whole length to a chromosomal region in Burkholderia xenovorans LB400. This indicates the dynamic evolution of this type of element and the continued transition between elements with a more pathogenic character and those with catabolic properties. PMID:16484212

  10. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

    PubMed

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-07-12

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  11. Electric Field Induced Interfacial Instabilities

    NASA Technical Reports Server (NTRS)

    Kusner, Robert E.; Min, Kyung Yang; Wu, Xiao-lun; Onuki, Akira

    1999-01-01

    The study of the interface in a charge-free, critical and near-critical binary fluid in the presence of an externally applied electric field is presented. At sufficiently large fields, the interface between the two phases of the binary fluid should become unstable and exhibit an undulation with a predefined wavelength on the order of the capillary length. As the critical point is approached, this wavelength is reduced, potentially approaching length-scales such as the correlation length or critical nucleation radius. At this point the critical properties of the system may be affected. In this paper, the flat interface of a marginally polar binary fluid mixture is stressed by a perpendicular alternating electric field and the resulting instability is characterized by the critical electric field E(sub c) and the pattern observed. The character of the surface dynamics at the onset of instability is found to be strongly dependent on the frequency f of the field applied. The plot of E(sub c) vs. f for a fixed temperature shows a sigmoidal shape, whose low and high frequency limits are well described by a power-law relationship, E(sub c) = epsilon(exp zeta) with zeta = 0.35 and zeta = 0.08, respectively. The low-limit exponent compares well with the value zeta = 4 for a system of conducting and non-conducting fluids. On the other hand, the high-limit exponent coincides with what was first predicted by Onuki. The instability manifests itself as the conducting phase penetrates the non-conducting phase. As the frequency increases, the shape of the pattern changes from an array of bifurcating strings to an array of column-like (or rod-like) protrusions, each of which spans the space between the plane interface and one of the electrodes. For an extremely high frequency, the disturbance quickly grows into a parabolic cone pointing toward the upper plate. As a result, the interface itself changes its shape from that of a plane to that of a high sloping pyramid.

  12. Cross-cultural invariance of NPI-13: Entitlement as culturally specific, leadership and grandiosity as culturally universal.

    PubMed

    Żemojtel-Piotrowska, Magdalena; Piotrowski, Jarosław; Rogoza, Radosław; Baran, Tomasz; Hitokoto, Hidefumi; Maltby, John

    2018-04-15

    The current study explores the problem with the lack of measurement invariance for the Narcissistic Personality Inventory (NPI) by addressing two issues: conceptual heterogeneity of narcissism and methodological issues related to the binary character of data. We examine the measurement invariance of the 13-item version of the NPI in three populations in Japan, Poland and the UK. Analyses revealed that leadership/authority and grandiose exhibitionism dimensions of the NPI were cross-culturally invariant, while entitlement/exploitativeness was culturally specific. Therefore, we proposed NPI-9 as indicating scalar invariance, and we examined the pattern of correlations between NPI-9 and other variables across three countries. The results suggest that NPI-9 is valid brief scale measuring general levels of narcissism in cross-cultural studies, while the NPI-13 remains suitable for research within specific countries. © 2018 International Union of Psychological Science.

  13. Digital image compression for a 2f multiplexing optical setup

    NASA Astrophysics Data System (ADS)

    Vargas, J.; Amaya, D.; Rueda, E.

    2016-07-01

    In this work a virtual 2f multiplexing system was implemented in combination with digital image compression techniques and redundant information elimination. Depending on the image type to be multiplexed, a memory-usage saving of as much as 99% was obtained. The feasibility of the system was tested using three types of images, binary characters, QR codes, and grey level images. A multiplexing step was implemented digitally, while a demultiplexing step was implemented in a virtual 2f optical setup following real experimental parameters. To avoid cross-talk noise, each image was codified with a specially designed phase diffraction carrier that would allow the separation and relocation of the multiplexed images on the observation plane by simple light propagation. A description of the system is presented together with simulations that corroborate the method. The present work may allow future experimental implementations that will make use of all the parallel processing capabilities of optical systems.

  14. First-principles study of structural stability, electronic, optical and elastic properties of binary intermetallic: PtZr

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pagare, Gitanjali, E-mail: gita-pagare@yahoo.co.in; Jain, Ekta, E-mail: jainekta05@gmail.com; Sanyal, S. P., E-mail: sps.physicsbu@gmail.com

    2016-05-06

    Structural, electronic, optical and elastic properties of PtZr have been studied using the full-potential linearized augmented plane wave (FP-LAPW) method within density functional theory (DFT). The energy against volume and enthalpy vs. pressure variation in three different structures i.e. B{sub 1}, B{sub 2} and B{sub 3} for PtZr has been presented. The equilibrium lattice parameter, bulk modulus and its pressure derivative have been obtained using optimization method for all the three phases. Furthermore, electronic structure was discussed to reveal the metallic character of the present compound. The linear optical properties are also studied under zero pressure for the first time.more » Results on elastic properties are obtained using generalized gradient approximation (GGA) for exchange correlation potentials. Ductile nature of PtZr compound is predicted in accordance with Pugh’s criteria.« less

  15. Integrated system for automated financial document processing

    NASA Astrophysics Data System (ADS)

    Hassanein, Khaled S.; Wesolkowski, Slawo; Higgins, Ray; Crabtree, Ralph; Peng, Antai

    1997-02-01

    A system was developed that integrates intelligent document analysis with multiple character/numeral recognition engines in order to achieve high accuracy automated financial document processing. In this system, images are accepted in both their grayscale and binary formats. A document analysis module starts by extracting essential features from the document to help identify its type (e.g. personal check, business check, etc.). These features are also utilized to conduct a full analysis of the image to determine the location of interesting zones such as the courtesy amount and the legal amount. These fields are then made available to several recognition knowledge sources such as courtesy amount recognition engines and legal amount recognition engines through a blackboard architecture. This architecture allows all the available knowledge sources to contribute incrementally and opportunistically to the solution of the given recognition query. Performance results on a test set of machine printed business checks using the integrated system are also reported.

  16. ID card number detection algorithm based on convolutional neural network

    NASA Astrophysics Data System (ADS)

    Zhu, Jian; Ma, Hanjie; Feng, Jie; Dai, Leiyan

    2018-04-01

    In this paper, a new detection algorithm based on Convolutional Neural Network is presented in order to realize the fast and convenient ID information extraction in multiple scenarios. The algorithm uses the mobile device equipped with Android operating system to locate and extract the ID number; Use the special color distribution of the ID card, select the appropriate channel component; Use the image threshold segmentation, noise processing and morphological processing to take the binary processing for image; At the same time, the image rotation and projection method are used for horizontal correction when image was tilting; Finally, the single character is extracted by the projection method, and recognized by using Convolutional Neural Network. Through test shows that, A single ID number image from the extraction to the identification time is about 80ms, the accuracy rate is about 99%, It can be applied to the actual production and living environment.

  17. Identification and growth characteristics of pink pigmented oxidative bacteria, Methylobacterium mesophilicum and biovars isolated from chlorinated and raw water supplies.

    PubMed

    O'Brien, J R; Murphy, J M

    1993-01-01

    Pink pigmented bacteria were isolated from a blood bank water purification unit, a municipal town water supply (tap water), and an island (untreated) ground water source. A total of thirteen strains including two reference strains of pink pigmented bacteria were compared in a numerical phenotypic study using 119 binary characters. Three clusters were derived, one major cluster of eleven strains was subdivided into two sub-clusters on the basis of methanol utilization. Five strains were facultative methylotrophs and were classified as Methylobacterium mesophilicum biovar 1. The other six strains did not utilize methanol, but on the basis of high phenotypic similarity of 83.6% were classified as M. mesophilicum biovar 2. The single reference strain comprising cluster 2 Pseudomonas extorquens NCIB 9399 was assigned to the genus Methylobacterium and classified as M. extorquens. Cluster 3 was the single reference strain Rhizobium CB 376.

  18. Nuclear DNA amounts in angiosperms: progress, problems and prospects.

    PubMed

    Bennett, M D; Leitch, I J

    2005-01-01

    The nuclear DNA amount in an unreplicated haploid chromosome complement (1C-value) is a key diversity character with many uses. Angiosperm C-values have been listed for reference purposes since 1976, and pooled in an electronic database since 1997 (http://www.kew.org/cval/homepage). Such lists are cited frequently and provide data for many comparative studies. The last compilation was published in 2000, so a further supplementary list is timely to monitor progress against targets set at the first plant genome size workshop in 1997 and to facilitate new goal setting. The present work lists DNA C-values for 804 species including first values for 628 species from 88 original sources, not included in any previous compilation, plus additional values for 176 species included in a previous compilation. 1998-2002 saw striking progress in our knowledge of angiosperm C-values. At least 1700 first values for species were measured (the most in any five-year period) and familial representation rose from 30 % to 50 %. The loss of many densitometers used to measure DNA C-values proved less serious than feared, owing to the development of relatively inexpensive flow cytometers and computer-based image analysis systems. New uses of the term genome (e.g. in 'complete' genome sequencing) can cause confusion. The Arabidopsis Genome Initiative C-value for Arabidopsis thaliana (125 Mb) was a gross underestimate, and an exact C-value based on genome sequencing alone is unlikely to be obtained soon for any angiosperm. Lack of this expected benchmark poses a quandary as to what to use as the basal calibration standard for angiosperms. The next decade offers exciting prospects for angiosperm genome size research. The database (http://www.kew.org/cval/homepage) should become sufficiently representative of the global flora to answer most questions without needing new estimations. DNA amount variation will remain a key interest as an integrated strand of holistic genomics.

  19. Development of Brassica oleracea-nigra monosomic alien addition lines: genotypic, cytological and morphological analyses.

    PubMed

    Tan, Chen; Cui, Cheng; Xiang, Yi; Ge, Xianhong; Li, Zaiyun

    2017-12-01

    We report the development and characterization of Brassica oleracea - nigra monosomic alien addition lines (MAALs) to dissect the Brassica B genome. Brassica nigra (2n = 16, BB) represents the diploid Brassica B genome which carries many useful genes and traits for breeding but received limited studies. To dissect the B genome from B. nigra, the triploid F 1 hybrid (2n = 26, CCB) obtained previously from the cross B. oleracea var. alboglabra (2n = 18, CC) × B. nigra was used as the maternal parent and backcrossed successively to parental B. oleracea. The progenies in BC 1 to BC 3 generations were analyzed by the methods of FISH and SSR markers to screen the monosomic alien addition lines (MAALs) with each of eight different B-genome chromosomes added to C genome (2n = 19, CC + 1B 1-8 ), and seven different MAALs were established, except for the one with chromosome B2 which existed in one triple addition. Most of these MAALs were distinguishable morphologically from each other, as they expressed the characters from B. nigra differently and at variable extents. The alien chromosome remained unpaired as a univalent in 86.24% pollen mother cells at diakinesis or metaphase I, and formed a trivalent with two C-genome chromosomes in 13.76% cells. Transmission frequency of all the added chromosomes was far higher through the ovules (averagely 14.40%) than the pollen (2.64%). The B1, B4 and B5 chromosomes were transmitted by female at much higher rates (22.38-30.00%) than the other four (B3, B6, B7, B8) (5.04-8.42%). The MAALs should be valuable for exploiting the genome structure and evolution of B. nigra.

  20. PwRn1, a novel Ty3/gypsy-like retrotransposon of Paragonimus westermani: molecular characters and its differentially preserved mobile potential according to host chromosomal polyploidy.

    PubMed

    Bae, Young-An; Ahn, Jong-Sook; Kim, Seon-Hee; Rhyu, Mun-Gan; Kong, Yoon; Cho, Seung-Yull

    2008-10-14

    Retrotransposons have been known to involve in the remodeling and evolution of host genome. These reverse transcribing elements, which show a complex evolutionary pathway with diverse intermediate forms, have been comprehensively analyzed from a wide range of host genomes, while the information remains limited to only a few species in the phylum Platyhelminthes. A LTR retrotransposon and its homologs with a strong phylogenetic affinity toward CsRn1 of Clonorchis sinensis were isolated from a trematode parasite Paragonimus westermani via a degenerate PCR method and from an insect species Anopheles gambiae by in silico analysis of the whole mosquito genome, respectively. These elements, designated PwRn1 and AgCR-1 - AgCR-14 conserved unique features including a t-RNATrp primer binding site and the unusual CHCC signature of Gag proteins. Their flanking LTRs displayed >97% nucleotide identities and thus, these elements were likely to have expanded recently in the trematode and insect genomes. They evolved heterogeneous expression strategies: a single fused ORF, two separate ORFs with an identical reading frame and two ORFs overlapped by -1 frameshifting. Phylogenetic analyses suggested that the elements with the separate ORFs had evolved from an ancestral form(s) with the overlapped ORFs. The mobile potential of PwRn1 was likely to be maintained differentially in association with the karyotype of host genomes, as was examined by the presence/absence of intergenomic polymorphism and mRNA transcripts. Our results on the structural diversity of CsRn1-like elements can provide a molecular tool to dissect a more detailed evolutionary episode of LTR retrotransposons. The PwRn1-associated genomic polymorphism, which is substantial in diploids, will also be informative in addressing genomic diversification following inter-/intra-specific hybridization in P. westermani populations.

  1. Chromosome Numbers and Genome Size Variation in Indian Species of Curcuma (Zingiberaceae)

    PubMed Central

    Leong-Škorničková, Jana; Šída, Otakar; Jarolímová, Vlasta; Sabu, Mamyil; Fér, Tomáš; Trávníček, Pavel; Suda, Jan

    2007-01-01

    Background and Aims Genome size and chromosome numbers are important cytological characters that significantly influence various organismal traits. However, geographical representation of these data is seriously unbalanced, with tropical and subtropical regions being largely neglected. In the present study, an investigation was made of chromosomal and genome size variation in the majority of Curcuma species from the Indian subcontinent, and an assessment was made of the value of these data for taxonomic purposes. Methods Genome size of 161 homogeneously cultivated plant samples classified into 51 taxonomic entities was determined by propidium iodide flow cytometry. Chromosome numbers were counted in actively growing root tips using conventional rapid squash techniques. Key Results Six different chromosome counts (2n = 22, 42, 63, >70, 77 and 105) were found, the last two representing new generic records. The 2C-values varied from 1·66 pg in C. vamana to 4·76 pg in C. oligantha, representing a 2·87-fold range. Three groups of taxa with significantly different homoploid genome sizes (Cx-values) and distinct geographical distribution were identified. Five species exhibited intraspecific variation in nuclear DNA content, reaching up to 15·1 % in cultivated C. longa. Chromosome counts and genome sizes of three Curcuma-like species (Hitchenia caulina, Kaempferia scaposa and Paracautleya bhatii) corresponded well with typical hexaploid (2n = 6x = 42) Curcuma spp. Conclusions The basic chromosome number in the majority of Indian taxa (belonging to subgenus Curcuma) is x = 7; published counts correspond to 6x, 9x, 11x, 12x and 15x ploidy levels. Only a few species-specific C-values were found, but karyological and/or flow cytometric data may support taxonomic decisions in some species alliances with morphological similarities. Close evolutionary relationships among some cytotypes are suggested based on the similarity in homoploid genome sizes and geographical grouping. A new species combination, Curcuma scaposa (Nimmo) Škorničk. & M. Sabu, comb. nov., is proposed. PMID:17686760

  2. Evolution of early embryogenesis in rhabditid nematodes

    PubMed Central

    Brauchle, Michael; Kiontke, Karin; MacMenamin, Philip; Fitch, David H. A.; Piano, Fabio

    2009-01-01

    The cell biological events that guide early embryonic development occur with great precision within species but can be quite diverse across species. How these cellular processes evolve and which molecular components underlie evolutionary changes is poorly understood. To begin to address these questions, we systematically investigated early embryogenesis, from the one- to the four-cell embryo, in 34 nematode species related to C. elegans. We found 40 cell-biological characters that captured the phenotypic differences between these species. By tracing the evolutionary changes on a molecular phylogeny, we found that these characters evolved multiple times and independently of one another. Strikingly, all these phenotypes are mimicked by single-gene RNAi experiments in C. elegans. We use these comparisons to hypothesize the molecular mechanisms underlying the evolutionary changes. For example, we predict that a cell polarity module was altered during the evolution of the Protorhabditis group and show that PAR-1, a kinase localized asymmetrically in C. elegans early embryos, is symmetrically localized in the one-cell stage of Protorhabditis group species. Our genome-wide approach identifies candidate molecules—and thereby modules—associated with evolutionary changes in cell-biological phenotypes. PMID:19643102

  3. Heritability and Genome-Wide Association Studies for Hair Color in a Dutch Twin Family Based Sample

    PubMed Central

    Lin, Bochao Danae; Mbarek, Hamdi; Willemsen, Gonneke; Dolan, Conor V.; Fedko, Iryna O.; Abdellaoui, Abdel; de Geus, Eco J.; Boomsma, Dorret I.; Hottenga, Jouke-Jan

    2015-01-01

    Hair color is one of the most visible and heritable traits in humans. Here, we estimated heritability by structural equation modeling (N = 20,142), and performed a genome wide association (GWA) analysis (N = 7091) and a GCTA study (N = 3340) on hair color within a large cohort of twins, their parents and siblings from the Netherlands Twin Register (NTR). Self-reported hair color was analyzed as five binary phenotypes, namely “blond versus non-blond”, “red versus non-red”, “brown versus non-brown”, “black versus non-black”, and “light versus dark”. The broad-sense heritability of hair color was estimated between 73% and 99% and the genetic component included non-additive genetic variance. Assortative mating for hair color was significant, except for red and black hair color. From GCTA analyses, at most 24.6% of the additive genetic variance in hair color was explained by 1000G well-imputed SNPs. Genome-wide association analysis for each hair color showed that SNPs in the MC1R region were significantly associated with red, brown and black hair, and also with light versus dark hair color. Five other known genes (HERC2, TPCN2, SLC24A4, IRF4, and KITLG) gave genome-wide significant hits for blond, brown and light versus dark hair color. We did not find and replicate any new loci for hair color. PMID:26184321

  4. ESTimating plant phylogeny: lessons from partitioning

    PubMed Central

    de la Torre, Jose EB; Egan, Mary G; Katari, Manpreet S; Brenner, Eric D; Stevenson, Dennis W; Coruzzi, Gloria M; DeSalle, Rob

    2006-01-01

    Background While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies. Results A maximum parsimony (MP) analysis resulted in a single tree with relatively high support at all nodes in the tree despite rampant conflict among trees generated from the separate analysis of individual partitions. In a comparison of broader-scale groupings based on cellular compartment (ie: chloroplast, mitochondrial or nuclear) or function, only the nuclear partition tree (based largely on EST data) was found to be topologically identical to the tree based on the simultaneous analysis of all data. Despite topological conflict among the broader-scale groupings examined, only the tree based on morphological data showed statistically significant differences. Conclusion Based on the amount of character support contributed by EST data which make up a majority of the nuclear data set, and the lack of conflict of the nuclear data set with the simultaneous analysis tree, we conclude that the inclusion of EST data does provide a viable and efficient approach to address phylogenetic questions within a parsimony framework on a genomic scale, if problems of orthology determination and potential sequencing errors can be overcome. In addition, approaches that examine conflict and support in a simultaneous analysis framework allow for a more precise understanding of the evolutionary history of individual process partitions and may be a novel way to understand functional aspects of different kinds of cellular classes of gene products. PMID:16776834

  5. Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

    PubMed Central

    Williams, L. Keoki; Buu, Anne

    2017-01-01

    We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206

  6. Integrated rare variant-based risk gene prioritization in disease case-control sequencing studies.

    PubMed

    Lin, Jhih-Rong; Zhang, Quanwei; Cai, Ying; Morrow, Bernice E; Zhang, Zhengdong D

    2017-12-01

    Rare variants of major effect play an important role in human complex diseases and can be discovered by sequencing-based genome-wide association studies. Here, we introduce an integrated approach that combines the rare variant association test with gene network and phenotype information to identify risk genes implicated by rare variants for human complex diseases. Our data integration method follows a 'discovery-driven' strategy without relying on prior knowledge about the disease and thus maintains the unbiased character of genome-wide association studies. Simulations reveal that our method can outperform a widely-used rare variant association test method by 2 to 3 times. In a case study of a small disease cohort, we uncovered putative risk genes and the corresponding rare variants that may act as genetic modifiers of congenital heart disease in 22q11.2 deletion syndrome patients. These variants were missed by a conventional approach that relied on the rare variant association test alone.

  7. Why individual thermo sensation and pain perception varies? Clue of disruptive mutations in TRPVs from 2504 human genome data.

    PubMed

    Ghosh, Arijit; Kaur, Navneet; Kumar, Abhishek; Goswami, Chandan

    2016-09-02

    Every individual varies in character and so do their sensory functions and perceptions. The molecular mechanism and the molecular candidates involved in these processes are assumed to be similar if not same. So far several molecular factors have been identified which are fairly conserved across the phylogenetic tree and are involved in these complex sensory functions. Among all, members belonging to Transient Receptor Potential (TRP) channels have been widely characterized for their involvement in thermo-sensation. These include TRPV1 to TRPV4 channels which reveal complex thermo-gating behavior in response to changes in temperature. The molecular evolution of these channels is highly correlative with the thermal response of different species. However, recent 2504 human genome data suggest that these thermo-sensitive TRPV channels are highly variable and carry possible deleterious mutations in human population. These unexpected findings may explain the individual differences in terms of complex sensory functions.

  8. Next-generation sequencing of the yellowfin tuna mitochondrial genome reveals novel phylogenetic relationships within the genus Thunnus.

    PubMed

    Guo, Liang; Li, Mingming; Zhang, Heng; Yang, Sen; Chen, Xinghan; Meng, Zining; Lin, Haoran

    2016-05-01

    Recently, the next-generation sequencing (NGS) technology has become a powerful tool for sequencing the teleost mitochondrial genome (mitogenome). Here, we used this technology to determine the mitogenome of the yellowfin tuna (Thunnus albacares). A total of 41,378 reads were generated by Illumina platform with an average depth of 250×. The mitogenome (16,528 bp in length) contained 37 mitochondrial genes with the similar gene order to other typical teleosts. These mitochondrial genes were encoded on the heavy strand except for ND6 and eight tRNA genes. The result of phylogenetic analysis supported two distinct clades dividing the genus Thunnus, but the tuna species of these two genetic clades were different from that of two recognized subgenus based on anatomical characters and geographical distribution. Our results might help to understand the structure, function, and evolutionary history of the yellowfin tuna mitogenome and also provide valuable new insights for phylogenetic affinity of tuna species.

  9. Missing Data and Influential Sites: Choice of Sites for Phylogenetic Analysis Can Be As Important As Taxon Sampling and Model Choice

    PubMed Central

    Shavit Grievink, Liat; Penny, David; Holland, Barbara R.

    2013-01-01

    Phylogenetic studies based on molecular sequence alignments are expected to become more accurate as the number of sites in the alignments increases. With the advent of genomic-scale data, where alignments have very large numbers of sites, bootstrap values close to 100% and posterior probabilities close to 1 are the norm, suggesting that the number of sites is now seldom a limiting factor on phylogenetic accuracy. This provokes the question, should we be fussy about the sites we choose to include in a genomic-scale phylogenetic analysis? If some sites contain missing data, ambiguous character states, or gaps, then why not just throw them away before conducting the phylogenetic analysis? Indeed, this is exactly the approach taken in many phylogenetic studies. Here, we present an example where the decision on how to treat sites with missing data is of equal importance to decisions on taxon sampling and model choice, and we introduce a graphical method for illustrating this. PMID:23471508

  10. Unusual DNA Structures Associated With Germline Genetic Activity in Caenorhabditis elegans

    PubMed Central

    Fire, Andrew; Alcazar, Rosa; Tan, Frederick

    2006-01-01

    We describe a surprising long-range periodicity that underlies a substantial fraction of C. elegans genomic sequence. Extended segments (up to several hundred nucleotides) of the C. elegans genome show a strong bias toward occurrence of AA/TT dinucleotides along one face of the helix while little or no such constraint is evident on the opposite helical face. Segments with this characteristic periodicity are highly overrepresented in intron sequences and are associated with a large fraction of genes with known germline expression in C. elegans. In addition to altering the path and flexibility of DNA in vitro, sequences of this character have been shown by others to constrain DNA∷nucleosome interactions, potentially producing a structure that could resist the assembly of highly ordered (phased) nucleosome arrays that have been proposed as a precursor to heterochromatin. We propose a number of ways that the periodic occurrence of An/Tn clusters could reflect evolution and function of genes that express in the germ cell lineage of C. elegans. PMID:16648589

  11. Prediction of whole-genome risk for selection and management of hyperketonemia in Holstein dairy cattle.

    PubMed

    Weigel, K A; Pralle, R S; Adams, H; Cho, K; Do, C; White, H M

    2017-06-01

    Hyperketonemia (HYK), a common early postpartum health disorder characterized by elevated blood concentrations of β-hydroxybutyrate (BHB), affects millions of dairy cows worldwide and leads to significant economic losses and animal welfare concerns. In this study, blood concentrations of BHB were assessed for 1,453 Holstein cows using electronic handheld meters at four time points between 5 and 18 days postpartum. Incidence rates of subclinical (1.2 ≤ maximum BHB ≤ 2.9 mmol/L) and clinical ketosis (maximum BHB ≥ 3.0 mmol/L) were 24.0 and 2.4%, respectively. Variance components, estimated breeding values, and predicted HYK phenotypes were computed on the original, square-root, and binary scales. Heritability estimates for HYK ranged from 0.058 to 0.072 in pedigree-based analyses, as compared to estimates that ranged from 0.071 to 0.093 when pedigrees were augmented with 60,671 single nucleotide polymorphism genotypes of 959 cows and 801 male ancestors. On average, predicted HYK phenotypes from the genome-enhanced analysis ranged from 0.55 mmol/L for first-parity cows in the best contemporary group to 1.40 mmol/L for fourth-parity cows in the worst contemporary group. Genome-enhanced predictions of HYK phenotypes were more closely associated with actual phenotypes than pedigree-based predictions in five-fold cross-validation, and transforming phenotypes to reduce skewness and kurtosis also improved predictive ability. This study demonstrates the feasibility of using repeated cowside measurement of blood BHB concentration in early lactation to construct a reference population that can be used to estimate HYK breeding values for genomic selection programmes and predict HYK phenotypes for genome-guided management decisions. © 2017 Blackwell Verlag GmbH.

  12. Initial implementation of a comparative data analysis ontology.

    PubMed

    Prosdocimi, Francisco; Chisham, Brandon; Pontelli, Enrico; Thompson, Julie D; Stoltzfus, Arlin

    2009-07-03

    Comparative analysis is used throughout biology. When entities under comparison (e.g. proteins, genomes, species) are related by descent, evolutionary theory provides a framework that, in principle, allows N-ary comparisons of entities, while controlling for non-independence due to relatedness. Powerful software tools exist for specialized applications of this approach, yet it remains under-utilized in the absence of a unifying informatics infrastructure. A key step in developing such an infrastructure is the definition of a formal ontology. The analysis of use cases and existing formalisms suggests that a significant component of evolutionary analysis involves a core problem of inferring a character history, relying on key concepts: "Operational Taxonomic Units" (OTUs), representing the entities to be compared; "character-state data" representing the observations compared among OTUs; "phylogenetic tree", representing the historical path of evolution among the entities; and "transitions", the inferred evolutionary changes in states of characters that account for observations. Using the Web Ontology Language (OWL), we have defined these and other fundamental concepts in a Comparative Data Analysis Ontology (CDAO). CDAO has been evaluated for its ability to represent token data sets and to support simple forms of reasoning. With further development, CDAO will provide a basis for tools (for semantic transformation, data retrieval, validation, integration, etc.) that make it easier for software developers and biomedical researchers to apply evolutionary methods of inference to diverse types of data, so as to integrate this powerful framework for reasoning into their research.

  13. Eosinophil Activities Modulate the Immune/Inflammatory Character of Allergic Respiratory Responses in Mice

    PubMed Central

    Jacobsen, Elizabeth A.; LeSuer, William E.; Willetts, Lian; Zellner, Katie R.; Mazzolini, Kirea; Antonios, Nathalie; Beck, Brandon; Protheroe, Cheryl; Ochkur, Sergei I.; Colbert, Dana; Lacy, Paige; Moqbel, Redwan; Appleton, Judith; Lee, Nancy A.; Lee, James J.

    2014-01-01

    Background The importance and specific role(s) of eosinophils in modulating the immune/inflammatory phenotype of allergic pulmonary disease remain to be defined. Established animals models assessing the role(s) of eosinophils as contributors and/or causative agents of disease have relied on congenitally deficient mice where the developmental consequences of eosinophil depletion are unknown. Methods We developed a novel conditional eosinophil-deficient strain of mice (iPHIL) through a gene knock-in strategy inserting the human diphtheria toxin (DT) receptor (DTR) into the endogenous eosinophil peroxidase genomic locus. Results Expression of DTR rendered resistant mouse eosinophil progenitors sensitive to DT without affecting any other cell types. The presence of eosinophils was shown to be unnecessary during the sensitization phase of either ovalbumin (OVA) or house dust mite (HDM) acute asthma models. However, eosinophil ablation during airway challenge led to a predominantly neutrophilic phenotype (>15% neutrophils) accompanied by allergen-induced histopathologies and airway hyperresponsiveness in response to methacholine indistinguishable from eosinophilic wild type mice. Moreover, the iPHIL neutrophilic airway phenotype was shown to be a steroid-resistant allergic respiratory variant that was reversible upon restoration of peripheral eosinophils. Conclusions Eosinophil contributions to allergic immune/inflammatory responses appear to be limited to the airway challenge and not the sensitization phase of allergen provocation models. The reversible steroid-resistant character of the iPHIL neutrophilic airway variant suggests underappreciated mechanisms by which eosinophils shape the character of allergic respiratory responses. PMID:24266710

  14. Bi-temporal analysis of landscape changes in the easternmost mediterranean deltas using binary and classified change information.

    PubMed

    Alphan, Hakan

    2013-03-01

    The aim of this study is (1) to quantify landscape changes in the easternmost Mediterranean deltas using bi-temporal binary change detection approach and (2) to analyze relationships between conservation/management designations and various categories of change that indicate type, degree and severity of human impact. For this purpose, image differencing and ratioing were applied to Landsat TM images of 1984 and 2006. A total of 136 candidate change images including normalized difference vegetation index (NDVI) and principal component analysis (PCA) difference images were tested to understand performance of bi-temporal pre-classification analysis procedures in the Mediterranean delta ecosystems. Results showed that visible image algebra provided high accuracies than did NDVI and PCA differencing. On the other hand, Band 5 differencing had one of the lowest change detection performances. Seven superclasses of change were identified using from/to change categories between the earlier and later dates. These classes were used to understand spatial character of anthropogenic impacts in the study area and derive qualitative and quantitative change information within and outside of the conservation/management areas. Change analysis indicated that natural site and wildlife reserve designations fell short of protecting sand dunes from agricultural expansion in the west. East of the study area, however, was exposed to least human impact owing to the fact that nature conservation status kept human interference at a minimum. Implications of these changes were discussed and solutions were proposed to deal with management problems leading to environmental change.

  15. Genomic regression of claw keratin, taste receptor and light-associated genes provides insights into biology and evolutionary origins of snakes.

    PubMed

    Emerling, Christopher A

    2017-10-01

    Regressive evolution of anatomical traits often corresponds with the regression of genomic loci underlying such characters. As such, studying patterns of gene loss can be instrumental in addressing questions of gene function, resolving conflicting results from anatomical studies, and understanding the evolutionary history of clades. The evolutionary origins of snakes involved the regression of a number of anatomical traits, including limbs, taste buds and the visual system, and by analyzing serpent genomes, I was able to test three hypotheses associated with the regression of these features. The first concerns two keratins that are putatively specific to claws. Both genes that encode these keratins are pseudogenized/deleted in snake genomes, providing additional evidence of claw-specificity. The second hypothesis is that snakes lack taste buds, an issue complicated by conflicting results in the literature. I found evidence that different snakes have lost one or more taste receptors, but all snakes examined retained at least one gustatory channel. The final hypothesis addressed is that the earliest snakes were adapted to a dim light niche. I found evidence of deleted and pseudogenized genes with light-associated functions in snakes, demonstrating a pattern of gene loss similar to other dim light-adapted clades. Molecular dating estimates suggest that dim light adaptation preceded the loss of limbs, providing some bearing on interpretations of the ecological origins of snakes. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Genome-wide scans for candidate genes involved in the aquatic adaptation of dolphins.

    PubMed

    Sun, Yan-Bo; Zhou, Wei-Ping; Liu, He-Qun; Irwin, David M; Shen, Yong-Yi; Zhang, Ya-Ping

    2013-01-01

    Since their divergence from the terrestrial artiodactyls, cetaceans have fully adapted to an aquatic lifestyle, which represents one of the most dramatic transformations in mammalian evolutionary history. Numerous morphological and physiological characters of cetaceans have been acquired in response to this drastic habitat transition, such as thickened blubber, echolocation, and ability to hold their breath for a long period of time. However, knowledge about the molecular basis underlying these adaptations is still limited. The sequence of the genome of Tursiops truncates provides an opportunity for a comparative genomic analyses to examine the molecular adaptation of this species. Here, we constructed 11,838 high-quality orthologous gene alignments culled from the dolphin and four other terrestrial mammalian genomes and screened for positive selection occurring in the dolphin lineage. In total, 368 (3.1%) of the genes were identified as having undergone positive selection by the branch-site model. Functional characterization of these genes showed that they are significantly enriched in the categories of lipid transport and localization, ATPase activity, sense perception of sound, and muscle contraction, areas that are potentially related to cetacean adaptations. In contrast, we did not find a similar pattern in the cow, a closely related species. We resequenced some of the positively selected sites (PSSs), within the positively selected genes, and showed that most of our identified PSSs (50/52) could be replicated. The results from this study should have important implications for our understanding of cetacean evolution and their adaptations to the aquatic environment.

  17. Phylogeny and mitochondrial gene order variation in Lophotrochozoa in the light of new mitogenomic data from Nemertea

    PubMed Central

    Podsiadlowski, Lars; Braband, Anke; Struck, Torsten H; von Döhren, Jörn; Bartolomaeus, Thomas

    2009-01-01

    Background The new animal phylogeny established several taxa which were not identified by morphological analyses, most prominently the Ecdysozoa (arthropods, roundworms, priapulids and others) and Lophotrochozoa (molluscs, annelids, brachiopods and others). Lophotrochozoan interrelationships are under discussion, e.g. regarding the position of Nemertea (ribbon worms), which were discussed to be sister group to e.g. Mollusca, Brachiozoa or Platyhelminthes. Mitochondrial genomes contributed well with sequence data and gene order characters to the deep metazoan phylogeny debate. Results In this study we present the first complete mitochondrial genome record for a member of the Nemertea, Lineus viridis. Except two trnP and trnT, all genes are located on the same strand. While gene order is most similar to that of the brachiopod Terebratulina retusa, sequence based analyses of mitochondrial genes place nemerteans close to molluscs, phoronids and entoprocts without clear preference for one of these taxa as sister group. Conclusion Almost all recent analyses with large datasets show good support for a taxon comprising Annelida, Mollusca, Brachiopoda, Phoronida and Nemertea. But the relationships among these taxa vary between different studies. The analysis of gene order differences gives evidence for a multiple independent occurrence of a large inversion in the mitochondrial genome of Lophotrochozoa and a re-inversion of the same part in gastropods. We hypothesize that some regions of the genome have a higher chance for intramolecular recombination than others and gene order data have to be analysed carefully to detect convergent rearrangement events. PMID:19660126

  18. Avian comparative genomics: reciprocal chromosome painting between domestic chicken (Gallus gallus) and the stone curlew (Burhinus oedicnemus, Charadriiformes)—An atypical species with low diploid number

    PubMed Central

    2009-01-01

    The chicken is the most extensively studied species in birds and thus constitutes an ideal reference for comparative genomics in birds. Comparative cytogenetic studies indicate that the chicken has retained many chromosome characters of the ancestral avian karyotype. The homology between chicken macrochromosomes (1–9 and Z) and their counterparts in more than 40 avian species of 10 different orders has been established by chromosome painting. However, the avian homologues of chicken micro-chromosomes remain to be defined. Moreover, no reciprocal chromosome painting in birds has been performed due to the lack of chromosome-specific probes from other avian species. Here we have generated a set of chromosome-specific paints using flow cytometry that cover the whole genome of the stone curlew (Burhinus oedicnemus, Charadriiformes), a species with one of the lowest diploid number so far reported in birds, as well as paints from more microchromosomes of the chicken. A genome-wide comparative map between the chicken and the stone curlew has been constructed for the first time based on reciprocal chromosome painting. The results indicate that extensive chromosome fusions underlie the sharp decrease in the diploid number in the stone curlew. To a lesser extent, chromosome fissions and inversions occurred also during the evolution of the stone curlew. It is anticipated that this complete set of chromosome painting probes from the first Neoaves species will become an invaluable tool for avian comparative cytogenetics. PMID:19172404

  19. A global perspective on Campanulaceae: Biogeographic, genomic, and floral evolution.

    PubMed

    Crowl, Andrew A; Miles, Nicholas W; Visger, Clayton J; Hansen, Kimberly; Ayers, Tina; Haberle, Rosemarie; Cellinese, Nico

    2016-02-01

    The Campanulaceae are a diverse clade of flowering plants encompassing more than 2300 species in myriad habitats from tropical rainforests to arctic tundra. A robust, multigene phylogeny, including all major lineages, is presented to provide a broad, evolutionary perspective of this cosmopolitan clade. We used a phylogenetic framework, in combination with divergence dating, ancestral range estimation, chromosome modeling, and morphological character reconstruction analyses to infer phylogenetic placement and timing of major biogeographic, genomic, and morphological changes in the history of the group and provide insights into the diversification of this clade across six continents. Ancestral range estimation supports an out-of-Africa diversification following the Cretaceous-Tertiary extinction event. Chromosomal modeling, with corroboration from the distribution of synonymous substitutions among gene duplicates, provides evidence for as many as 20 genome-wide duplication events before large radiations. Morphological reconstructions support the hypothesis that switches in floral symmetry and anther dehiscence were important in the evolution of secondary pollen presentation mechanisms. This study provides a broad, phylogenetic perspective on the evolution of the Campanulaceae clade. The remarkable habitat diversity and cosmopolitan distribution of this lineage appears to be the result of a complex history of genome duplications and numerous long-distance dispersal events. We failed to find evidence for an ancestral polyploidy event for this clade, and our analyses indicate an ancestral base number of nine for the group. This study will serve as a framework for future studies in diverse areas of research in Campanulaceae. © 2016 Botanical Society of America.

  20. Phylogeny of the Acanthocephala based on morphological characters.

    PubMed

    Monks, S

    2001-02-01

    Only four previous studies of relationships among acanthocephalans have included cladistic analyses, and knowledge of the phylogeny of the group has not kept pace with that of other taxa. The purpose of this study is to provide a more comprehensive analysis of the phylogenetic relationships among members of the phylum Acanthocephala using morphological characters. The most appropriate outgroups are those that share a common early cell-cleavage pattern (polar placement of centrioles), such as the Rotifera, rather than the Priapulida (meridional placement of centrioles) to provide character polarity based on common ancestry rather than a general similarity likely due to convergence of body shapes. The phylogeny of 22 species of the Acanthocephala was evaluated based on 138 binary and multistate characters derived from comparative morphological and ontogenetic studies. Three assumptions of cement gland structure were tested: (i) the plesiomorphic type of cement glands in the Rotifera, as the sister group, is undetermined; (ii) non-syncytial cement glands are plesiomorphic; and (iii) syncytial cement glands are plesiomorphic. The results were used to test an early move of Tegorhynchus pectinarius to Koronacantha and to evaluate the relationship between Tegorhynchus and Illiosentis. Analysis of the data-set for each of these assumptions of cement gland structure produced the same single most parsimonious tree topology. Using Assumptions i and ii for the cement glands, the trees were the same length (length = 404 steps, CI = 0.545, CIX = 0.517, HI = 0.455, HIX = 0.483, RI = 0.670, RC = 0.365). Using Assumption iii, the tree was three steps longer (length = 408 steps, CI = 0.539, CIX = 0.512, HI = 0.461, HIX = 0.488, RI = 0.665, RC = 0.359). The tree indicates that the Palaeacanthocephala and Eoacanthocephala both are monophyletic and are sister taxa. The members of the Archiacanthocephala are basal to the other two clades, but do not themselves form a clade. The results provide strong support for the Palaeacanthocephala and the Eoacanthocephala and the hypothesis that the Eoacanthocephala is the most primitive group is not supported. Little support for the Archiacanthocephala as a monophyletic group was provided by the analysis. Support is provided for the recognition of Tegorhynchus and Illiosentis as distinct taxa, as well as the transfer of T. pectinarius to Koronacantha.

  1. A reference genetic map of C. clementina hort. ex Tan.; citrus evolution inferences from comparative mapping

    PubMed Central

    2012-01-01

    Background Most modern citrus cultivars have an interspecific origin. As a foundational step towards deciphering the interspecific genome structures, a reference whole genome sequence was produced by the International Citrus Genome Consortium from a haploid derived from Clementine mandarin. The availability of a saturated genetic map of Clementine was identified as an essential prerequisite to assist the whole genome sequence assembly. Clementine is believed to be a ‘Mediterranean’ mandarin × sweet orange hybrid, and sweet orange likely arose from interspecific hybridizations between mandarin and pummelo gene pools. The primary goals of the present study were to establish a Clementine reference map using codominant markers, and to perform comparative mapping of pummelo, sweet orange, and Clementine. Results Five parental genetic maps were established from three segregating populations, which were genotyped with Single Nucleotide Polymorphism (SNP), Simple Sequence Repeats (SSR) and Insertion-Deletion (Indel) markers. An initial medium density reference map (961 markers for 1084.1 cM) of the Clementine was established by combining male and female Clementine segregation data. This Clementine map was compared with two pummelo maps and a sweet orange map. The linear order of markers was highly conserved in the different species. However, significant differences in map size were observed, which suggests a variation in the recombination rates. Skewed segregations were much higher in the male than female Clementine mapping data. The mapping data confirmed that Clementine arose from hybridization between ‘Mediterranean’ mandarin and sweet orange. The results identified nine recombination break points for the sweet orange gamete that contributed to the Clementine genome. Conclusions A reference genetic map of citrus, used to facilitate the chromosome assembly of the first citrus reference genome sequence, was established. The high conservation of marker order observed at the interspecific level should allow reasonable inferences of most citrus genome sequences by mapping next-generation sequencing (NGS) data in the reference genome sequence. The genome of the haploid Clementine used to establish the citrus reference genome sequence appears to have been inherited primarily from the ‘Mediterranean’ mandarin. The high frequency of skewed allelic segregations in the male Clementine data underline the probable extent of deviation from Mendelian segregation for characters controlled by heterozygous loci in male parents. PMID:23126659

  2. 'Drawing' a Molecular Portrait of CIN and Cervical Cancer: a Review of Genome-Wide Molecular Profiling Data.

    PubMed

    Kurmyshkina, Olga V; Kovchur, Pavel I; Volkova, Tatyana O

    2015-01-01

    In this review we summarize the results of studies employing high-throughput methods of profiling of HPV-associated cervical intraepithelial neoplasia (CIN) and squamous cell cervical cancers at key intracellular regulatory levels to demonstrate the unique identity of the landscape of molecular changes underlying this oncopathology, and to show how these changes are related to the 'natural history' of cervical cancer progression and the formation of clinically significant properties of tumors. A step-wise character of cervical cancer progression is a morphologically well-described fact and, as evidenced by genome-wide screenings, it is indeed the consistent change of the molecular profiles of HPV-infected epithelial cells through which they progressively acquire the phenotypic hallmarks of cancerous cells. In this sense, CIN/cervical cancer is a unique model for studying the driving forces and mechanisms of carcinogenesis. Recent research has allowed definition of the whole-genome spectrum of both random and regular molecular alterations, as well as changes either common to processes of carcinogenesis or specific for cervical cancer. Despite the existence of questions that are still to be investigated, these findings are of great value for the future development of approaches for the diagnostics and treatment of cervical neoplasms.

  3. Genomic Characterization Reveals Insights Into Patulin Biosynthesis and Pathogenicity in Penicillium Species.

    PubMed

    Li, Boqiang; Zong, Yuanyuan; Du, Zhenglin; Chen, Yong; Zhang, Zhanquan; Qin, Guozheng; Zhao, Wenming; Tian, Shiping

    2015-06-01

    Penicillium species are fungal pathogens that infect crop plants worldwide. P. expansum differs from P. italicum and P. digitatum, all major postharvest pathogens of pome and citrus, in that the former is able to produce the mycotoxin patulin and has a broader host range. The molecular basis of host-specificity of fungal pathogens has now become the focus of recent research. The present report provides the whole genome sequence of P. expansum (33.52 Mb) and P. italicum (28.99 Mb) and identifies differences in genome structure, important pathogenic characters, and secondary metabolite (SM) gene clusters in Penicillium species. We identified a total of 55 gene clusters potentially related to secondary metabolism, including a cluster of 15 genes (named PePatA to PePatO), that may be involved in patulin biosynthesis in P. expansum. Functional studies confirmed that PePatL and PePatK play crucial roles in the biosynthesis of patulin and that patulin production is not related to virulence of P. expansum. Collectively, P. expansum contains more pathogenic genes and SM gene clusters, in particular, an intact patulin cluster, than P. italicum or P. digitatum. These findings provide important information relevant to understanding the molecular network of patulin biosynthesis and mechanisms of host-specificity in Penicillium species.

  4. Phylogeny of Eleusine (Poaceae: Chloridoideae) based on nuclear ITS and plastid trnT-trnF sequences.

    PubMed

    Neves, Susana S; Swire-Clark, Ginger; Hilu, Khidir W; Baird, Wm Vance

    2005-05-01

    Phylogenetic relationships in the genus Eleusine (Poaceae: Chloridoideae) were investigated using nuclear ITS and plastid trnT-trnF sequences. Separate and combined data sets were analyzed using parsimony, distance, and likelihood based methods, including Bayesian. Data congruence was examined using character and topological measures. Significant data heterogeneity was detected, but there was little conflict in the topological substructure measures for triplets and quartets, and resolution and clade support increased in the combined analysis. Data incongruence may be a result of noise and insufficient information in the slower evolving trnT-trnF. Monophyly of Eleusine is strongly supported in all analyses, but basal relationships in the genus remain uncertain. There is good support for a CAIK clade (E. coracana subsp. coracana and africana, E. indica, and E. kigeziensis), with E. tristachya as its sister group. Two putative ITS homeologues (A and B loci) were identified in the allotetraploid E. coracana; the 'B' locus sequence type was not found in the remaining species. Eleusine coracana and its putative 'A' genome donor, the diploid E. indica, are confirmed close allies, but sequence data contradicts the hypothesis that E. floccifolia is its second genome donor. The 'B' genome donor remains unidentified and may be extinct.

  5. Reverse genetics in high throughput: rapid generation of complete negative strand RNA virus cDNA clones and recombinant viruses thereof.

    PubMed

    Nolden, T; Pfaff, F; Nemitz, S; Freuling, C M; Höper, D; Müller, T; Finke, Stefan

    2016-04-05

    Reverse genetics approaches are indispensable tools for proof of concepts in virus replication and pathogenesis. For negative strand RNA viruses (NSVs) the limited number of infectious cDNA clones represents a bottleneck as clones are often generated from cell culture adapted or attenuated viruses, with limited potential for pathogenesis research. We developed a system in which cDNA copies of complete NSV genomes were directly cloned into reverse genetics vectors by linear-to-linear RedE/T recombination. Rapid cloning of multiple rabies virus (RABV) full length genomes and identification of clones identical to field virus consensus sequence confirmed the approache's reliability. Recombinant viruses were recovered from field virus cDNA clones. Similar growth kinetics of parental and recombinant viruses, preservation of field virus characters in cell type specific replication and virulence in the mouse model were confirmed. Reduced titers after reporter gene insertion indicated that the low level of field virus replication is affected by gene insertions. The flexibility of the strategy was demonstrated by cloning multiple copies of an orthobunyavirus L genome segment. This important step in reverse genetics technology development opens novel avenues for the analysis of virus variability combined with phenotypical characterization of recombinant viruses at a clonal level.

  6. Sampling gene diversity across the supergroup Amoebozoa: large EST data sets from Acanthamoeba castellanii, Hartmannella vermiformis, Physarum polycephalum, Hyperamoeba dachnaya and Hyperamoeba sp.

    PubMed

    Watkins, Russell F; Gray, Michael W

    2008-04-01

    From comparative analysis of EST data for five taxa within the eukaryotic supergroup Amoebozoa, including two free-living amoebae (Acanthamoeba castellanii, Hartmannella vermiformis) and three slime molds (Physarum polycephalum, Hyperamoeba dachnaya and Hyperamoeba sp.), we obtained new broad-range perspectives on the evolution and biosynthetic capacity of this assemblage. Together with genome sequences for the amoebozoans Dictyostelium discoideum and Entamoeba histolytica, and including partial genome sequence available for A. castellanii, we used the EST data to identify genes that appear to be exclusive to the supergroup, and to specific clades therein. Many of these genes are likely involved in cell-cell communication or differentiation. In examining on a broad scale a number of characters that previously have been considered in simpler cross-species comparisons, typically between Dictyostelium and Entamoeba, we find that Amoebozoa as a whole exhibits striking variation in the number and distribution of biosynthetic pathways, for example, ones for certain critical stress-response molecules, including trehalose and mannitol. Finally, we report additional compelling cases of lateral gene transfer within Amoebozoa, further emphasizing that although this process has influenced genome evolution in all examined amoebozoan taxa, it has done so to a variable extent.

  7. Mitogenomes from type specimens, a genotyping tool for morphologically simple species: ten genomes of agar-producing red algae.

    PubMed

    Boo, Ga Hun; Hughey, Jeffery R; Miller, Kathy Ann; Boo, Sung Min

    2016-10-14

    DNA sequences from type specimens provide independent, objective characters that enhance the value of type specimens and permit the correct application of species names to phylogenetic clades and specimens. We provide mitochondrial genomes (mitogenomes) from archival type specimens of ten species in agar-producing red algal genera Gelidium and Pterocladiella. The genomes contain 43-44 genes, ranging in size from 24,910 to 24,970 bp with highly conserved gene synteny. Low Ka/Ks ratios of apocytochrome b and cytochrome oxidase genes support their utility as markers. Phylogenies of mitogenomes and cox1+rbcL sequences clarified classification at the genus and species levels. Three species formerly in Gelidium and Pterocladia are transferred to Pterocladiella: P. media comb. nov., P. musciformis comb. nov., and P. luxurians comb. and stat. nov. Gelidium sinicola is merged with G. coulteri because they share identical cox1 and rbcL sequences. We describe a new species, Gelidium millariana sp. nov., previously identified as G. isabelae from Australia. We demonstrate that mitogenomes from type specimens provide a new tool for typifying species in the Gelidiales and that there is an urgent need for analyzing mitogenomes from type specimens of red algae and other morphologically simple organisms for insight into their nomenclature, taxonomy and evolution.

  8. Mitogenomes from type specimens, a genotyping tool for morphologically simple species: ten genomes of agar-producing red algae

    PubMed Central

    Boo, Ga Hun; Hughey, Jeffery R.; Miller, Kathy Ann; Boo, Sung Min

    2016-01-01

    DNA sequences from type specimens provide independent, objective characters that enhance the value of type specimens and permit the correct application of species names to phylogenetic clades and specimens. We provide mitochondrial genomes (mitogenomes) from archival type specimens of ten species in agar-producing red algal genera Gelidium and Pterocladiella. The genomes contain 43–44 genes, ranging in size from 24,910 to 24,970 bp with highly conserved gene synteny. Low Ka/Ks ratios of apocytochrome b and cytochrome oxidase genes support their utility as markers. Phylogenies of mitogenomes and cox1+rbcL sequences clarified classification at the genus and species levels. Three species formerly in Gelidium and Pterocladia are transferred to Pterocladiella: P. media comb. nov., P. musciformis comb. nov., and P. luxurians comb. and stat. nov. Gelidium sinicola is merged with G. coulteri because they share identical cox1 and rbcL sequences. We describe a new species, Gelidium millariana sp. nov., previously identified as G. isabelae from Australia. We demonstrate that mitogenomes from type specimens provide a new tool for typifying species in the Gelidiales and that there is an urgent need for analyzing mitogenomes from type specimens of red algae and other morphologically simple organisms for insight into their nomenclature, taxonomy and evolution. PMID:27739454

  9. Genomic Diversification of Enterococci in Hosts: The Role of the Mobilome

    PubMed Central

    Santagati, Maria; Campanile, Floriana; Stefani, Stefania

    2012-01-01

    Enterococci are ubiquitous lactic acid bacteria, possessing a flexible nature that allows them to colonize various environments and hosts but also to be opportunistic pathogens. Many papers have contributed to a better understanding of: (i) the taxonomy of this complex group of microorganisms; (ii) intra-species variability; (iii) the role of different pathogenicity traits; and (iv) some markers related to the character of host-specificity, but the reasons of such incredible success of adaptability is still far from being fully explained. Recently, genomic-based studies have improved our understanding of the genome diversity of the most studied species, i.e., E. faecalis and E. faecium. From these studies, what is becoming evident is the role of the mobilome in adding new abilities to colonize new hosts and environments, and eventually in driving their evolution: specific clones associated with human infections or specific hosts can exist, but probably the consideration of these populations as strictly clonal groups is only partially correct. The variable presence of mobile genetic elements may, indeed, be one of the factors involved in the evolution of one specific group in a specific host and/or environment. Certainly more extensive studies using new high throughput technologies are mandatory to fully understand the evolution of predominant clones and species in different hosts and environments. PMID:22435066

  10. Genomic diversification of enterococci in hosts: the role of the mobilome.

    PubMed

    Santagati, Maria; Campanile, Floriana; Stefani, Stefania

    2012-01-01

    Enterococci are ubiquitous lactic acid bacteria, possessing a flexible nature that allows them to colonize various environments and hosts but also to be opportunistic pathogens. Many papers have contributed to a better understanding of: (i) the taxonomy of this complex group of microorganisms; (ii) intra-species variability; (iii) the role of different pathogenicity traits; and (iv) some markers related to the character of host-specificity, but the reasons of such incredible success of adaptability is still far from being fully explained. Recently, genomic-based studies have improved our understanding of the genome diversity of the most studied species, i.e., E. faecalis and E. faecium. From these studies, what is becoming evident is the role of the mobilome in adding new abilities to colonize new hosts and environments, and eventually in driving their evolution: specific clones associated with human infections or specific hosts can exist, but probably the consideration of these populations as strictly clonal groups is only partially correct. The variable presence of mobile genetic elements may, indeed, be one of the factors involved in the evolution of one specific group in a specific host and/or environment. Certainly more extensive studies using new high throughput technologies are mandatory to fully understand the evolution of predominant clones and species in different hosts and environments.

  11. Screening synteny blocks in pairwise genome comparisons through integer programming.

    PubMed

    Tang, Haibao; Lyons, Eric; Pedersen, Brent; Schnable, James C; Paterson, Andrew H; Freeling, Michael

    2011-04-18

    It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events. We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons). The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers. © 2011 Tang et al; licensee BioMed Central Ltd.

  12. Genome-Based Characterization of Biological Processes That Differentiate Closely Related Bacteria

    PubMed Central

    Palmer, Marike; Steenkamp, Emma T.; Coetzee, Martin P. A.; Blom, Jochen; Venter, Stephanus N.

    2018-01-01

    Bacteriologists have strived toward attaining a natural classification system based on evolutionary relationships for nearly 100 years. In the early twentieth century it was accepted that a phylogeny-based system would be the most appropriate, but in the absence of molecular data, this approach proved exceedingly difficult. Subsequent technical advances and the increasing availability of genome sequencing have allowed for the generation of robust phylogenies at all taxonomic levels. In this study, we explored the possibility of linking biological characters to higher-level taxonomic groups in bacteria by making use of whole genome sequence information. For this purpose, we specifically targeted the genus Pantoea and its four main lineages. The shared gene sets were determined for Pantoea, the four lineages within the genus, as well as its sister-genus Tatumella. This was followed by functional characterization of the gene sets using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. In comparison to Tatumella, various traits involved in nutrient cycling were identified within Pantoea, providing evidence for increased efficacy in recycling of metabolites within the genus. Additionally, a number of traits associated with pathogenicity were identified within species often associated with opportunistic infections, with some support for adaptation toward overcoming host defenses. Some traits were also only conserved within specific lineages, potentially acquired in an ancestor to the lineage and subsequently maintained. It was also observed that the species isolated from the most diverse sources were generally the most versatile in their carbon metabolism. By investigating evolution, based on the more variable genomic regions, it may be possible to detect biologically relevant differences associated with the course of evolution and speciation. PMID:29467735

  13. Genetic and Molecular Epidemiological Characterization of a Novel Adenovirus in Antarctic Penguins Collected between 2008 and 2013

    PubMed Central

    Lee, Sook-Young; Kim, Jeong-Hoon; Seo, Tae-Kun; No, Jin Sun; Kim, Hankyeom; Kim, Won-keun; Choi, Han-Gu; Kang, Sung-Ho; Song, Jin-Won

    2016-01-01

    Antarctica is considered a relatively uncontaminated region with regard to the infectious diseases because of its extreme environment, and isolated geography. For the genetic characterization and molecular epidemiology of the newly found penguin adenovirus in Antarctica, entire genome sequencing and annual survey of penguin adenovirus were conducted. The entire genome sequences of penguin adenoviruses were completed for two Chinstrap penguins (Pygoscelis antarctica) and two Gentoo penguins (Pygoscelis papua). The whole genome lengths and G+C content of penguin adenoviruses were found to be 24,630–24,662 bp and 35.5–35.6%, respectively. Notably, the presence of putative sialidase gene was not identified in penguin adenoviruses by Rapid Amplification of cDNA Ends (RACE-PCR) as well as consensus specific PCR. The penguin adenoviruses were demonstrated to be a new species within the genus Siadenovirus, with a distance of 29.9–39.3% (amino acid, 32.1–47.9%) in DNA polymerase gene, and showed the closest relationship with turkey adenovirus 3 (TAdV-3) in phylogenetic analysis. During the 2008–2013 study period, the penguin adenoviruses were annually detected in 22 of 78 penguins (28.2%), and the molecular epidemiological study of the penguin adenovirus indicates a predominant infection in Chinstrap penguin population (12/30, 40%). Interestingly, the genome of penguin adenovirus could be detected in several internal samples, except the lymph node and brain. In conclusion, an analysis of the entire adenoviral genomes from Antarctic penguins was conducted, and the penguin adenoviruses, containing unique genetic character, were identified as a new species within the genus Siadenovirus. Moreover, it was annually detected in Antarctic penguins, suggesting its circulation within the penguin population. PMID:27309961

  14. Comparison of space flight and heavy ion radiation induced genomic/epigenomic mutations in rice (Oryza sativa)

    NASA Astrophysics Data System (ADS)

    Shi, Jinming; Lu, Weihong; Sun, Yeqing

    2014-04-01

    Rice seeds, after space flight and low dose heavy ion radiation treatment were cultured on ground. Leaves of the mature plants were obtained for examination of genomic/epigenomic mutations by using amplified fragment length polymorphism (AFLP) and methylation sensitive amplification polymorphism (MSAP) method, respectively. The mutation sites were identified by fragment recovery and sequencing. The heritability of the mutations was detected in the next generation. Results showed that both space flight and low dose heavy ion radiation can induce significant alterations on rice genome and epigenome (P < 0.05). For both genetic and epigenetic assays, while there was no significant difference in mutation rates and their ability to be inherited to the next generation, the site of mutations differed between the space flight and radiation treated groups. More than 50% of the mutation sites were shared by two radiation treated groups, radiated with different LET value and dose, while only about 20% of the mutation sites were shared by space flight group and radiation treated group. Moreover, in space flight group, we found that DNA methylation changes were more prone to occur on CNG sequence than CG sequence. Sequencing results proved that both space flight and heavy ion radiation induced mutations were widely spread on rice genome including coding region and repeated region. Our study described and compared the characters of space flight and low dose heavy ion radiation induced genomic/epigenomic mutations. Our data revealed the mechanisms of application of space environment for mutagenesis and crop breeding. Furthermore, this work implicated that the nature of mutations induced under space flight conditions may involve factors beyond ion radiation.

  15. SiGN-SSM: open source parallel software for estimating gene networks with state space models.

    PubMed

    Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

    2011-04-15

    SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.

  16. PAM-Dependent Target DNA Recognition and Cleavage by C2c1 CRISPR-Cas Endonuclease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Hui; Gao, Pu; Rajashankar, Kanagalaghatta R.

    C2c1 is a newly identified guide RNA-mediated type V-B CRISPR-Cas endonuclease that site-specifically targets and cleaves both strands of target DNA. We have determined crystal structures of Alicyclobacillus acidoterrestris C2c1 (AacC2c1) bound to sgRNA as a binary complex and to target DNAs as ternary complexes, thereby capturing catalytically competent conformations of AacC2c1 with both target and non-target DNA strands independently positioned within a single RuvC catalytic pocket. Moreover, C2c1-mediated cleavage results in a staggered seven-nucleotide break of target DNA. crRNA adopts a pre-ordered five-nucleotide A-form seed sequence in the binary complex, with release of an inserted tryptophan, facilitating zippering upmore » of 20-bp guide RNA:target DNA heteroduplex on ternary complex formation. Notably, the PAM-interacting cleft adopts a “locked” conformation on ternary complex formation. Structural comparison of C2c1 ternary complexes with their Cas9 and Cpf1 counterparts highlights the diverse mechanisms adopted by these distinct CRISPR-Cas systems, thereby broadening and enhancing their applicability as genome editing tools.« less

  17. cit: hypothesis testing software for mediation analysis in genomic applications.

    PubMed

    Millstein, Joshua; Chen, Gary K; Breton, Carrie V

    2016-08-01

    The challenges of successfully applying causal inference methods include: (i) satisfying underlying assumptions, (ii) limitations in data/models accommodated by the software and (iii) low power of common multiple testing approaches. The causal inference test (CIT) is based on hypothesis testing rather than estimation, allowing the testable assumptions to be evaluated in the determination of statistical significance. A user-friendly software package provides P-values and optionally permutation-based FDR estimates (q-values) for potential mediators. It can handle single and multiple binary and continuous instrumental variables, binary or continuous outcome variables and adjustment covariates. Also, the permutation-based FDR option provides a non-parametric implementation. Simulation studies demonstrate the validity of the cit package and show a substantial advantage of permutation-based FDR over other common multiple testing strategies. The cit open-source R package is freely available from the CRAN website (https://cran.r-project.org/web/packages/cit/index.html) with embedded C ++ code that utilizes the GNU Scientific Library, also freely available (http://www.gnu.org/software/gsl/). joshua.millstein@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Hemichordate models.

    PubMed

    Tagawa, Kuni

    2016-08-01

    Hemichordates are marine animals with two different lifestyles. The solitary, free-living enteropneusts or acorn worms resemble polychaetes or earthworms, while the tiny, colonial, sessile pterobranchs are similar to bryozoans and phoronids. Hemichordates, together with echinoderms, comprise the clade Ambulacraria and are a sister group to the Chordata. As adults, they exhibit cardinal chordate characters, such as gill slits. Their embryogenesis and dipleurula-type (tornaria) larvae are very similar to those of echinoderms. Recent advances in comparative genomics and molecular developmental biology of hemichordates, especially the vermiform enteropneusts, have shed light on deuterostome ancestors. This paper briefly reviews the numerous recent studies on the Phylum Hemichordata. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Genetic diversity and population structure of Musa accessions in ex situ conservation

    PubMed Central

    2013-01-01

    Background Banana cultivars are mostly derived from hybridization between wild diploid subspecies of Musa acuminata (A genome) and M. balbisiana (B genome), and they exhibit various levels of ploidy and genomic constitution. The Embrapa ex situ Musa collection contains over 220 accessions, of which only a few have been genetically characterized. Knowledge regarding the genetic relationships and diversity between modern cultivars and wild relatives would assist in conservation and breeding strategies. Our objectives were to determine the genomic constitution based on Internal Transcribed Spacer (ITS) regions polymorphism and the ploidy of all accessions by flow cytometry and to investigate the population structure of the collection using Simple Sequence Repeat (SSR) loci as co-dominant markers based on Structure software, not previously performed in Musa. Results From the 221 accessions analyzed by flow cytometry, the correct ploidy was confirmed or established for 212 (95.9%), whereas digestion of the ITS region confirmed the genomic constitution of 209 (94.6%). Neighbor-joining clustering analysis derived from SSR binary data allowed the detection of two major groups, essentially distinguished by the presence or absence of the B genome, while subgroups were formed according to the genomic composition and commercial classification. The co-dominant nature of SSR was explored to analyze the structure of the population based on a Bayesian approach, detecting 21 subpopulations. Most of the subpopulations were in agreement with the clustering analysis. Conclusions The data generated by flow cytometry, ITS and SSR supported the hypothesis about the occurrence of homeologue recombination between A and B genomes, leading to discrepancies in the number of sets or portions from each parental genome. These phenomenons have been largely disregarded in the evolution of banana, as the “single-step domestication” hypothesis had long predominated. These findings will have an impact in future breeding approaches. Structure analysis enabled the efficient detection of ancestry of recently developed tetraploid hybrids by breeding programs, and for some triploids. However, for the main commercial subgroups, Structure appeared to be less efficient to detect the ancestry in diploid groups, possibly due to sampling restrictions. The possibility of inferring the membership among accessions to correct the effects of genetic structure opens possibilities for its use in marker-assisted selection by association mapping. PMID:23497122

  20. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

    PubMed

    Guzzi, Pietro Hiram; Cannataro, Mario

    2013-08-01

    A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  1. Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics

    NASA Astrophysics Data System (ADS)

    Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.

    2018-09-01

    This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.

  2. solveME: fast and reliable solution of nonlinear ME models.

    PubMed

    Yang, Laurence; Ma, Ding; Ebrahim, Ali; Lloyd, Colton J; Saunders, Michael A; Palsson, Bernhard O

    2016-09-22

    Genome-scale models of metabolism and macromolecular expression (ME) significantly expand the scope and predictive capabilities of constraint-based modeling. ME models present considerable computational challenges: they are much (>30 times) larger than corresponding metabolic reconstructions (M models), are multiscale, and growth maximization is a nonlinear programming (NLP) problem, mainly due to macromolecule dilution constraints. Here, we address these computational challenges. We develop a fast and numerically reliable solution method for growth maximization in ME models using a quad-precision NLP solver (Quad MINOS). Our method was up to 45 % faster than binary search for six significant digits in growth rate. We also develop a fast, quad-precision flux variability analysis that is accelerated (up to 60× speedup) via solver warm-starts. Finally, we employ the tools developed to investigate growth-coupled succinate overproduction, accounting for proteome constraints. Just as genome-scale metabolic reconstructions have become an invaluable tool for computational and systems biologists, we anticipate that these fast and numerically reliable ME solution methods will accelerate the wide-spread adoption of ME models for researchers in these fields.

  3. MFCompress: a compression tool for FASTA and multi-FASTA data.

    PubMed

    Pinho, Armando J; Pratas, Diogo

    2014-01-01

    The data deluge phenomenon is becoming a serious problem in most genomic centers. To alleviate it, general purpose tools, such as gzip, are used to compress the data. However, although pervasive and easy to use, these tools fall short when the intention is to reduce as much as possible the data, for example, for medium- and long-term storage. A number of algorithms have been proposed for the compression of genomics data, but unfortunately only a few of them have been made available as usable and reliable compression tools. In this article, we describe one such tool, MFCompress, specially designed for the compression of FASTA and multi-FASTA files. In comparison to gzip and applied to multi-FASTA files, MFCompress can provide additional average compression gains of almost 50%, i.e. it potentially doubles the available storage, although at the cost of some more computation time. On highly redundant datasets, and in comparison with gzip, 8-fold size reductions have been obtained. Both source code and binaries for several operating systems are freely available for non-commercial use at http://bioinformatics.ua.pt/software/mfcompress/.

  4. Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration.

    PubMed

    Deelen, Patrick; Bonder, Marc Jan; van der Velde, K Joeri; Westra, Harm-Jan; Winder, Erwin; Hendriksen, Dennis; Franke, Lude; Swertz, Morris A

    2014-12-11

    To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java 'Genotype-IO' API. All software is open source under license LGPLv3 and available from http://www.molgenis.org/systemsgenetics. GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines.

  5. The complete sequences and gene organisation of the mitochondrial genomes of the heterodont bivalves Acanthocardia tuberculata and Hiatella arctica – and the first record for a putative Atpase subunit 8 gene in marine bivalves

    PubMed Central

    Dreyer, Hermann; Steiner, Gerhard

    2006-01-01

    Background Mitochondrial (mt) gene arrangement is highly variable among molluscs and especially among bivalves. Of the 30 complete molluscan mt-genomes published to date, only one is of a heterodont bivalve, although this is the most diverse taxon in terms of species numbers. We determined the complete sequence of the mitochondrial genomes of Acanthocardia tuberculata and Hiatella arctica, (Mollusca, Bivalvia, Heterodonta) and describe their gene contents and genome organisations to assess the variability of these features among the Bivalvia and their value for phylogenetic inference. Results The size of the mt-genome in Acanthocardia tuberculata is 16.104 basepairs (bp), and in Hiatella arctica 18.244 bp. The Acanthocardia mt-genome contains 12 of the typical protein coding genes, lacking the Atpase subunit 8 (atp8) gene, as all published marine bivalves. In contrast, a complete atp8 gene is present in Hiatella arctica. In addition, we found a putative truncated atp8 gene when re-annotating the mt-genome of Venerupis philippinarum. Both mt-genomes reported here encode all genes on the same strand and have an additional trnM. In Acanthocardia several large non-coding regions are present. One of these contains 3.5 nearly identical copies of a 167 bp motive. In Hiatella, the 3' end of the NADH dehydrogenase subunit (nad)6 gene is duplicated together with the adjacent non-coding region. The gene arrangement of Hiatella is markedly different from all other known molluscan mt-genomes, that of Acanthocardia shows few identities with the Venerupis philippinarum. Phylogenetic analyses on amino acid and nucleotide levels robustly support the Heterodonta and the sister group relationship of Acanthocardia and Venerupis. Monophyletic Bivalvia are resolved only by a Bayesian inference of the nucleotide data set. In all other analyses the two unionid species, being to only ones with genes located on both strands, do not group with the remaining bivalves. Conclusion The two mt-genomes reported here add to and underline the high variability of gene order and presence of duplications in bivalve and molluscan taxa. Some genomic traits like the loss of the atp8 gene or the encoding of all genes on the same strand are homoplastic among the Bivalvia. These characters, gene order, and the nucleotide sequence data show considerable potential of resolving phylogenetic patterns at lower taxonomic levels. PMID:16948842

  6. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less

  7. Quantitative and Qualitative Differences in Morphological Traits Revealed between Diploid Fragaria Species

    PubMed Central

    SARGENT, DANIEL J.; GEIBEL, M.; HAWKINS, J. A.; WILKINSON, M. J.; BATTEY, N. H.; SIMPSON, D. W.

    2004-01-01

    • Background and Aims The aims of this investigation were to highlight the qualitative and quantitative diversity apparent between nine diploid Fragaria species and produce interspecific populations segregating for a large number of morphological characters suitable for quantitative trait loci analysis. • Methods A qualitative comparison of eight described diploid Fragaria species was performed and measurements were taken of 23 morphological traits from 19 accessions including eight described species and one previously undescribed species. A principal components analysis was performed on 14 mathematically unrelated traits from these accessions, which partitioned the species accessions into distinct morphological groups. Interspecific crosses were performed with accessions of species that displayed significant quantitative divergence and, from these, populations that should segregate for a range of quantitative traits were raised. • Key Results Significant differences between species were observed for all 23 morphological traits quantified and three distinct groups of species accessions were observed after the principal components analysis. Interspecific crosses were performed between these groups, and F2 and backcross populations were raised that should segregate for a range of morphological characters. In addition, the study highlighted a number of distinctive morphological characters in many of the species studied. • Conclusions Diploid Fragaria species are morphologically diverse, yet remain highly interfertile, making the group an ideal model for the study of the genetic basis of phenotypic differences between species through map-based investigation using quantitative trait loci. The segregating interspecific populations raised will be ideal for such investigations and could also provide insights into the nature and extent of genome evolution within this group. PMID:15469944

  8. Next-generation phenomics for the Tree of Life.

    PubMed

    Burleigh, J Gordon; Alphonse, Kenzley; Alverson, Andrew J; Bik, Holly M; Blank, Carrine; Cirranello, Andrea L; Cui, Hong; Daly, Marymegan; Dietterich, Thomas G; Gasparich, Gail; Irvine, Jed; Julius, Matthew; Kaufman, Seth; Law, Edith; Liu, Jing; Moore, Lisa; O'Leary, Maureen A; Passarotti, Maria; Ranade, Sonali; Simmons, Nancy B; Stevenson, Dennis W; Thacker, Robert W; Theriot, Edward C; Todorovic, Sinisa; Velazco, Paúl M; Walls, Ramona L; Wolfe, Joanna M; Yu, Mengjie

    2013-06-26

    The phenotype represents a critical interface between the genome and the environment in which organisms live and evolve. Phenotypic characters also are a rich source of biodiversity data for tree building, and they enable scientists to reconstruct the evolutionary history of organisms, including most fossil taxa, for which genetic data are unavailable. Therefore, phenotypic data are necessary for building a comprehensive Tree of Life. In contrast to recent advances in molecular sequencing, which has become faster and cheaper through recent technological advances, phenotypic data collection remains often prohibitively slow and expensive. The next-generation phenomics project is a collaborative, multidisciplinary effort to leverage advances in image analysis, crowdsourcing, and natural language processing to develop and implement novel approaches for discovering and scoring the phenome, the collection of phentotypic characters for a species. This research represents a new approach to data collection that has the potential to transform phylogenetics research and to enable rapid advances in constructing the Tree of Life. Our goal is to assemble large phenomic datasets built using new methods and to provide the public and scientific community with tools for phenomic data assembly that will enable rapid and automated study of phenotypes across the Tree of Life.

  9. Single nucleotide polymorphism analysis of Korean native chickens using next generation sequencing data.

    PubMed

    Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon

    2015-02-01

    There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.

  10. Mission Control Operations: Employing a New High Performance Design for Communications Links Supporting Exploration Programs

    NASA Technical Reports Server (NTRS)

    Jackson, Dan E., Jr.

    2015-01-01

    The planetary exploration programs demand a totally new examination of data multiplexing, digital communications protocols and data transmission principles for both ground and spacecraft operations. Highly adaptive communications devices on-board and on the ground must provide the greatest possible transmitted data density between deployed crew personnel, spacecraft and ground control teams. Regarding these requirements, this proposal borrows from research into quantum mechanical computing by applying the concept of a qubit, a single bit that represents 16 states, to radio frequency (RF) communications link design for exploration programs. This concept of placing multiple character values into a single data bit can easily make the evolutionary steps needed to meet exploration mission demands. To move the qubit from the quantum mechanical research laboratory into long distance RF data transmission, this proposal utilizes polarization modulation of the RF carrier signal to represent numbers from zero to fifteen. It introduces the concept of a binary-to-hexadecimal converter that quickly chops any data stream into 16-bit words and connects variously polarized feedhorns to a single-frequency radio transmitter. Further, the concept relies on development of a receiver that uses low-noise amplifiers and an antenna array to quickly assess carrier polarity and perform hexadecimal to binary conversion. Early testbed experiments using the International Space Station (ISS) as an operations laboratory can be implemented to provide the most cost-effective return for research investment. The improvement in signal-to-noise ratio while supporting greater baseband data rates that could be achieved through this concept justifies its consideration for long-distance exploration programs.

  11. Biomimetic hydrophobic surface fabricated by chemical etching method from hierarchically structured magnesium alloy substrate

    NASA Astrophysics Data System (ADS)

    Liu, Yan; Yin, Xiaoming; Zhang, Jijia; Wang, Yaming; Han, Zhiwu; Ren, Luquan

    2013-09-01

    As one of the lightest metal materials, magnesium alloy plays an important role in industry such as automobile, airplane and electronic product. However, magnesium alloy is hindered due to its high chemical activity and easily corroded. Here, inspired by typical plant surfaces such as lotus leaves and petals of red rose with super-hydrophobic character, the new hydrophobic surface is fabricated on magnesium alloy to improve anti-corrosion by two-step methodology. The procedure is that the samples are processed by laser first and then immersed and etched in the aqueous AgNO3 solution concentrations of 0.1 mol/L, 0.3 mol/L and 0.5 mol/L for different times of 15 s, 40 s and 60 s, respectively, finally modified by DTS (CH3(CH2)11Si(OCH3)3). The microstructure, chemical composition, wettability and anti-corrosion are characterized by means of SEM, XPS, water contact angle measurement and electrochemical method. The hydrophobic surfaces with microscale crater-like and nanoscale flower-like binary structure are obtained. The low-energy material is contained in surface after DTS treatment. The contact angles could reach up to 138.4 ± 2°, which hydrophobic property is both related to the micro-nano binary structure and chemical composition. The results of electrochemical measurements show that anti-corrosion property of magnesium alloy is improved. Furthermore, our research is expected to create some ideas from natural enlightenment to improve anti-corrosion property of magnesium alloy while this method can be easily extended to other metal materials.

  12. Investigation of the interactions of enteric and hydrophilic polymers to enhance dissolution of griseofulvin following hot melt extrusion processing.

    PubMed

    Bennett, Ryan C; Keen, Justin M; Bi, Yunxia Vivian; Porter, Stuart; Dürig, Thomas; McGinity, James W

    2015-07-01

    This study focuses on the application of hot melt extrusion (HME) to produce solid dispersions containing griseofulvin (GF) and investigates the in-vitro dissolution performance of HME powders and resulting tablet compositions containing HME-processed dispersions. Binary, ternary and quaternary dispersions containing GF, enteric polymer (Eudragit L100-55 or AQOAT-LF) and/or vinyl pyrrolidone-based polymer (Plasdone K-12 povidone or S-630 copovidone) were processed by HME. Two plasticizers, triethyl citrate (TEC) and acetyl tributyl citrate (ATBC), were incorporated to aid in melt processing and to modify release of GF in neutral media following a pH-change in dissolution. Products were characterized for GF recovery, degrees of compositional amorphous character, intermolecular interactions and non-sink dissolution performance. Binary dispersions exhibited lower maximum observed concentration values and magnitudes of supersaturated GF in neutral media dissolution in comparison with the ternary dispersions. The quaternary HME products, 1 : 2 : 1 : 0.6 GF : L100-55 : S-630 : ATBC and GF : AQOAT-LF : K-12 : ATBC, were determined as the most optimal concentration-enhancing compositions due to increased hydrogen bonding of enteric functional groups with carbonyl/acetate groups of vinyl pyrrolidone-based polymers, reduced compositional crystallinity and presence of incorporated hydrophobic plasticizer. HME products containing combinations of concentration-enhancing polymers can supersaturate and sustain GF dissolution to greater magnitudes in neutral media following the pH-transition and be compressed into immediate-release tablets exhibiting similar dissolution profiles. © 2015 Royal Pharmaceutical Society.

  13. A deer (subfamily Cervinae) genetic linkage map and the evolution of ruminant genomes.

    PubMed Central

    Slate, Jon; Van Stijn, Tracey C; Anderson, Rayna M; McEwan, K Mary; Maqbool, Nauman J; Mathias, Helen C; Bixley, Matthew J; Stevens, Deirdre R; Molenaar, Adrian J; Beever, Jonathan E; Galloway, Susan M; Tate, Michael L

    2002-01-01

    Comparative maps between ruminant species and humans are increasingly important tools for the discovery of genes underlying economically important traits. In this article we present a primary linkage map of the deer genome derived from an interspecies hybrid between red deer (Cervus elaphus) and Père David's deer (Elaphurus davidianus). The map is approximately 2500 cM long and contains >600 markers including both evolutionary conserved type I markers and highly polymorphic type II markers (microsatellites). Comparative mapping by annotation and sequence similarity (COMPASS) was demonstrated to be a useful tool for mapping bovine and ovine ESTs in deer. Using marker order as a phylogenetic character and comparative map information from human, mouse, deer, cattle, and sheep, we reconstructed the karyotype of the ancestral Pecoran mammal and identified the chromosome rearrangements that have occurred in the sheep, cattle, and deer lineages. The deer map and interspecies hybrid pedigrees described here are a valuable resource for (1) predicting the location of orthologs to human genes in ruminants, (2) mapping QTL in farmed and wild deer populations, and (3) ruminant phylogenetic studies. PMID:11973312

  14. The evolution of sex: A new hypothesis based on mitochondrial mutational erosion: Mitochondrial mutational erosion in ancestral eukaryotes would favor the evolution of sex, harnessing nuclear recombination to optimize compensatory nuclear coadaptation.

    PubMed

    Havird, Justin C; Hall, Matthew D; Dowling, Damian K

    2015-09-01

    The evolution of sex in eukaryotes represents a paradox, given the "twofold" fitness cost it incurs. We hypothesize that the mutational dynamics of the mitochondrial genome would have favored the evolution of sexual reproduction. Mitochondrial DNA (mtDNA) exhibits a high-mutation rate across most eukaryote taxa, and several lines of evidence suggest that this high rate is an ancestral character. This seems inexplicable given that mtDNA-encoded genes underlie the expression of life's most salient functions, including energy conversion. We propose that negative metabolic effects linked to mitochondrial mutation accumulation would have invoked selection for sexual recombination between divergent host nuclear genomes in early eukaryote lineages. This would provide a mechanism by which recombinant host genotypes could be rapidly shuffled and screened for the presence of compensatory modifiers that offset mtDNA-induced harm. Under this hypothesis, recombination provides the genetic variation necessary for compensatory nuclear coadaptation to keep pace with mitochondrial mutation accumulation. © 2015 WILEY Periodicals, Inc.

  15. The rate and character of spontaneous mutation in an RNA virus.

    PubMed Central

    Malpica, José M; Fraile, Aurora; Moreno, Ignacio; Obies, Clara I; Drake, John W; García-Arenal, Fernando

    2002-01-01

    Estimates of spontaneous mutation rates for RNA viruses are few and uncertain, most notably due to their dependence on tiny mutation reporter sequences that may not well represent the whole genome. We report here an estimate of the spontaneous mutation rate of tobacco mosaic virus using an 804-base cognate mutational target, the viral MP gene that encodes the movement protein (MP). Selection against newly arising mutants was countered by providing MP function from a transgene. The estimated genomic mutation rate was on the lower side of the range previously estimated for lytic animal riboviruses. We also present the first unbiased riboviral mutational spectrum. The proportion of base substitutions is the same as that in a retrovirus but is lower than that in most DNA-based organisms. Although the MP mutant frequency was 0.02-0.05, 35% of the sequenced mutants contained two or more mutations. Therefore, the mutation process in populations of TMV and perhaps of riboviruses generally differs profoundly from that in populations of DNA-based microbes and may be strongly influenced by a subpopulation of mutator polymerases. PMID:12524327

  16. Mutational influences of low-dose and high let ionizing radiation in drosophila melanogaster

    NASA Astrophysics Data System (ADS)

    Lei, Huang; Fanjun, Kong; Sun, Yeqing

    For cosmic environment consists of a varying kinds of radiation particles including high Z and energy ions which was charactered with low-dose and high RBE, it is important to determine the possible biofuctions of high LET radiation on human beings. To analyse the possible effectes of mutational influences of low-dose and high-LET ionizing radiation, wild fruit flies drosophila melanogaster were irradiated by 12C6+ ions in two LET levels (63.3 and 30 keV/µum) with different low doses from 2mGy to 2000mGy (2, 20, 200, 2000mGy) in HIRFL (Heavy ion radiation facility laboratory, lanzhou, China).In the same LET value group, the average polymorphic frequency was elevated along with adding doses of irradation, the frequency in 2000 mGy dose samples was significantly higher than other samples (p<0.01).These results suggest that genomic DNA sequence could be effected by low-dose and high-LET ionizing radiation, the irradiation dose is an important element in genomic mutation frequency origination.

  17. Structural Analysis of Biodiversity

    PubMed Central

    Sirovich, Lawrence; Stoeckle, Mark Y.; Zhang, Yu

    2010-01-01

    Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. PMID:20195371

  18. De-identification of clinical notes via recurrent neural network and conditional random field.

    PubMed

    Liu, Zengjian; Tang, Buzhou; Wang, Xiaolong; Chen, Qingcai

    2017-11-01

    De-identification, identifying information from data, such as protected health information (PHI) present in clinical data, is a critical step to enable data to be shared or published. The 2016 Centers of Excellence in Genomic Science (CEGS) Neuropsychiatric Genome-scale and RDOC Individualized Domains (N-GRID) clinical natural language processing (NLP) challenge contains a de-identification track in de-identifying electronic medical records (EMRs) (i.e., track 1). The challenge organizers provide 1000 annotated mental health records for this track, 600 out of which are used as a training set and 400 as a test set. We develop a hybrid system for the de-identification task on the training set. Firstly, four individual subsystems, that is, a subsystem based on bidirectional LSTM (long-short term memory, a variant of recurrent neural network), a subsystem-based on bidirectional LSTM with features, a subsystem based on conditional random field (CRF) and a rule-based subsystem, are used to identify PHI instances. Then, an ensemble learning-based classifiers is deployed to combine all PHI instances predicted by above three machine learning-based subsystems. Finally, the results of the ensemble learning-based classifier and the rule-based subsystem are merged together. Experiments conducted on the official test set show that our system achieves the highest micro F1-scores of 93.07%, 91.43% and 95.23% under the "token", "strict" and "binary token" criteria respectively, ranking first in the 2016 CEGS N-GRID NLP challenge. In addition, on the dataset of 2014 i2b2 NLP challenge, our system achieves the highest micro F1-scores of 96.98%, 95.11% and 98.28% under the "token", "strict" and "binary token" criteria respectively, outperforming other state-of-the-art systems. All these experiments prove the effectiveness of our proposed method. Copyright © 2017. Published by Elsevier Inc.

  19. A computational approach to candidate gene prioritization for X-linked mental retardation using annotation-based binary filtering and motif-based linear discriminatory analysis

    PubMed Central

    2011-01-01

    Background Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches - the examination of similarities to known disease genes and/or the evaluation of functional annotation of genes. Each of these approaches has its own caveats. Here we employ a previously described method of candidate gene prioritization based mainly on gene annotation, in accompaniment with a technique based on the evaluation of pertinent sequence motifs or signatures, in an attempt to refine the gene prioritization approach. We apply this approach to X-linked mental retardation (XLMR), a group of heterogeneous disorders for which some of the underlying genetics is known. Results The gene annotation-based binary filtering method yielded a ranked list of putative XLMR candidate genes with good plausibility of being associated with the development of mental retardation. In parallel, a motif finding approach based on linear discriminatory analysis (LDA) was employed to identify short sequence patterns that may discriminate XLMR from non-XLMR genes. High rates (>80%) of correct classification was achieved, suggesting that the identification of these motifs effectively captures genomic signals associated with XLMR vs. non-XLMR genes. The computational tools developed for the motif-based LDA is integrated into the freely available genomic analysis portal Galaxy (http://main.g2.bx.psu.edu/). Nine genes (APLN, ZC4H2, MAGED4, MAGED4B, RAP2C, FAM156A, FAM156B, TBL1X, and UXT) were highlighted as highly-ranked XLMR methods. Conclusions The combination of gene annotation information and sequence motif-orientated computational candidate gene prediction methods highlight an added benefit in generating a list of plausible candidate genes, as has been demonstrated for XLMR. Reviewers: This article was reviewed by Dr Barbara Bardoni (nominated by Prof Juergen Brosius); Prof Neil Smalheiser and Dr Dustin Holloway (nominated by Prof Charles DeLisi). PMID:21668950

  20. Genetic factors controlling wool shedding in a composite Easycare sheep flock.

    PubMed

    Matika, O; Bishop, S C; Pong-Wong, R; Riggio, V; Headon, D J

    2013-12-01

    Historically, sheep have been selectively bred for desirable traits including wool characteristics. However, recent moves towards extensive farming and reduced farm labour have seen a renewed interest in Easycare breeds. The aim of this study was to quantify the underlying genetic architecture of wool shedding in an Easycare flock. Wool shedding scores were collected from 565 pedigreed commercial Easycare sheep from 2002 to 2010. The wool scoring system was based on a 10-point (0-9) scale, with score 0 for animals retaining full fleece and 9 for those completely shedding. DNA was sampled from 200 animals of which 48 with extreme phenotypes were genotyped using a 50-k SNP chip. Three genetic analyses were performed: heritability analysis, complex segregation analysis to test for a major gene hypothesis and a genome-wide association study to map regions in the genome affecting the trait. Phenotypes were treated as a continuous or binary variable and categories. High estimates of heritability (0.80 when treated as a continuous, 0.65-0.75 as binary and 0.75 as categories) for shedding were obtained from linear mixed model analyses. Complex segregation analysis gave similar estimates (0.80 ± 0.06) to those above with additional evidence for a major gene with dominance effects. Mixed model association analyses identified four significant (P < 0.05) SNPs. Further analyses of these four SNPs in all 200 animals revealed that one of the SNPs displayed dominance effects similar to those obtained from the complex segregation analyses. In summary, we found strong genetic control for wool shedding, demonstrated the possibility of a single putative dominant gene controlling this trait and identified four SNPs that may be in partial linkage disequilibrium with gene(s) controlling shedding. © 2013 University of Edinburgh, Animal Genetics © 2013 Stichting International Foundation for Animal Genetics.

  1. Deleterious Mutations, Apparent Stabilizing Selection and the Maintenance of Quantitative Variation

    PubMed Central

    Kondrashov, A. S.; Turelli, M.

    1992-01-01

    Apparent stabilizing selection on a quantitative trait that is not causally connected to fitness can result from the pleiotropic effects of unconditionally deleterious mutations, because as N. Barton noted, ``... individuals with extreme values of the trait will tend to carry more deleterious alleles ....'' We use a simple model to investigate the dependence of this apparent selection on the genomic deleterious mutation rate, U; the equilibrium distribution of K, the number of deleterious mutations per genome; and the parameters describing directional selection against deleterious mutations. Unlike previous analyses, we allow for epistatic selection against deleterious alleles. For various selection functions and realistic parameter values, the distribution of K, the distribution of breeding values for a pleiotropically affected trait, and the apparent stabilizing selection function are all nearly Gaussian. The additive genetic variance for the quantitative trait is kQa(2), where k is the average number of deleterious mutations per genome, Q is the proportion of deleterious mutations that affect the trait, and a(2) is the variance of pleiotropic effects for individual mutations that do affect the trait. In contrast, when the trait is measured in units of its additive standard deviation, the apparent fitness function is essentially independent of Q and a(2); and β, the intensity of selection, measured as the ratio of additive genetic variance to the ``variance'' of the fitness curve, is very close to s = U/k, the selection coefficient against individual deleterious mutations at equilibrium. Therefore, this model predicts appreciable apparent stabilizing selection if s exceeds about 0.03, which is consistent with various data. However, the model also predicts that β must equal V(m)/V(G), the ratio of new additive variance for the trait introduced each generation by mutation to the standing additive variance. Most, although not all, estimates of this ratio imply apparent stabilizing selection weaker than generally observed. A qualitative argument suggests that even when direct selection is responsible for most of the selection observed on a character, it may be essentially irrelevant to the maintenance of variation for the character by mutation-selection balance. Simple experiments can indicate the fraction of observed stabilizing selection attributable to the pleiotropic effects of deleterious mutations. PMID:1427047

  2. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    PubMed Central

    2011-01-01

    Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935

  3. ISSR, ERIC and RAPD techniques to detect genetic diversity in the aphid pathogen Pandora neoaphidis.

    PubMed

    Tymon, Anna M; Pell, Judith K

    2005-03-01

    The entomopathogenic fungus Pandora neoaphidis is an important natural enemy of aphids. ISSR, ERIC (Enterobacterial Repetitive Intergenic Consensus) and RAPD PCR-based DNA fingerprint analyses were undertaken to study intra-specific variation amongst 30 isolates of P. neoaphidis worldwide, together with six closely related species of Entomophthorales. All methods yielded scorable binary characters, and distance matrices were constructed from both individual and combined data sets. Neighbour-joining was used to construct consensus phylogenetic trees which showed that although P. neoaphidis isolates were highly polymorphic they separated into a monophyletic group compared with the other Entomophthorales tested. Three distinct subclades were found, with UK isolates occupying two of these. No specific correlation with aphid host species was established for any of the isolates apart from those in one cluster which contained isolates obtained from nettle aphid, Microlophium carnosum. ERIC, ISSR and RAPD analysis allowed the rapid genetic characterisation and differentiation of isolates with the generation of potential isolate- and cluster specific-diagnostic DNA markers.

  4. Unique Pressure Dependence of the Order-Disorder Transition Temperature of a Series of PEP-PDMS Diblock Copolymers

    NASA Astrophysics Data System (ADS)

    Mortensen, K.; Almdal, K.; Schwahn, D.; Frielinghaus, H.

    1997-03-01

    Studies of the phase behavior of polymer systems has proven that the sensitivity to fluctuations is much more distinct than originally anticipated based on theoretical arguments. In blends of homo-polymers, studies have revealed that fluctuations give rise to significant re-normalized critical behavior. It has been argued that the free volume causes an entropic contribution to the Flory-Huggins interaction parameter, \\chi, and is thereby responsible for the re-normalized behavior. In block copolymers fluctuations have even more pronounced effects, as it changes the second order critical point at f=0.5 to first order and additional complex phases are stabilized. Measurements of the structure factor S(q) of PEP-PDMS diblock copolymers have revealed unique character in the phase-diagram with re-entrant ordered structure. Moreover, an unexpected singularity in the conformational compressibility, as identified from the peak-position, q, is observed. In contrary to binary polymer blends, pressure does not affect the Ginzburg number.

  5. Phylogenetic systematics of Schacontia Dyar with descriptions of eight new species (Lepidoptera, Crambidae)

    PubMed Central

    Goldstein, Paul Z.; Metz, Mark A.; Solis, M. Alma

    2013-01-01

    Abstract The Neotropical genus Schacontia Dyar (1914) is reviewed and revised to include eleven species. Schacontia replica Dyar, 1914, syn. n. and Schacontia pfeifferi Amsel, 1956, syn. n. are synonymized with Schacontia chanesalis (Druce, 1899) and eight new species are described: Schacontia umbra,sp. n., Schacontia speciosa,sp. n., Schacontia themis, sp. n., Schacontia rasa, sp. n., Schacontia nyx,sp. n., Schacontia clotho, sp. n., Schacontia lachesis, sp. n., and Schacontia atropos, sp. n. Three species, Schacontia medalba, Schacontia chanesalis, and Schacontia ysticalis, are re-described. An analysis of 64 characters (56 binary, 8 multistate; 5 head, 13 thoracic, 13 abdominal, 25 male genitalic, and 8 female genitalic) scored for all Schacontia and three outgroup species (Eustixia pupula Hübner, 1823, Glaphyria sesquistrialis Hübner, 1823, and Hellula undalis (Fabricius, 1781)) retrieved 8 equally most parsimonious trees (L=102, CI=71, RI=84) of which the strict consensus is: [[[[medalba + umbra] + chanesalis] + speciosa] + [ysticalis + [rasa + themis + [atropos + lachesis + nyx + clotho

  6. One where the kid actually is "all right": the queering of Iva in Marilyn Hacker's Love, Death, and the Changing of the Seasons.

    PubMed

    Gardner, Jax Lee

    2013-01-01

    This article explores Marilyn Hacker's 1986 sonnet sequence, Love, Death, and the Changing of the Seasons, for its depiction of lesbian parenting. Hacker moves beyond the simply erotic to focus on a truly subversive act present within the queer community, namely that of child-rearing. Lesbian parenting is a private world, one not subject to the male gaze in the ways that other seemingly private worlds (like sex) are still commodified. The daughter character of Iva exemplifies the construction of self in a queer environment. Children of queer parents have the unique subject position of being "queered" themselves regardless of their ultimate sexual orientation. While this queering would seem to primarily affect their understandings of gender and sexuality, this article argues that such early "othering" serves to deconstruct one's understanding of binaries and social conformity on a large scale, thereby encouraging qualities of acceptance and compassion and increasing the intimate family bond.

  7. SWT voting-based color reduction for text detection in natural scene images

    NASA Astrophysics Data System (ADS)

    Ikica, Andrej; Peer, Peter

    2013-12-01

    In this article, we propose a novel stroke width transform (SWT) voting-based color reduction method for detecting text in natural scene images. Unlike other text detection approaches that mostly rely on either text structure or color, the proposed method combines both by supervising text-oriented color reduction process with additional SWT information. SWT pixels mapped to color space vote in favor of the color they correspond to. Colors receiving high SWT vote most likely belong to text areas and are blocked from being mean-shifted away. Literature does not explicitly address SWT search direction issue; thus, we propose an adaptive sub-block method for determining correct SWT direction. Both SWT voting-based color reduction and SWT direction determination methods are evaluated on binary (text/non-text) images obtained from a challenging Computer Vision Lab optical character recognition database. SWT voting-based color reduction method outperforms the state-of-the-art text-oriented color reduction approach.

  8. Autonomous celestial navigation based on Earth ultraviolet radiance and fast gradient statistic feature extraction

    NASA Astrophysics Data System (ADS)

    Lu, Shan; Zhang, Hanmo

    2016-01-01

    To meet the requirement of autonomous orbit determination, this paper proposes a fast curve fitting method based on earth ultraviolet features to obtain accurate earth vector direction, in order to achieve the high precision autonomous navigation. Firstly, combining the stable characters of earth ultraviolet radiance and the use of transmission model software of atmospheric radiation, the paper simulates earth ultraviolet radiation model on different time and chooses the proper observation band. Then the fast improved edge extracting method combined Sobel operator and local binary pattern (LBP) is utilized, which can both eliminate noises efficiently and extract earth ultraviolet limb features accurately. And earth's centroid locations on simulated images are estimated via the least square fitting method using part of the limb edges. Taken advantage of the estimated earth vector direction and earth distance, Extended Kalman Filter (EKF) is applied to realize the autonomous navigation finally. Experiment results indicate the proposed method can achieve a sub-pixel earth centroid location estimation and extremely enhance autonomous celestial navigation precision.

  9. Grain boundary phase transformations in PtAu and relevance to thermal stabilization of bulk nanocrystalline metals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    O’Brien, C. J.; Barr, C. M.; Price, P. M.

    There has recently been a great deal of interest in employing immiscible solutes to stabilize nanocrystalline microstructures. Existing modeling efforts largely rely on mesoscale Monte Carlo approaches that employ a simplified model of the microstructure and result in highly homogeneous segregation to grain boundaries. However, there is ample evidence from experimental and modeling studies that demonstrates segregation to grain boundaries is highly non-uniform and sensitive to boundary character. This work employs a realistic nanocrystalline microstructure with experimentally relevant global solute concentrations to illustrate inhomogeneous boundary segregation. Furthermore, experiments quantifying segregation in thin films are reported that corroborate the prediction thatmore » grain boundary segregation is highly inhomogeneous. In addition to grain boundary structure modifying the degree of segregation, the existence of a phase transformation between low and high solute content grain boundaries is predicted. In order to conduct this study, new embedded atom method interatomic potentials are developed for Pt, Au, and the PtAu binary alloy.« less

  10. Grain boundary phase transformations in PtAu and relevance to thermal stabilization of bulk nanocrystalline metals

    DOE PAGES

    O’Brien, C. J.; Barr, C. M.; Price, P. M.; ...

    2017-10-31

    There has recently been a great deal of interest in employing immiscible solutes to stabilize nanocrystalline microstructures. Existing modeling efforts largely rely on mesoscale Monte Carlo approaches that employ a simplified model of the microstructure and result in highly homogeneous segregation to grain boundaries. However, there is ample evidence from experimental and modeling studies that demonstrates segregation to grain boundaries is highly non-uniform and sensitive to boundary character. This work employs a realistic nanocrystalline microstructure with experimentally relevant global solute concentrations to illustrate inhomogeneous boundary segregation. Furthermore, experiments quantifying segregation in thin films are reported that corroborate the prediction thatmore » grain boundary segregation is highly inhomogeneous. In addition to grain boundary structure modifying the degree of segregation, the existence of a phase transformation between low and high solute content grain boundaries is predicted. In order to conduct this study, new embedded atom method interatomic potentials are developed for Pt, Au, and the PtAu binary alloy.« less

  11. Identification of a microsporidian isolate from Cnaphalocrocis Medinalis and its pathogenicity to Bombyx mori.

    PubMed

    Huang, Xuhua; Qi, Guangjun; Pan, Zhixin; Zhu, Fangrong; Huang, Yuanjiao; Wu, Yonghu

    2014-11-01

    A microsporidian, CmM2, was isolated from Cnaphalocrocis medinalis. The biological characters, molecular analysis and pathogenicity of CmM2 were studied. The spore of CmM2 is long oval in shape and 3.45 ± 0.25 × 1.68 ± 0.18 µm in size, the life cycle includes meronts, sporonts, sporoblasts, and spores, with typical diplokaryon in each stage, propagated in binary fission. There is positive coagulation reaction between CmM2 and the polyclonal antibody of Nosema bombycis (N.b.). CmM2 spores is binuclear, and has 10-12 polar filament coils. The small subunit ribosomal RNA (SSU rRNA) gene sequence of CmM2 was obtained by PCR amplification and sequencing, the phylogenetic tree based on SSU rRNA sequences had been constructed, and the similarity and genetic distance of SSU rRNA sequences were analyzed, showed that CmM2 was grouped in the Nosema clade. The 50% infectious concentration of CmM2 to Bombyx mori is 4.72 × 10(4)  spores ml(-1) , and the germinative infection rate is 12.33%. The results showed that CmM2 is classified into genus Nosema, as Nosema sp. CmM2, and has a heavy infectivity to B. mori. The result indicated as well that it is valuable taxonomic determination for microsporidian isolates based on both biological characters and molecular evidence. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Orion Script Generator

    NASA Technical Reports Server (NTRS)

    Dooling, Robert J.

    2012-01-01

    NASA Engineering's Orion Script Generator (OSG) is a program designed to run on Exploration Flight Test One Software. The script generator creates a SuperScript file that, when run, accepts the filename for a listing of Compact Unique Identifiers (CUIs). These CUIs will correspond to different variables on the Orion spacecraft, such as the temperature of a component X, the active or inactive status of another component Y, and so on. OSG will use a linked database to retrieve the value for each CUI, such as "100 05," "True," and so on. Finally, OSG writes SuperScript code to display each of these variables before outputting the ssi file that allows recipients to view a graphical representation of Orion Flight Test One's status through these variables. This project's main challenge was creating flexible software that accepts and transfers many types of data, from Boolean (true or false) values to "Unsigned Long Long'' values (any number from 0 to 18,446,744,073,709,551,615). We also needed to allow bit manipulation for each variable, requiring us to program functions that could convert any of the multiple types of data into binary code. Throughout the project, we explored different methods to optimize the speed of working with the CUI database and long binary numbers. For example, the program handled extended binary numbers much more efficiently when we stored them as collections of Boolean values (true or false representing 1 or 0) instead of as collections of character strings or numbers. We also strove to make OSG as user-friendly and accommodating of different needs as possible its default behavior is to display a current CUI's maximum value and minimum value with three to five intermediate values in between, all in descending order. Fortunately, users can also add other input on the same lines as each CUI name to request different high values, low values, display options (ascending, sine, and so on), and interval sizes for generating intermediate values. Developing input validation took up quite a bit of time, but OSG's flexibility in the end was worth it.

  13. Pan-Genomic Analysis Permits Differentiation of Virulent and Non-virulent Strains of Xanthomonas arboricola That Cohabit Prunus spp. and Elucidate Bacterial Virulence Factors

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.; Cubero, Jaime

    2017-01-01

    Xanthomonas arboricola is a plant-associated bacterial species that causes diseases on several plant hosts. One of the most virulent pathovars within this species is X. arboricola pv. pruni (Xap), the causal agent of bacterial spot disease of stone fruit trees and almond. Recently, a non-virulent Xap-look-a-like strain isolated from Prunus was characterized and its genome compared to pathogenic strains of Xap, revealing differences in the profile of virulence factors, such as the genes related to the type III secretion system (T3SS) and type III effectors (T3Es). The existence of this atypical strain arouses several questions associated with the abundance, the pathogenicity, and the evolutionary context of X. arboricola on Prunus hosts. After an initial characterization of a collection of Xanthomonas strains isolated from Prunus bacterial spot outbreaks in Spain during the past decade, six Xap-look-a-like strains, that did not clustered with the pathogenic strains of Xap according to a multi locus sequence analysis, were identified. Pathogenicity of these strains was analyzed and the genome sequences of two Xap-look-a-like strains, CITA 14 and CITA 124, non-virulent to Prunus spp., were obtained and compared to those available genomes of X. arboricola associated with this host plant. Differences were found among the genomes of the virulent and the Prunus non-virulent strains in several characters related to the pathogenesis process. Additionally, a pan-genomic analysis that included the available genomes of X. arboricola, revealed that the atypical strains associated with Prunus were related to a group of non-virulent or low virulent strains isolated from a wide host range. The repertoire of the genes related to T3SS and T3Es varied among the strains of this cluster and those strains related to the most virulent pathovars of the species, corylina, juglandis, and pruni. This variability provides information about the potential evolutionary process associated to the acquisition of pathogenicity and host specificity in X. arboricola. Finally, based in the genomic differences observed between the virulent and the non-virulent strains isolated from Prunus, a sensitive and specific real-time PCR protocol was designed to detect and identify Xap strains. This method avoids miss-identifications due to atypical strains of X. arboricola that can cohabit Prunus. PMID:28450852

  14. Diversity in Genetic In Vivo Methods for Protein-Protein Interaction Studies: from the Yeast Two-Hybrid System to the Mammalian Split-Luciferase System

    PubMed Central

    Stynen, Bram; Tournu, Hélène; Tavernier, Jan

    2012-01-01

    Summary: The yeast two-hybrid system pioneered the field of in vivo protein-protein interaction methods and undisputedly gave rise to a palette of ingenious techniques that are constantly pushing further the limits of the original method. Sensitivity and selectivity have improved because of various technical tricks and experimental designs. Here we present an exhaustive overview of the genetic approaches available to study in vivo binary protein interactions, based on two-hybrid and protein fragment complementation assays. These methods have been engineered and employed successfully in microorganisms such as Saccharomyces cerevisiae and Escherichia coli, but also in higher eukaryotes. From single binary pairwise interactions to whole-genome interactome mapping, the self-reassembly concept has been employed widely. Innovative studies report the use of proteins such as ubiquitin, dihydrofolate reductase, and adenylate cyclase as reconstituted reporters. Protein fragment complementation assays have extended the possibilities in protein-protein interaction studies, with technologies that enable spatial and temporal analyses of protein complexes. In addition, one-hybrid and three-hybrid systems have broadened the types of interactions that can be studied and the findings that can be obtained. Applications of these technologies are discussed, together with the advantages and limitations of the available assays. PMID:22688816

  15. Genomic Biomarkers for the Prediction of Stage and Prognosis of Upper Tract Urothelial Carcinoma.

    PubMed

    Bagrodia, Aditya; Cha, Eugene K; Sfakianos, John P; Zabor, Emily C; Bochner, Bernard H; Al-Ahmadie, Hikmat A; Solit, David B; Coleman, Jonathan A; Iyer, Gopa; Scott, Sasinya N; Shah, Ronak; Ostrovnaya, Irina; Lee, Byron; Desai, Neil B; Ren, Qinghu; Rosenberg, Jonathan E; Dalbagni, Guido; Bajorin, Dean F; Reuter, Victor E; Berger, Michael F

    2016-06-01

    Genomic characterization of radical nephroureterectomy specimens in patients with upper tract urothelial carcinoma may allow for thoughtful integration of systemic and targeted therapies. We sought to determine whether genomic alterations in upper tract urothelial carcinoma are associated with adverse pathological and clinical outcomes. Next generation exon capture sequencing of 300 cancer associated genes was performed in 83 patients with upper tract urothelial carcinoma. Genomic alterations were assessed individually and also grouped into core signal transduction pathways or canonical cell functions for association with clinicopathological outcomes. Binary outcomes, including grade (high vs low), T stage (pTa/T1/T2 vs pT3/T4) and organ confined status (pT2 or less and N0/Nx vs greater than pT2 or N+) were assessed with the Kruskal-Wallis and Fisher exact tests as appropriate. Associations between alterations and survival were estimated using the Kaplan-Meier method and Cox regression. Of the 24 most commonly altered genes in 9 pathways TP53/MDM2 alterations and FGFR3 mutations were the only 2 alterations uniformly associated with high grade, advanced stage, nonorgan confined disease, and recurrence-free and cancer specific survival. TP53/MDM2 alterations were associated with adverse clinicopathological outcomes whereas FGFR3 mutations were associated with favorable outcomes. We created a risk score using TP53/MDM2 and FGFR3 status that was able to discriminate between adverse pathological and clinical outcomes, including in the subset of patients with high grade disease. The study is limited by small numbers and lack of validation. Our data indicate that specific genomic alterations in radical nephroureterectomy specimens correlate with tumor grade, stage and cancer specific survival outcomes. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  16. HIA: a genome mapper using hybrid index-based sequence alignment.

    PubMed

    Choi, Jongpill; Park, Kiejung; Cho, Seong Beom; Chung, Myungguen

    2015-01-01

    A number of alignment tools have been developed to align sequencing reads to the human reference genome. The scale of information from next-generation sequencing (NGS) experiments, however, is increasing rapidly. Recent studies based on NGS technology have routinely produced exome or whole-genome sequences from several hundreds or thousands of samples. To accommodate the increasing need of analyzing very large NGS data sets, it is necessary to develop faster, more sensitive and accurate mapping tools. HIA uses two indices, a hash table index and a suffix array index. The hash table performs direct lookup of a q-gram, and the suffix array performs very fast lookup of variable-length strings by exploiting binary search. We observed that combining hash table and suffix array (hybrid index) is much faster than the suffix array method for finding a substring in the reference sequence. Here, we defined the matching region (MR) is a longest common substring between a reference and a read. And, we also defined the candidate alignment regions (CARs) as a list of MRs that is close to each other. The hybrid index is used to find candidate alignment regions (CARs) between a reference and a read. We found that aligning only the unmatched regions in the CAR is much faster than aligning the whole CAR. In benchmark analysis, HIA outperformed in mapping speed compared with the other aligners, without significant loss of mapping accuracy. Our experiments show that the hybrid of hash table and suffix array is useful in terms of speed for mapping NGS sequencing reads to the human reference genome sequence. In conclusion, our tool is appropriate for aligning massive data sets generated by NGS sequencing.

  17. MultiMetEval: Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    PubMed Central

    Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval), built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads. PMID:23272111

  18. Fast and accurate phylogeny reconstruction using filtered spaced-word matches

    PubMed Central

    Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

    2017-01-01

    Abstract Motivation: Word-based or ‘alignment-free’ algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. Results: We propose Filtered Spaced Word Matches (FSWM), a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don’t-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don’t-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don’t-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. Availability and Implementation: The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/ Contact: chris.leimeister@stud.uni-goettingen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073754

  19. Fast and accurate phylogeny reconstruction using filtered spaced-word matches.

    PubMed

    Leimeister, Chris-André; Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

    2017-04-01

    Word-based or 'alignment-free' algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. We propose Filtered Spaced Word Matches (FSWM) , a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don't-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don't-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don't-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/. chris.leimeister@stud.uni-goettingen.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  20. Identification of a Retroelement from the Resurrection Plant Boea hygrometrica That Confers Osmotic and Alkaline Tolerance in Arabidopsis thaliana

    PubMed Central

    Shen, Chun-Ying; Xu, Guang-Hui; Chen, Shi-Xuan; Song, Li-Zhen; Li, Mei-Jing; Wang, Li-Li; Zhu, Yan; Lv, Wei-Tao; Gong, Zhi-Zhong; Liu, Chun-Ming; Deng, Xin

    2014-01-01

    Functional genomic elements, including transposable elements, small RNAs and non-coding RNAs, are involved in regulation of gene expression in response to plant stress. To identify genomic elements that regulate dehydration and alkaline tolerance in Boea hygrometrica, a resurrection plant that inhabits drought and alkaline Karst areas, a genomic DNA library from B. hygrometrica was constructed and subsequently transformed into Arabidopsis using binary bacterial artificial chromosome (BIBAC) vectors. Transgenic lines were screened under osmotic and alkaline conditions, leading to the identification of Clone L1-4 that conferred osmotic and alkaline tolerance. Sequence analyses revealed that L1-4 contained a 49-kb retroelement fragment from B. hygrometrica, of which only a truncated sequence was present in L1-4 transgenic Arabidopsis plants. Additional subcloning revealed that activity resided in a 2-kb sequence, designated Osmotic and Alkaline Resistance 1 (OAR1). In addition, transgenic Arabidopsis lines carrying an OAR1-homologue also showed similar stress tolerance phenotypes. Physiological and molecular analyses demonstrated that OAR1-transgenic plants exhibited improved photochemical efficiency and membrane integrity and biomarker gene expression under both osmotic and alkaline stresses. Short transcripts that originated from OAR1 were increased under stress conditions in both B. hygrometrica and Arabidopsis carrying OAR1. The relative copy number of OAR1 was stable in transgenic Arabidopsis under stress but increased in B. hygrometrica. Taken together, our results indicated a potential role of OAR1 element in plant tolerance to osmotic and alkaline stresses, and verified the feasibility of the BIBAC transformation technique to identify functional genomic elements from physiological model species. PMID:24851859

  1. Hemp (Cannabis sativa L.).

    PubMed

    Feeney, Mistianne; Punja, Zamir K

    2015-01-01

    Hemp (Cannabis sativa L.) suspension culture cells were transformed with Agrobacterium tumefaciens strain EHA101 carrying the binary plasmid pNOV3635. The plasmid contains a phosphomannose isomerase (PMI) selectable marker gene. Cells transformed with PMI are capable of metabolizing the selective agent mannose, whereas cells not expressing the gene are incapable of using the carbon source and will stop growing. Callus masses proliferating on selection medium were screened for PMI expression using a chlorophenol red assay. Genomic DNA was extracted from putatively transformed callus lines, and the presence of the PMI gene was confirmed using PCR and Southern hybridization. Using this method, an average transformation frequency of 31.23% ± 0.14 was obtained for all transformation experiments, with a range of 15.1-55.3%.

  2. Genome-wide association study for ketosis in US Jerseys using producer-recorded data.

    PubMed

    Parker Gaddis, K L; Megonigal, J H; Clay, J S; Wolfe, C W

    2018-01-01

    Ketosis is one of the most frequently reported metabolic health events in dairy herds. Several genetic analyses of ketosis in dairy cattle have been conducted; however, few have focused specifically on Jersey cattle. The objectives of this research included estimating variance components for susceptibility to ketosis and identification of genomic regions associated with ketosis in Jersey cattle. Voluntary producer-recorded health event data related to ketosis were available from Dairy Records Management Systems (Raleigh, NC). Standardization was implemented to account for the various acronyms used by producers to designate an incidence of ketosis. Events were restricted to the first reported incidence within 60 d after calving in first through fifth parities. After editing, there were a total of 42,233 records from 23,865 cows. A total of 1,750 genotyped animals were used for genomic analyses using 60,671 markers. Because of the binary nature of the trait, a threshold animal model was fitted using THRGIBBS1F90 (version 2.110) using only pedigree information, and genomic information was incorporated using a single-step genomic BLUP approach. Individual single nucleotide polymorphism (SNP) effects and the proportion of variance explained by 10-SNP windows were calculated using postGSf90 (version 1.38). Heritability of susceptibility to ketosis was 0.083 [standard deviation (SD) = 0.021] and 0.078 (SD = 0.018) in pedigree-based and genomic analyses, respectively. The marker with the largest associated effect was located on chromosome 10 at 66.3 Mbp. The 10-SNP window explaining the largest proportion of variance (0.70%) was located on chromosome 6 beginning at 56.1 Mbp. Gene Ontology (GO) and Medical Subject Heading (MeSH) enrichment analyses identified several overrepresented processes and terms related to immune function. Our results indicate that there is a genetic component related to ketosis susceptibility in Jersey cattle and, as such, genetic selection for improved resistance to ketosis is feasible. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  3. Comparative Genomics Evidence That Only Protein Toxins are Tagging Bad Bugs

    PubMed Central

    Georgiades, Kalliopi; Raoult, Didier

    2011-01-01

    The term toxin was introduced by Roux and Yersin and describes macromolecular substances that, when produced during infection or when introduced parenterally or orally, cause an impairment of physiological functions that lead to disease or to the death of the infected organism. Long after the discovery of toxins, early genetic studies on bacterial virulence demonstrated that removing a certain number of genes from pathogenic bacteria decreases their capacity to infect hosts. Each of the removed factors was therefore referred to as a “virulence factor,” and it was speculated that non-pathogenic bacteria lack such supplementary factors. However, many recent comparative studies demonstrate that the specialization of bacteria to eukaryotic hosts is associated with massive gene loss. We recently demonstrated that the only features that seem to characterize 12 epidemic bacteria are toxin–antitoxin (TA) modules, which are addiction molecules in host bacteria. In this study, we investigated if protein toxins are indeed the only molecules specific to pathogenic bacteria by comparing 14 epidemic bacterial killers (“bad bugs”) with their 14 closest non-epidemic relatives (“controls”). We found protein toxins in significantly more elevated numbers in all of the “bad bugs.” For the first time, statistical principal components analysis, including genome size, GC%, TA modules, restriction enzymes, and toxins, revealed that toxins are the only proteins other than TA modules that are correlated with the pathogenic character of bacteria. Moreover, intracellular toxins appear to be more correlated with the pathogenic character of bacteria than secreted toxins. In conclusion, we hypothesize that the only truly identifiable phenomena, witnessing the convergent evolution of the most pathogenic bacteria for humans are the loss of metabolic activities, i.e., the outcome of the loss of regulatory and transcription factors and the presence of protein toxins, alone, or coupled as TA modules. PMID:22919573

  4. Independent origins of neurons and synapses: insights from ctenophores

    PubMed Central

    Moroz, Leonid L.; Kohn, Andrea B.

    2016-01-01

    There is more than one way to develop neuronal complexity, and animals frequently use different molecular toolkits to achieve similar functional outcomes. Genomics and metabolomics data from basal metazoans suggest that neural signalling evolved independently in ctenophores and cnidarians/bilaterians. This polygenesis hypothesis explains the lack of pan-neuronal and pan-synaptic genes across metazoans, including remarkable examples of lineage-specific evolution of neurogenic and signalling molecules as well as synaptic components. Sponges and placozoans are two lineages without neural and muscular systems. The possibility of secondary loss of neurons and synapses in the Porifera/Placozoa clades is a highly unlikely and less parsimonious scenario. We conclude that acetylcholine, serotonin, histamine, dopamine, octopamine and gamma-aminobutyric acid (GABA) were recruited as transmitters in the neural systems in cnidarian and bilaterian lineages. By contrast, ctenophores independently evolved numerous secretory peptides, indicating extensive adaptations within the clade and suggesting that early neural systems might be peptidergic. Comparative analysis of glutamate signalling also shows numerous lineage-specific innovations, implying the extensive use of this ubiquitous metabolite and intercellular messenger over the course of convergent and parallel evolution of mechanisms of intercellular communication. Therefore: (i) we view a neuron as a functional character but not a genetic character, and (ii) any given neural system cannot be considered as a single character because it is composed of different cell lineages with distinct genealogies, origins and evolutionary histories. Thus, when reconstructing the evolution of nervous systems, we ought to start with the identification of particular cell lineages by establishing distant neural homologies or examples of convergent evolution. In a corollary of the hypothesis of the independent origins of neurons, our analyses suggest that both electrical and chemical synapses evolved more than once. PMID:26598724

  5. Genetic Diversity and Population Structure of Rice Varieties Cultivated in Temperate Regions.

    PubMed

    Reig-Valiente, Juan L; Viruel, Juan; Sales, Ester; Marqués, Luis; Terol, Javier; Gut, Marta; Derdak, Sophia; Talón, Manuel; Domingo, Concha

    2016-12-01

    After its domestication, rice cultivation expanded from tropical regions towards northern latitudes with temperate climate in a progressive process to overcome limiting photoperiod and temperature conditions. This process has originated a wide range of diversity that can be regarded as a valuable resource for crop improvement. In general, current rice breeding programs have to deal with a lack of both germplasm accessions specifically adapted to local agro-environmental conditions and adapted donors carrying desired agronomical traits. Comprehensive maps of genome variability and population structure would facilitate genome-wide association studies of complex traits, functional gene investigations and the selection of appropriate donors for breeding purposes. A collection of 217 rice varieties mainly cultivated in temperate regions was generated. The collection encompasses modern elite and old cultivars, as well as traditional landraces covering a wide genetic diversity available for rice breeders. Whole Genome Sequencing was performed on 14 cultivars representative of the collection and the genomic profiles of all cultivars were constructed using a panel of 2697 SNPs with wide coverage throughout the rice genome, obtained from the sequencing data. The population structure and genetic relationship analyses showed a strong substructure in the temperate rice population, predominantly based on grain type and the origin of the cultivars. Dendrogram also agrees population structure results. Based on SNP markers, we have elucidated the genetic relationship and the degree of genetic diversity among a collection of 217 temperate rice varieties possessing an enormous variety of agromorphological and physiological characters. Taken together, the data indicated the occurrence of relatively high gene flow and elevated rates of admixture between cultivars grown in remote regions, probably favoured by local breeding activities. The results of this study significantly expand the current genetic resources available for temperate varieties of rice, providing a valuable tool for future association mapping studies.

  6. Systematics and plastid genome evolution of the cryptically photosynthetic parasitic plant genus Cuscuta (Convolvulaceae).

    PubMed

    McNeal, Joel R; Arumugunathan, Kathiravetpilla; Kuehl, Jennifer V; Boore, Jeffrey L; Depamphilis, Claude W

    2007-12-13

    The genus Cuscuta L. (Convolvulaceae), commonly known as dodders, are epiphytic vines that invade the stems of their host with haustorial feeding structures at the points of contact. Although they lack expanded leaves, some species are noticeably chlorophyllous, especially as seedlings and in maturing fruits. Some species are reported as crop pests of worldwide distribution, whereas others are extremely rare and have local distributions and apparent niche specificity. A strong phylogenetic framework for this large genus is essential to understand the interesting ecological, morphological and molecular phenomena that occur within these parasites in an evolutionary context. Here we present a well-supported phylogeny of Cuscuta using sequences of the nuclear ribosomal internal transcribed spacer and plastid rps2, rbcL and matK from representatives across most of the taxonomic diversity of the genus. We use the phylogeny to interpret morphological and plastid genome evolution within the genus. At least three currently recognized taxonomic sections are not monophyletic and subgenus Cuscuta is unequivocally paraphyletic. Plastid genes are extremely variable with regards to evolutionary constraint, with rbcL exhibiting even higher levels of purifying selection in Cuscuta than photosynthetic relatives. Nuclear genome size is highly variable within Cuscuta, particularly within subgenus Grammica, and in some cases may indicate the existence of cryptic species in this large clade of morphologically similar species. Some morphological characters traditionally used to define major taxonomic splits within Cuscuta are homoplastic and are of limited use in defining true evolutionary groups. Chloroplast genome evolution seems to have evolved in a punctuated fashion, with episodes of loss involving suites of genes or tRNAs followed by stabilization of gene content in major clades. Nearly all species of Cuscuta retain some photosynthetic ability, most likely for nutrient apportionment to their seeds, while complete loss of photosynthesis and possible loss of the entire chloroplast genome is limited to a single small clade of outcrossing species found primarily in western South America.

  7. Systematics and plastid genome evolution of the cryptically photosynthetic parasitic plant genus Cuscuta (Convolvulaceae)

    PubMed Central

    McNeal, Joel R; Arumugunathan, Kathiravetpilla; Kuehl, Jennifer V; Boore, Jeffrey L; dePamphilis, Claude W

    2007-01-01

    Background The genus Cuscuta L. (Convolvulaceae), commonly known as dodders, are epiphytic vines that invade the stems of their host with haustorial feeding structures at the points of contact. Although they lack expanded leaves, some species are noticeably chlorophyllous, especially as seedlings and in maturing fruits. Some species are reported as crop pests of worldwide distribution, whereas others are extremely rare and have local distributions and apparent niche specificity. A strong phylogenetic framework for this large genus is essential to understand the interesting ecological, morphological and molecular phenomena that occur within these parasites in an evolutionary context. Results Here we present a well-supported phylogeny of Cuscuta using sequences of the nuclear ribosomal internal transcribed spacer and plastid rps2, rbcL and matK from representatives across most of the taxonomic diversity of the genus. We use the phylogeny to interpret morphological and plastid genome evolution within the genus. At least three currently recognized taxonomic sections are not monophyletic and subgenus Cuscuta is unequivocally paraphyletic. Plastid genes are extremely variable with regards to evolutionary constraint, with rbcL exhibiting even higher levels of purifying selection in Cuscuta than photosynthetic relatives. Nuclear genome size is highly variable within Cuscuta, particularly within subgenus Grammica, and in some cases may indicate the existence of cryptic species in this large clade of morphologically similar species. Conclusion Some morphological characters traditionally used to define major taxonomic splits within Cuscuta are homoplastic and are of limited use in defining true evolutionary groups. Chloroplast genome evolution seems to have evolved in a punctuated fashion, with episodes of loss involving suites of genes or tRNAs followed by stabilization of gene content in major clades. Nearly all species of Cuscuta retain some photosynthetic ability, most likely for nutrient apportionment to their seeds, while complete loss of photosynthesis and possible loss of the entire chloroplast genome is limited to a single small clade of outcrossing species found primarily in western South America. PMID:18078516

  8. First Nuclear DNA Amounts in more than 300 Angiosperms

    PubMed Central

    ZONNEVELD, B. J. M.; LEITCH, I. J.; BENNETT, M. D.

    2005-01-01

    • Background and Aims Genome size (DNA C-value) data are key biodiversity characters of fundamental significance used in a wide variety of biological fields. Since 1976, Bennett and colleagues have made scattered published and unpublished genome size data more widely accessible by assembling them into user-friendly compilations. Initially these were published as hard copy lists, but since 1997 they have also been made available electronically (see the Plant DNA C-values database www.kew.org/cval/homepage.html). Nevertheless, at the Second Plant Genome Size Meeting in 2003, Bennett noted that as many as 1000 DNA C-value estimates were still unpublished and hence unavailable. Scientists were strongly encouraged to communicate such unpublished data. The present work combines the databasing experience of the Kew-based authors with the unpublished C-values produced by Zonneveld to make a large body of valuable genome size data available to the scientific community. • Methods C-values for angiosperm species, selected primarily for their horticultural interest, were estimated by flow cytometry using the fluorochrome propidium iodide. The data were compiled into a table whose form is similar to previously published lists of DNA amounts by Bennett and colleagues. • Key Results and Conclusions The present work contains C-values for 411 taxa including first values for 308 species not listed previously by Bennett and colleagues. Based on a recent estimate of the global published output of angiosperm DNA C-value data (i.e. 200 first C-value estimates per annum) the present work equals 1·5 years of average global published output; and constitutes over 12 % of the latest 5-year global target set by the Second Plant Genome Size Workshop (see www.kew.org/cval/workshopreport.html). Hopefully, the present example will encourage others to unveil further valuable data which otherwise may lie forever unpublished and unavailable for comparative analyses. PMID:15905300

  9. [The great virus comeback].

    PubMed

    Forterre, Patrick

    2013-01-01

    Viruses have been considered for a long time as by-products of biological evolution. This view is changing now as a result of several recent discoveries. Viral ecologists have shown that viral particles are the most abundant biological entities on our planet, whereas metagenomic analyses have revealed an unexpected abundance and diversity of viral genes in the biosphere. Comparative genomics have highlighted the uniqueness of viral sequences, in contradiction with the traditional view of viruses as pickpockets of cellular genes. On the contrary, cellular genomes, especially eukaryotic ones, turned out to be full of genes derived from viruses or related elements (plasmids, transposons, retroelements and so on). The discovery of unusual viruses infecting archaea has shown that the viral world is much more diverse than previously thought, ruining the traditional dichotomy between bacteriophages and viruses. Finally, the discovery of giant viruses has blurred the traditional image of viruses as small entities. Furthermore, essential clues on virus history have been obtained in the last ten years. In particular, structural analyses of capsid proteins have uncovered deeply rooted homologies between viruses infecting different cellular domains, suggesting that viruses originated before the last universal common ancestor (LUCA). These studies have shown that several lineages of viruses originated independently, i.e., viruses are polyphyletic. From the time of LUCA, viruses have coevolved with their hosts, and viral lineages can be viewed as lianas wrapping around the trunk, branches and leaves of the tree of life. Although viruses are very diverse, with genomes encoding from one to more than one thousand proteins, they can all be simply defined as organisms producing virions. Virions themselves can be defined as infectious particles made of at least one protein associated with the viral nucleic acid, endowed with the capability to protect the viral genome and ensure its delivery to the infected cell. These definitions, which clearly distinguish viruses from plasmids, suggest that infectious RNA molecules that only encode an RNA replicase presently classified among viruses by the ICTV (International Committee for the Taxonomy of Viruses) into families of Endornaviridae and Hypoviridae are in fact RNA plasmids. Since a viral genome should encode for at least one structural protein, these definitions also imply that viruses originated after the emergence of the ribosome in an RNA-protein cellular world. Although virions are the hallmarks of viruses, viruses and virions should not be confused. The infection transforms the ribocell (cell encoding ribosomes and dividing by binary fission) into a virocell (cell producing virions) or ribovirocell (cell that produces virions but can still divide by binary fission). In the ribovirocell, two different organisms, defined by their distinct evolutionary histories, coexist in symbiosis in the same cell. The virocells or ribovirocells are the living forms of the virus, which can be in fine considered to be a living organism. In the virocell, the metabolism is reorganized for the production of virions, while the ability to capture and store free energy is retained, as in other cellular organisms. In the virocell, viral genomes replicate, recombine and evolve, leading to the emergence of new viral proteins and potentially novel functions. Some of these new functions can be later on transferred to the cell, explaining how viruses can play a major (often underestimated) role in the evolution of cellular organisms. The virocell concept thus helps to understand recent hypotheses suggesting that viruses played a critical role in major evolutionary transitions, such as the origin of DNA genomes or else the origin of the eukaryotic nucleus. Finally, it is more and more recognized that viruses are the major source of variation and selection in living organisms (both viruses and cells), the two pillars of darwinism. One can thus conclude that the continuous interaction between viruses and cells, all along the history of life, has been, and still is, a major engine of biological evolution. © Société de Biologie, 2013.

  10. Physical locations of 5S and 18S-25S rDNA in Asian and American diploid Hordeum species with the I genome.

    PubMed

    Taketa, S; Ando, H; Takeda, K; von Bothmer, R

    2001-05-01

    The physical locations of 5S and 18S-25S rDNA sequences in 15 diploid Hordeum species with the I genome were examined by double-target in situ hybridization with pTa71 (18S-25S rDNA) and pTa794 (5S rDNA) clones as probes. All the three Asian species had a species-specific rDNA pattern. In 12 American species studied, eight different rDNA types were found. The type reported previously in H. chilense (the 'chilense' type) was observed in eight American species. The chilense type had double 5S rDNA sites - two sites on one chromosome arm separated by a short distance - and two pairs of major 18S-25S rDNA sites on two pairs of satellite chromosomes. The other seven types found in American species were similar to the chilense type and could be derived from the chilense type through deletion, reduction or addition of a rDNA site. Intraspecific polymorphisms were observed in three American species. The overall similarity in rDNA patterns among American species indicates the close relationships between North and South American species and their derivation from a single ancestral source. The differences in the distribution patterns of 5S and 18S-25S rDNA between Asian and American species suggest differentiation between the I genomes of Asian and American species. The 5S and 18S-25S rDNA sites are useful chromosome markers for delimiting Asian species, but have limited value as a taxonomic character in American species. On the basis of rDNA patterns, karyotype evolution and phylogeny of the I-genome diploid species are discussed.

  11. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae

    PubMed Central

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences. PMID:28617867

  12. Scanning the genome for gene single nucleotide polymorphisms involved in adaptive population differentiation in white spruce

    PubMed Central

    Namroud, Marie-Claire; Beaulieu, Jean; Juge, Nicolas; Laroche, Jérôme; Bousquet, Jean

    2008-01-01

    Conifers are characterized by a large genome size and a rapid decay of linkage disequilibrium, most often within gene limits. Genome scans based on noncoding markers are less likely to detect molecular adaptation linked to genes in these species. In this study, we assessed the effectiveness of a genome-wide single nucleotide polymorphism (SNP) scan focused on expressed genes in detecting local adaptation in a conifer species. Samples were collected from six natural populations of white spruce (Picea glauca) moderately differentiated for several quantitative characters. A total of 534 SNPs representing 345 expressed genes were analysed. Genes potentially under natural selection were identified by estimating the differentiation in SNP frequencies among populations (FST) and identifying outliers, and by estimating local differentiation using a Bayesian approach. Both average expected heterozygosity and population differentiation estimates (HE = 0.270 and FST = 0.006) were comparable to those obtained with other genetic markers. Of all genes, 5.5% were identified as outliers with FST at the 95% confidence level, while 14% were identified as candidates for local adaptation with the Bayesian method. There was some overlap between the two gene sets. More than half of the candidate genes for local adaptation were specific to the warmest population, about 20% to the most arid population, and 15% to the coldest and most humid higher altitude population. These adaptive trends were consistent with the genes’ putative functions and the divergence in quantitative traits noted among the populations. The results suggest that an approach separating the locus and population effects is useful to identify genes potentially under selection. These candidates are worth exploring in more details at the physiological and ecological levels. PMID:18662225

  13. Enhanced degradation of perfluorooctanoic acid by a genome shuffling-modified Pseudomonas parafulva YAB-1.

    PubMed

    Yi, Langbo; Peng, Qingzhong; Liu, Deming; Zhou, Lulu; Tang, Chongjian; Zhou, Yaoyu; Chai, Liyuan

    2018-05-02

    Perfluorooctanoic acid (PFOA) as an emerging persistent organic pollutant is hard to be degraded by conventional methods because of its stable physical and chemical properties. Microbial transformation is an attractive remediation approach to prevent and clean up PFOA contamination. To date, several strains of wild microbes have been reported to have limited capacity to degrade PFOA, selection of superior strains degrading PFOA become urgently necessary. Here, we report the application of genome shuffling to improve the PFOA-degrading bacterium Pseudomonas Parafulva YAB-1. The initial mutant populations of strain YAB1 were generated by nitrosoguanidine and ultraviolet irradiation mutagenesis respectively, resulting in mutants YM-9 and YM-19 with slightly improved PFOA-degrading ability. YM-9 and YM-19 were used as the starting strains for three rounds of recursive protoplast fusion. The positive mutants were screened on inorganic salt medium plates containing different concentrations of PFOA and selected based on their PFOA degradability in shake-flask fermentation test. The best performing recombinant F3-52 was isolated after three rounds of genome shuffling. In batch fermentation, the PFOA degradation rate of mutant F3-52 was up to 58.6%, which was 1.8-fold higher than that of the parent strain YAB1, and 1.6-fold higher than the initial mutants YM-9 and YM-19. Pass-generation test indicated that the heredity character of F3-52 was stable. The results demonstrated that genome shuffling was an efficient method for improving PFOA degradation of Pseudomonas Parafulva YAB1. The bred mutant F3-52 with 58.6% PFOA-degrading rate could be used for the environmental control of PFOA pollutant.

  14. From Mendel's discovery on pea to today's plant genetics and breeding : Commemorating the 150th anniversary of the reading of Mendel's discovery.

    PubMed

    Smýkal, Petr; K Varshney, Rajeev; K Singh, Vikas; Coyne, Clarice J; Domoney, Claire; Kejnovský, Eduard; Warkentin, Thomas

    2016-12-01

    This work discusses several selected topics of plant genetics and breeding in relation to the 150th anniversary of the seminal work of Gregor Johann Mendel. In 2015, we celebrated the 150th anniversary of the presentation of the seminal work of Gregor Johann Mendel. While Darwin's theory of evolution was based on differential survival and differential reproductive success, Mendel's theory of heredity relies on equality and stability throughout all stages of the life cycle. Darwin's concepts were continuous variation and "soft" heredity; Mendel espoused discontinuous variation and "hard" heredity. Thus, the combination of Mendelian genetics with Darwin's theory of natural selection was the process that resulted in the modern synthesis of evolutionary biology. Although biology, genetics, and genomics have been revolutionized in recent years, modern genetics will forever rely on simple principles founded on pea breeding using seven single gene characters. Purposeful use of mutants to study gene function is one of the essential tools of modern genetics. Today, over 100 plant species genomes have been sequenced. Mapping populations and their use in segregation of molecular markers and marker-trait association to map and isolate genes, were developed on the basis of Mendel's work. Genome-wide or genomic selection is a recent approach for the development of improved breeding lines. The analysis of complex traits has been enhanced by high-throughput phenotyping and developments in statistical and modeling methods for the analysis of phenotypic data. Introgression of novel alleles from landraces and wild relatives widens genetic diversity and improves traits; transgenic methodologies allow for the introduction of novel genes from diverse sources, and gene editing approaches offer possibilities to manipulate gene in a precise manner.

  15. Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae.

    PubMed

    Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

    2017-01-01

    Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.

  16. AFRESh: an adaptive framework for compression of reads and assembled sequences with random access functionality.

    PubMed

    Paridaens, Tom; Van Wallendael, Glenn; De Neve, Wesley; Lambert, Peter

    2017-05-15

    The past decade has seen the introduction of new technologies that lowered the cost of genomic sequencing increasingly. We can even observe that the cost of sequencing is dropping significantly faster than the cost of storage and transmission. The latter motivates a need for continuous improvements in the area of genomic data compression, not only at the level of effectiveness (compression rate), but also at the level of functionality (e.g. random access), configurability (effectiveness versus complexity, coding tool set …) and versatility (support for both sequenced reads and assembled sequences). In that regard, we can point out that current approaches mostly do not support random access, requiring full files to be transmitted, and that current approaches are restricted to either read or sequence compression. We propose AFRESh, an adaptive framework for no-reference compression of genomic data with random access functionality, targeting the effective representation of the raw genomic symbol streams of both reads and assembled sequences. AFRESh makes use of a configurable set of prediction and encoding tools, extended by a Context-Adaptive Binary Arithmetic Coding scheme (CABAC), to compress raw genetic codes. To the best of our knowledge, our paper is the first to describe an effective implementation CABAC outside of its' original application. By applying CABAC, the compression effectiveness improves by up to 19% for assembled sequences and up to 62% for reads. By applying AFRESh to the genomic symbols of the MPEG genomic compression test set for reads, a compression gain is achieved of up to 51% compared to SCALCE, 42% compared to LFQC and 44% compared to ORCOM. When comparing to generic compression approaches, a compression gain is achieved of up to 41% compared to GNU Gzip and 22% compared to 7-Zip at the Ultra setting. Additionaly, when compressing assembled sequences of the Human Genome, a compression gain is achieved up to 34% compared to GNU Gzip and 16% compared to 7-Zip at the Ultra setting. A Windows executable version can be downloaded at https://github.com/tparidae/AFresh . tom.paridaens@ugent.be. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  17. Genomic predictions can accelerate selection for resistance against Piscirickettsia salmonis in Atlantic salmon (Salmo salar).

    PubMed

    Bangera, Rama; Correa, Katharina; Lhorente, Jean P; Figueroa, René; Yáñez, José M

    2017-01-31

    Salmon Rickettsial Syndrome (SRS) caused by Piscirickettsia salmonis is a major disease affecting the Chilean salmon industry. Genomic selection (GS) is a method wherein genome-wide markers and phenotype information of full-sibs are used to predict genomic EBV (GEBV) of selection candidates and is expected to have increased accuracy and response to selection over traditional pedigree based Best Linear Unbiased Prediction (PBLUP). Widely used GS methods such as genomic BLUP (GBLUP), SNPBLUP, Bayes C and Bayesian Lasso may perform differently with respect to accuracy of GEBV prediction. Our aim was to compare the accuracy, in terms of reliability of genome-enabled prediction, from different GS methods with PBLUP for resistance to SRS in an Atlantic salmon breeding program. Number of days to death (DAYS), binary survival status (STATUS) phenotypes, and 50 K SNP array genotypes were obtained from 2601 smolts challenged with P. salmonis. The reliability of different GS methods at different SNP densities with and without pedigree were compared to PBLUP using a five-fold cross validation scheme. Heritability estimated from GS methods was significantly higher than PBLUP. Pearson's correlation between predicted GEBV from PBLUP and GS models ranged from 0.79 to 0.91 and 0.79-0.95 for DAYS and STATUS, respectively. The relative increase in reliability from different GS methods for DAYS and STATUS with 50 K SNP ranged from 8 to 25% and 27-30%, respectively. All GS methods outperformed PBLUP at all marker densities. DAYS and STATUS showed superior reliability over PBLUP even at the lowest marker density of 3 K and 500 SNP, respectively. 20 K SNP showed close to maximal reliability for both traits with little improvement using higher densities. These results indicate that genomic predictions can accelerate genetic progress for SRS resistance in Atlantic salmon and implementation of this approach will contribute to the control of SRS in Chile. We recommend GBLUP for routine GS evaluation because this method is computationally faster and the results are very similar with other GS methods. The use of lower density SNP or the combination of low density SNP and an imputation strategy may help to reduce genotyping costs without compromising gain in reliability.

  18. Biolistic transformation of Scoparia dulcis L.

    PubMed

    Srinivas, Kota; Muralikrishna, Narra; Kumar, Kalva Bharath; Raghu, Ellendula; Mahender, Aileni; Kiranmayee, Kasula; Yashodahara, Velivela; Sadanandam, Abbagani

    2016-01-01

    Here, we report for the first time, the optimized conditions for microprojectile bombardment-mediated genetic transformation in Vassourinha (Scoparia dulcis L.), a Plantaginaceae medicinal plant species. Transformation was achieved by bombardment of axenic leaf segments with Binary vector pBI121 harbouring β-glucuronidase gene (GUS) as a reporter and neomycin phosphotransferase II gene (npt II) as a selectable marker. The influence of physical parameters viz., acceleration pressure, flight distance, gap width & macroprojectile travel distance of particle gun on frequency of transient GUS and stable (survival of putative transformants) expressions have been investigated. Biolistic delivery of the pBI121 yielded the best (80.0 %) transient expression of GUS gene bombarded at a flight distance of 6 cm and rupture disc pressure/acceleration pressure of 650 psi. Highest stable expression of 52.0 % was noticed in putative transformants on RMBI-K medium. Integration of GUS and npt II genes in the nuclear genome was confirmed through primer specific PCR. DNA blot analysis showed more than one transgene copy in the transformed plantlet genomes. The present study may be used for metabolic engineering and production of biopharmaceuticals by transplastomic technology in this valuable medicinal plant.

  19. SlideSort: all pairs similarity search for short reads

    PubMed Central

    Shimizu, Kana; Tsuda, Koji

    2011-01-01

    Motivation: Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses. Results: In this study, we designed and implemented an exact algorithm SlideSort that finds all similar pairs from a string pool in terms of edit distance. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mers, our method is more effective in reducing the number of edit distance calculations. In comparison to backtracking methods such as BWA, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing short reads for further processing. Availability: Executable binary files and C++ libraries are available at http://www.cbrc.jp/~shimizu/slidesort/ for Linux and Windows. Contact: slidesort@m.aist.go.jp; shimizu-kana@aist.go.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21148542

  20. Regulatory changes raise troubling questions for genomic testing.

    PubMed

    Evans, Barbara J; Dorschner, Michael O; Burke, Wylie; Jarvik, Gail P

    2014-11-01

    By 6 October 2014, many laboratories in the United States must begin honoring new individual data access rights created by recent changes to federal privacy and laboratory regulations. These access rights are more expansive than has been widely understood and pose complex challenges for genomic testing laboratories. This article analyzes regulatory texts and guidances to explore which laboratories are affected. It offers the first published analysis of which parts of the vast trove of data generated during next-generation sequencing will be accessible to patients and research subjects. Persons tested at affected laboratories seemingly will have access, upon request, to uninterpreted gene variant information contained in their stored variant call format, binary alignment/map, and FASTQ files. A defect in the regulations will subject some non-CLIA-regulated research laboratories to these new access requirements unless the Department of Health and Human Services takes swift action to avert this apparently unintended consequence. More broadly, all affected laboratories face a long list of daunting operational, business, compliance, and bioethical issues as they adapt to this change and to the Food and Drug Administration's recently announced plan to publish draft guidance outlining a new oversight framework for lab-developed tests.

  1. [Agrobacterium-mediated sunflower transformation (Helianthus annuus L.) in vitro and in Planta using strain of LBA4404 harboring binary vector pBi2E with dsRNA-suppressor proline dehydrogenase gene].

    PubMed

    Tishchenko, E N; Komisarenko, A G; Mikhal'skaia, S I; Sergeeva, L E; Adamenko, N I; Morgun, B V; Kochetov, A V

    2014-01-01

    To estimate the efficiency of proline dehydrogenase gene suppression towards increasing of sunflower (Helianthus annuus L.) tolerance level to water deficit and salinity, we employed strain LBA4404 harboring pBi2E with double-stranded RNA-suppressor, which were prepared on basis arabidopsis ProDH1 gene. The techniques of Agrobacterium-mediated transformation in vitro and in planta during fertilization sunflower have been proposed. There was shown the genotype-depended integration of T-DNA in sunflower genome. PCR-analysis showed that ProDH1 presents in genome of inbred lines transformed in planta, as well as in T1- and T2-generations. In trans-genic regenerants the essential accumulation of free L-proline during early stages of in vitro cultivation under normal conditions was shown. There was established the essential accumulation of free proline in transgenic regenerants during cultivation under lethal stress pressure (0.4 M mannitol and 2.0% sea water salts) and its decline upon the recovery period. These data are declared about effectiveness of suppression of sunflower ProDH and gene participation in processes connected with osmotolerance.

  2. Comparative Genome Analysis and Global Phylogeny of the Toxin Variant Clostridium difficile PCR Ribotype 017 Reveals the Evolution of Two Independent Sublineages

    PubMed Central

    Cairns, M. D.; Preston, M. D.; Hall, C. L.; Gerding, D. N.; Hawkey, P. M.; Kato, H.; Kim, H.; Kuijper, E. J.; Lawley, T. D.; Pituch, H.; Reid, S.; Kullin, B.; Riley, T. V.; Solomon, K.; Tsai, P. J.; Weese, J. S.

    2016-01-01

    ABSTRACT The diarrheal pathogen Clostridium difficile consists of at least six distinct evolutionary lineages. The RT017 lineage is anomalous, as strains only express toxin B, compared to strains from other lineages that produce toxins A and B and, occasionally, binary toxin. Historically, RT017 initially was reported in Asia but now has been reported worldwide. We used whole-genome sequencing and phylogenetic analysis to investigate the patterns of global spread and population structure of 277 RT017 isolates from animal and human origins from six continents, isolated between 1990 and 2013. We reveal two distinct evenly split sublineages (SL1 and SL2) of C. difficile RT017 that contain multiple independent clonal expansions. All 24 animal isolates were contained within SL1 along with human isolates, suggesting potential transmission between animals and humans. Genetic analyses revealed an overrepresentation of antibiotic resistance genes. Phylogeographic analyses show a North American origin for RT017, as has been found for the recently emerged epidemic RT027 lineage. Despite having only one toxin, RT017 strains have evolved in parallel from at least two independent sources and can readily transmit between continents. PMID:28031436

  3. Agrobacterium-mediated transformation of the haploid liverwort Marchantia polymorpha L., an emerging model for plant biology.

    PubMed

    Ishizaki, Kimitsune; Chiyoda, Shota; Yamato, Katsuyuki T; Kohchi, Takayuki

    2008-07-01

    Agrobacterium-mediated transformation has not been practical in pteridophytes, bryophytes and algae to date, although it is commonly used in model plants including Arabidopsis and rice. Here we present a rapid Agrobacterium-mediated transformation system for the haploid liverwort Marchantia polymorpha L. using immature thalli developed from spores. Hundreds of hygromycin-resistant plants per sporangium were obtained by co-cultivation of immature thalli with Agrobacterium carrying the binary vector that contains a reporter, the beta-glucuronidase (GUS) gene with an intron, and a selection marker, the hygromycin phosphotransferase (hpt) gene. In this system, individual gemmae, which arise asexually from single initial cells, were analyzed as isogenic transformants. GUS activity staining showed that all hygromycin-resistant plants examined expressed the GUS transgene in planta. DNA analyses verified random integration of 1-5 copies of the intact T-DNA between the right and the left borders into the M. polymorpha genome. The efficient and rapid Agrobacterium-mediated transformation of M. polymorpha should provide molecular techniques to facilitate comparative genomics, taking advantage of this unique model plant that retains many features of the common ancestor of land plants.

  4. Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra)

    PubMed Central

    2012-01-01

    Background Chinese bayberry (Myrica rubra Sieb. and Zucc.) is a subtropical evergreen tree originating in China. It has been cultivated in southern China for several thousand years, and annual production has reached 1.1 million tons. The taste and high level of health promoting characters identified in the fruit in recent years has stimulated its extension in China and introduction to Australia. A limited number of co-dominant markers have been developed and applied in genetic diversity and identity studies. Here we report, for the first time, a survey of whole genome shotgun data to develop a large number of simple sequence repeat (SSR) markers to analyse the genetic diversity of the common cultivated Chinese bayberry and the relationship with three other Myrica species. Results The whole genome shotgun survey of Chinese bayberry produced 9.01Gb of sequence data, about 26x coverage of the estimated genome size of 323 Mb. The genome sequences were highly heterozygous, but with little duplication. From the initial assembled scaffold covering 255 Mb sequence data, 28,602 SSRs (≥5 repeats) were identified. Dinucleotide was the most common repeat motif with a frequency of 84.73%, followed by 13.78% trinucleotide, 1.34% tetranucleotide, 0.12% pentanucleotide and 0.04% hexanucleotide. From 600 primer pairs, 186 polymorphic SSRs were developed. Of these, 158 were used to screen 29 Chinese bayberry accessions and three other Myrica species: 91.14%, 89.87% and 46.84% SSRs could be used in Myrica adenophora, Myrica nana and Myrica cerifera, respectively. The UPGMA dendrogram tree showed that cultivated Myrica rubra is closely related to Myrica adenophora and Myrica nana, originating in southwest China, and very distantly related to Myrica cerifera, originating in America. These markers can be used in the construction of a linkage map and for genetic diversity studies in Myrica species. Conclusion Myrica rubra has a small genome of about 323 Mb with a high level of heterozygosity. A large number of SSRs were identified, and 158 polymorphic SSR markers developed, 91% of which can be transferred to other Myrica species. PMID:22621340

  5. The first complete mitogenome of the South China deep-sea giant isopod Bathynomus sp. (Crustacea: Isopoda: Cirolanidae) allows insights into the early mitogenomic evolution of isopods.

    PubMed

    Shen, Yanjun; Kou, Qi; Zhong, Zaixuan; Li, Xinzheng; He, Lisheng; He, Shunping; Gan, Xiaoni

    2017-03-01

    In this study, the complete mitochondrial (mt) genome sequence of the South China deep-sea giant isopod Bathynomus sp. was determined, and this study is the first to explore in detail the mt genome of a deep-sea member of the order Isopoda. This species belongs to the genus Bathynomus , the members of which are saprophagous residents of the deep-sea benthic environment; based on their large size, Bathynomus is included in the "supergiant group" of isopods. The mt genome of Bathynomus sp. is 14,965 bp in length and consists of 13 protein-coding genes, two ribosomal RNA genes, only 18 transfer RNA genes, and a noncoding control region 362 bp in length, which is the smallest control region discovered in Isopoda to date. Although the overall genome organization is typical for metazoans, the mt genome of Bathynomus sp. shows a number of derived characters, such as an inversion of 10 genes when compared to the pancrustacean ground pattern. Rearrangements in some genes (e.g., cob , trnT , nad5, and trnF ) are shared by nearly all isopod mt genomes analyzed thus far, and when compared to the putative isopod ground pattern, five rearrangements were found in Bathynomus sp. Two tRNAs exhibit modified secondary structures: The TΨC arm is absent from trnQ , and trnC lacks the DHU. Within the class Malacostraca, trnC arm loss is only found in other isopods. Phylogenetic analysis revealed that Bathynomus sp. (Cymothoida) and Sphaeroma serratum (Sphaeromatidea) form a single clade, although it is unclear whether Cymothoida is monophyletic or paraphyletic. Moreover, the evolutionary rate of Bathynomus sp. (dN/dS [nonsynonymous mutational rate/synonymous mutational rate] = 0.0705) is the slowest measured to date among Cymothoida, which may be associated with its relatively constant deep-sea environment. Overall, our results may provide useful information for understanding the evolution of deep-sea Isopoda species.

  6. Mitogenomics does not resolve deep molluscan relationships (yet?).

    PubMed

    Stöger, I; Schrödl, M

    2013-11-01

    The origin of molluscs among lophotrochozoan metazoans is unresolved and interclass relationships are contradictory between morphology-based, multi-locus, and recent phylogenomic analyses. Within the "Deep Metazoan Phylogeny" framework, all available molluscan mitochondrial genomes were compiled, covering 6 of 8 classes. Genomes were reannotated, and 13 protein coding genes (PCGs) were analyzed in various taxon settings, under multiple masking and coding regimes. Maximum Likelihood based methods were used for phylogenetic reconstructions. In all cases, molluscs result mixed up with lophotrochozoan outgroups, and most molluscan classes with more than single representatives available are non-monophyletic. We discuss systematic errors such as long branch attraction to cause aberrant, basal positions of fast evolving ingroups such as scaphopods, patellogastropods and, in particular, the gastropod subgroup Heterobranchia. Mitochondrial sequences analyzed either as amino acids or nucleotides may perform well in some (Cephalopoda) but not in other palaeozoic molluscan groups; they are not suitable to reconstruct deep (Cambrian) molluscan evolution. Supposedly "rare" mitochondrial genome level features have long been promoted as phylogenetically informative. In our newly annotated data set, features such as genome size, transcription on one or both strands, and certain coupled pairs of PCGs show a homoplastic, but obviously non-random distribution. Apparently congruent (but not unambiguous) signal for non-trivial subclades, e.g. for a clade composed of pteriomorph and heterodont bivalves, needs confirmation from a more comprehensive bivalve sampling. We found that larger clusters not only of PCGs but also of rRNAs and even tRNAs can bear local phylogenetic signal; adding trnG-trnE to the end of the ancestral cluster trnM-trnC-trnY-trnW-trnQ might be synapomorphic for Mollusca. Mitochondrial gene arrangement and other genome level features explored and reviewed herein thus failed as golden bullets, but are promising as additional characters or evidence supporting deep molluscan clades revealed by other data sets. A representative and dense sampling of molluscan subgroups may contribute to resolve contentious interclass relationships in the future, and is vital for exploring the evolution of especially diverse mitochondrial genomes in molluscs. Copyright © 2012 Elsevier Inc. All rights reserved.

  7. Archaea: The First Domain of Diversified Life

    PubMed Central

    Caetano-Anollés, Gustavo; Nasir, Arshan; Zhou, Kaiyue; Caetano-Anollés, Derek; Mittenthal, Jay E.; Sun, Feng-Jie; Kim, Kyung Mo

    2014-01-01

    The study of the origin of diversified life has been plagued by technical and conceptual difficulties, controversy, and apriorism. It is now popularly accepted that the universal tree of life is rooted in the akaryotes and that Archaea and Eukarya are sister groups to each other. However, evolutionary studies have overwhelmingly focused on nucleic acid and protein sequences, which partially fulfill only two of the three main steps of phylogenetic analysis, formulation of realistic evolutionary models, and optimization of tree reconstruction. In the absence of character polarization, that is, the ability to identify ancestral and derived character states, any statement about the rooting of the tree of life should be considered suspect. Here we show that macromolecular structure and a new phylogenetic framework of analysis that focuses on the parts of biological systems instead of the whole provide both deep and reliable phylogenetic signal and enable us to put forth hypotheses of origin. We review over a decade of phylogenomic studies, which mine information in a genomic census of millions of encoded proteins and RNAs. We show how the use of process models of molecular accumulation that comply with Weston's generality criterion supports a consistent phylogenomic scenario in which the origin of diversified life can be traced back to the early history of Archaea. PMID:24987307

  8. Morphological Characters and Transcriptome Profiles Associated with Black Skin and Red Skin in Crimson Snapper (Lutjanus erythropterus)

    PubMed Central

    Zhang, Yan-Ping; Wang, Zhong-Duo; Guo, Yu-Song; Liu, Li; Yu, Juan; Zhang, Shun; Liu, Shao-Jun; Liu, Chu-Wu

    2015-01-01

    In this study, morphology observation and illumina sequencing were performed on two different coloration skins of crimson snapper (Lutjanus erythropterus), the black zone and the red zone. Three types of chromatophores, melanophores, iridophores and xanthophores, were organized in the skins. The main differences between the two colorations were in the amount and distribution of the three chromatophores. After comparing the two transcriptomes, 9200 unigenes with significantly different expressions (ratio change ≥ 2 and q-value ≤ 0.05) were found, of which 5972 were up-regulated in black skin and 3228 were up-regulated in red skin. Through the function annotation, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the differentially transcribed genes, we excavated a number of uncharacterized candidate pigment genes as well as found the conserved genes affecting pigmentation in crimson snapper. The patterns of expression of 14 pigment genes were confirmed by the Quantitative real-time PCR analysis between the two color skins. Overall, this study shows a global survey of the morphological characters and transcriptome analysis of the different coloration skins in crimson snapper, and provides valuable cellular and genetic information to uncover the mechanism of the formation of pigment patterns in snappers. PMID:26569232

  9. Mendel’s Genes: Toward a Full Molecular Characterization

    PubMed Central

    Reid, James B.; Ross, John J.

    2011-01-01

    The discipline of classical genetics is founded on the hereditary behavior of the seven genes studied by Gregor Mendel. The advent of molecular techniques has unveiled much about the identity of these genes. To date, four genes have been sequenced: A (flower color), LE (stem length), I (cotyledon color), and R (seed shape). Two of the other three genes, GP (pod color) and FA (fasciation), are amenable to candidate gene approaches on the basis of their function, linkage relationships, and synteny between the pea and Medicago genomes. However, even the gene (locus) identity is not known for certain for the seventh character, the pod form, although it is probably V. While the nature of the mutations used by Mendel cannot be determined with certainty, on the basis of the varieties available in Europe in the 1850s, we can speculate on their nature. It turns out that these mutations are attributable to a range of causes—from simple base substitutions and changes to splice sites to the insertion of a transposon-like element. These findings provide a fascinating connection between Mendelian genetics and molecular biology that can be used very effectively in teaching new generations of geneticists. Mendel’s characters also provide novel insights into the nature of the genes responsible for characteristics of agronomic and consumer importance. PMID:21908742

  10. Morphological Characters and Transcriptome Profiles Associated with Black Skin and Red Skin in Crimson Snapper (Lutjanus erythropterus).

    PubMed

    Zhang, Yan-Ping; Wang, Zhong-Duo; Guo, Yu-Song; Liu, Li; Yu, Juan; Zhang, Shun; Liu, Shao-Jun; Liu, Chu-Wu

    2015-11-12

    In this study, morphology observation and illumina sequencing were performed on two different coloration skins of crimson snapper (Lutjanus erythropterus), the black zone and the red zone. Three types of chromatophores, melanophores, iridophores and xanthophores, were organized in the skins. The main differences between the two colorations were in the amount and distribution of the three chromatophores. After comparing the two transcriptomes, 9200 unigenes with significantly different expressions (ratio change ≥ 2 and q-value ≤ 0.05) were found, of which 5972 were up-regulated in black skin and 3228 were up-regulated in red skin. Through the function annotation, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the differentially transcribed genes, we excavated a number of uncharacterized candidate pigment genes as well as found the conserved genes affecting pigmentation in crimson snapper. The patterns of expression of 14 pigment genes were confirmed by the Quantitative real-time PCR analysis between the two color skins. Overall, this study shows a global survey of the morphological characters and transcriptome analysis of the different coloration skins in crimson snapper, and provides valuable cellular and genetic information to uncover the mechanism of the formation of pigment patterns in snappers.

  11. New encoded single-indicator sequences based on physico-chemical parameters for efficient exon identification.

    PubMed

    Meher, J K; Meher, P K; Dash, G N; Raval, M K

    2012-01-01

    The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.

  12. Design principles from multiscale simulations to predict nanostructure in self-assembling ionic liquids

    DOE PAGES

    Nebgen, Benjamin Tyler; Magurudeniya, Harsha D.; Kwock, Kevin Wen Chi; ...

    2017-07-18

    Molecular dynamics simulations (up to the nanoscale) were performed on the 3-methyl-1-pentylimidazolium ionic liquid cation paired with three anions; chloride, nitrate, and thiocyanate as aqueous mixtures, using the effective fragment potential (EFP) method, a computationally inexpensive way of modeling intermolecular interactions. The simulations provided insight (preferred geometries, radial distribution functions and theoretical proton NMR resonances) into the interactions within the ionic domain and are validated against 1H NMR spectroscopy and small- and wide-angle X-ray scattering experiments on 1-decyl-3-methylimidazolium. Ionic liquids containing thiocyanate typically resist gelation and form poorly ordered lamellar structures upon mixing with water. Conversely, chloride, a strongly coordinatingmore » anion, normally forms strong physical gels and produces well-ordered nanostructures adopting a variety of structural motifs over a very wide range of water compositions. Nitrate is intermediate in character, whereby upon dispersal in water it displays a range of viscosities and self-assembles into nanostructures with considerable variability in the fidelity of ordering and symmetry, as a function of water content in the binary mixtures. The observed changes in the macro and nanoscale characteristics were directly correlated to ionic domain structures and intermolecular interactions as theoretically predicted by the analysis of MD trajectories and calculated RDFs. Specifically, both chloride and nitrate are positioned in the plane of the cation. Anion to cation proximity is dependent on water content. Thiocyanate is more susceptible to water insertion into the second solvent shell. Experimental 1H NMR chemical shifts monitor the site-specific competition dependence with water content in the binary mixtures. As a result, thiocyanate preferentially sits above and below the aromatic ring plane, a state disallowing interaction with the protons on the imidazolium ring.« less

  13. Design principles from multiscale simulations to predict nanostructure in self-assembling ionic liquids.

    PubMed

    Nebgen, Benjamin T; Magurudeniya, Harsha D; Kwock, Kevin W C; Ringstrand, Bryan S; Ahmed, Towfiq; Seifert, Sönke; Zhu, Jian-Xin; Tretiak, Sergei; Firestone, Millicent A

    2017-12-14

    Molecular dynamics simulations (up to the nanoscale) were performed on the 3-methyl-1-pentylimidazolium ionic liquid cation paired with three anions; chloride, nitrate, and thiocyanate as aqueous mixtures, using the effective fragment potential (EFP) method, a computationally inexpensive way of modeling intermolecular interactions. The simulations provided insight (preferred geometries, radial distribution functions and theoretical proton NMR resonances) into the interactions within the ionic domain and are validated against 1 H NMR spectroscopy and small- and wide-angle X-ray scattering experiments on 1-decyl-3-methylimidazolium. Ionic liquids containing thiocyanate typically resist gelation and form poorly ordered lamellar structures upon mixing with water. Conversely, chloride, a strongly coordinating anion, normally forms strong physical gels and produces well-ordered nanostructures adopting a variety of structural motifs over a very wide range of water compositions. Nitrate is intermediate in character, whereby upon dispersal in water it displays a range of viscosities and self-assembles into nanostructures with considerable variability in the fidelity of ordering and symmetry, as a function of water content in the binary mixtures. The observed changes in the macro and nanoscale characteristics were directly correlated to ionic domain structures and intermolecular interactions as theoretically predicted by the analysis of MD trajectories and calculated RDFs. Specifically, both chloride and nitrate are positioned in the plane of the cation. Anion to cation proximity is dependent on water content. Thiocyanate is more susceptible to water insertion into the second solvent shell. Experimental 1 H NMR chemical shifts monitor the site-specific competition dependence with water content in the binary mixtures. Thiocyanate preferentially sits above and below the aromatic ring plane, a state disallowing interaction with the protons on the imidazolium ring.

  14. [Fuzzy logic in urology. How to reason in inaccurate terms].

    PubMed

    Vírseda Chamorro, Miguel; Salinas Casado, Jesus; Vázquez Alba, David

    2004-05-01

    The Occidental thinking is basically binary, based on opposites. The classic logic constitutes a systematization of these thinking. The methods of pure sciences such as physics are based on systematic measurement, analysis and synthesis. Nature is described by deterministic differential equations this way. Medical knowledge does not adjust well to deterministic equations of physics so that probability methods are employed. However, this method is not free of problems, both theoretical and practical, so that it is not often possible even to know with certainty the probabilities of most events. On the other hand, the application of binary logic to medicine in general, and to urology particularly, finds serious difficulties such as the imprecise character of the definition of most diseases and the uncertainty associated with most medical acts. These are responsible for the fact that many medical recommendations are made using a literary language which is inaccurate, inconsistent and incoherent. The blurred logic is a way of reasoning coherently using inaccurate concepts. This logic was proposed by Lofti Zadeh in 1965 and it is based in two principles: the theory of blurred conjuncts and the use of blurred rules. A blurred conjunct is one the elements of which have a degree of belonging between 0 and 1. Each blurred conjunct is associated with an inaccurate property or linguistic variable. Blurred rules use the principles of classic logic adapted to blurred conjuncts taking the degree of belonging of each element to the blurred conjunct of reference as the value of truth. Blurred logic allows to do coherent urologic recommendations (i.e. what patient is the performance of PSA indicated in?, what to do in the face of an elevated PSA?), or to perform diagnosis adapted to the uncertainty of diagnostic tests (e.g. data obtained from pressure flow studies in females).

  15. Did LMC X-3 Undergo a 'Her X-1-like' Anomalous Low State?

    NASA Technical Reports Server (NTRS)

    Boyd, Patricia t.

    2008-01-01

    The black hole X-ray binary LMC X-3 has been monitored by the Rossi X-ray Timing Explorer (RXTE) from its launch to the present by the All-Sky Monitor (ASM). This well-sampled light curve is supplemented by frequent pointed observations with the PCA and HEXTE instruments which provide improved sensitivity, time resolution and spectral information. The long-term X-ray luminosity of the system is strongly modulated on timescales of hundreds of days. The mean 2-10 kev X-ray flux varies by a factor of more than 100 during this long-term cycle. This variability has been attributed to the precession of a bright, tilted, and warped accretion disk---the mechanism also invoked to explain the 35-day super-orbital period in the X-ray binary pulsar system Her X-1. The ASM light curve displays a unique episode, starting in December 2003, during which LMC X-3 displayed a very low, nearly constant flux, for about 80 days. This is markedly different from the typical low-flux excursions in LMC X-3, which smoothly evolve toward and then away from a minimum flux on about a 10-day time scale. The character of the long-term variability, as measured by amplitude and characteristic time scale, is not the same after this long low state as it was before. Similar shifts in long-term period and amplitude are seen after the so-called "anomalous low states" in Her X-1, when the 35-day X-ray modulation ceases for an unpredictable length of time. These similar shifts in the long-term amplitude and timescale in the two systems suggests they share a similar mechanism which gives rise to the anomalous low states

  16. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications.

    PubMed

    Zhang, Yiyan; Xin, Yi; Li, Qin; Ma, Jianshe; Li, Shuai; Lv, Xiaodan; Lv, Weiqi

    2017-11-02

    Various kinds of data mining algorithms are continuously raised with the development of related disciplines. The applicable scopes and their performances of these algorithms are different. Hence, finding a suitable algorithm for a dataset is becoming an important emphasis for biomedical researchers to solve practical problems promptly. In this paper, seven kinds of sophisticated active algorithms, namely, C4.5, support vector machine, AdaBoost, k-nearest neighbor, naïve Bayes, random forest, and logistic regression, were selected as the research objects. The seven algorithms were applied to the 12 top-click UCI public datasets with the task of classification, and their performances were compared through induction and analysis. The sample size, number of attributes, number of missing values, and the sample size of each class, correlation coefficients between variables, class entropy of task variable, and the ratio of the sample size of the largest class to the least class were calculated to character the 12 research datasets. The two ensemble algorithms reach high accuracy of classification on most datasets. Moreover, random forest performs better than AdaBoost on the unbalanced dataset of the multi-class task. Simple algorithms, such as the naïve Bayes and logistic regression model are suitable for a small dataset with high correlation between the task and other non-task attribute variables. K-nearest neighbor and C4.5 decision tree algorithms perform well on binary- and multi-class task datasets. Support vector machine is more adept on the balanced small dataset of the binary-class task. No algorithm can maintain the best performance in all datasets. The applicability of the seven data mining algorithms on the datasets with different characteristics was summarized to provide a reference for biomedical researchers or beginners in different fields.

  17. Evaluation of methods and marker Systems in Genomic Selection of oil palm (Elaeis guineensis Jacq.).

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Yeoh, Suat Hui; Appleton, David Ross; Harikrishna, Jennifer Ann

    2017-12-11

    Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits. The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods. Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation.

  18. Genetic Control of Contagious Asexuality in the Pea Aphid

    PubMed Central

    Jaquiéry, Julie; Stoeckel, Solenn; Larose, Chloé; Nouhaud, Pierre; Rispe, Claude; Mieuzet, Lucie; Bonhomme, Joël; Mahéo, Frédérique; Legeai, Fabrice; Gauthier, Jean-Pierre; Prunier-Leterme, Nathalie; Tagu, Denis; Simon, Jean-Christophe

    2014-01-01

    Although evolutionary transitions from sexual to asexual reproduction are frequent in eukaryotes, the genetic bases of such shifts toward asexuality remain largely unknown. We addressed this issue in an aphid species where both sexual and obligate asexual lineages coexist in natural populations. These sexual and asexual lineages may occasionally interbreed because some asexual lineages maintain a residual production of males potentially able to mate with the females produced by sexual lineages. Hence, this species is an ideal model to study the genetic basis of the loss of sexual reproduction with quantitative genetic and population genomic approaches. Our analysis of the co-segregation of ∼300 molecular markers and reproductive phenotype in experimental crosses pinpointed an X-linked region controlling obligate asexuality, this state of character being recessive. A population genetic analysis (>400-marker genome scan) on wild sexual and asexual genotypes from geographically distant populations under divergent selection for reproductive strategies detected a strong signature of divergent selection in the genomic region identified by the experimental crosses. These population genetic data confirm the implication of the candidate region in the control of reproductive mode in wild populations originating from 700 km apart. Patterns of genetic differentiation along chromosomes suggest bidirectional gene flow between populations with distinct reproductive modes, supporting contagious asexuality as a prevailing route to permanent parthenogenesis in pea aphids. This genetic system provides new insights into the mechanisms of coexistence of sexual and asexual aphid lineages. PMID:25473828

  19. A m-ary linear feedback shift register with binary logic

    NASA Technical Reports Server (NTRS)

    Perlman, M. (Inventor)

    1973-01-01

    A family of m-ary linear feedback shift registers with binary logic is disclosed. Each m-ary linear feedback shift register with binary logic generates a binary representation of a nonbinary recurring sequence, producible with a m-ary linear feedback shift register without binary logic in which m is greater than 2. The state table of a m-ary linear feedback shift register without binary logic, utilizing sum modulo m feedback, is first tubulated for a given initial state. The entries in the state table are coded in binary and the binary entries are used to set the initial states of the stages of a plurality of binary shift registers. A single feedback logic unit is employed which provides a separate feedback binary digit to each binary register as a function of the states of corresponding stages of the binary registers.

  20. Interplay between tetrel and triel bonds in RC6H4CN⋯MF3CN⋯BX3 complexes: A combined symmetry-adapted perturbation theory, Møller-Plesset, and quantum theory of atoms-in-molecules study.

    PubMed

    Yourdkhani, Sirous; Korona, Tatiana; Hadipour, Nasser L

    2015-12-15

    Intermolecular ternary complexes composed of: (1) the centrally placed trifluoroacetonitrile or its higher analogs with central carbon exchanged by silicon or germanium (M = C, Si, Ge), (2) the benzonitrile molecule or its para derivatives on one side, and (3) the boron trifluoride of trichloride molecule (X = F, Cl) on the opposite side as well as the corresponding intermolecular tetrel- and triel-bonded binary complexes, were investigated by symmetry-adapted perturbation theory (SAPT) and the supermolecular Møller-Plesset method (MP2) at the complete basis set limit for optimized geometries. A character of interactions was studied by quantum theory of atoms-in-molecules (QTAIM). A comparison of interaction energies and QTAIM bond descriptors for dimers and trimers reveals that tetrel and triel bonds increase in their strength if present together in the trimer. For the triel-bonded complex, this growth leads to a change of the bond character from closed-shell to partly covalent for Si or Ge tetrel atoms, so the resulting bonding scheme corresponds to a preliminary stage of the SN2 reaction. Limitations of the Lewis theory of acids and bases were shown by its failure in predicting the stability order of the triel complexes. The necessity of including interaction energy terms beyond the electrostatic component for an elucidation of the nature of σ- and π-holes was presented by a SAPT energy decomposition and by a study of differences in monomer electrostatic potentials obtained either from isolated monomer densities, or from densities resulting from a perturbation with the effective field of another monomer. © 2015 Wiley Periodicals, Inc.

  1. Ribosomal DNA Integrating rAAV-rDNA Vectors Allow for Stable Transgene Expression

    PubMed Central

    Lisowski, Leszek; Lau, Ashley; Wang, Zhongya; Zhang, Yue; Zhang, Feijie; Grompe, Markus; Kay, Mark A

    2012-01-01

    Although recombinant adeno-associated virus (rAAV) vectors are proving to be efficacious in clinical trials, the episomal character of the delivered transgene restricts their effectiveness to use in quiescent tissues, and may not provide lifelong expression. In contrast, integrating vectors enhance the risk of insertional mutagenesis. In an attempt to overcome both of these limitations, we created new rAAV-rDNA vectors, with an expression cassette flanked by ribosomal DNA (rDNA) sequences capable of homologous recombination into genomic rDNA. We show that after in vivo delivery the rAAV-rDNA vectors integrated into the genomic rDNA locus 8–13 times more frequently than control vectors, providing an estimate that 23–39% of the integrations were specific to the rDNA locus. Moreover, a rAAV-rDNA vector containing a human factor IX (hFIX) expression cassette resulted in sustained therapeutic levels of serum hFIX even after repeated manipulations to induce liver regeneration. Because of the relative safety of integration in the rDNA locus, these vectors expand the usage of rAAV for therapeutics requiring long-term gene transfer into dividing cells. PMID:22990671

  2. What Defines the "Kingdom" Fungi?

    PubMed

    Richards, Thomas A; Leonard, Guy; Wideman, Jeremy G

    2017-06-01

    The application of environmental DNA techniques and increased genome sequencing of microbial diversity, combined with detailed study of cellular characters, has consistently led to the reexamination of our understanding of the tree of life. This has challenged many of the definitions of taxonomic groups, especially higher taxonomic ranks such as eukaryotic kingdoms. The Fungi is an example of a kingdom which, together with the features that define it and the taxa that are grouped within it, has been in a continual state of flux. In this article we aim to summarize multiple lines of data pertinent to understanding the early evolution and definition of the Fungi. These include ongoing cellular and genomic comparisons that, we will argue, have generally undermined all attempts to identify a synapomorphic trait that defines the Fungi. This article will also summarize ongoing work focusing on taxon discovery, combined with phylogenomic analysis, which has identified novel groups that lie proximate/adjacent to the fungal clade-wherever the boundary that defines the Fungi may be. Our hope is that, by summarizing these data in the form of a discussion, we can illustrate the ongoing efforts to understand what drove the evolutionary diversification of fungi.

  3. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    NASA Astrophysics Data System (ADS)

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress.

  4. Introduction to the symposium--barnacle biology: essential aspects and contemporary approaches.

    PubMed

    Zardus, John D

    2012-09-01

    Barnacles have evolved a number of specialized features peculiar for crustaceans: they produce a calcified, external shell; they exhibit sexual strategies involving dioecy and androdioecy; and some have become internal parasites of other Crustacea. The thoroughly sessile habit of adults also belies the highly mobile and complex nature of their larval stages. Given these and other remarkable innovations in their natural history, it is perhaps not surprising that barnacles present a spectrum of opportunities for study. This symposium integrates research on barnacles in the areas of larval biology, biofouling, reproduction, biogeography, speciation, population genetics, ecological genomics, and phylogenetics. Pioneering comparisons are presented of metamorphosis among barnacles from three major lineages. Biofouling is investigated from the perspectives of biochemical and biomechanical mechanisms. Tradeoffs in reproductive specializations are scrutinized through theoretical modeling and empirical validation. Patterns of endemism and diversity are delineated in Australia and intricate species boundaries in the genus Chthamalus are elucidated for the Indo-Pacific. General methodological concerns with population expansion studies in crustaceans are highlighted using barnacle models. Data from the first, draft barnacle genome are employed to examine location-specific selection. Lastly, barnacle evolution is framed in a deep phylogenetic context and hypothetical origins of defined characters are outlined and tested.

  5. [Medicinal plant DNA marker assisted breeding (Ⅱ) the assistant identification of SNPs assisted identification and breeding research of high yield Perilla frutescens new variety].

    PubMed

    Shen, Qi; Zhang, Dong; Sun, Wei; Zhang, Yu-Jun; Shang, Zhi-Wei; Chen, Shi-Lin

    2017-05-01

    Perilla frutescens is one of 60 kinds of food and medicine plants in the initial directory announced by health ministry of China. With the development of Perilla domain in recent , the breeding and application of good varieties has become the main bottleneck of its development. This study reported that applied to the system selection, add to marker-assisted method to breed perilla varieties. Through the whole genome sequencing and consistency matching, annotated the mutation locus according to genome data, and comparison analysis with Perilla common variants database, finally selected 30 non-synonymous mutation SNPs used as characteristic markers of Zhongyan Feishu No.1. those SNP marker were used as chosen standard of Perilla varieties. Finally breeding new perilla variety Zhongyan Feishu No.1, which possess to characters of the leaf and seed dual-used, high yield, high resistance, and could used to green fertilizer. The Zhongyan Feishu No.1 acquired the plant new varieties identification of Beijing city , the identification numbers is 2016054. Marker assisted identification guide new varieties breeding in plants, which can provide a new reference for breeding of medicinal plants. Copyright© by the Chinese Pharmaceutical Association.

  6. Postprocessing for character recognition using pattern features and linguistic information

    NASA Astrophysics Data System (ADS)

    Yoshikawa, Takatoshi; Okamoto, Masayosi; Horii, Hiroshi

    1993-04-01

    We propose a new method of post-processing for character recognition using pattern features and linguistic information. This method corrects errors in the recognition of handwritten Japanese sentences containing Kanji characters. This post-process method is characterized by having two types of character recognition. Improving the accuracy of the character recognition rate of Japanese characters is made difficult by the large number of characters, and the existence of characters with similar patterns. Therefore, it is not practical for a character recognition system to recognize all characters in detail. First, this post-processing method generates a candidate character table by recognizing the simplest features of characters. Then, it selects words corresponding to the character from the candidate character table by referring to a word and grammar dictionary before selecting suitable words. If the correct character is included in the candidate character table, this process can correct an error, however, if the character is not included, it cannot correct an error. Therefore, if this method can presume a character does not exist in a candidate character table by using linguistic information (word and grammar dictionary). It then can verify a presumed character by character recognition using complex features. When this method is applied to an online character recognition system, the accuracy of character recognition improves 93.5% to 94.7%. This proved to be the case when it was used for the editorials of a Japanese newspaper (Asahi Shinbun).

  7. BamTools: a C++ API and toolkit for analyzing and managing BAM files.

    PubMed

    Barnett, Derek W; Garrison, Erik K; Quinlan, Aaron R; Strömberg, Michael P; Marth, Gabor T

    2011-06-15

    Analysis of genomic sequencing data requires efficient, easy-to-use access to alignment results and flexible data management tools (e.g. filtering, merging, sorting, etc.). However, the enormous amount of data produced by current sequencing technologies is typically stored in compressed, binary formats that are not easily handled by the text-based parsers commonly used in bioinformatics research. We introduce a software suite for programmers and end users that facilitates research analysis and data management using BAM files. BamTools provides both the first C++ API publicly available for BAM file support as well as a command-line toolkit. BamTools was written in C++, and is supported on Linux, Mac OSX and MS Windows. Source code and documentation are freely available at http://github.org/pezmaster31/bamtools.

  8. Sexual reproduction and genetic exchange in parasitic protists.

    PubMed

    Weedall, Gareth D; Hall, Neil

    2015-02-01

    A key part of the life cycle of an organism is reproduction. For a number of important protist parasites that cause human and animal disease, their sexuality has been a topic of debate for many years. Traditionally, protists were considered to be primitive relatives of the 'higher' eukaryotes, which may have diverged prior to the evolution of sex and to reproduce by binary fission. More recent views of eukaryotic evolution suggest that sex, and meiosis, evolved early, possibly in the common ancestor of all eukaryotes. However, detecting sex in these parasites is not straightforward. Recent advances, particularly in genome sequencing technology, have allowed new insights into parasite reproduction. Here, we review the evidence on reproduction in parasitic protists. We discuss protist reproduction in the light of parasitic life cycles and routes of transmission among hosts.

  9. Archaeal phylogenomics provides evidence in support of a methanogenic origin of the Archaea and a thaumarchaeal origin for the eukaryotes.

    PubMed

    Kelly, S; Wickstead, B; Gull, K

    2011-04-07

    We have developed a machine-learning approach to identify 3537 discrete orthologue protein sequence groups distributed across all available archaeal genomes. We show that treating these orthologue groups as binary detection/non-detection data is sufficient to capture the majority of archaeal phylogeny. We subsequently use the sequence data from these groups to infer a method and substitution-model-independent phylogeny. By holding this phylogeny constrained and interrogating the intersection of this large dataset with both the Eukarya and the Bacteria using Bayesian and maximum-likelihood approaches, we propose and provide evidence for a methanogenic origin of the Archaea. By the same criteria, we also provide evidence in support of an origin for Eukarya either within or as sisters to the Thaumarchaea.

  10. Efficient production of human acidic fibroblast growth factor in pea (Pisum sativum L.) plants by agroinfection of germinated seeds

    PubMed Central

    2011-01-01

    Background For efficient and large scale production of recombinant proteins in plants transient expression by agroinfection has a number of advantages over stable transformation. Simple manipulation, rapid analysis and high expression efficiency are possible. In pea, Pisum sativum, a Virus Induced Gene Silencing System using the pea early browning virus has been converted into an efficient agroinfection system by converting the two RNA genomes of the virus into binary expression vectors for Agrobacterium transformation. Results By vacuum infiltration (0.08 Mpa, 1 min) of germinating pea seeds with 2-3 cm roots with Agrobacteria carrying the binary vectors, expression of the gene for Green Fluorescent Protein as marker and the gene for the human acidic fibroblast growth factor (aFGF) was obtained in 80% of the infiltrated developing seedlings. Maximal production of the recombinant proteins was achieved 12-15 days after infiltration. Conclusions Compared to the leaf injection method vacuum infiltration of germinated seeds is highly efficient allowing large scale production of plants transiently expressing recombinant proteins. The production cycle of plants for harvesting the recombinant protein was shortened from 30 days for leaf injection to 15 days by applying vacuum infiltration. The synthesized aFGF was purified by heparin-affinity chromatography and its mitogenic activity on NIH 3T3 cells confirmed to be similar to a commercial product. PMID:21548923

  11. An Improved Single-Step Cloning Strategy Simplifies the Agrobacterium tumefaciens-Mediated Transformation (ATMT)-Based Gene-Disruption Method for Verticillium dahliae.

    PubMed

    Wang, Sheng; Xing, Haiying; Hua, Chenlei; Guo, Hui-Shan; Zhang, Jie

    2016-06-01

    The soilborne fungal pathogen Verticillium dahliae infects a broad range of plant species to cause severe diseases. The availability of Verticillium genome sequences has provided opportunities for large-scale investigations of individual gene function in Verticillium strains using Agrobacterium tumefaciens-mediated transformation (ATMT)-based gene-disruption strategies. Traditional ATMT vectors require multiple cloning steps and elaborate characterization procedures to achieve successful gene replacement; thus, these vectors are not suitable for high-throughput ATMT-based gene deletion. Several advancements have been made that either involve simplification of the steps required for gene-deletion vector construction or increase the efficiency of the technique for rapid recombinant characterization. However, an ATMT binary vector that is both simple and efficient is still lacking. Here, we generated a USER-ATMT dual-selection (DS) binary vector, which combines both the advantages of the USER single-step cloning technique and the efficiency of the herpes simplex virus thymidine kinase negative-selection marker. Highly efficient deletion of three different genes in V. dahliae using the USER-ATMT-DS vector enabled verification that this newly-generated vector not only facilitates the cloning process but also simplifies the subsequent identification of fungal homologous recombinants. The results suggest that the USER-ATMT-DS vector is applicable for efficient gene deletion and suitable for large-scale gene deletion in V. dahliae.

  12. Predicting binary, discrete and continued lncRNA-disease associations via a unified framework based on graph regression.

    PubMed

    Shi, Jian-Yu; Huang, Hua; Zhang, Yan-Ning; Long, Yu-Xi; Yiu, Siu-Ming

    2017-12-21

    In human genomes, long non-coding RNAs (lncRNAs) have attracted more and more attention because their dysfunctions are involved in many diseases. However, the associations between lncRNAs and diseases (LDA) still remain unknown in most cases. While identifying disease-related lncRNAs in vivo is costly, computational approaches are promising to not only accelerate the possible identification of associations but also provide clues on the underlying mechanism of various lncRNA-caused diseases. Former computational approaches usually only focus on predicting new associations between lncRNAs having known associations with diseases and other lncRNA-associated diseases. They also only work on binary lncRNA-disease associations (whether the pair has an association or not), which cannot reflect and reveal other biological facts, such as the number of proteins involved in LDA or how strong the association is (i.e., the intensity of LDA). To address abovementioned issues, we propose a graph regression-based unified framework (GRUF). In particular, our method can work on lncRNAs, which have no previously known disease association and diseases that have no known association with any lncRNAs. Also, instead of only a binary answer for the association, our method tries to uncover more biological relationship between a pair of lncRNA and disease, which may provide better clues for researchers. We compared GRUF with three state-of-the-art approaches and demonstrated the superiority of GRUF, which achieves 5%~16% improvement in terms of the area under the receiver operating characteristic curve (AUC). GRUF also provides a predicted confidence score for the predicted LDA, which reveals the significant correlation between the score and the number of RNA-Binding Proteins involved in LDAs. Lastly, three out of top-5 LDA candidates generated by GRUF in novel prediction are verified indirectly by medical literature and known biological facts. The proposed GRUF has two advantages over existing approaches. Firstly, it can be used to work on lncRNAs that have no known disease association and diseases that have no known association with any lncRNAs. Secondly, instead of providing a binary answer (with or without association), GRUF works for both discrete and continued LDA, which help revealing the pathological implications between lncRNAs and diseases.

  13. C-CAT: a computer software used to analyze and select Chinese characters and character components for psychological research.

    PubMed

    Lo, Ming; Hue, Chih-Wei

    2008-11-01

    The Character-Component Analysis Toolkit (C-CAT) software was designed to assist researchers in constructing experimental materials using traditional Chinese characters. The software package contains two sets of character stocks: one suitable for research using literate adults as subjects and one suitable for research using schoolchildren as subjects. The software can identify linguistic properties, such as the number of strokes contained, the character-component pronunciation regularity, and the arrangement of character components within a character. Moreover, it can compute a character's linguistic frequency, neighborhood size, and phonetic validity with respect to a user-selected character stock. It can also search the selected character stock for similar characters or for character components with user-specified linguistic properties.

  14. Automatic extraction of via in the CT image of PCB

    NASA Astrophysics Data System (ADS)

    Liu, Xifeng; Hu, Yuwei

    2018-04-01

    In modern industry, the nondestructive testing of printed circuit board (PCB) can prevent effectively the system failure and is becoming more and more important. In order to detect the via in the PCB base on the CT image automatically accurately and reliably, a novel algorithm for via extraction based on weighting stack combining the morphologic character of via is designed. Every slice data in the vertical direction of the PCB is superimposed to enhanced vias target. The OTSU algorithm is used to segment the slice image. OTSU algorithm of thresholding gray level images is efficient for separating an image into two classes where two types of fairly distinct classes exist in the image. Randomized Hough Transform was used to locate the region of via in the segmented binary image. Then the 3D reconstruction of via based on sequence slice images was done by volume rendering. The accuracy of via positioning and detecting from a CT images of PCB was demonstrated by proposed algorithm. It was found that the method is good in veracity and stability for detecting of via in three dimensional.

  15. Odor detection of mixtures of homologous carboxylic acids and coffee aroma compounds by humans.

    PubMed

    Miyazawa, Toshio; Gallagher, Michele; Preti, George; Wise, Paul M

    2009-11-11

    Mixture summation among homologous carboxylic acids, that is, the relationship between detection probabilities for mixtures and detection probabilities for their unmixed components, varies with similarity in carbon-chain length. The current study examined detection of acetic, butyric, hexanoic, and octanoic acids mixed with three other model odorants that differ greatly from the acids in both structure and odor character, namely, 2-hydroxy-3-methylcyclopent-2-en-1-one, furan-2-ylmethanethiol, and (3-methyl-3-sulfanylbutyl) acetate. Psychometric functions were measured for both single compounds and binary mixtures (2 of 5, forced-choice method). An air dilution olfactometer delivered stimuli, with vapor-phase calibration using gas chromatography-mass spectrometry. Across the three odorants that differed from the acids, acetic and butyric acid showed approximately additive (or perhaps even supra-additive) summation at low perithreshold concentrations, but subadditive interactions at high perithreshold concentrations. In contrast, the medium-chain acids showed subadditive interactions across a wide range of concentrations. Thus, carbon-chain length appears to influence not only summation with other carboxylic acids but also summation with at least some unrelated compounds.

  16. Anthracene + Pyrene Solid Mixtures: Eutectic and Azeotropic Character

    PubMed Central

    Rice, James W.; Fu, Jinxia; Suuberg, Eric M.

    2010-01-01

    To better characterize the thermodynamic behavior of a binary polycyclic aromatic hydrocarbon mixture, thermochemical and vapor pressure experiments were used to examine the phase behavior of the anthracene (1) + pyrene (2) system. A solid-liquid phase diagram was mapped for the mixture. A eutectic point occurs at 404 K at x1 = 0.22. A model based on eutectic formation can be used to predict the enthalpy of fusion associated with the mixture. For mixtures that contain x1 < 0.90, the enthalpy of fusion is near that of pure pyrene. This and X-ray diffraction results indicate that mixtures of anthracene and pyrene have pyrene-like crystal structures and energetics until the composition nears that of pure anthracene. Solid-vapor equilibrium studies show that mixtures of anthracene and pyrene form solid azeotropes at x1 of 0.03 and 0.14. Additionally, mixtures at x1 = 0.99 sublime at the vapor pressure of pure anthracene, suggesting that anthracene behavior is not significantly influenced by x2 = 0.01 in the crystal structure. PMID:21116474

  17. Automatic detection and recognition of signs from natural scenes.

    PubMed

    Chen, Xilin; Yang, Jie; Zhang, Jing; Waibel, Alex

    2004-01-01

    In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yashchuk, V.V.; Takacs, P.; Anderson, E.H.

    A modulation transfer function (MTF) calibration method based on binary pseudorandom (BPR) gratings and arrays has been proven to be an effective MTF calibration method for interferometric microscopes and a scatterometer. Here we report on a further expansion of the application range of the method. We describe the MTF calibration of a 6 in. phase shifting Fizeau interferometer. Beyond providing a direct measurement of the interferometer's MTF, tests with a BPR array surface have revealed an asymmetry in the instrument's data processing algorithm that fundamentally limits its bandwidth. Moreover, the tests have illustrated the effects of the instrument's detrending andmore » filtering procedures on power spectral density measurements. The details of the development of a BPR test sample suitable for calibration of scanning and transmission electron microscopes are also presented. Such a test sample is realized as a multilayer structure with the layer thicknesses of two materials corresponding to the BPR sequence. The investigations confirm the universal character of the method that makes it applicable to a large variety of metrology instrumentation with spatial wavelength bandwidths from a few nanometers to hundreds of millimeters.« less

  19. Correlational latent heat by nonlocal quantum kinetic theory

    NASA Astrophysics Data System (ADS)

    Morawetz, K.

    2018-05-01

    A kinetic equation of nonlocal and noninstantaneous character unifies the achievements of transport in dense quantum gases with the Landau theory of quasiclassical transport in Fermi systems. Large cancellations in the off-shell motion appear, which are usually hidden in non-Markovian behaviors. The remaining corrections are expressed in terms of shifts in space and time that characterize the nonlocality of the scattering process. In this way, it is possible to recast quantum transport into a quasiclassical picture. In addition to the quasiparticle, the balance equations for density, momentum, energy, and entropy also include correlated two-particle contributions beyond the Landau theory. The medium effects on binary collisions are shown to mediate the latent heat, i.e., an energy conversion between correlation and thermal energy. For Maxwellian particles with time-dependent s -wave scattering, the correlated parts of the observables are calculated and a sign change of the latent heat is reported at a universal ratio of scattering length to the thermal de Broglie wavelength. This is interpreted as a change from correlational heating to cooling.

  20. Distinct Processes Drive Diversification in Different Clades of Gesneriaceae.

    PubMed

    Roalson, Eric H; Roberts, Wade R

    2016-07-01

    Using a time-calibrated phylogenetic hypothesis including 768 Gesneriaceae species (out of [Formula: see text]3300 species) and more than 29,000 aligned bases from 26 gene regions, we test Gesneriaceae for diversification rate shifts and the possible proximal drivers of these shifts: geographic distributions, growth forms, and pollination syndromes. Bayesian Analysis of Macroevolutionary Mixtures analyses found five significant rate shifts in Beslerieae, core Nematanthus, core Columneinae, core Streptocarpus, and Pacific Cyrtandra These rate shifts correspond with shifts in diversification rates, as inferred by Binary State Speciation and Extinction Model and Geographic State Speciation and Extinction model, associated with hummingbird pollination, epiphytism, unifoliate growth, and geographic area. Our results suggest that diversification processes are extremely variable across Gesneriaceae clades with different combinations of characters influencing diversification rates in different clades. Diversification patterns between New and Old World lineages show dramatic differences, suggesting that the processes of diversification in Gesneriaceae are very different in these two geographic regions. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Top