extensive sequence-level divergencewoa: Topics by Science.gov

Sample records for extensive sequence-level divergencewoa

Sequence modelling and an extensible data model for genomic database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Peter Wei-Der

1992-01-01

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data modelmore » that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.« less
Sequence modelling and an extensible data model for genomic database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Peter Wei-Der

1992-01-01

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data modelmore » that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.« less
Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

PubMed

Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

2002-01-01

Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.
The UK10K project identifies rare variants in health and disease.

PubMed

Walter, Klaudia; Min, Josine L; Huang, Jie; Crooks, Lucy; Memari, Yasin; McCarthy, Shane; Perry, John R B; Xu, ChangJiang; Futema, Marta; Lawson, Daniel; Iotchkova, Valentina; Schiffels, Stephan; Hendricks, Audrey E; Danecek, Petr; Li, Rui; Floyd, James; Wain, Louise V; Barroso, Inês; Humphries, Steve E; Hurles, Matthew E; Zeggini, Eleftheria; Barrett, Jeffrey C; Plagnol, Vincent; Richards, J Brent; Greenwood, Celia M T; Timpson, Nicholas J; Durbin, Richard; Soranzo, Nicole

2015-10-01

The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.
Deconvoluting simulated metagenomes: the performance of hard- and soft- clustering algorithms applied to metagenomic chromosome conformation capture (3C)

PubMed Central

DeMaere, Matthew Z.

2016-01-01

Background Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity) are present in the sample has not yet been systematically characterised. Methods We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft) using an adaptation of the extended B-cubed validation measure. Results When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity), a naive soft-clustering extension of the Louvain method achieves the highest performance. Discussion Previously, only hard-clustering algorithms have been applied to metagenomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development. PMID:27843713
Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

PubMed Central

Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

2015-01-01

Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Iterative refinement of structure-based sequence alignments by Seed Extension

PubMed Central

Kim, Changhoon; Tai, Chin-Hsien; Lee, Byungkook

2009-01-01

Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. PMID:19589133
Crimean-Congo Hemorrhagic Fever

DTIC Science & Technology

2004-01-01

aminocaproic acid were also indicated. Much emphasis was also placed on preventing reinfection, including the necessity of remov- ing blood crusts from...The se- quence is approximately 60% identical both at the nucleotide and amino acid levels to the L segment of Dugbe virus, the only other Nairovirus...However, more recent data based on nucleic acid sequence analysis have revealed extensive genetic diversity. The first published CCHFV sequence
High levels of diversity characterize mandrill (Mandrillus sphinx) Mhc-DRB sequences.

PubMed

Abbott, Kristin M; Wickings, E Jean; Knapp, Leslie A

2006-08-01

The major histocompatibility complex (MHC) is highly polymorphic in most primate species studied thus far. The rhesus macaque (Macaca mulatta) has been studied extensively and the Mhc-DRB region demonstrates variability similar to humans. The extent of MHC diversity is relatively unknown for other Old World monkeys (OWM), especially among genera other than Macaca. A molecular survey of the Mhc-DRB region in mandrills (Mandrillus sphinx) revealed extensive variability, suggesting that other OWMs may also possess high levels of Mhc-DRB polymorphism. In the present study, 33 Mhc-DRB loci were identified from only 13 animals. Eleven were wild-born and presumed to be unrelated and two were captive-born twins. Two to seven different sequences were identified for each individual, suggesting that some mandrills may have as many as four Mhc-DRB loci on a single haplotype. From these sequences, representatives of at least six Mhc-DRB loci or lineages were identified. As observed in other primates, some new lineages may have arisen through the process of gene conversion. These findings indicate that mandrills have Mhc-DRB diversity not unlike rhesus macaques and humans.
Phylogenetic analyses of complete mitochondrial genome sequences suggest a basal divergence of the enigmatic rodent Anomalurus

PubMed Central

Horner, David S; Lefkimmiatis, Konstantinos; Reyes, Aurelio; Gissi, Carmela; Saccone, Cecilia; Pesole, Graziano

2007-01-01

Background Phylogenetic relationships between Lagomorpha, Rodentia and Primates and their allies (Euarchontoglires) have long been debated. While it is now generally agreed that Rodentia constitutes a monophyletic sister-group of Lagomorpha and that this clade (Glires) is sister to Primates and Dermoptera, higher-level relationships within Rodentia remain contentious. Results We have sequenced and performed extensive evolutionary analyses on the mitochondrial genome of the scaly-tailed flying squirrel Anomalurus sp., an enigmatic rodent whose phylogenetic affinities have been obscure and extensively debated. Our phylogenetic analyses of the coding regions of available complete mitochondrial genome sequences from Euarchontoglires suggest that Anomalurus is a sister taxon to the Hystricognathi, and that this clade represents the most basal divergence among sampled Rodentia. Bayesian dating methods incorporating a relaxed molecular clock provide divergence-time estimates which are consistently in agreement with the fossil record and which indicate a rapid radiation within Glires around 60 million years ago. Conclusion Taken together, the data presented provide a working hypothesis as to the phylogenetic placement of Anomalurus, underline the utility of mitochondrial sequences in the resolution of even relatively deep divergences and go some way to explaining the difficulty of conclusively resolving higher-level relationships within Glires with available data and methodologies. PMID:17288612
Impact of longer-term modest climate shifts on architecture of high-frequency sequences (Cyclothems), Pennsylvanian of midcontinent U.S.A

USGS Publications Warehouse

Feldman, H.R.; Franseen, E.K.; Joeckel, R.M.; Heckel, P.H.

2005-01-01

Pennsylvanian glacioeustatic cyclothems exposed in Kansas and adjacent areas provide a unique opportunity to test models of the impact of relative sea level and climate on stratal architecture. A succession of eight of these high-frequency sequences, traced along dip for 500 km, reveal that modest climate shifts from relatively dry-seasonal to relatively wet-seasonal with a duration of several sequences (???600,000 to 1 million years) had a dominant impact on facies, sediment dispersal patterns, and sequence architecture. The climate shifts documented herein are intermediate, both in magnitude and duration, between previously documented longer-term climate shifts throughout much of the Pennsylvanian and shorter-term shifts described within individual sequences. Climate indicators are best preserved at sequence boundaries and in incised-valley fills of the lowstand systems tracts (LST). Relatively drier climate indicators include high-chroma paleosols, typically with pedogenic carbonates, and plant assemblages that are dominated by gymnosperms, mostly xerophytic walchian conifers. The associated valleys are small (4 km wide and >20 m deep), and dominated by quartz sandstones derived from distant source areas, reflecting large drainage networks. Transgressive systems tracts (TST) in all eight sequences gen erally are characterized by thin, extensive limestones and thin marine shales, suggesting that the dominant control on TST facies distribution was the sequestration of siliciclastic sediment in updip positions. Highstand systems tracts (HST) were significantly impacted by the intermediate-scale climate cycle in that HSTs from relatively drier climates consist of thin marine shales overlain by extensive, thick regressive limestones, whereas HSTs from relatively wetter climates are dominated by thick marine shales. Previously documented relative sea-level changes do not track the climate cycles, indicating that climate played a role distinct from that of relative sea-level change. These intermediate-scale modest climate shifts had a dominant impact on sequence architecture. This independent measure of climate and relative sea level may allow the testing of models of climate and sediment supply based on modern systems. Copyright ?? 2005, SEPM.
A Sequence-Independent, Unstructured Internal Ribosome Entry Site Is Responsible for Internal Expression of the Coat Protein of Turnip Crinkle Virus

PubMed Central

May, Jared; Johnson, Philip; Saleem, Huma

2017-01-01

ABSTRACT To maximize the coding potential of viral genomes, internal ribosome entry sites (IRES) can be used to bypass the traditional requirement of a 5′ cap and some/all of the associated translation initiation factors. Although viral IRES typically contain higher-order RNA structure, an unstructured sequence of about 84 nucleotides (nt) immediately upstream of the Turnip crinkle virus (TCV) coat protein (CP) open reading frame (ORF) has been found to promote internal expression of the CP from the genomic RNA (gRNA) both in vitro and in vivo. An absence of extensive RNA structure was predicted using RNA folding algorithms and confirmed by selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) RNA structure probing. Analysis of the IRES region in vitro by use of both the TCV gRNA and reporter constructs did not reveal any sequence-specific elements but rather suggested that an overall lack of structure was an important feature for IRES activity. The CP IRES is A-rich, independent of orientation, and strongly conserved among viruses in the same genus. The IRES was dependent on eIF4G, but not eIF4E, for activity. Low levels of CP accumulated in vivo in the absence of detectable TCV subgenomic RNAs, strongly suggesting that the IRES was active in the gRNA in vivo. Since the TCV CP also serves as the viral silencing suppressor, early translation of the CP from the viral gRNA is likely important for countering host defenses. Cellular mRNA IRES also lack extensive RNA structures or sequence conservation, suggesting that this viral IRES and cellular IRES may have similar strategies for internal translation initiation. IMPORTANCE Cap-independent translation is a common strategy among positive-sense, single-stranded RNA viruses for bypassing the host cell requirement of a 5′ cap structure. Viral IRES, in general, contain extensive secondary structure that is critical for activity. In contrast, we demonstrate that a region of viral RNA devoid of extensive secondary structure has IRES activity and produces low levels of viral coat protein in vitro and in vivo. Our findings may be applicable to cellular mRNA IRES that also have little or no sequences/structures in common. PMID:28179526
Species Identification of Clinical Prevotella Isolates by Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry

PubMed Central

Soetens, Oriane; De Bel, Annelies; Echahidi, Fedoua; Vancutsem, Ellen; Vandoorslaer, Kristof; Piérard, Denis

2012-01-01

The performance of matrix-assisted laser desorption–ionization time of flight mass spectrometry (MALDI-TOF MS) for species identification of Prevotella was evaluated and compared with 16S rRNA gene sequencing. Using a Bruker database, 62.7% of the 102 clinical isolates were identified to the species level and 73.5% to the genus level. Extension of the commercial database improved these figures to, respectively, 83.3% and 89.2%. MALDI-TOF MS identification of Prevotella is reliable but needs a more extensive database. PMID:22301022
Does Literacy Skill Level Predict Performance in Community College Courses: A Replication and Extension

ERIC Educational Resources Information Center

Allen, Nancy J.; DeLauro, Kimberly A.; Perry, Julia K.; Carman, Carol A.

2017-01-01

Previous research has found a positive relationship between students who had completed a sequence of developmental reading and writing courses and success in a reading-intensive college-level course. This study replicates and expands upon the previous research of Goldstein and Perin (2008) by utilizing a differently diverse sample and an…
Recombination of polynucleotide sequences using random or defined primers

DOEpatents

Arnold, Frances H.; Shao, Zhixin; Affholter, Joseph A.; Zhao, Huimin H; Giver, Lorraine J.

2000-01-01

A method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides is disclosed. The method involves priming template polynucleotide(s) with random-sequences or defined-sequence primers to generate a pool of short DNA fragments with a low level of point mutations. The DNA fragments are subjected to denaturization followed by annealing and further enzyme-catalyzed DNA polymerization. This procedure is repeated a sufficient number of times to produce full-length genes which comprise mutants of the original template polynucleotides. These genes can be further amplified by the polymerase chain reaction and cloned into a vector for expression of the encoded proteins.
Recombination of polynucleotide sequences using random or defined primers

DOEpatents

Arnold, Frances H.; Shao, Zhixin; Affholter, Joseph A.; Zhao, Huimin; Giver, Lorraine J.

2001-01-01

A method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides is disclosed. The method involves priming template polynucleotide(s) with random-sequences or defined-sequence primers to generate a pool of short DNA fragments with a low level of point mutations. The DNA fragments are subjected to denaturization followed by annealing and further enzyme-catalyzed DNA polymerization. This procedure is repeated a sufficient number of times to produce full-length genes which comprise mutants of the original template polynucleotides. These genes can be further amplified by the polymerase chain reaction and cloned into a vector for expression of the encoded proteins.
Sequence Requirements of the 5-Enolpyruvylshikimate-3-phosphate Synthase 5[prime]-Upstream Region for Tissue-Specific Expression in Flowers and Seedlings.

PubMed Central

Benfey, PN; Takatsuji, H; Ren, L; Shah, DM; Chua, NH

1990-01-01

We have analyzed expression from deletion derivatives of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) 5[prime]-upstream region in transgenic petunia flowers and seedlings. In seedlings, expression was strongest in root cortex cells and in trichomes. High-level expression in petals and in seedling roots was conferred by large (>500 base-pair) stretches of sequence, but was lost when smaller fragments were analyzed individually. This apparent requirement for extensive sequence suggests that combinations of cis-elements that are widely separated control tissue-specific expression from the EPSPS promoter. We have also used the high-level, petal-specific expression of the EPSPS promoter to change petal color in two mutant petunia lines. PMID:12354968
Foundations for a syntatic pattern recognition system for genomic DNA sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Searles, D.B.

1993-03-01

The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.
Sequence and facies architecture of the upper Blackhawk Formation and the Lower Castlegate Sandstone (Upper Cretaceous), Book Cliffs, Utah, USA

NASA Astrophysics Data System (ADS)

Yoshida, S.

2000-11-01

High-frequency stratigraphic sequences that comprise the Desert Member of the Blackhawk Formation, the Lower Castlegate Sandstone, and the Buck Tongue in the Green River area of Utah display changes in sequence architecture from marine deposits to marginal marine deposits to an entirely nonmarine section. Facies and sequence architecture differ above and below the regionally extensive Castlegate sequence boundary, which separates two low-frequency (106-year cyclicity) sequences. Below this surface, high-frequency sequences are identified and interpreted as comprising the highstand systems tract of the low-frequency Blackhawk sequence. Each high-frequency sequence has a local incised valley system on top of the wave-dominated delta, and coastal plain to shallow marine deposits are preserved. Above the Castlegate sequence boundary, in contrast, a regionally extensive sheet sandstone of fluvial to estuarine origin with laterally continuous internal erosional surfaces occurs. These deposits above the Castlegate sequence boundary are interpreted as the late lowstand to early transgressive systems tracts of the low-frequency Castlegate sequence. The base-level changes that generated both the low- and high-frequency sequences are attributed to crustal response to fluctuations in compressive intraplate stress on two different time scales. The low-frequency stratigraphic sequences are attributed to changes in the long-term regional subsidence rate and regional tilting of foreland basin fill. High-frequency sequences probably reflect the response of anisotropic basement to tectonism. Sequence architecture changes rapidly across the faulted margin of the underlying Paleozoic Paradox Basin. The high-frequency sequences are deeply eroded and stack above the Paradox Basin, but display less relief and become conformable updip. These features indicate that the area above the Paradox Basin was more prone to vertical structural movements during formation of the Blackhawk-Lower Castlegate succession.
Small subunit ribosomal RNA genes of tabanids and hippoboscids (Diptera: Brachycera): evolutionary relationships and comparison with other Diptera.

PubMed

Carreno, R A; Barta, J R

1998-11-01

The small subunit ribosomal RNA (SSU rRNA) genes of hippoboscid (Ornithoica vicina Walker) and tabanid (Chrysops niger Macquart) Diptera were sequenced to determine their phylogenetic position within the order and to determine whether or not extensive hypervariable regions in this gene are widespread in the Diptera. A parsimony analysis of an alignment containing 8 dipteran sequences produced a single most parsimonious tree that placed O. vicina as sister group to Drosophila melanogaster Meigen. The tabanid Chrysops niger was sister group to the asilomorphan taxa, and the sister group to the Brachycera was a Tipula sp. although this relationship was not supported by bootstrap analysis. The hippoboscid and tabanid sequences contain extensive hypervariable regions in the V2, V4, V6, and V7 regions as do other Diptera. When these regions of the alignment were excluded from the phylogenetic analysis, a single most parsimonious tree was found. This tree had an identical overall topology to the tree obtained from the total data set. The hypervariable regions in parts of the dipteran SSU rRNA genes were more extensive in the nematocerous dipteran sequences used in this study than in the other dipteran representatives; these hypervariable regions may be of more utility in inferring relationship among species and subspecies than at the suprageneric level.

Whole exome sequencing to estimate alloreactivity potential between donors and recipients in stem cell transplantation

PubMed Central

Sampson, Juliana K.; Sheth, Nihar U.; Koparde, Vishal N.; Scalora, Allison F.; Serrano, Myrna G.; Lee, Vladimir; Roberts, Catherine H.; Jameson-Lee, Max; Ferreira-Gonzalez, Andrea; Manjili, Masoud H.; Buck, Gregory A.; Neale, Michael C.; Toor, Amir A.

2016-01-01

Summary Whole exome sequencing (WES) was performed on stem cell transplant donor-recipient (D-R) pairs to determine the extent of potential antigenic variation at a molecular level. In a small cohort of D-R pairs, a high frequency of sequence variation was observed between the donor and recipient exomes independent of human leucocyte antigen (HLA) matching. Nonsynonymous, nonconservative single nucleotide polymorphisms were approximately twice as frequent in HLA-matched unrelated, compared with related D-R pairs. When mapped to individual chromosomes, these polymorphic nucleotides were uniformly distributed across the entire exome. In conclusion, WES reveals extensive nucleotide sequence variation in the exomes of HLA-matched donors and recipients. PMID:24749631
Foundations for a syntatic pattern recognition system for genomic DNA sequences. [Annual] report, 1 December 1991--31 March 1993

DOE Office of Scientific and Technical Information (OSTI.GOV)

Searles, D.B.

1993-03-01

The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.
Phylogenetic relationships and taxonomic revision of Paranoplocephala Lühe, 1910 sensu lato (Cestoda, Cyclophyllidea, Anoplocephalidae)

USDA-ARS?s Scientific Manuscript database

An extensive phylogenetic analysis and genus-level taxonomic revision of Paranoplocephala Lühe, 1910 -like cestodes (Cyclophyllidea, Anoplocephalidae) are presented. The phylogenetic analysis is based on DNA sequences of two partial mitochondrial genes, i.e. cytochrome c oxidase subunit 1 (cox1) and...
Identification of Genes Related to Learning and Memory in the Brain Transcriptome of the Mollusc, "Hermissenda Crassicornis"

ERIC Educational Resources Information Center

Tamvacakis, Arianna N.; Senatore, Adriano; Katz, Paul S.

2015-01-01

The sea slug "Hermissenda crassicornis" (Mollusca, Gastropoda, Nudibranchia) has been studied extensively in associative learning paradigms. However, lack of genetic information previously hindered molecular-level investigations. Here, the "Hermissenda" brain transcriptome was sequenced and assembled de novo, producing 165,743…
Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics

NASA Astrophysics Data System (ADS)

Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.

2018-09-01

This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.
ECB deacylase mutants

DOEpatents

Arnold, Frances H.; Shao, Zhixin; Zhao, Huimin; Giver, Lorraine J.

2002-01-01

A method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides is disclosed. The method involves priming template polynucleotide(s) with random-sequences or defined-sequence primers to generate a pool of short DNA fragments with a low level of point mutations. The DNA fragments are subjected to denaturization followed by annealing and further enzyme-catalyzed DNA polymerization. This procedure is repeated a sufficient number of times to produce full-length genes which comprise mutants of the original template polynucleotides. These genes can be further amplified by the polymerase chain reaction and cloned into a vector for expression of the encoded proteins.
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.

PubMed

Warris, Sven; Timal, N Roshan N; Kempenaar, Marcel; Poortinga, Arne M; van de Geest, Henri; Varbanescu, Ana L; Nap, Jan-Peter

2018-01-01

Our previously published CUDA-only application PaSWAS for Smith-Waterman (SW) sequence alignment of any type of sequence on NVIDIA-based GPUs is platform-specific and therefore adopted less than could be. The OpenCL language is supported more widely and allows use on a variety of hardware platforms. Moreover, there is a need to promote the adoption of parallel computing in bioinformatics by making its use and extension more simple through more and better application of high-level languages commonly used in bioinformatics, such as Python. The novel application pyPaSWAS presents the parallel SW sequence alignment code fully packed in Python. It is a generic SW implementation running on several hardware platforms with multi-core systems and/or GPUs that provides accurate sequence alignments that also can be inspected for alignment details. Additionally, pyPaSWAS support the affine gap penalty. Python libraries are used for automated system configuration, I/O and logging. This way, the Python environment will stimulate further extension and use of pyPaSWAS. pyPaSWAS presents an easy Python-based environment for accurate and retrievable parallel SW sequence alignments on GPUs and multi-core systems. The strategy of integrating Python with high-performance parallel compute languages to create a developer- and user-friendly environment should be considered for other computationally intensive bioinformatics algorithms.
Whole exome sequencing to estimate alloreactivity potential between donors and recipients in stem cell transplantation.

PubMed

Sampson, Juliana K; Sheth, Nihar U; Koparde, Vishal N; Scalora, Allison F; Serrano, Myrna G; Lee, Vladimir; Roberts, Catherine H; Jameson-Lee, Max; Ferreira-Gonzalez, Andrea; Manjili, Masoud H; Buck, Gregory A; Neale, Michael C; Toor, Amir A

2014-08-01

Whole exome sequencing (WES) was performed on stem cell transplant donor-recipient (D-R) pairs to determine the extent of potential antigenic variation at a molecular level. In a small cohort of D-R pairs, a high frequency of sequence variation was observed between the donor and recipient exomes independent of human leucocyte antigen (HLA) matching. Nonsynonymous, nonconservative single nucleotide polymorphisms were approximately twice as frequent in HLA-matched unrelated, compared with related D-R pairs. When mapped to individual chromosomes, these polymorphic nucleotides were uniformly distributed across the entire exome. In conclusion, WES reveals extensive nucleotide sequence variation in the exomes of HLA-matched donors and recipients. © 2014 John Wiley & Sons Ltd.
Pulseq: A rapid and hardware-independent pulse sequence prototyping framework.

PubMed

Layton, Kelvin J; Kroboth, Stefan; Jia, Feng; Littin, Sebastian; Yu, Huijun; Leupold, Jochen; Nielsen, Jon-Fredrik; Stöcker, Tony; Zaitsev, Maxim

2017-04-01

Implementing new magnetic resonance experiments, or sequences, often involves extensive programming on vendor-specific platforms, which can be time consuming and costly. This situation is exacerbated when research sequences need to be implemented on several platforms simultaneously, for example, at different field strengths. This work presents an alternative programming environment that is hardware-independent, open-source, and promotes rapid sequence prototyping. A novel file format is described to efficiently store the hardware events and timing information required for an MR pulse sequence. Platform-dependent interpreter modules convert the file to appropriate instructions to run the sequence on MR hardware. Sequences can be designed in high-level languages, such as MATLAB, or with a graphical interface. Spin physics simulation tools are incorporated into the framework, allowing for comparison between real and virtual experiments. Minimal effort is required to implement relatively advanced sequences using the tools provided. Sequences are executed on three different MR platforms, demonstrating the flexibility of the approach. A high-level, flexible and hardware-independent approach to sequence programming is ideal for the rapid development of new sequences. The framework is currently not suitable for large patient studies or routine scanning although this would be possible with deeper integration into existing workflows. Magn Reson Med 77:1544-1552, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Symmetry of proprioceptive sense in female soccer players.

PubMed

Iwańska, Dagmara; Karczewska, Magdalena; Madej, Anna; Urbanik, Czesław

2015-01-01

The purpose of the study was to assess the symmetry of proprioceptive sense among female soccer players when trying to reproduce isometric knee extensions (right and left) and to analyze the impact of a given level of muscle force on proprioception. The study involved 12 soccer players aged 19.5 ± 2.65 years. Soccer players performed a control measurement of a maximum 3s (knee at the 90°) position in the joint. Subsequently, 70%, 50%, and 30% of the maximum voluntary contraction (MVC) were all calculated and then reproduced by each subject with feedback. Next, the players reproduced the predefined muscle contraction values in three sequences: A - 50%, 70%, 30%; B - 50%, 30%, 70%; C - 70%, 30%, 50% of MVC without visual control. In every sequence, the participants found obtaining the value of 30% of MVC the most difficult. The value they reproduced most accurately was 70% of MVC. Both trial II and trial III demonstrated that the symmetry index SI significantly differed from values considered acceptable (SIRa). In each successive sequence the largest asymmetry occurred while reproducing the lowest values of MVC (30%) (p < 0.05). High level of prioprioceptive sense is important to soccer players due to the extensive overload associated with dynamics stops or changes in direction while running. Special attention should be paid to develop skills in sensing force of varying levels. It was much harder to reproduce the predefined values if there was no feedback.
Some New Sets of Sequences of Fuzzy Numbers with Respect to the Partial Metric

PubMed Central

Ozluk, Muharrem

2015-01-01

In this paper, we essentially deal with Köthe-Toeplitz duals of fuzzy level sets defined using a partial metric. Since the utilization of Zadeh's extension principle is quite difficult in practice, we prefer the idea of level sets in order to construct some classical notions. In this paper, we present the sets of bounded, convergent, and null series and the set of sequences of bounded variation of fuzzy level sets, based on the partial metric. We examine the relationships between these sets and their classical forms and give some properties including definitions, propositions, and various kinds of partial metric spaces of fuzzy level sets. Furthermore, we study some of their properties like completeness and duality. Finally, we obtain the Köthe-Toeplitz duals of fuzzy level sets with respect to the partial metric based on a partial ordering. PMID:25695102
Accurate phylogenetic classification of DNA fragments based onsequence composition

DOE Office of Scientific and Technical Information (OSTI.GOV)

McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis

2006-05-01

Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequencemore » characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.« less
An N-terminal peptide extension results in efficient expression, but not secretion, of a synthetic horseradish peroxidase gene in transgenic tobacco.

PubMed

Kis, Mihaly; Burbridge, Emma; Brock, Ian W; Heggie, Laura; Dix, Philip J; Kavanagh, Tony A

2004-03-01

Native horseradish (Armoracia rusticana) peroxidase, HRP (EC 1.11.1.7), isoenzyme C is synthesized with N-terminal and C-terminal peptide extensions, believed to be associated with protein targeting. This study aimed to explore the specific functions of these extensions, and to generate transgenic plants with expression patterns suitable for exploring the role of peroxidase in plant development and defence. Transgenic Nicotiana tabacum (tobacco) plants expressing different versions of a synthetic horseradish peroxidase, HRP, isoenzyme C gene were constructed. The gene was engineered to include additional sequences coding for either the natural N-terminal or the C-terminal extension or both. These constructs were placed under the control of a constitutive promoter (CaMV-35S) or the tobacco RUBISCO-SSU light inducible promoter (SSU) and introduced into tobacco using Agrobacterium-mediated transformation. To study the effects of the N- and C-terminal extensions, the localization of recombinant peroxidase was determined using biochemical and molecular techniques. Transgenic tobacco plants can exhibit a ten-fold increase in peroxidase activity compared with wild-type tobacco levels, and the majority of this activity is located in the symplast. The N-terminal extension is essential for the production of high levels of recombinant protein, while the C-terminal extension has little effect. Differences in levels of enzyme activity and recombinant protein are reflected in transcript levels. There is no evidence to support either preferential secretion or vacuolar targeting of recombinant peroxidase in this heterologous expression system. This leads us to question the postulated targeting roles of these peptide extensions. The N-terminal extension is essential for high level expression and appears to influence transcript stability or translational efficiency. Plants have been generated with greatly elevated cytosolic peroxidase activity, and smaller increases in apoplastic activity. These will be valuable for exploring the role of these enzymes in stress amelioration and plant development.
An N‐terminal Peptide Extension Results in Efficient Expression, but not Secretion, of a Synthetic Horseradish Peroxidase Gene in Transgenic Tobacco

PubMed Central

KIS, MIHALY; BURBRIDGE, EMMA; BROCK, IAN W.; HEGGIE, LAURA; DIX, PHILIP J.; KAVANAGH, TONY A.

2004-01-01

• Background and Aims Native horseradish (Armoracia rusticana) peroxidase, HRP (EC 1.11.1.7), isoenzyme C is synthesized with N‐terminal and C‐terminal peptide extensions, believed to be associated with protein targeting. This study aimed to explore the specific functions of these extensions, and to generate transgenic plants with expression patterns suitable for exploring the role of peroxidase in plant development and defence. • Methods Transgenic Nicotiana tabacum (tobacco) plants expressing different versions of a synthetic horseradish peroxidase, HRP, isoenzyme C gene were constructed. The gene was engineered to include additional sequences coding for either the natural N‐terminal or the C‐terminal extension or both. These constructs were placed under the control of a constitutive promoter (CaMV‐35S) or the tobacco RUBISCO‐SSU light inducible promoter (SSU) and introduced into tobacco using Agrobacterium‐mediated transformation. To study the effects of the N‐ and C‐terminal extensions, the localization of recombinant peroxidase was determined using biochemical and molecular techniques. • Key Results Transgenic tobacco plants can exhibit a ten‐fold increase in peroxidase activity compared with wild‐type tobacco levels, and the majority of this activity is located in the symplast. The N‐terminal extension is essential for the production of high levels of recombinant protein, while the C‐terminal extension has little effect. Differences in levels of enzyme activity and recombinant protein are reflected in transcript levels. • Conclusions There is no evidence to support either preferential secretion or vacuolar targeting of recombinant peroxidase in this heterologous expression system. This leads us to question the postulated targeting roles of these peptide extensions. The N‐terminal extension is essential for high level expression and appears to influence transcript stability or translational efficiency. Plants have been generated with greatly elevated cytosolic peroxidase activity, and smaller increases in apoplastic activity. These will be valuable for exploring the role of these enzymes in stress amelioration and plant development. PMID:14749254
Sequence analysis of a bitter taste receptor gene repertoires in different ruminant species

USDA-ARS?s Scientific Manuscript database

Bitter taste has been extensively studied in mammalian species and is associated with sensitivity to toxins and with food choices that avoid dangerous substances in the diet. At the molecular level, bitter compounds are sensed by bitter taste receptor proteins (T2R) present at the surface of taste r...
Identifying the North American plum species phylogenetic signal using nuclear, mitochondrial, and chloroplast DNA markers

USDA-ARS?s Scientific Manuscript database

Premise of the study: Prunus L. phylogeny has extensively studied using cpDNA sequences. CpDNA has a slow rate of evolution which is beneficial to determine species relationships at a deeper level. However, a limitation of the chloroplast based phylogenies is its transfer by interspecific hybridizat...
Parallel gene analysis with allele-specific padlock probes and tag microarrays

PubMed Central

Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats

2003-01-01

Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Early Miocene sequence development across the New Jersey margin

USGS Publications Warehouse

Monteverde, D.H.; Mountain, Gregory S.; Miller, K.G.

2008-01-01

Sequence stratigraphy provides an understanding of the interplay between eustasy, sediment supply and accommodation in the sedimentary construction of passive margins. We used this approach to follow the early to middle Miocene growth of the New Jersey margin and analyse the connection between relative changes of sea level and variable sediment supply. Eleven candidate sequence boundaries were traced in high-resolution multi-channel seismic profiles across the inner margin and matched to geophysical log signatures and lithologic changes in ODP Leg 150X onshore coreholes. Chronologies at these drill sites were then used to assign ages to the intervening seismic sequences. We conclude that the regional and global correlation of early Miocene sequences suggests a dominant role of global sea-level change but margin progradation was controlled by localized sediment contribution and that local conditions played a large role in sequence formation and preservation. Lowstand deposits were regionally restricted and their locations point to both single and multiple sediment sources. The distribution of highstand deposits, by contrast, documents redistribution by along shelf currents. We find no evidence that sea level fell below the elevation of the clinoform rollover, and the existence of extensive lowstand deposits seaward of this inflection point indicates efficient cross-shelf sediment transport mechanisms despite the apparent lack of well-developed fluvial drainage. ?? 2008 The Authors. Journal compilation ?? 2008 Blackwell Publishing.
PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

PubMed

Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

2011-03-07

Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.
Production of Supra-regular Spatial Sequences by Macaque Monkeys.

PubMed

Jiang, Xinjian; Long, Tenghai; Cao, Weicong; Li, Junru; Dehaene, Stanislas; Wang, Liping

2018-06-18

Understanding and producing embedded sequences in language, music, or mathematics, is a central characteristic of our species. These domains are hypothesized to involve a human-specific competence for supra-regular grammars, which can generate embedded sequences that go beyond the regular sequences engendered by finite-state automata. However, is this capacity truly unique to humans? Using a production task, we show that macaque monkeys can be trained to produce time-symmetrical embedded spatial sequences whose formal description requires supra-regular grammars or, equivalently, a push-down stack automaton. Monkeys spontaneously generalized the learned grammar to novel sequences, including longer ones, and could generate hierarchical sequences formed by an embedding of two levels of abstract rules. Compared to monkeys, however, preschool children learned the grammars much faster using a chunking strategy. While supra-regular grammars are accessible to nonhuman primates through extensive training, human uniqueness may lie in the speed and learning strategy with which they are acquired. Copyright © 2018 Elsevier Ltd. All rights reserved.

DNA methylation dynamics during early plant life.

PubMed

Bouyer, Daniel; Kramdi, Amira; Kassam, Mohamed; Heese, Maren; Schnittger, Arp; Roudier, François; Colot, Vincent

2017-09-25

Cytosine methylation is crucial for gene regulation and silencing of transposable elements in mammals and plants. While this epigenetic mark is extensively reprogrammed in the germline and early embryos of mammals, the extent to which DNA methylation is reset between generations in plants remains largely unknown. Using Arabidopsis as a model, we uncovered distinct DNA methylation dynamics over transposable element sequences during the early stages of plant development. Specifically, transposable elements and their relics show invariably high methylation at CG sites but increasing methylation at CHG and CHH sites. This non-CG methylation culminates in mature embryos, where it reaches saturation for a large fraction of methylated CHH sites, compared to the typical 10-20% methylation level observed in seedlings or adult plants. Moreover, the increase in CHH methylation during embryogenesis matches the hypomethylated state in the early endosperm. Finally, we show that interfering with the embryo-to-seedling transition results in the persistence of high CHH methylation levels after germination, specifically over sequences that are targeted by the RNA-directed DNA methylation (RdDM) machinery. Our findings indicate the absence of extensive resetting of DNA methylation patterns during early plant life and point instead to an important role of RdDM in reinforcing DNA methylation of transposable element sequences in every cell of the mature embryo. Furthermore, we provide evidence that this elevated RdDM activity is a specific property of embryogenesis.
Use of 16S rRNA gene for identification of a broad range of clinically relevant bacterial pathogens

DOE PAGES

Srinivasan, Ramya; Karaoz, Ulas; Volegova, Marina; ...

2015-02-06

According to World Health Organization statistics of 2011, infectious diseases remain in the top five causes of mortality worldwide. However, despite sophisticated research tools for microbial detection, rapid and accurate molecular diagnostics for identification of infection in humans have not been extensively adopted. Time-consuming culture-based methods remain to the forefront of clinical microbial detection. The 16S rRNA gene, a molecular marker for identification of bacterial species, is ubiquitous to members of this domain and, thanks to ever-expanding databases of sequence information, a useful tool for bacterial identification. In this study, we assembled an extensive repository of clinical isolates (n =more » 617), representing 30 medically important pathogenic species and originally identified using traditional culture-based or non-16S molecular methods. This strain repository was used to systematically evaluate the ability of 16S rRNA for species level identification. To enable the most accurate species level classification based on the paucity of sequence data accumulated in public databases, we built a Naïve Bayes classifier representing a diverse set of high-quality sequences from medically important bacterial organisms. We show that for species identification, a model-based approach is superior to an alignment based method. Overall, between 16S gene based and clinical identities, our study shows a genus-level concordance rate of 96% and a species-level concordance rate of 87.5%. We point to multiple cases of probable clinical misidentification with traditional culture based identification across a wide range of gram-negative rods and gram-positive cocci as well as common gram-negative cocci.« less
Use of 16S rRNA Gene for Identification of a Broad Range of Clinically Relevant Bacterial Pathogens

PubMed Central

Srinivasan, Ramya; Karaoz, Ulas; Volegova, Marina; MacKichan, Joanna; Kato-Maeda, Midori; Miller, Steve; Nadarajan, Rohan; Brodie, Eoin L.; Lynch, Susan V.

2015-01-01

According to World Health Organization statistics of 2011, infectious diseases remain in the top five causes of mortality worldwide. However, despite sophisticated research tools for microbial detection, rapid and accurate molecular diagnostics for identification of infection in humans have not been extensively adopted. Time-consuming culture-based methods remain to the forefront of clinical microbial detection. The 16S rRNA gene, a molecular marker for identification of bacterial species, is ubiquitous to members of this domain and, thanks to ever-expanding databases of sequence information, a useful tool for bacterial identification. In this study, we assembled an extensive repository of clinical isolates (n = 617), representing 30 medically important pathogenic species and originally identified using traditional culture-based or non-16S molecular methods. This strain repository was used to systematically evaluate the ability of 16S rRNA for species level identification. To enable the most accurate species level classification based on the paucity of sequence data accumulated in public databases, we built a Naïve Bayes classifier representing a diverse set of high-quality sequences from medically important bacterial organisms. We show that for species identification, a model-based approach is superior to an alignment based method. Overall, between 16S gene based and clinical identities, our study shows a genus-level concordance rate of 96% and a species-level concordance rate of 87.5%. We point to multiple cases of probable clinical misidentification with traditional culture based identification across a wide range of gram-negative rods and gram-positive cocci as well as common gram-negative cocci. PMID:25658760
The contribution of alu elements to mutagenic DNA double-strand break repair.

PubMed

Morales, Maria E; White, Travis B; Streva, Vincent A; DeFreece, Cecily B; Hedges, Dale J; Deininger, Prescott L

2015-03-01

Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both the rate and nature of DNA repair events.
Completing the Task Procedure or Focusing on Form: Contextualizing Grammar Instruction via Task-Based Teaching

ERIC Educational Resources Information Center

Saraç, Hatice Sezgi

2018-01-01

In this study, it was aimed to compare two distinct methodologies of grammar instruction: task-based and form-focused teaching. Within the application procedure, which lasted for one academic term, two groups of tertiary level learners (N = 53) were exposed to the same sequence of target structures, extensive writing activities and evaluation…
Characterization of kinetoplast DNA from Phytomonas serpens.

PubMed

Sá-Carvalho, D; Perez-Morga, D; Traub-Cseko, Y M

1993-01-01

The restriction enzyme digestion of kinetoplast DNA from four Phytomonas serpens isolates shows an overall similar band pattern. One minicircle from isolate 30T was cloned and sequenced, showing low levels of homology but the same general features and organization as described for minicircles of other trypanosomatids. Extensive regions of the minicircle are composed by G and T on the H strand. These regions are very repetitive and similar to regions in a minicircle of Crithidia oncopelti and to telomeric sequences of Saccharomyces cerevisiae. Conserved Sequence Block 3, present in all trypanosomatids, is one nucleotide different from the consensus in P. serpens and provides a basis to differentiate P. serpens from other trypanosomatids. Electron microscopy of kinetoplast DNA evidenced a network with organization similar to other trypanosomatids and the measurement of minicircles confirmed the size of about 1.45 kb of the sequenced minicircle.
The mass-lifetime relation

NASA Astrophysics Data System (ADS)

LoPresto, Michael C.

2018-05-01

In a recent "AstroNote," I described a simple exercise on the mass-luminosity relation for main sequence stars as an example of exposing students in a general education science course of lower mathematical level to the use of quantitative skills such as collecting and analyzing data. Here I present another attempt at a meaningful experience for such students that again involves both the gathering and analysis of numerical data and comparison with accepted result, this time on the relationship of the mass and lifetimes of main sequence stars. This experiment can stand alone or be used as an extension of the previous mass-luminosity relationship experiment.
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
Analysis of human herpesvirus-6 IE1 sequence variation in clinical samples.

PubMed

Stanton, Richard; Wilkinson, Gavin W G; Fox, Julie D

2003-12-01

Herpesvirus immediate early (IE) proteins are known to play key roles in establishing productive infections, regulating reactivation from latency, and creating a cellular environment favourable to viral replication. Human herpesvirus-6 (HHV-6) IE genes have not been studied as intensively as their homologues in the prototype betaherpesvirus human cytomegalovirus (HCMV). Whilst the HCMV IE1 gene is relatively conserved, early studies indicated that HHV-6 IE1 exhibited a high level of sequence variation between HHV-6A and HHV-6B isolates, although the observation was based primarily on virus stocks that had been isolated and propagated in vitro. In this study, we investigated the level of HHV-6 IE1 sequence variation in vivo by direct sequencing of circulating virus in clinical samples without prior in vitro culture. Sequences exactly matching those reported for reference HHV-6 isolates were identified in clinical samples, thus the HHV-6 laboratory strains used in the majority of in vitro studies appear to be representative of virus circulating in vivo with respect to the IE1 gene. The HHV-6 IE1 sequence is also conserved in reference strains that had been passaged extensively in vitro. The high degree of divergence between variant A and B type IE1 sequences was confirmed, but interestingly HHV-6B IE1 sequences were observed to further segregate into two distinct subgroups, with the laboratory strains Z29 and HST representative of these two subgroups. Within each HHV-6B subgroup, a remarkably high level of homology was observed. Thus the HHV-6 IE1 sequence appears highly stable, underlining its potential importance to the viral life cycle. Copyright 2003 Wiley-Liss, Inc.
B-chromosome systems in the greater glider, Petauroides volans (Marsupialia: Pseudocheiridae). II. Investigation of B-chromosome DNA sequences isolated by micromanipulation and PCR.

PubMed

McQuade, L R; Hill, R J; Francis, D

1994-01-01

B chromosomes, despite their common occurrence throughout the animal and plant kingdoms, have not been investigated extensively at the molecular level. While the majority of B chromosomes occurring in animals have been described as heterochromatic, only a few researchers have examined the DNA of these chromosomes beyond this gross cytological level. This is the case in the largest of the gliding marsupial possums, the greater glider, Petauroides volans. To examine the molecular composition and localization of B-chromosome DNA sequences in P. volans, a combination of micromanipulation and the polymerase chain reaction was used in this study to isolate and then amplify the DNA of the B chromosomes. Localization of the isolated B-chromosome sequences to metaphase chromosomes was investigated using fluorescence in situ hybridization. The B chromosomes in this species are shown to be composed of a heterogeneous mixture of sequences, some of which are unique to the B chromosomes, while others exhibit homology to the centromeric regions of the autosomal complement.
76 FR 37241 - Airworthiness Directives; Airbus Model A318, A319, A320, and A321 Series Airplanes

Federal Register 2010, 2011, 2012, 2013, 2014

2011-06-27

... Aircraft Monitoring] warnings during the landing gear retraction or extension sequence. * * * * * This... [Electronic Centralised Aircraft Monitoring] warnings during the landing gear retraction or extension sequence... [Electronic Centralised Aircraft [[Page 37243
Thematization of Derivative Schema in University Students: Nuances in Constructing Relations between a Function's Successive Derivatives

ERIC Educational Resources Information Center

Fuentealba, Claudio; Sánchez-Matamoros, Gloria; Badillo, Edelmira; Trigueros, María

2017-01-01

This study is part of a more extensive research project that addresses the understanding of the derivative concept in university students with prior instruction in differential calculus. In particular, we focus on the analysis of students' responses to a sequence of tasks that require a high level of understanding of the concept, and complement…
MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

PubMed Central

Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

2017-01-01

An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets. PMID:28467460
MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads.

PubMed

Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

2017-01-01

An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.
PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

PubMed Central

2011-01-01

Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349
Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels

NASA Astrophysics Data System (ADS)

Moghaddasi, Hanieh; Khalifeh, Khosrow; Darooneh, Amir Hossein

2017-01-01

Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that the distribution of distances between the successive sub-sequences (words) is q-exponential which is the distribution function in non-extensive statistical mechanics. Thus the q-parameter can be used as a measure of words clustering levels. Here, we analyzed the distribution of distances between consecutive occurrences of 16 possible dinucleotides in human chromosomes to obtain their corresponding q-parameters. We found that CG as a biologically important two-letter word concerning its methylation, has the highest clustering level. This finding shows the predicting ability of the method in biology. We also proposed that chromosome 18 with the largest value of q-parameter for promoters of genes is more sensitive to dietary and lifestyle. We extended our study to compare the genome of some selected organisms and concluded that the clustering level of CGs increases in higher evolutionary organisms compared to lower ones.
(S)-3-hydroxy-3-methylglutaryl coenzyme A reductase, a product of the mva operon of Pseudomonas mevalonii, is regulated at the transcriptional level.

PubMed Central

Wang, Y L; Beach, M J; Rodwell, V W

1989-01-01

We have cloned and sequenced a 505-base-pair (bp) segment of DNA situated upstream of mvaA, the structural gene for (S)-3-hydroxy-3-methylglutaryl coenzyme A reductase (EC 1.1.1.88) of Pseudomonas mevalonii. The DNA segment that we characterized includes the promoter region for the mva operon. Nuclease S1 mapping and primer extension analysis showed that mvaA is the promoter-proximal gene of the mva operon. Transcription initiates at -56 bp relative to the first A (+1) of the translation start site. Transcription in vivo was induced by mevalonate. Structural features of the mva promoter region include an 80-bp A + T-rich region, and -12, -24 consensus sequences that resemble sequences of sigma 54 promoters in enteric organisms. The relative amplitudes of catalytic activity, enzyme protein, and mvaA mRNA are consistent with a model of regulation of this operon at the transcriptional level. Images PMID:2477360
Characterization of a prototype strain of hepatitis E virus.

PubMed

Tsarev, S A; Emerson, S U; Reyes, G R; Tsareva, T S; Legters, L J; Malik, I A; Iqbal, M; Purcell, R H

1992-01-15

A strain of hepatitis E virus (SAR-55) implicated in an epidemic of enterically transmitted non-A, non-B hepatitis, now called hepatitis E, was characterized extensively. Six cynomolgus monkeys (Macaca fascicularis) were infected with a strain of hepatitis E virus from Pakistan. Reverse transcription-polymerase chain reaction was used to determine the pattern of virus shedding in feces, bile, and serum relative to hepatitis and induction of specific antibodies. Virtually the entire genome of SAR-55 (7195 nucleotides) was sequenced. Comparison of the sequence of SAR-55 with that of a Burmese strain revealed a high level of homology except for one region encoding 100 amino acids of a putative nonstructural polyprotein. Identification of this region as hypervariable was obtained by partial sequencing of a third isolate of hepatitis E virus from Kirgizia.
Le Silurien de la région d'Oulad Abbou (Meseta occidentale, Maroc) : une sédimentation péritidale sous contrôle tectonique

NASA Astrophysics Data System (ADS)

Attou, Ahmed; Hamoumi, Naima

2004-07-01

In the Oulad Abbou syncline, western coastal Meseta, the Silurian deposits exhibit siliciclastic or mixed siliciclastic/carbonate tidal facies that recorded alkaline basalt flows and syn-sedimentary deformations. These facies are staked into peritidal shallowing upward sequences reflecting the evolution from an infratidal to a supratidal environment. These sequences recorded low-amplitude and high-frequency sea-level variations. The built-up of these rhythmic sequences is related to distensive tectonic that allowed the development of isolated platform from extensive siliciclastic influx. This tectonic event is well recorded in the palaeogeographic evolution of the northern Gondwana platform during the Lower Palaeozoic time. To cite this article: A. Attou, N. Hamoumi, C. R. Geoscience 336 (2004).
Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics

PubMed Central

Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed

2016-01-01

In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003

Differences in a ribosomal DNA sequence of Strongylus species allows identification of single eggs.

PubMed

Campbell, A J; Gasser, R B; Chilton, N B

1995-03-01

In the current study, molecular techniques were evaluated for the species identification of individual strongyle eggs. Adult worms of Strongylus edentatus, S. equinus and S. vulgaris were collected at necropsy from horses from Australia and the U.S.A. Genomic DNA was isolated and a ribosomal transcribed spacer (ITS-2) amplified and sequenced using polymerase chain reaction (PCR) techniques. The length of the ITS-2 sequence of S. edentatus, S. equinus and S. vulgaris ranged between 217 and 235 nucleotides. Extensive sequence analysis demonstrated a low degree (0-0.9%) of intraspecific variation in the ITS-2 for the Strongylus species examined, whereas the levels of interspecific differences (13-29%) were significantly greater. Interspecific differences in the ITS-2 sequences allowed unequivocal species identification of single worms and eggs using PCR-linked restriction fragment length polymorphism. These results demonstrate the potential of the ribosomal spacers as genetic markers for species identification of single strongyle eggs from horse faeces.
Testing Extension Services through AKAP Models

ERIC Educational Resources Information Center

De Rosa, Marcello; Bartoli, Luca; La Rocca, Giuseppe

2014-01-01

Purpose: The aim of the paper is to analyse the attitude of Italian farms in gaining access to agricultural extension services (AES). Design/methodology/approach: The ways Italian farms use AES are described through the AKAP (Awareness, Knowledge, Adoption, Product) sequence. This article investigated the AKAP sequence by submitting a…
FlyBase: genes and gene models

PubMed Central

Drysdale, Rachel A.; Crosby, Madeline A.

2005-01-01

FlyBase (http://flybase.org) is the primary repository of genetic and molecular data of the insect family Drosophilidae. For the most extensively studied species, Drosophila melanogaster, a wide range of data are presented in integrated formats. Data types include mutant phenotypes, molecular characterization of mutant alleles and aberrations, cytological maps, wild-type expression patterns, anatomical images, transgenic constructs and insertions, sequence-level gene models and molecular classification of gene product functions. There is a growing body of data for other Drosophila species; this is expected to increase dramatically over the next year, with the completion of draft-quality genomic sequences of an additional 11 Drosphila species. PMID:15608223
Depositional facies, environments and sequence stratigraphic interpretation of the Middle Triassic-Lower Cretaceous (pre-Late Albian) succession in Arif El-Naga anticline, northeast Sinai, Egypt

NASA Astrophysics Data System (ADS)

El-Azabi, M. H.; El-Araby, A.

2005-01-01

The Middle Triassic-Lower Cretaceous (pre-Late Albian) succession of Arif El-Naga anticline comprises various distinctive facies and environments that are connected with eustatic relative sea-level changes, local/regional tectonism, variable sediment influx and base-level changes. It displays six unconformity-bounded depositional sequences. The Triassic deposits are divided into a lower clastic facies (early Middle Triassic sequence) and an upper carbonate unit (late Middle- and latest Middle/early Late Triassic sequences). The early Middle Triassic sequence consists of sandstone with shale/mudstone interbeds that formed under variable regimes, ranging from braided fluvial, lower shoreface to beach foreshore. The marine part of this sequence marks retrogradational and progradational parasequences of transgressive- and highstand systems tract deposits respectively. Deposition has taken place under warm semi-arid climate and a steady supply of clastics. The late Middle- and latest Middle/early Late Triassic sequences are carbonate facies developed on an extensive shallow marine shelf under dry-warm climate. The late Middle Triassic sequence includes retrogradational shallow subtidal oyster rudstone and progradational lower intertidal lime-mudstone parasequences that define the transgressive- and highstand systems tracts respectively. It terminates with upper intertidal oncolitic packstone with bored upper surface. The next latest Middle/early Late Triassic sequence is marked by lime-mudstone, packstone/grainstone and algal stromatolitic bindstone with minor shale/mudstone. These lower intertidal/shallow subtidal deposits of a transgressive-systems tract are followed upward by progradational highstand lower intertidal lime-mudstone deposits. The overlying Jurassic deposits encompass two different sequences. The Lower Jurassic sequence is made up of intercalating lower intertidal lime-mudstone and wave-dominated beach foreshore sandstone which formed during a short period of rising sea-level with a relative increase in clastic supply. The Middle-Upper Jurassic sequence is represented by cycles of cross-bedded sandstone topped with thin mudstone that accumulated by northerly flowing braided-streams accompanying regional uplift of the Arabo-Nubian shield. It is succeeded by another regressive fluvial sequence of Early Cretaceous age due to a major eustatic sea-level fall. The Lower Cretaceous sequence is dominated by sandy braided-river deposits with minor overbank fines and basal debris flow conglomerate.
An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies.

PubMed

Dai, Hongying; Wu, Guodong; Wu, Michael; Zhi, Degui

2016-01-01

Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Formula: see text], compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Formula: see text]). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.
Regularized rare variant enrichment analysis for case-control exome sequencing data.

PubMed

Larson, Nicholas B; Schaid, Daniel J

2014-02-01

Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.
MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

ScienceCinema

Sakakibara, Yasumbumi

2018-02-13

Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sakakibara, Yasumbumi

2011-10-13

Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
Effect of genome sequence on the force-induced unzipping of a DNA molecule.

PubMed

Singh, N; Singh, Y

2006-02-01

We considered a dsDNA polymer in which distribution of bases are random at the base pair level but ordered at a length of 18 base pairs and calculated its force elongation behaviour in the constant extension ensemble. The unzipping force F(y) vs. extension y is found to have a series of maxima and minima. By changing base pairs at selected places in the molecule we calculated the change in F(y) curve and found that the change in the value of force is of the order of few pN and the range of the effect depending on the temperature, can spread over several base pairs. We have also discussed briefly how to calculate in the constant force ensemble a pause or a jump in the extension-time curve from the knowledge of F(y).
A sequence-dependent rigid-base model of DNA

NASA Astrophysics Data System (ADS)

Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.

2013-02-01

A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can successfully predict the nonlocal changes in the minimum energy configuration of an oligomer that are consequent upon a local change of sequence at the level of a single point mutation.
A sequence-dependent rigid-base model of DNA.

PubMed

Gonzalez, O; Petkevičiūtė, D; Maddocks, J H

2013-02-07

A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can successfully predict the nonlocal changes in the minimum energy configuration of an oligomer that are consequent upon a local change of sequence at the level of a single point mutation.
Biochemistry and genetics of actinomycete cellulases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wilson, D.B.

1992-01-01

The order Actinomycetales includes a number of genera that contain species that actively degrade cellulose and these include both mesophilic and facultative thermophilic species. Cellulases produced by strains from two of the genera containing thermophilic organisms have been studied extensively: Microbispora bispora and Thermomonospora fusca. Fractionation of M. bispora cellulases has identified six different enzymes, all of which were purified to near homogeneity and partially characterized. Two of these enzymes appear to be exocellulases and gave synergism with each other and with the endocellulases. The structural genes of five M. bispora cellulases have been cloned and one was sequenced. Fractionationmore » of T. fusca cellulases has identified five different enzymes, all of which were purified to near homogeneity and partially characterized. One of the T. fusca enzymes gives synergism in the hydrolysis of crystalline cellulose with several T. fusca endocellulases and with Trichoderma reesei CBHI but not with T. reesei CBHII. Each T. fusca cellulase contains distinct catalytic and cellulose binding domains. The structural genes of four of the T. fusca endoglucanases have been cloned and sequenced, while three cellulase genes have been cloned from T. curvata. The T. fusca cellulase genes are expressed at a low level in Escherichia coli, but at a high level in Streptomyces lividans. Sequence comparisons have shown that there are no significant amino acid homologies between any of the catalytic domains of the four T. fusca cellulases, but each of them shows extensive homology to several other cellulases and fits in one of the five existing cellulase gene families. 73 refs., 8 figs., 4 tabs.« less
Characterization of a prototype strain of hepatitis E virus.

PubMed Central

Tsarev, S A; Emerson, S U; Reyes, G R; Tsareva, T S; Legters, L J; Malik, I A; Iqbal, M; Purcell, R H

1992-01-01

A strain of hepatitis E virus (SAR-55) implicated in an epidemic of enterically transmitted non-A, non-B hepatitis, now called hepatitis E, was characterized extensively. Six cynomolgus monkeys (Macaca fascicularis) were infected with a strain of hepatitis E virus from Pakistan. Reverse transcription-polymerase chain reaction was used to determine the pattern of virus shedding in feces, bile, and serum relative to hepatitis and induction of specific antibodies. Virtually the entire genome of SAR-55 (7195 nucleotides) was sequenced. Comparison of the sequence of SAR-55 with that of a Burmese strain revealed a high level of homology except for one region encoding 100 amino acids of a putative nonstructural polyprotein. Identification of this region as hypervariable was obtained by partial sequencing of a third isolate of hepatitis E virus from Kirgizia. Images PMID:1731327
An extension of command shaping methods for controlling residual vibration using frequency sampling

NASA Technical Reports Server (NTRS)

Singer, Neil C.; Seering, Warren P.

1992-01-01

The authors present an extension to the impulse shaping technique for commanding machines to move with reduced residual vibration. The extension, called frequency sampling, is a method for generating constraints that are used to obtain shaping sequences which minimize residual vibration in systems such as robots whose resonant frequencies change during motion. The authors present a review of impulse shaping methods, a development of the proposed extension, and a comparison of results of tests conducted on a simple model of the space shuttle robot arm. Frequency shaping provides a method for minimizing the impulse sequence duration required to give the desired insensitivity.
Depositional sequence stratigraphy and architecture of the cretaceous ferron sandstone: Implications for coal and coalbed methane resources - A field excursion

USGS Publications Warehouse

Garrison, J.R.; Van Den, Bergh; Barker, C.E.; Tabet, D.E.

1997-01-01

This Field Excursion will visit outcrops of the fluvial-deltaic Upper Cretaceous (Turonian) Ferron Sandstone Member of the Mancos Shale, known as the Last Chance delta or Upper Ferron Sandstone. This field guide and the field stops will outline the architecture and depositional sequence stratigraphy of the Upper Ferron Sandstone clastic wedge and explore the stratigraphic positions and compositions of major coal zones. The implications of the architecture and stratigraphy of the Ferron fluvial-deltaic complex for coal and coalbed methane resources will be discussed. Early works suggested that the southwesterly derived deltaic deposits of the the upper Ferron Sandstone clastic wedge were a Type-2 third-order depositional sequence, informally called the Ferron Sequence. These works suggested that the Ferron Sequence is separated by a type-2 sequence boundary from the underlying 3rd-order Hyatti Sequence, which has its sediment source from the northwest. Within the 3rd-order depositional sequence, the deltaic events of the Ferron clastic wedge, recognized as parasequence sets, appear to be stacked into progradational, aggradational, and retrogradational patterns reflecting a generally decreasing sediment supply during an overall slow sea-level rise. The architecture of both near-marine facies and non-marine fluvial facies exhibit well defined trends in response to this decrease in available sediment. Recent studies have concluded that, unless coincident with a depositional sequence boundary, regionally extensive coal zones occur at the tops of the parasequence sets within the Ferron clastic wedge. These coal zones consist of coal seams and their laterally equivalent fissile carbonaceous shales, mudstones, and siltstones, paleosols, and flood plain mudstones. Although the compositions of coal zones vary along depositional dip, the presence of these laterally extensive stratigraphic horizons, above parasequence sets, provides a means of correlating and defining the tops of depositional parasequence sets in both near-marine and non-marine parts of fluvial-deltaic depositional sequences. Ongoing field studies, based on this concept of coal zone stratigraphy, and detailed stratigraphic mapping, have documented the existence of at least 12 parasequence sets within the Last Chance delta clastic wedge. These parasequence sets appear to form four high frequency, 4th-order depositional sequences. The dramatic erosional unconformities, associated with these 4th-order sequence boundaries, indicate that there was up to 20-30 m of erosion, signifying locally substantial base-level drops. These base-level drops were accompanied by a basin ward shift in paleo-shorelines by as much as 5-7 km. These 4th-order Upper Ferron Sequences are superimposed on the 3rd-order sea-level rise event and the 3rd-order, sediment supply/accommodation space driven, stratigraphie architecture of the Upper Ferron Sandstone. The fluvial deltaic architecture shows little response to these 4th-order sea-level events. Coal zones generally thicken landward relative to the mean position of the landward pinch-out of the underlying parasequence set, but after some distance landward, they decrease in thickness. Coal zones also generally thin seaward relative to the mean position of the landward pinch-out of the underlying parasequence set. The coal is thickest in the region between this landward pinch-out and the position of maximum zone thickness. Data indicate that the proportion of coal in the coal zone decreases progressively landward from the landward pinch-out. The effects of differential compaction and differences in original pre-peat swamp topography have the effect of adding perturbations to the general trends. These coal zone systematics have major impact on approaches to exploration and production, and the resource accessment of both coal and coalbed methane.
Comparison of Next-Generation Sequencing Systems

PubMed Central

Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

2012-01-01

With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749
Underwound DNA under Tension: Structure, Elasticity, and Sequence-Dependent Behaviors

NASA Astrophysics Data System (ADS)

Sheinin, Maxim Y.; Forth, Scott; Marko, John F.; Wang, Michelle D.

2011-09-01

DNA melting under torsion plays an important role in a wide variety of cellular processes. In the present Letter, we have investigated DNA melting at the single-molecule level using an angular optical trap. By directly measuring force, extension, torque, and angle of DNA, we determined the structural and elastic parameters of torsionally melted DNA. Our data reveal that under moderate forces, the melted DNA assumes a left-handed structure as opposed to an open bubble conformation and is highly torsionally compliant. We have also discovered that at low forces melted DNA properties are highly dependent on DNA sequence. These results provide a more comprehensive picture of the global DNA force-torque phase diagram.
Phylogenetic and environmental diversity of DsrAB-type dissimilatory (bi)sulfite reductases

PubMed Central

Müller, Albert Leopold; Kjeldsen, Kasper Urup; Rattei, Thomas; Pester, Michael; Loy, Alexander

2015-01-01

The energy metabolism of essential microbial guilds in the biogeochemical sulfur cycle is based on a DsrAB-type dissimilatory (bi)sulfite reductase that either catalyzes the reduction of sulfite to sulfide during anaerobic respiration of sulfate, sulfite and organosulfonates, or acts in reverse during sulfur oxidation. Common use of dsrAB as a functional marker showed that dsrAB richness in many environments is dominated by novel sequence variants and collectively represents an extensive, largely uncharted sequence assemblage. Here, we established a comprehensive, manually curated dsrAB/DsrAB database and used it to categorize the known dsrAB diversity, reanalyze the evolutionary history of dsrAB and evaluate the coverage of published dsrAB-targeted primers. Based on a DsrAB consensus phylogeny, we introduce an operational classification system for environmental dsrAB sequences that integrates established taxonomic groups with operational taxonomic units (OTUs) at multiple phylogenetic levels, ranging from DsrAB enzyme families that reflect reductive or oxidative DsrAB types of bacterial or archaeal origin, superclusters, uncultured family-level lineages to species-level OTUs. Environmental dsrAB sequences constituted at least 13 stable family-level lineages without any cultivated representatives, suggesting that major taxa of sulfite/sulfate-reducing microorganisms have not yet been identified. Three of these uncultured lineages occur mainly in marine environments, while specific habitat preferences are not evident for members of the other 10 uncultured lineages. In summary, our publically available dsrAB/DsrAB database, the phylogenetic framework, the multilevel classification system and a set of recommended primers provide a necessary foundation for large-scale dsrAB ecology studies with next-generation sequencing methods. PMID:25343514
extendFromReads

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Kelly P.

2013-10-03

This package assists in genome assembly. extendFromReads takes as input a set of Illumina (eg, MiSeq) DNA sequencing reads, a query seed sequence and a direction to extend the seed. The algorithm collects all seed-- ]matching reads (flipping reverse-- ]orientation hits), trims off the seed and additional sequence in the other direction, sorts the remaining sequences alphabetically, and prints them aligned without gaps from the point of seed trimming. This produces a visual display distinguishing the flanks of multi- ]copy seeds. A companion script hitMates.pl collects the mates of seed-- ]hi]ng reads, whose alignment reveals longer extensions from the seed.more » The collect/trim/sort strategy was made iterative and scaled up in the script denovo.pl, for de novo contig assembly. An index is pre-- ]built using indexReads.pl that for each unique 21-- ]mer found in all the reads, records its gfate h of extension (whether extendable, blocked by low coverage, or blocked by branching after a duplicated sequence) and other characteristics. Importantly, denovo.pl records all branchings that follow a branching contig endpoint, providing contig- ]extension information« less
Health Education a Conceptual Approach. Growing and Developing, Interacting, Decision Making. Concept 2: Growing and Developing Follows a Predictable Sequence, Yet is Unique for Each Individual. Teacher-Student Resources.

ERIC Educational Resources Information Center

Creswell, William H., Jr.; And Others

The following resource guide is one in a series which presents extensive bibliographic material oriented around a specific concept, in this guide, the predictability and uniqueness of growing and developing. A section is devoted to selected materials related to the concept; grade levels for which each resource might be useful are indicated beside…

Single-cell sequencing and tumorigenesis: improved understanding of tumor evolution and metastasis.

PubMed

Ellsworth, Darrell L; Blackburn, Heather L; Shriver, Craig D; Rabizadeh, Shahrooz; Soon-Shiong, Patrick; Ellsworth, Rachel E

2017-12-01

Extensive genomic and transcriptomic heterogeneity in human cancer often negatively impacts treatment efficacy and survival, thus posing a significant ongoing challenge for modern treatment regimens. State-of-the-art DNA- and RNA-sequencing methods now provide high-resolution genomic and gene expression portraits of individual cells, facilitating the study of complex molecular heterogeneity in cancer. Important developments in single-cell sequencing (SCS) technologies over the past 5 years provide numerous advantages over traditional sequencing methods for understanding the complexity of carcinogenesis, but significant hurdles must be overcome before SCS can be clinically useful. In this review, we: (1) highlight current methodologies and recent technological advances for isolating single cells, single-cell whole-genome and whole-transcriptome amplification using minute amounts of nucleic acids, and SCS, (2) summarize research investigating molecular heterogeneity at the genomic and transcriptomic levels and how this heterogeneity affects clonal evolution and metastasis, and (3) discuss the promise for integrating SCS in the clinical care arena for improved patient care.
Ultra-barcoding in cacao (Theobroma spp.; Malvaceae) using whole chloroplast genomes and nuclear ribosomal DNA.

PubMed

Kane, Nolan; Sveinsson, Saemundur; Dempewolf, Hannes; Yang, Ji Yong; Zhang, Dapeng; Engels, Johannes M M; Cronk, Quentin

2012-02-01

To reliably identify lineages below the species level such as subspecies or varieties, we propose an extension to DNA-barcoding using next-generation sequencing to produce whole organellar genomes and substantial nuclear ribosomal sequence. Because this method uses much longer versions of the traditional DNA-barcoding loci in the plastid and ribosomal DNA, we call our approach ultra-barcoding (UBC). We used high-throughput next-generation sequencing to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an individual of the related species T. grandiflorum, as well as an additional publicly available whole plastid genome of T. cacao. All individuals of T. cacao examined were uniquely distinguished, and evidence of reticulation and gene flow was observed. Sequence variation was observed in some of the canonical barcoding regions between species, but other regions of the chloroplast were more variable both within species and between species, as were ribosomal spacers. Furthermore, no single region provides the level of data available using the complete plastid genome and rDNA. Our data demonstrate that UBC is a viable, increasingly cost-effective approach for reliably distinguishing varieties and even individual genotypes of T. cacao. This approach shows great promise for applications where very closely related or interbreeding taxa must be distinguished.
Trading genes along the silk road: mtDNA sequences and the origin of central Asian populations.

PubMed Central

Comas, D; Calafell, F; Mateu, E; Pérez-Lezaun, A; Bosch, E; Martínez-Arias, R; Clarimon, J; Facchini, F; Fiori, G; Luiselli, D; Pettener, D; Bertranpetit, J

1998-01-01

Central Asia is a vast region at the crossroads of different habitats, cultures, and trade routes. Little is known about the genetics and the history of the population of this region. We present the analysis of mtDNA control-region sequences in samples of the Kazakh, the Uighurs, the lowland Kirghiz, and the highland Kirghiz, which we have used to address both the population history of the region and the possible selective pressures that high altitude has on mtDNA genes. Central Asian mtDNA sequences present features intermediate between European and eastern Asian sequences, in several parameters-such as the frequencies of certain nucleotides, the levels of nucleotide diversity, mean pairwise differences, and genetic distances. Several hypotheses could explain the intermediate position of central Asia between Europe and eastern Asia, but the most plausible would involve extensive levels of admixture between Europeans and eastern Asians in central Asia, possibly enhanced during the Silk Road trade and clearly after the eastern and western Eurasian human groups had diverged. Lowland and highland Kirghiz mtDNA sequences are very similar, and the analysis of molecular variance has revealed that the fraction of mitochondrial genetic variance due to altitude is not significantly different from zero. Thus, it seems unlikely that altitude has exerted a major selective pressure on mitochondrial genes in central Asian populations. PMID:9837835
Use of tuf Sequences for Genus-Specific PCR Detection and Phylogenetic Analysis of 28 Streptococcal Species

PubMed Central

Picard, François J.; Ke, Danbing; Boudreau, Dominique K.; Boissinot, Maurice; Huletsky, Ann; Richard, Dave; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.

2004-01-01

A 761-bp portion of the tuf gene (encoding the elongation factor Tu) from 28 clinically relevant streptococcal species was obtained by sequencing amplicons generated using broad-range PCR primers. These tuf sequences were used to select Streptococcus-specific PCR primers and to perform phylogenetic analysis. The specificity of the PCR assay was verified using 102 different bacterial species, including the 28 streptococcal species. Genomic DNA purified from all streptococcal species was efficiently detected, whereas there was no amplification with DNA from 72 of the 74 nonstreptococcal bacterial species tested. There was cross-amplification with DNAs from Enterococcus durans and Lactococcus lactis. However, the 15 to 31% nucleotide sequence divergence in the 761-bp tuf portion of these two species compared to any streptococcal tuf sequence provides ample sequence divergence to allow the development of internal probes specific to streptococci. The Streptococcus-specific assay was highly sensitive for all 28 streptococcal species tested (i.e., detection limit of 1 to 10 genome copies per PCR). The tuf sequence data was also used to perform extensive phylogenetic analysis, which was generally in agreement with phylogeny determined on the basis of 16S rRNA gene data. However, the tuf gene provided a better discrimination at the streptococcal species level that should be particularly useful for the identification of very closely related species. In conclusion, tuf appears more suitable than the 16S ribosomal RNA gene for the development of diagnostic assays for the detection and identification of streptococcal species because of its higher level of species-specific genetic divergence. PMID:15297518
Spatial analysis of extension fracture systems: A process modeling approach

USGS Publications Warehouse

Ferguson, C.C.

1985-01-01

Little consensus exists on how best to analyze natural fracture spacings and their sequences. Field measurements and analyses published in geotechnical literature imply fracture processes radically different from those assumed by theoretical structural geologists. The approach adopted in this paper recognizes that disruption of rock layers by layer-parallel extension results in two spacing distributions, one representing layer-fragment lengths and another separation distances between fragments. These two distributions and their sequences reflect mechanics and history of fracture and separation. Such distributions and sequences, represented by a 2 ?? n matrix of lengthsL, can be analyzed using a method that is history sensitive and which yields also a scalar estimate of bulk extension, e (L). The method is illustrated by a series of Monte Carlo experiments representing a variety of fracture-and-separation processes, each with distinct implications for extension history. Resulting distributions of e (L)are process-specific, suggesting that the inverse problem of deducing fracture-and-separation history from final structure may be tractable. ?? 1985 Plenum Publishing Corporation.
Sequencing and Characterisation of an Extensive Atlantic Salmon (Salmo salar L.) MicroRNA Repertoire

PubMed Central

Bekaert, Michaël; Lowe, Natalie R.; Bishop, Stephen C.; Bron, James E.; Taggart, John B.; Houston, Ross D.

2013-01-01

Atlantic salmon (Salmo salar L.), a member of the family Salmonidae, is a totemic species of ecological and cultural significance that is also economically important in terms of both sports fisheries and aquaculture. These factors have promoted the continuous development of genomic resources for this species, furthering both fundamental and applied research. MicroRNAs (miRNA) are small endogenous non-coding RNA molecules that control spatial and temporal expression of targeted genes through post-transcriptional regulation. While miRNA have been characterised in detail for many other species, this is not yet the case for Atlantic salmon. To identify miRNAs from Atlantic salmon, we constructed whole fish miRNA libraries for 18 individual juveniles (fry, four months post hatch) and characterised them by Illumina high-throughput sequencing (total of 354,505,167 paired-ended reads). We report an extensive and partly novel repertoire of miRNA sequences, comprising 888 miRNA genes (547 unique mature miRNA sequences), quantify their expression levels in basal conditions, examine their homology to miRNAs from other species and identify their predicted target genes. We also identify the location and putative copy number of the miRNA genes in the draft Atlantic salmon reference genome sequence. The Atlantic salmon miRNAs experimentally identified in this study provide a robust large-scale resource for functional genome research in salmonids. There is an opportunity to explore the evolution of salmonid miRNAs following the relatively recent whole genome duplication event in salmonid species and to investigate the role of miRNAs in the regulation of gene expression in particular their contribution to variation in economically and ecologically important traits. PMID:23922936
Winnowing sequences from a database search.

PubMed

Berman, P; Zhang, Z; Wolf, Y I; Koonin, E V; Miller, W

2000-01-01

In database searches for sequence similarity, matches to a distinct sequence region (e.g., protein domain) are frequently obscured by numerous matches to another region of the same sequence. In order to cope with this problem, algorithms are developed to discard redundant matches. One model for this problem begins with a list of intervals, each with an associated score; each interval gives the range of positions in the query sequence that align to a database sequence, and the score is that of the alignment. If interval I is contained in interval J, and I's score is less than J's, then I is said to be dominated by J. The problem is then to identify each interval that is dominated by at least K other intervals, where K is a given level of "tolerable redundancy." An algorithm is developed to solve the problem in O(N log N) time and O(N*) space, where N is the number of intervals and N* is a precisely defined value that never exceeds N and is frequently much smaller. This criterion for discarding database hits has been implemented in the Blast program, as illustrated herein with examples. Several variations and extensions of this approach are also described.
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

PubMed Central

Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

2001-01-01

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

PubMed

Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

2015-06-01

High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Classification and Weakly Supervised Pain Localization using Multiple Segment Representation.

PubMed

Sikka, Karan; Dhall, Abhinav; Bartlett, Marian Stewart

2014-10-01

Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges to automatic facial expression recognition (AFER) research. Previous pain vs no-pain systems have highlighted two major challenges: (1) ground truth is provided for the sequence, but the presence or absence of the target expression for a given frame is unknown, and (2) the time point and the duration of the pain expression event(s) in each video are unknown. To address these issues we propose a novel framework (referred to as MS-MIL) where each sequence is represented as a bag containing multiple segments, and multiple instance learning (MIL) is employed to handle this weakly labeled data in the form of sequence level ground-truth. These segments are generated via multiple clustering of a sequence or running a multi-scale temporal scanning window, and are represented using a state-of-the-art Bag of Words (BoW) representation. This work extends the idea of detecting facial expressions through 'concept frames' to 'concept segments' and argues through extensive experiments that algorithms such as MIL are needed to reap the benefits of such representation. The key advantages of our approach are: (1) joint detection and localization of painful frames using only sequence-level ground-truth, (2) incorporation of temporal dynamics by representing the data not as individual frames but as segments, and (3) extraction of multiple segments, which is well suited to signals with uncertain temporal location and duration in the video. Extensive experiments on UNBC-McMaster Shoulder Pain dataset highlight the effectiveness of the approach by achieving competitive results on both tasks of pain classification and localization in videos. We also empirically evaluate the contributions of different components of MS-MIL. The paper also includes the visualization of discriminative facial patches, important for pain detection, as discovered by our algorithm and relates them to Action Units that have been associated with pain expression. We conclude the paper by demonstrating that MS-MIL yields a significant improvement on another spontaneous facial expression dataset, the FEEDTUM dataset.
SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

USGS Publications Warehouse

Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.
Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants

USDA-ARS?s Scientific Manuscript database

Next-generation sequencing technology such as genotyping-by-sequencing (GBS) made low-cost, but often low-coverage, whole-genome sequencing widely available. Extensive inbreeding in crop plants provides an untapped, high quality source of phased haplotypes for imputing missing genotypes. We introduc...
ESTs from Seeds to Assist the Selective Breeding of Jatropha curcas L. for Oil and Active Compounds

PubMed Central

Gomes, Kleber A; Almeida, Tiago C; Gesteira, Abelmon S; Lôbo, Ivon P; Guimarães, Ana Carolina R; de Miranda, Antonio B; Van Sluys, Marie-Anne; da Cruz, Rosenira S; Cascardo, Júlio CM; Carels, Nicolas

2010-01-01

We report here on the characterization of a cDNA library from seeds of Jatropha curcas L. at three stages of fruit maturation before yellowing. We sequenced a total of 2200 clones and obtained a set of 931 non-redundant sequences (unigenes) after trimming and quality control, ie, 140 contigs and 791 singlets with PHRED quality ≥10. We found low levels of sequence redundancy and extensive metabolic coverage by homology comparison to GO. After comparison of 5841 non-redundant ESTs from a total of 13193 reads from GenBank with KEGG, we identified tags with nucleotide variations among J. curcas accessions for genes of fatty acid, terpene, alkaloid, quinone and hormone pathways of biosynthesis. More specifically, the expression level of four genes (palmitoyl-acyl carrier protein thioesterase, 3-ketoacyl-CoA thiolase B, lysophosphatidic acid acyltransferase and geranyl pyrophosphate synthase) measured by real-time PCR proved to be significantly different between leaves and fruits. Since the nucleotide polymorphism of these tags is associated to higher level of gene expression in fruits compared to leaves, we propose this approach to speed up the search for quantitative traits in selective breeding of J. curcas. We also discuss its potential utility for the selective breeding of economically important traits in J. curcas. PMID:26217103
Quantification of effect of sequential posteromedial release on flexion and extension gaps: a computer-assisted study in cadaveric knees.

PubMed

Mullaji, Arun; Sharma, Amit; Marawar, Satyajit; Kanna, Raj

2009-08-01

A novel sequence of posteromedial release consistent with surgical technique of total knee arthroplasty was performed in 15 cadaveric knees. Medial and lateral flexion and extension gaps were measured after each step of the release using a computed tomography-free computer navigation system. A spring-loaded distractor and a manual distractor were used to distract the joint. Posterior cruciate ligament release increased flexion more than extension gap; deep medial collateral ligament release had a negligible effect; semimembranosus release increased the flexion gap medially; reduction osteotomy increased medial flexion and extension gaps; superficial medial collateral ligament release increased medial joint gap more in flexion and caused severe instability. This sequence of release led to incremental and differential effects on flexion-extension gaps and has implications in correcting varus deformity.
Extensive genetic differentiation detected within a model marsupial, the tammar wallaby (Notamacropus eugenii)

PubMed Central

Miller, Emily J.; Neaves, Linda E.; Zenger, Kyall R.; Herbert, Catherine A.

2017-01-01

The tammar wallaby (Notamacropus eugenii) is one of the most intensively studied of all macropodids and was the first Australasian marsupial to have its genome sequenced. However, comparatively little is known about genetic diversity and differentiation amongst the morphologically distinct allopatric populations of tammar wallabies found in Western (WA) and South Australia (SA). Here we compare autosomal and Y-linked microsatellite genotypes, as well as sequence data (~600 bp) from the mitochondrial DNA (mtDNA) control region (CR) in tammar wallabies from across its distribution. Levels of diversity at autosomal microsatellite loci were typically high in the WA mainland and Kangaroo Island (SA) populations (A = 8.9–10.6; He = 0.77–0.78) but significantly reduced in other endemic island populations (A = 3.8–4.1; He = 0.41–0.48). Autosomal and Y-linked microsatellite loci revealed a pattern of significant differentiation amongst populations, especially between SA and WA. The Kangaroo Island and introduced New Zealand population showed limited differentiation. Multiple divergent mtDNA CR haplotypes were identified within both SA and WA populations. The CR haplotypes of tammar wallabies from SA and WA show reciprocal monophyly and are highly divergent (14.5%), with levels of sequence divergence more typical of different species. Within WA tammar wallabies, island populations each have unique clusters of highly related CR haplotypes and each is most closely related to different WA mainland haplotypes. Y-linked microsatellite haplotypes show a similar pattern of divergence although levels of diversity are lower. In light of these differences, we suggest that two subspecies of tammar wallaby be recognized; Notamacropus eugenii eugenii in SA and N. eugenii derbianus in WA. The extensive neutral genetic diversity and inter-population differentiation identified within tammar wallabies should further increase the species value and usefulness as a model organism. PMID:28257440
nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data.

PubMed

Zhang, Changsheng; Cai, Hongmin; Huang, Jingying; Song, Yan

2016-09-17

Variations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer. Single-cell sequencing technology allows the dissection of genomic heterogeneity at the single-cell level, thereby providing important evolutionary information about cancer cells. In contrast to traditional bulk sequencing, single-cell sequencing requires the amplification of the whole genome of a single cell to accumulate enough samples for sequencing. However, the amplification process inevitably introduces amplification bias, resulting in an over-dispersing portion of the sequencing data. Recent study has manifested that the over-dispersed portion of the single-cell sequencing data could be well modelled by negative binomial distributions. We developed a read-depth based method, nbCNV to detect the copy number variants (CNVs). The nbCNV method uses two constraints-sparsity and smoothness to fit the CNV patterns under the assumption that the read signals are negatively binomially distributed. The problem of CNV detection was formulated as a quadratic optimization problem, and was solved by an efficient numerical solution based on the classical alternating direction minimization method. Extensive experiments to compare nbCNV with existing benchmark models were conducted on both simulated data and empirical single-cell sequencing data. The results of those experiments demonstrate that nbCNV achieves superior performance and high robustness for the detection of CNVs in single-cell sequencing data.
Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming

PubMed Central

Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A.

2015-01-01

Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study, next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina HiSeq 2500 instrument. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs. PMID:26656830
Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

PubMed

Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

2015-01-01

Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.
A fast sequence assembly method based on compressed data structures.

PubMed

Liang, Peifeng; Zhang, Yancong; Lin, Kui; Hu, Jinglu

2014-01-01

Assembling a large genome using next generation sequencing reads requires large computer memory and a long execution time. To reduce these requirements, a memory and time efficient assembler is presented from applying FM-index in JR-Assembler, called FMJ-Assembler, where FM stand for FMR-index derived from the FM-index and BWT and J for jumping extension. The FMJ-Assembler uses expanded FM-index and BWT to compress data of reads to save memory and jumping extension method make it faster in CPU time. An extensive comparison of the FMJ-Assembler with current assemblers shows that the FMJ-Assembler achieves a better or comparable overall assembly quality and requires lower memory use and less CPU time. All these advantages of the FMJ-Assembler indicate that the FMJ-Assembler will be an efficient assembly method in next generation sequencing technology.
Cloning metallothionein gene in Zacco platypus and its potential as an exposure biomarker against cadmium.

PubMed

Lee, Sangwoo; Kim, Cheolmin; Kim, Jungkon; Kim, Woo-Keun; Shin, Hyun Suk; Lim, Eun-Suk; Lee, Jin Wuk; Kim, Sunmi; Kim, Ki-Tae; Lee, Sung-Kyu; Choi, Cheol Young; Choi, Kyungho

2015-07-01

Zacco platypus, pale chub, is an indigenous freshwater fish of East Asia including Korea and has many useful characteristics as indicator species for water pollution. While utility of Z. platypus as an experimental species has been recognized, genetic-level information is very limited and warrants extensive research. Metallothionein (MT) is widely used and well-known biomarker for heavy metal exposure in many experimental species. In the present study, we cloned MT in Z. platypus and evaluated its utility as a biomarker for metal exposure. For this purpose, we sequenced complete complementary DNA (cDNA) of MT in Z. platypus and carried out phylogenetic analysis with its sequences. The transcription-level responses of MT gene following the exposure to CdCl2 were also assessed to validate the utility of this gene as an exposure biomarker. Analysis of cDNA sequence of MT gene demonstrated high conformity with those of other fish. MT messenger RNA (mRNA) expression and enzymatic MT content significantly increased following CdCl2 exposure in a concentration-dependent manner. The level of CdCl2 that resulted in significant MT changes in Z. platypus was within the range that was reported from other fish. The MT gene of Z. platypus sequenced in the present study can be used as a useful biomarker for heavy metal exposure in the aquatic environment of Korea and other countries where this freshwater fish species represents the ecosystem.

Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks.

PubMed

Li, Hongdong; Zhang, Yang; Guan, Yuanfang; Menon, Rajasree; Omenn, Gilbert S

2017-01-01

Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.
Single-molecule analysis of DNA cross-links using nanopore technology

NASA Astrophysics Data System (ADS)

Wolna, Anna H.

The alpha-hemolysin (alpha-HL) protein ion channel is a potential next-generation sequencing platform that has been extensively used to study nucleic acids at a single-molecule level. After applying a potential across a lipid bilayer, the imbedded alpha-HL allows monitoring of the duration and current levels of DNA translocation and immobilization. Because this method does not require DNA amplification prior to sequencing, all the DNA damage present in the cell at any given time will be present during the sequencing experiment. The goal of this research is to determine if these damage sites give distinguishable current levels beyond those observed for the canonical nucleobases. Because DNA cross-links are one of the most prevalent types of DNA damage occurring in vivo, the blockage current levels were determined for thymine-dimers, guanine(C8)-thymine(N3) cross-links and platinum adducts. All of these cross-links give a different blockage current level compared to the undamaged strands when immobilized in the ion channel, and they all can easily translocate across the alpha-HL channel. Additionally, the alpha-HL nanopore technique presents a unique opportunity to study the effects of DNA cross-links, such as thymine-dimers, on the secondary structure of DNA G-quadruplexes folded from the human telomere sequence. Using this single-molecule nanopore technique we can detect subtle structural differences that cannot be easily addressed using conventional methods. The human telomere plays crucial roles in maintaining genome stability. In the presence of suitable cations, the repetitive 5'-TTAGGG human telomere sequence can fold into G-quadruplexes that adopt the hybrid fold in vivo. The telomere sequence is hypersensitive to UV-induced thymine-dimer (T=T) formation, and yet the presence of thymine dimers does not cause telomere shortening. The potential structural disruption and thermodynamic stability of the T=T-containing natural telomere sequences were studied to understand how this damage is tolerated in telomeric DNA. The alpha-HL experiments determined that T=Ts disrupt double-chain reversal loop formation but are well tolerated in edgewise and diagonal loops of the hybrid G-quadruplexes. These studies demonstrated the power of the alpha-HL ion channel to analyze DNA modifications and secondary structures at a single-molecule level.
Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta

PubMed Central

Kanost, Michael R.; Arrese, Estela L.; Cao, Xiaolong; Chen, Yun-Ru; Chellapilla, Sanjay; Goldsmith, Marian R; Grosse-Wilde, Ewald; Heckel, David G.; Herndon, Nicolae; Jiang, Haobo; Papanicolaou, Alexie; Qu, Jiaxin; Soulages, Jose L.; Vogel, Heiko; Walters, James; Waterhouse, Robert M.; Ahn, Seung-Joon; Almeida, Francisca C.; An, Chunju; Aqrawi, Peshtewani; Bretschneider, Anne; Bryant, William B.; Bucks, Sascha; Chao, Hsu; Chevignon, Germain; Christen, Jayne M.; Clarke, David F.; Dittmer, Neal T.; Ferguson, Laura C.F.; Garavelou, Spyridoula; Gordon, Karl H.J.; Gunaratna, Ramesh T.; Han, Yi; Hauser, Frank; He, Yan; Heidel-Fischer, Hanna; Hirsh, Ariana; Hu, Yingxia; Jiang, Hongbo; Kalra, Divya; Klinner, Christian; König, Christopher; Kovar, Christie; Kroll, Ashley R.; Kuwar, Suyog S.; Lee, Sandy L.; Lehman, Rüdiger; Li, Kai; Li, Zhaofei; Liang, Hanquan; Lovelace, Shanna; Lu, Zhiqiang; Mansfield, Jennifer H.; McCulloch, Kyle J.; Mathew, Tittu; Morton, Brian; Muzny, Donna M.; Neunemann, David; Ongeri, Fiona; Pauchet, Yannick; Pu, Ling-Ling; Pyrousis, Ioannis; Rao, Xiang-Jun; Redding, Amanda; Roesel, Charles; Sanchez-Gracia, Alejandro; Schaack, Sarah; Shukla, Aditi; Tetreau, Guillaume; Wang, Yang; Xiong, Guang-Hua; Traut, Walther; Walsh, Tom K.; Worley, Kim C.; Wu, Di; Wu, Wenbi; Wu, Yuan-Qing; Zhang, Xiufeng; Zou, Zhen; Zucker, Hannah; Briscoe, Adriana D.; Burmester, Thorsten; Clem, Rollie J.; Feyereisen, René; Grimmelikhuijzen, Cornelis J.P; Hamodrakas, Stavros J.; Hansson, Bill S.; Huguet, Elisabeth; Jermiin, Lars S.; Lan, Que; Lehman, Herman K.; Lorenzen, Marce; Merzendorfer, Hans; Michalopoulos, Ioannis; Morton, David B.; Muthukrishnan, Subbaratnam; Oakeshott, John G.; Palmer, Will; Park, Yoonseong; Passarelli, A. Lorena; Rozas, Julio; Schwartz, Lawrence M.; Smith, Wendy; Southgate, Agnes; Vilcinskas, Andreas; Vogt, Richard; Wang, Ping; Werren, John; Yu, Xiao-Qiang; Zhou, Jing-Jiang; Brown, Susan J.; Scherer, Steven E.; Richards, Stephen; Blissard, Gary W.

2016-01-01

Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects. PMID:27522922
Sliding over the Blocks in Enzyme-Free RNA Copying – One-Pot Primer Extension in Ice

PubMed Central

Löffler, Philipp M. G.; Groen, Joost; Dörr, Mark; Monnard, Pierre-Alain

2013-01-01

Template-directed polymerization of RNA in the absence of enzymes is the basis for an information transfer in the ‘RNA-world’ hypothesis and in novel nucleic acid based technology. Previous investigations established that only cytidine rich strands are efficient templates in bulk aqueous solutions while a few specific sequences completely block the extension of hybridized primers. We show that a eutectic water/ice system can support Pb2+/Mg2+-ion catalyzed extension of a primer across such sequences, i.e. AA, AU and AG, in a one-pot synthesis. Using mixtures of imidazole activated nucleotide 5′-monophosphates, the two first “blocking” residues could be passed during template-directed polymerization, i.e., formation of triply extended products containing a high fraction of faithful copies was demonstrated. Across the AG sequence, a mismatch sequence was formed in similar amounts to the correct product due to U·G wobble pairing. Thus, the template-directed extension occurs both across pyrimidine and purine rich sequences and insertions of pyrimidines did not inhibit the subsequent insertions. Products were mainly formed with 2′-5′-phosphodiester linkages, however, the abundance of 3′–5′-linkages was higher than previously reported for pyrimidine insertions. When enzyme-free, template-directed RNA polymerization is performed in a eutectic water ice environment, various intrinsic reaction limitations observed in bulk solution can then be overcome. PMID:24058695
Cenozoic global sea level, sequences, and the New Jersey transect: Results from coastal plain and continental slope drilling

USGS Publications Warehouse

Miller, K.G.; Mountain, Gregory S.; Browning, J.V.; Kominz, M.; Sugarman, P.J.; Christie-Blick, N.; Katz, M.E.; Wright, J.D.

1998-01-01

The New Jersey Sea Level Transect was designed to evaluate the relationships among global sea level (eustatic) change, unconformity-bounded sequences, and variations in subsidence, sediment supply, and climate on a passive continental margin. By sampling and dating Cenozoic strata from coastal plain and continental slope locations, we show that sequence boundaries correlate (within ??0.5 myr) regionally (onshore-offshore) and interregionally (New Jersey-Alabama-Bahamas), implicating a global cause. Sequence boundaries correlate with ??18O increases for at least the past 42 myr, consistent with an ice volume (glacioeustatic) control, although a causal relationship is not required because of uncertainties in ages and correlations. Evidence for a causal connection is provided by preliminary Miocene data from slope Site 904 that directly link ??18O increases with sequence boundaries. We conclude that variation in the size of ice sheets has been a primary control on the formation of sequence boundaries since ~42 Ma. We speculate that prior to this, the growth and decay of small ice sheets caused small-amplitude sea level changes (<20 m) in this supposedly ice-free world because Eocene sequence boundaries also appear to correlate with minor ??18O increases. Subsidence estimates (backstripping) indicate amplitudes of short-term (million-year scale) lowerings that are consistent with estimates derived from ??18O studies (25-50 m in the Oligocene-middle Miocene and 10-20 m in the Eocene) and a long-term lowering of 150-200 m over the past 65 myr, consistent with estimates derived from volume changes on mid-ocean ridges. Although our results are consistent with the general number and timing of Paleocene to middle Miocene sequences published by workers at Exxon Production Research Company, our estimates of sea level amplitudes are substantially lower than theirs. Lithofacies patterns within sequences follow repetitive, predictable patterns: (1) coastal plain sequences consist of basal transgressive sands overlain by regressive highstand silts and quartz sands; and (2) although slope lithofacies variations are subdued, reworked sediments constitute lowstand deposits, causing the strongest, most extensive seismic reflections. Despite a primary eustatic control on sequence boundaries, New Jersey sequences were also influenced by changes in tectonics, sediment supply, and climate. During the early to middle Eocene, low siliciclastic and high pelagic input associated with warm climates resulted in widespread carbonate deposition and thin sequences. Late middle Eocene and earliest Oligocene cooling events curtailed carbonate deposition in the coastal plain and slope, respectively, resulting in a switch to siliciclastic sedimentation. In onshore areas, Oligocene sequences are thin owing to low siliciclastic and pelagic input, and their distribution is patchy, reflecting migration or progradation of depocenters; in contrast, Miocene onshore sequences are thicker, reflecting increased sediment supply, and they are more complete downdip owing to simple tectonics. We conclude that the New Jersey margin provides a natural laboratory for unraveling complex interactions of eustasy, tectonics, changes in sediment supply, and climate change.
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

PubMed

Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

1993-12-01

Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.
Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

NASA Astrophysics Data System (ADS)

McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

2016-05-01

Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.
Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides.

PubMed

McMillen, Chelsea L; Wright, Patience M; Cassady, Carolyn J

2016-05-01

Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.
Dendrites, deep learning, and sequences in the hippocampus.

PubMed

Bhalla, Upinder S

2017-10-12

The hippocampus places us both in time and space. It does so over remarkably large spans: milliseconds to years, and centimeters to kilometers. This works for sensory representations, for memory, and for behavioral context. How does it fit in such wide ranges of time and space scales, and keep order among the many dimensions of stimulus context? A key organizing principle for a wide sweep of scales and stimulus dimensions is that of order in time, or sequences. Sequences of neuronal activity are ubiquitous in sensory processing, in motor control, in planning actions, and in memory. Against this strong evidence for the phenomenon, there are currently more models than definite experiments about how the brain generates ordered activity. The flip side of sequence generation is discrimination. Discrimination of sequences has been extensively studied at the behavioral, systems, and modeling level, but again physiological mechanisms are fewer. It is against this backdrop that I discuss two recent developments in neural sequence computation, that at face value share little beyond the label "neural." These are dendritic sequence discrimination, and deep learning. One derives from channel physiology and molecular signaling, the other from applied neural network theory - apparently extreme ends of the spectrum of neural circuit detail. I suggest that each of these topics has deep lessons about the possible mechanisms, scales, and capabilities of hippocampal sequence computation. © 2017 Wiley Periodicals, Inc.
PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework.

PubMed

Song, Jiangning; Li, Fuyi; Takemoto, Kazuhiro; Haffari, Gholamreza; Akutsu, Tatsuya; Chou, Kuo-Chen; Webb, Geoffrey I

2018-04-14

Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence-structure-function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence-structure-function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations. Copyright © 2018 Elsevier Ltd. All rights reserved.
Multivariate analysis of ultrasound-recorded dorsal strain sequences: Investigation of dynamic neck extensions in women with chronic whiplash associated disorders.

PubMed

Peolsson, Anneli; Peterson, Gunnel; Trygg, Johan; Nilsson, David

2016-08-03

Whiplash Associated Disorders (WAD) refers to the multifaceted and chronic burden that is common after a whiplash injury. Tools to assist in the diagnosis of WAD and an increased understanding of neck muscle behaviour are needed. We examined the multilayer dorsal neck muscle behaviour in nine women with chronic WAD versus healthy controls during the entire sequence of a dynamic low-loaded neck extension exercise, which was recorded using real-time ultrasound movies with high frame rates. Principal component analysis and orthogonal partial least squares were used to analyse mechanical muscle strain (deformation in elongation and shortening). The WAD group showed more shortening during the neck extension phase in the trapezius muscle and during both the neck extension and the return to neutral phase in the multifidus muscle. For the first time, a novel non-invasive method is presented that is capable of detecting altered dorsal muscle strain in women with WAD during an entire exercise sequence. This method may be a breakthrough for the future diagnosis and treatment of WAD.
Multivariate analysis of ultrasound-recorded dorsal strain sequences: Investigation of dynamic neck extensions in women with chronic whiplash associated disorders

PubMed Central

Peolsson, Anneli; Peterson, Gunnel; Trygg, Johan; Nilsson, David

2016-01-01

Whiplash Associated Disorders (WAD) refers to the multifaceted and chronic burden that is common after a whiplash injury. Tools to assist in the diagnosis of WAD and an increased understanding of neck muscle behaviour are needed. We examined the multilayer dorsal neck muscle behaviour in nine women with chronic WAD versus healthy controls during the entire sequence of a dynamic low-loaded neck extension exercise, which was recorded using real-time ultrasound movies with high frame rates. Principal component analysis and orthogonal partial least squares were used to analyse mechanical muscle strain (deformation in elongation and shortening). The WAD group showed more shortening during the neck extension phase in the trapezius muscle and during both the neck extension and the return to neutral phase in the multifidus muscle. For the first time, a novel non-invasive method is presented that is capable of detecting altered dorsal muscle strain in women with WAD during an entire exercise sequence. This method may be a breakthrough for the future diagnosis and treatment of WAD. PMID:27484361
Multivariate analysis of ultrasound-recorded dorsal strain sequences: Investigation of dynamic neck extensions in women with chronic whiplash associated disorders

NASA Astrophysics Data System (ADS)

Peolsson, Anneli; Peterson, Gunnel; Trygg, Johan; Nilsson, David

2016-08-01

Whiplash Associated Disorders (WAD) refers to the multifaceted and chronic burden that is common after a whiplash injury. Tools to assist in the diagnosis of WAD and an increased understanding of neck muscle behaviour are needed. We examined the multilayer dorsal neck muscle behaviour in nine women with chronic WAD versus healthy controls during the entire sequence of a dynamic low-loaded neck extension exercise, which was recorded using real-time ultrasound movies with high frame rates. Principal component analysis and orthogonal partial least squares were used to analyse mechanical muscle strain (deformation in elongation and shortening). The WAD group showed more shortening during the neck extension phase in the trapezius muscle and during both the neck extension and the return to neutral phase in the multifidus muscle. For the first time, a novel non-invasive method is presented that is capable of detecting altered dorsal muscle strain in women with WAD during an entire exercise sequence. This method may be a breakthrough for the future diagnosis and treatment of WAD.
Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

PubMed

Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

2013-06-01

Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. © 2013 John Wiley & Sons Ltd.
Preferential access to genetic information from endogenous hominin ancient DNA and accurate quantitative SNP-typing via SPEX

PubMed Central

Brotherton, Paul; Sanchez, Juan J.; Cooper, Alan; Endicott, Phillip

2010-01-01

The analysis of targeted genetic loci from ancient, forensic and clinical samples is usually built upon polymerase chain reaction (PCR)-generated sequence data. However, many studies have shown that PCR amplification from poor-quality DNA templates can create sequence artefacts at significant levels. With hominin (human and other hominid) samples, the pervasive presence of highly PCR-amplifiable human DNA contaminants in the vast majority of samples can lead to the creation of recombinant hybrids and other non-authentic artefacts. The resulting PCR-generated sequences can then be difficult, if not impossible, to authenticate. In contrast, single primer extension (SPEX)-based approaches can genotype single nucleotide polymorphisms from ancient fragments of DNA as accurately as modern DNA. A single SPEX-type assay can amplify just one of the duplex DNA strands at target loci and generate a multi-fold depth-of-coverage, with non-authentic recombinant hybrids reduced to undetectable levels. Crucially, SPEX-type approaches can preferentially access genetic information from damaged and degraded endogenous ancient DNA templates over modern human DNA contaminants. The development of SPEX-type assays offers the potential for highly accurate, quantitative genotyping from ancient hominin samples. PMID:19864251
Barcode extension for analysis and reconstruction of structures

NASA Astrophysics Data System (ADS)

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng

2017-03-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures.

PubMed

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-03-13

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures

PubMed Central

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-01-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117
Overcoming Sequence Misalignments with Weighted Structural Superposition

PubMed Central

Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

2012-01-01

An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542

Extensive concerted evolution of rice paralogs and the road to regaining independence.

PubMed

Wang, Xiyin; Tang, Haibao; Bowers, John E; Feltus, Frank A; Paterson, Andrew H

2007-11-01

Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the approximately 0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, approximately 8% of japonica paralogs produced 5-7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while approximately 70-MY-old "paleologs" resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice-sorghum divergence approximately 41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity--that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5-7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization.
Cloning and characterization of a Prevotella melaninogenica hemolysin.

PubMed Central

Allison, H E; Hillman, J D

1997-01-01

Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica. PMID:9199448
Cloning and characterization of a Prevotella melaninogenica hemolysin.

PubMed

Allison, H E; Hillman, J D

1997-07-01

Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica.
Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

PubMed

Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

2012-01-01

The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
PlantTFDB: a comprehensive plant transcription factor database

PubMed Central

Guo, An-Yuan; Chen, Xin; Gao, Ge; Zhang, He; Zhu, Qi-Hui; Liu, Xiao-Chuan; Zhong, Ying-Fu; Gu, Xiaocheng; He, Kun; Luo, Jingchu

2008-01-01

Transcription factors (TFs) play key roles in controlling gene expression. Systematic identification and annotation of TFs, followed by construction of TF databases may serve as useful resources for studying the function and evolution of transcription factors. We developed a comprehensive plant transcription factor database PlantTFDB (http://planttfdb.cbi.pku.edu.cn), which contains 26 402 TFs predicted from 22 species, including five model organisms with available whole genome sequence and 17 plants with available EST sequences. To provide comprehensive information for those putative TFs, we made extensive annotation at both family and gene levels. A brief introduction and key references were presented for each family. Functional domain information and cross-references to various well-known public databases were available for each identified TF. In addition, we predicted putative orthologs of those TFs among the 22 species. PlantTFDB has a simple interface to allow users to search the database by IDs or free texts, to make sequence similarity search against TFs of all or individual species, and to download TF sequences for local analysis. PMID:17933783
Development of novel types of plastid transformation vectors and evaluation of factors controlling expression.

PubMed

Herz, Stefan; Füssl, Monika; Steiger, Sandra; Koop, Hans-Ulrich

2005-12-01

Two new vector types for plastid transformation were developed and uidA reporter gene expression was compared to standard transformation vectors. The first vector type does not contain any plastid promoter, instead it relies on extension of existing plastid operons and was therefore named "operon-extension" vector. When a strongly expressed plastid operon like psbA was extended by the reporter gene with this vector type, the expression level was superior to that of a standard vector under control of the 16S rRNA promoter. Different insertion sites, promoters and 5'-UTRs were analysed for their effect on reporter gene expression with standard and operon-extension vectors. The 5'-UTR of phage 7 gene 10 in combination with a modified N-terminus was found to yield the highest expression levels. Expression levels were also strongly dependent on external factors like plant or leaf age or light intensity. In the second vector type, named "split" plastid transformation vector, modules of the expression cassette were distributed on two separate vectors. Upon co-transformation of plastids with these vectors, the complete expression cassette became inserted into the plastome. This result can be explained by successive co-integration of the split vectors and final loop-out recombination of the duplicated sequences. The split vector concept was validated with different vector pairs.
Selection of the initial design for the two-stage continual reassessment method.

PubMed

Jia, Xiaoyu; Ivanova, Anastasia; Lee, Shing M

2017-01-01

In the two-stage continual reassessment method (CRM), model-based dose escalation is preceded by a pre-specified escalating sequence starting from the lowest dose level. This is appealing to clinicians because it allows a sufficient number of patients to be assigned to each of the lower dose levels before escalating to higher dose levels. While a theoretical framework to build the two-stage CRM has been proposed, the selection of the initial dose-escalating sequence, generally referred to as the initial design, remains arbitrary, either by specifying cohorts of three patients or by trial and error through extensive simulations. Motivated by a currently ongoing oncology dose-finding study for which clinicians explicitly stated their desire to assign at least one patient to each of the lower dose levels, we proposed a systematic approach for selecting the initial design for the two-stage CRM. The initial design obtained using the proposed algorithm yields better operating characteristics compared to using a cohort of three initial design with a calibrated CRM. The proposed algorithm simplifies the selection of initial design for the two-stage CRM. Moreover, initial designs to be used as reference for planning a two-stage CRM are provided.
Facilitating protein solubility by use of peptide extensions

DOEpatents

Freimuth, Paul I; Zhang, Yian-Biao; Howitt, Jason

2013-09-17

Expression vectors for expression of a protein or polypeptide of interest as a fusion product composed of the protein or polypeptide of interest fused at one terminus to a solubility enhancing peptide extension are provided. Sequences encoding the peptide extensions are provided. The invention further comprises antibodies which bind specifically to one or more of the solubility enhancing peptide extensions.
Recognition of the Xenopus ribosomal core promoter by the transcription factor xUBF involves multiple HMG box domains and leads to an xUBF interdomain interaction.

PubMed

Leblanc, B; Read, C; Moss, T

1993-02-01

The interaction of the ribosomal transcription factor xUBF with the RNA polymerase I core promoter of Xenopus laevis has been studied both at the DNA and protein levels. It is shown that a single xUBF-DNA complex forms over the 40S initiation site (+1) and involves at least the DNA sequences between -20 and +60 bp. DNA sequences upstream of +10 and downstream of +18 are each sufficient to direct complex formation independently. HMG box 1 of xUBF independently recognizes the sequences -20 to -1 and +1 to +22 and the addition of the N-terminal dimerization domain to HMG box 1 stabilizes its interaction with these sequences approximately 10-fold. HMG boxes 2/3 interact with the DNA downstream of +22 and can independently position xUBF across the initiation site. The C-terminal segment of xUBF, HMG boxes 4, 5 or the acidic domain, directly or indirectly interact with HMG box 1, making the core promoter sequences between -11 and -15 hypersensitive to DNase. This interaction also requires the DNA sequences between +17 and +32, i.e. the HMG box 2/3 binding site. The data suggest extensive folding of the core promoter within the xUBF complex.
A bio-inspired system for spatio-temporal recognition in static and video imagery

NASA Astrophysics Data System (ADS)

Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas

2007-04-01

This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.
Quantitative analysis of ribosome–mRNA complexes at different translation stages

PubMed Central

Shirokikh, Nikolay E.; Alkalaeva, Elena Z.; Vassilenko, Konstantin S.; Afonina, Zhanna A.; Alekhina, Olga M.; Kisselev, Lev L.; Spirin, Alexander S.

2010-01-01

Inhibition of primer extension by ribosome–mRNA complexes (toeprinting) is a proven and powerful technique for studying mechanisms of mRNA translation. Here we have assayed an advanced toeprinting approach that employs fluorescently labeled DNA primers, followed by capillary electrophoresis utilizing standard instruments for sequencing and fragment analysis. We demonstrate that this improved technique is not merely fast and cost-effective, but also brings the primer extension inhibition method up to the next level. The electrophoretic pattern of the primer extension reaction can be characterized with a precision unattainable by the common toeprint analysis utilizing radioactive isotopes. This method allows us to detect and quantify stable ribosomal complexes at all stages of translation, including initiation, elongation and termination, generated during the complete translation process in both the in vitro reconstituted translation system and the cell lysate. We also point out the unique advantages of this new methodology, including the ability to assay sites of the ribosomal complex assembly on several mRNA species in the same reaction mixture. PMID:19910372
Exposure dating and glacial reconstruction at Mt. Field, Tasmania, Australia, identifies MIS 3 and MIS 2 glacial advances and climatic variability

NASA Astrophysics Data System (ADS)

Mackintosh, A. N.; Barrows, T. T.; Colhoun, E. A.; Fifield, L. K.

2006-05-01

Tasmania is important for understanding Quaternary climatic change because it is one of only three areas that experienced extensive mid-latitude Southern Hemisphere glaciation and it lies in a dominantly oceanic environment at a great distance from Northern Hemisphere ice sheet feedbacks. We applied exposure dating using 36Cl to an extensive sequence of moraines from the last glacial at Mt. Field, Tasmania. Glaciers advanced at 41-44 ka during Marine oxygen Isotope Stage (MIS) 3 and at 18 ka during MIS 2. Both advances occurred in response to an ELA lowering greater than 1100 m below the present-day mean summer freezing level, and a possible temperature reduction of 7-8°C. Deglaciation was rapid and complete by ca. 16 ka. The overall story emerging from studies of former Tasmanian glaciers is that the MIS 2 glaciation was of limited extent and that some glaciers were more extensive during earlier parts of the last glacial cycle. Copyright
Probabilistic topic modeling for the analysis and classification of genomic sequences

PubMed Central

2015-01-01

Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734
Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

PubMed

Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

2010-10-15

The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV ψ. Copyright © 2010 Elsevier Ltd. All rights reserved.
Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project.

PubMed

Park, Tae-Ho; Park, Beom-Seok; Kim, Jin-A; Hong, Joon Ki; Jin, Mina; Seol, Young-Joo; Mun, Jeong-Hwan

2011-01-01

As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa. Copyright © 2011. Published by Elsevier Ltd.
Computation of repetitions and regularities of biologically weighted sequences.

PubMed

Christodoulakis, M; Iliopoulos, C; Mouchard, L; Perdikuri, K; Tsakalidis, A; Tsichlas, K

2006-01-01

Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (iii) computation of regularities. Our algorithms can be used as basic building blocks for more sophisticated algorithms applied on weighted sequences.
Subtype Distribution of Blastocystis Isolates in Sebha, Libya

PubMed Central

Abdulsalam, Awatif M.; Ithoi, Init; Al-Mekhlafi, Hesham M.; Al-Mekhlafi, Abdulsalam M.; Ahmed, Abdulhamid; Surin, Johari

2013-01-01

Background Blastocystis is a genetically diverse and a common intestinal parasite of humans with a controversial pathogenic potential. This study was carried out to identify the Blastocystis subtypes and their association with demographic and socioeconomic factors among outpatients living in Sebha city, Libya. Methods/Findings Blastocystis in stool samples were cultured followed by isolation, PCR amplification of a partial SSU rDNA gene, cloning, and sequencing. The DNA sequences of isolated clones showed 98.3% to 100% identity with the reference Blastocystis isolates from the Genbank. Multiple sequence alignment showed polymorphism from one to seven base substitution and/or insertion/deletion in several groups of non-identical nucleotides clones. Phylogenetic analysis revealed three assemblage subtypes (ST) with ST1 as the most prevalent (51.1%) followed by ST2 (24.4%), ST3 (17.8%) and mixed infections of two concurrent subtypes (6.7%). Blastocystis ST1 infection was significantly associated with female (P = 0.009) and low educational level (P = 0.034). ST2 was also significantly associated with low educational level (P= 0.008) and ST3 with diarrhoea (P = 0.008). Conclusion Phylogenetic analysis of Libyan Blastocystis isolates identified three different subtypes; with ST1 being the predominant subtype and its infection was significantly associated with female gender and low educational level. More extensive studies are needed in order to relate each Blastocystis subtype with clinical symptoms and potential transmission sources in this community. PMID:24376805
Subtype distribution of Blastocystis isolates in Sebha, Libya.

PubMed

Abdulsalam, Awatif M; Ithoi, Init; Al-Mekhlafi, Hesham M; Al-Mekhlafi, Abdulsalam M; Ahmed, Abdulhamid; Surin, Johari

2013-01-01

Blastocystis is a genetically diverse and a common intestinal parasite of humans with a controversial pathogenic potential. This study was carried out to identify the Blastocystis subtypes and their association with demographic and socioeconomic factors among outpatients living in Sebha city, Libya. Blastocystis in stool samples were cultured followed by isolation, PCR amplification of a partial SSU rDNA gene, cloning, and sequencing. The DNA sequences of isolated clones showed 98.3% to 100% identity with the reference Blastocystis isolates from the Genbank. Multiple sequence alignment showed polymorphism from one to seven base substitution and/or insertion/deletion in several groups of non-identical nucleotides clones. Phylogenetic analysis revealed three assemblage subtypes (ST) with ST1 as the most prevalent (51.1%) followed by ST2 (24.4%), ST3 (17.8%) and mixed infections of two concurrent subtypes (6.7%). ST1 infection was significantly associated with female (P = 0.009) and low educational level (P = 0.034). ST2 was also significantly associated with low educational level (P= 0.008) and ST3 with diarrhoea (P = 0.008). Phylogenetic analysis of Libyan Blastocystis isolates identified three different subtypes; with ST1 being the predominant subtype and its infection was significantly associated with female gender and low educational level. More extensive studies are needed in order to relate each Blastocystis subtype with clinical symptoms and potential transmission sources in this community.
Molecular Characterization of Transgene Integration by Next-Generation Sequencing in Transgenic Cattle

PubMed Central

Zhang, Ran; Yin, Yinliang; Zhang, Yujun; Li, Kexin; Zhu, Hongxia; Gong, Qin; Wang, Jianwu; Hu, Xiaoxiang; Li, Ning

2012-01-01

As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species. PMID:23185606
Resolving the tips of the tree of life: How much mitochondrialdata doe we need?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bonett, Ronald M.; Macey, J. Robert; Boore, Jeffrey L.

2005-04-29

Mitochondrial (mt) DNA sequences are used extensively to reconstruct evolutionary relationships among recently diverged animals,and have constituted the most widely used markers for species- and generic-level relationships for the last decade or more. However, most studies to date have employed relatively small portions of the mt-genome. In contrast, complete mt-genomes primarily have been used to investigate deep divergences, including several studies of the amount of mt sequence necessary to recover ancient relationships. We sequenced and analyzed 24 complete mt-genomes from a group of salamander species exhibiting divergences typical of those in many species-level studies. We present the first comprehensive investigationmore » of the amount of mt sequence data necessary to consistently recover the mt-genome tree at this level, using parsimony and Bayesian methods. Both methods of phylogenetic analysis revealed extremely similar results. A surprising number of well supported, yet conflicting, relationships were found in trees based on fragments less than {approx}2000 nucleotides (nt), typical of the vast majority of the thousands of mt-based studies published to date. Large amounts of data (11,500+ nt) were necessary to consistently recover the whole mt-genome tree. Some relationships consistently were recovered with fragments of all sizes, but many nodes required the majority of the mt-genome to stabilize, particularly those associated with short internal branches. Although moderate amounts of data (2000-3000 nt) were adequate to recover mt-based relationships for which most nodes were congruent with the whole mt-genome tree, many thousands of nucleotides were necessary to resolve rapid bursts of evolution. Recent advances in genomics are making collection of large amounts of sequence data highly feasible, and our results provide the basis for comparative studies of other closely related groups to optimize mt sequence sampling and phylogenetic resolution at the ''tips'' of the Tree of Life.« less

Palindromic Sequence Artifacts Generated during Next Generation Sequencing Library Preparation from Historic and Ancient DNA

PubMed Central

Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel

2014-01-01

Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
East Asian mtDNA haplogroup determination in Koreans: haplogroup-level coding region SNP analysis and subhaplogroup-level control region sequence analysis.

PubMed

Lee, Hwan Young; Yoo, Ji-Eun; Park, Myung Jin; Chung, Ukhee; Kim, Chong-Youl; Shin, Kyoung-Jin

2006-11-01

The present study analyzed 21 coding region SNP markers and one deletion motif for the determination of East Asian mitochondrial DNA (mtDNA) haplogroups by designing three multiplex systems which apply single base extension methods. Using two multiplex systems, all 593 Korean mtDNAs were allocated into 15 haplogroups: M, D, D4, D5, G, M7, M8, M9, M10, M11, R, R9, B, A, and N9. As the D4 haplotypes occurred most frequently in Koreans, the third multiplex system was used to further define D4 subhaplogroups: D4a, D4b, D4e, D4g, D4h, and D4j. This method allowed the complementation of coding region information with control region mutation motifs and the resultant findings also suggest reliable control region mutation motifs for the assignment of East Asian mtDNA haplogroups. These three multiplex systems produce good results in degraded samples as they contain small PCR products (101-154 bp) for single base extension reactions. SNP scoring was performed in 101 old skeletal remains using these three systems to prove their utility in degraded samples. The sequence analysis of mtDNA control region with high incidence of haplogroup-specific mutations and the selective scoring of highly informative coding region SNPs using the three multiplex systems are useful tools for most applications involving East Asian mtDNA haplogroup determination and haplogroup-directed stringent quality control.
Correlation of the major late Jurassic —early Tertiary low- and highstand cycles of south-west Egypt and north-west Sudan

NASA Astrophysics Data System (ADS)

Wycisk, Peter

1994-12-01

The mainly continental deposits of northwest Sudan and south-west Egypt have been correlated with coeval shallow marine and marine deposits in northern Egypt along a north-south running cross-section, based on surface and subsurface data. The palaeodepth curve of northern Egypt illustrates the gradual seal-level rise, reaching its maximum during the Late Cretaceous with conspicuous advances during the Aptian and late Cenomanian. A general highstand is also recorded during the Campanian-Maastrichtian in north-west Sudan. A detailed facies correlation is given for the Aptian and late Cenomanian highstand in western Egypt. The correlation of the Cenomanian Bahariya and Maghrabi formations displays short-term relative sealevel fluctuations. The interpretation illustrates the extensiveness of related erosional processes in the hinterland, partly intensified by temporarily uplift of the Uweinat-Aswan High in the south. Regional uplift and constant erosion took place in south-west Egypt during Coniacian and Santonian times. The regional stratigraphic gaps and uncertain interpretation of the Bahariya Uplift are induced by the influence of the Trans-African Lineament, especially during the Late Cretaceous. Low-stand fluvial sheet sandstones characterized by non-cyclic sequence development and high facies stability occur, especially in the Neocomian and early Turonian. During the Barremian and Albian, fluvial architecture changes to more cyclic fluvial sequences and increasing soil formation, due to increasing subsidence, more humid climatic conditions and the generally rising sea level, culminating in the extensive shallow marine Abu Ballas and Maghrabi formations.
Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta.

PubMed

Kanost, Michael R; Arrese, Estela L; Cao, Xiaolong; Chen, Yun-Ru; Chellapilla, Sanjay; Goldsmith, Marian R; Grosse-Wilde, Ewald; Heckel, David G; Herndon, Nicolae; Jiang, Haobo; Papanicolaou, Alexie; Qu, Jiaxin; Soulages, Jose L; Vogel, Heiko; Walters, James; Waterhouse, Robert M; Ahn, Seung-Joon; Almeida, Francisca C; An, Chunju; Aqrawi, Peshtewani; Bretschneider, Anne; Bryant, William B; Bucks, Sascha; Chao, Hsu; Chevignon, Germain; Christen, Jayne M; Clarke, David F; Dittmer, Neal T; Ferguson, Laura C F; Garavelou, Spyridoula; Gordon, Karl H J; Gunaratna, Ramesh T; Han, Yi; Hauser, Frank; He, Yan; Heidel-Fischer, Hanna; Hirsh, Ariana; Hu, Yingxia; Jiang, Hongbo; Kalra, Divya; Klinner, Christian; König, Christopher; Kovar, Christie; Kroll, Ashley R; Kuwar, Suyog S; Lee, Sandy L; Lehman, Rüdiger; Li, Kai; Li, Zhaofei; Liang, Hanquan; Lovelace, Shanna; Lu, Zhiqiang; Mansfield, Jennifer H; McCulloch, Kyle J; Mathew, Tittu; Morton, Brian; Muzny, Donna M; Neunemann, David; Ongeri, Fiona; Pauchet, Yannick; Pu, Ling-Ling; Pyrousis, Ioannis; Rao, Xiang-Jun; Redding, Amanda; Roesel, Charles; Sanchez-Gracia, Alejandro; Schaack, Sarah; Shukla, Aditi; Tetreau, Guillaume; Wang, Yang; Xiong, Guang-Hua; Traut, Walther; Walsh, Tom K; Worley, Kim C; Wu, Di; Wu, Wenbi; Wu, Yuan-Qing; Zhang, Xiufeng; Zou, Zhen; Zucker, Hannah; Briscoe, Adriana D; Burmester, Thorsten; Clem, Rollie J; Feyereisen, René; Grimmelikhuijzen, Cornelis J P; Hamodrakas, Stavros J; Hansson, Bill S; Huguet, Elisabeth; Jermiin, Lars S; Lan, Que; Lehman, Herman K; Lorenzen, Marce; Merzendorfer, Hans; Michalopoulos, Ioannis; Morton, David B; Muthukrishnan, Subbaratnam; Oakeshott, John G; Palmer, Will; Park, Yoonseong; Passarelli, A Lorena; Rozas, Julio; Schwartz, Lawrence M; Smith, Wendy; Southgate, Agnes; Vilcinskas, Andreas; Vogt, Richard; Wang, Ping; Werren, John; Yu, Xiao-Qiang; Zhou, Jing-Jiang; Brown, Susan J; Scherer, Steven E; Richards, Stephen; Blissard, Gary W

2016-09-01

Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects. Copyright © 2016 Elsevier Ltd. All rights reserved.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

PubMed

Wenger, Yvan; Galliot, Brigitte

2013-03-25

Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

PubMed Central

2013-01-01

Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
PRADA: pipeline for RNA sequencing data analysis.

PubMed

Torres-García, Wandaliz; Zheng, Siyuan; Sivachenko, Andrey; Vegesna, Rahulsimham; Wang, Qianghu; Yao, Rong; Berger, Michael F; Weinstein, John N; Getz, Gad; Verhaak, Roel G W

2014-08-01

Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program. http://sourceforge.net/projects/prada/ gadgetz@broadinstitute.org or rverhaak@mdanderson.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Molecular Simulations of Sequence-Specific Association of Transmembrane Proteins in Lipid Bilayers

NASA Astrophysics Data System (ADS)

Doxastakis, Manolis; Prakash, Anupam; Janosi, Lorant

2011-03-01

Association of membrane proteins is central in material and information flow across the cellular membranes. Amino-acid sequence and the membrane environment are two critical factors controlling association, however, quantitative knowledge on such contributions is limited. In this work, we study the dimerization of helices in lipid bilayers using extensive parallel Monte Carlo simulations with recently developed algorithms. The dimerization of Glycophorin A is examined employing a coarse-grain model that retains a level of amino-acid specificity, in three different phospholipid bilayers. Association is driven by a balance of protein-protein and lipid-induced interactions with the latter playing a major role at short separations. Following a different approach, the effect of amino-acid sequence is studied using the four transmembrane domains of the epidermal growth factor receptor family in identical lipid environments. Detailed characterization of dimer formation and estimates of the free energy of association reveal that these helices present significant affinity to self-associate with certain dimers forming non-specific interfaces.
Prospecting Metagenomic Enzyme Subfamily Genes for DNA Family Shuffling by a Novel PCR-based Approach*

PubMed Central

Wang, Qiuyan; Wu, Huili; Wang, Anming; Du, Pengfei; Pei, Xiaolin; Li, Haifeng; Yin, Xiaopu; Huang, Lifeng; Xiong, Xiaolong

2010-01-01

DNA family shuffling is a powerful method for enzyme engineering, which utilizes recombination of naturally occurring functional diversity to accelerate laboratory-directed evolution. However, the use of this technique has been hindered by the scarcity of family genes with the required level of sequence identity in the genome database. We describe here a strategy for collecting metagenomic homologous genes for DNA shuffling from environmental samples by truncated metagenomic gene-specific PCR (TMGS-PCR). Using identified metagenomic gene-specific primers, twenty-three 921-bp truncated lipase gene fragments, which shared 64–99% identity with each other and formed a distinct subfamily of lipases, were retrieved from 60 metagenomic samples. These lipase genes were shuffled, and selected active clones were characterized. The chimeric clones show extensive functional and genetic diversity, as demonstrated by functional characterization and sequence analysis. Our results indicate that homologous sequences of genes captured by TMGS-PCR can be used as suitable genetic material for DNA family shuffling with broad applications in enzyme engineering. PMID:20962349
Stratigraphic controls on seawater intrusion and implications for groundwater management, Dominguez Gap area of Los Angeles, California, USA

USGS Publications Warehouse

Nishikawa, T.; Siade, A.J.; Reichard, E.G.; Ponti, D.J.; Canales, A.G.; Johnson, T.A.

2009-01-01

Groundwater pumping has led to extensive water-level declines and seawater intrusion in coastal Los Angeles, California (USA). A SUTRA-based solute-transport model was developed to test the hydraulic implications of a sequence-stratigraphic model of the Dominguez Gap area and to assess the effects of water-management scenarios. The model is two-dimensional, vertical and follows an approximate flow line extending from the Pacific Ocean through the Dominguez Gap area. Results indicate that a newly identified fault system can provide a pathway for transport of seawater and that a stratigraphic boundary located between the Bent Spring and Upper Wilmington sequences may control the vertical movement of seawater. Three 50-year water-management scenarios were considered: (1) no change in water-management practices; (2) installation of a slurry wall; and (3) raising inland water levels to 7.6 m above sea level. Scenario 3 was the most effective by reversing seawater intrusion. The effects of an instantaneous 1-m sea-level rise were also tested using water-management scenarios 1 and 3. Results from two 100-year simulations indicate that a 1-m sea-level rise may accelerate seawater intrusion for scenario 1; however, scenario 3 remains effective for controlling seawater intrusion. ?? Springer-Verlag 2009.
Cambro-ordovician sea-level fluctuations and sequence boundaries: The missing record and the evolution of new taxa

USGS Publications Warehouse

Lehnert, O.; Miller, J.F.; Leslie, Stephen A.; Repetski, J.E.; Ethington, Raymond L.

2005-01-01

The evolution of early Palaeozoic conodont faunas shows a clear connection to sea-level changes. One way that this connection manifests itself is that thick successions of carbonates are missing beneath major sequence boundaries due to karstification and erosion. From this observation arises the question of how many taxa have been lost from different conodont lineages in these incomplete successions. Although many taxa suffered extinction due to the environmental stresses associated with falling sea-levels, some must have survived in these extreme conditions. The number of taxa missing in the early Palaeozoic tropics always will be unclear, but it will be even more difficult to evaluate the missing record in detrital successions of higher latitudes. A common pattern in the evolution of Cambrian-Ordovician conodont lineages is appearances of new species at sea-level rises and disappearances at sea-level drops. This simple picture can be complicated by intervals that consistently have no representatives of a particular lineage, even after extensive sampling of the most complete sections. Presumably the lineages survived in undocumented refugia. In this paper, we give examples of evolution in Cambrian-Ordovician shallowmarine conodont faunas and highlight problems of undiscovered or truly missing segments of lineages. ?? The Palaeontological Association.
Extensive Horizontal Transfer and Homologous Recombination Generate Highly Chimeric Mitochondrial Genomes in Yeast.

PubMed

Wu, Baojun; Buljic, Adnan; Hao, Weilong

2015-10-01

The frequency of horizontal gene transfer (HGT) in mitochondrial DNA varies substantially. In plants, HGT is relatively common, whereas in animals it appears to be quite rare. It is of considerable importance to understand mitochondrial HGT across the major groups of eukaryotes at a genome-wide level, but so far this has been well studied only in plants. In this study, we generated ten new mitochondrial genome sequences and analyzed 40 mitochondrial genomes from the Saccharomycetaceae to assess the magnitude and nature of mitochondrial HGT in yeasts. We provide evidence for extensive, homologous-recombination-mediated, mitochondrial-to-mitochondrial HGT occurring throughout yeast mitochondrial genomes, leading to genomes that are highly chimeric evolutionarily. This HGT has led to substantial intraspecific polymorphism in both sequence content and sequence divergence, which to our knowledge has not been previously documented in any mitochondrial genome. The unexpectedly high frequency of mitochondrial HGT in yeast may be driven by frequent mitochondrial fusion, relatively low mitochondrial substitution rates and pseudohyphal fusion to produce heterokaryons. These findings suggest that mitochondrial HGT may play an important role in genome evolution of a much broader spectrum of eukaryotes than previously appreciated and that there is a critical need to systematically study the frequency, extent, and importance of mitochondrial HGT across eukaryotes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Cassini Solstice Mission: Streamlining Operations by Sequencing with PIEs

NASA Technical Reports Server (NTRS)

Vandermey, Nancy; Alonge, Eleanor K.; Magee, Kari; Heventhal, William

2014-01-01

The Cassini Solstice Mission (CSM) is the second extended mission phase of the highly successful Cassini/Huygens mission to Saturn. Conducted at a much-reduced funding level, operations for the CSM have been streamlined and simplified significantly. Integration of the science timeline, which involves allocating observation time in a balanced manner to each of the five different science disciplines (with representatives from the twelve different science instruments), has long been a labor-intensive endeavor. Lessons learned from the prime mission (2004-2008) and first extended mission (Equinox mission, 2008-2010) were utilized to design a new process involving PIEs (Pre-Integrated Events) to ensure the highest priority observations for each discipline could be accomplished despite reduced work force and overall simplification of processes. Discipline-level PIE lists were managed by the Science Planning team and graphically mapped to aid timeline deconfliction meetings prior to assigning discrete segments of time to the various disciplines. Periapse segments are generally discipline-focused, with the exception of a handful of PIEs. In addition to all PIEs being documented in a spreadsheet, allocated out-of-discipline PIEs were entered into the Cassini Information Management System (CIMS) well in advance of timeline integration. The disciplines were then free to work the rest of the timeline internally, without the need for frequent interaction, debate, and negotiation with representatives from other disciplines. As a result, the number of integration meetings has been cut back extensively, freeing up workforce. The sequence implementation process was streamlined as well, combining two previous processes (and teams) into one. The new Sequence Implementation Process (SIP) schedules 22 weeks to build each 10-week-long sequence, and only 3 sequence processes overlap. This differs significantly from prime mission during which 5-week-long sequences were built in 24 weeks, with 6 overlapping processes.
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

PubMed

Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

2016-07-08

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Partitioning the Genetic Diversity of a Virus Family: Approach and Evaluation through a Case Study of Picornaviruses

PubMed Central

Lauber, Chris

2012-01-01

The recent advent of genome sequences as the only source available to classify many newly discovered viruses challenges the development of virus taxonomy by expert virologists who traditionally rely on extensive virus characterization. In this proof-of-principle study, we address this issue by presenting a computational approach (DEmARC) to classify viruses of a family into groups at hierarchical levels using a sole criterion—intervirus genetic divergence. To quantify genetic divergence, we used pairwise evolutionary distances (PEDs) estimated by maximum likelihood inference on a multiple alignment of family-wide conserved proteins. PEDs were calculated for all virus pairs, and the resulting distribution was modeled via a mixture of probability density functions. The model enables the quantitative inference of regions of distance discontinuity in the family-wide PED distribution, which define the levels of hierarchy. For each level, a limit on genetic divergence, below which two viruses join the same group, was objectively selected among a set of candidates by minimizing violations of intragroup PEDs to the limit. In a case study, we applied the procedure to hundreds of genome sequences of picornaviruses and extensively evaluated it by modulating four key parameters. It was found that the genetics-based classification largely tolerates variations in virus sampling and multiple alignment construction but is affected by the choice of protein and the measure of genetic divergence. In an accompanying paper (C. Lauber and A. E. Gorbalenya, J. Virol. 86:3905–3915, 2012), we analyze the substantial insight gained with the genetics-based classification approach by comparing it with the expert-based picornavirus taxonomy. PMID:22278230
Complete Genome Sequence of Magnetospirillum gryphiswaldense MSR-1

PubMed Central

Wang, Xu; Wang, Qing; Zhang, Weijia; Wang, Yinjia; Li, Li; Wen, Tong; Zhang, Tongwei; Zhang, Yang; Xu, Jun; Hu, Junying; Li, Shuqi; Liu, Lingzi; Liu, Jinxin; Jiang, Wei; Tian, Jiesheng; Wang, Lei; Li, Jilun

2014-01-01

We report the complete genomic sequence of Magnetospirillum gryphiswaldense MSR-1 (DSM 6361), a type strain of the genus Magnetospirillum belonging to the Alphaproteobacteria. Compared to the reported draft sequence, extensive rearrangements and differences were found, indicating high genomic flexibility and “domestication” by accelerated evolution of the strain upon repeated passaging. PMID:24625872
Aspect-Oriented Subprogram Synthesizes UML Sequence Diagrams

NASA Technical Reports Server (NTRS)

Barry, Matthew R.; Osborne, Richard N.

2006-01-01

The Rational Sequence computer program described elsewhere includes a subprogram that utilizes the capability for aspect-oriented programming when that capability is present. This subprogram is denoted the Rational Sequence (AspectJ) component because it uses AspectJ, which is an extension of the Java programming language that introduces aspect-oriented programming techniques into the language
Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire.

USDA-ARS?s Scientific Manuscript database

The P. ultimum DAOM BR144 (=CBS 805.95 = ATCC200006) genome (42.8 Mb) encodes 15,290 genes, and has extensive sequence similarity and synteny with related Phytophthora spp., including the potato late blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86 % o...
Integrating sequence and structural biology with DAS

PubMed Central

Prlić, Andreas; Down, Thomas A; Kulesha, Eugene; Finn, Robert D; Kähäri, Andreas; Hubbard, Tim JP

2007-01-01

Background The Distributed Annotation System (DAS) is a network protocol for exchanging biological data. It is frequently used to share annotations of genomes and protein sequence. Results Here we present several extensions to the current DAS 1.5 protocol. These provide new commands to share alignments, three dimensional molecular structure data, add the possibility for registration and discovery of DAS servers, and provide a convention how to provide different types of data plots. We present examples of web sites and applications that use the new extensions. We operate a public registry of DAS sources, which now includes entries for more than 250 distinct sources. Conclusion Our DAS extensions are essential for the management of the growing number of services and exchange of diverse biological data sets. In addition the extensions allow new types of applications to be developed and scientific questions to be addressed. The registry of DAS sources is available at PMID:17850653

Multiframe video coding for improved performance over wireless channels.

PubMed

Budagavi, M; Gibson, J D

2001-01-01

We propose and evaluate a multi-frame extension to block motion compensation (BMC) coding of videoconferencing-type video signals for wireless channels. The multi-frame BMC (MF-BMC) coder makes use of the redundancy that exists across multiple frames in typical videoconferencing sequences to achieve additional compression over that obtained by using the single frame BMC (SF-BMC) approach, such as in the base-level H.263 codec. The MF-BMC approach also has an inherent ability of overcoming some transmission errors and is thus more robust when compared to the SF-BMC approach. We model the error propagation process in MF-BMC coding as a multiple Markov chain and use Markov chain analysis to infer that the use of multiple frames in motion compensation increases robustness. The Markov chain analysis is also used to devise a simple scheme which randomizes the selection of the frame (amongst the multiple previous frames) used in BMC to achieve additional robustness. The MF-BMC coders proposed are a multi-frame extension of the base level H.263 coder and are found to be more robust than the base level H.263 coder when subjected to simulated errors commonly encountered on wireless channels.
Targeted exploration and analysis of large cross-platform human transcriptomic compendia

PubMed Central

Zhu, Qian; Wong, Aaron K; Krishnan, Arjun; Aure, Miriam R; Tadych, Alicja; Zhang, Ran; Corney, David C; Greene, Casey S; Bongo, Lars A; Kristensen, Vessela N; Charikar, Moses; Li, Kai; Troyanskaya, Olga G.

2016-01-01

We present SEEK (http://seek.princeton.edu), a query-based search engine across very large transcriptomic data collections, including thousands of human data sets from almost 50 microarray and next-generation sequencing platforms. SEEK uses a novel query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify query-coregulated genes, pathways, and processes. SEEK provides cross-platform handling, multi-gene query search, iterative metadata-based search refinement, and extensive visualization-based analysis options. PMID:25581801
Speciation in ancient cryptic species complexes: evidence from the molecular phylogeny of Brachionus plicatilis (Rotifera).

PubMed

Gómez, Africa; Serra, Manuel; Carvalho, Gary R; Lunt, David H

2002-07-01

Continental lake-dwelling zooplanktonic organisms have long been considered cosmopolitan species with little geographic variation in spite of the isolation of their habitats. Evidence of morphological cohesiveness and high dispersal capabilities support this interpretation. However, this view has been challenged recently as many such species have been shown either to comprise cryptic species complexes or to exhibit marked population genetic differentiation and strong phylogeographic structuring at a regional scale. Here we investigate the molecular phylogeny of the cosmopolitan passively dispersing rotifer Brachionus plicatilis (Rotifera: Monogononta) species complex using nucleotide sequence variation from both nuclear (ribosomal internal transcribed spacer 1, ITS1) and mitochondrial (cytochrome c oxidase subunit I, COI) genes. Analysis of rotifer resting eggs from 27 salt lakes in the Iberian Peninsula plus lakes from four continents revealed nine genetically divergent lineages. The high level of sequence divergence, absence of hybridization, and extensive sympatry observed support the specific status of these lineages. Sequence divergence estimates indicate that the B. plicatilis complex began diversifying many millions of years ago, yet has showed relatively high levels of morphological stasis. We discuss these results in relation to the ecology and genetics of aquatic invertebrates possessing dispersive resting propagules and address the apparent contradiction between zooplanktonic population structure and their morphological stasis.
Riboflavin accumulation and characterization of cDNAs encoding lumazine synthase and riboflavin synthase in bitter melon (Momordica charantia).

PubMed

Tuan, Pham Anh; Kim, Jae Kwang; Lee, Sanghyun; Chae, Soo Cheon; Park, Sang Un

2012-12-05

Riboflavin (vitamin B2) is the universal precursor of the coenzymes flavin mononucleotide and flavin adenine dinucleotide--cofactors that are essential for the activity of a wide variety of metabolic enzymes in animals, plants, and microbes. Using the RACE PCR approach, cDNAs encoding lumazine synthase (McLS) and riboflavin synthase (McRS), which catalyze the last two steps in the riboflavin biosynthetic pathway, were cloned from bitter melon (Momordica charantia), a popular vegetable crop in Asia. Amino acid sequence alignments indicated that McLS and McRS share high sequence identity with other orthologous genes and carry an N-terminal extension, which is reported to be a plastid-targeting sequence. Organ expression analysis using quantitative real-time RT PCR showed that McLS and McRS were constitutively expressed in M. charantia, with the strongest expression levels observed during the last stage of fruit ripening (stage 6). This correlated with the highest level of riboflavin content, which was detected during ripening stage 6 by HPLC analysis. McLS and McRS were highly expressed in the young leaves and flowers, whereas roots exhibited the highest accumulation of riboflavin. The cloning and characterization of McLS and McRS from M. charantia may aid the metabolic engineering of vitamin B2 in crops.
Inhalable Microorganisms in Beijing’s PM2.5 and PM10 Pollutants during a Severe Smog Event

PubMed Central

2014-01-01

Particulate matter (PM) air pollution poses a formidable public health threat to the city of Beijing. Among the various hazards of PM pollutants, microorganisms in PM2.5 and PM10 are thought to be responsible for various allergies and for the spread of respiratory diseases. While the physical and chemical properties of PM pollutants have been extensively studied, much less is known about the inhalable microorganisms. Most existing data on airborne microbial communities using 16S or 18S rRNA gene sequencing to categorize bacteria or fungi into the family or genus levels do not provide information on their allergenic and pathogenic potentials. Here we employed metagenomic methods to analyze the microbial composition of Beijing’s PM pollutants during a severe January smog event. We show that with sufficient sequencing depth, airborne microbes including bacteria, archaea, fungi, and dsDNA viruses can be identified at the species level. Our results suggested that the majority of the inhalable microorganisms were soil-associated and nonpathogenic to human. Nevertheless, the sequences of several respiratory microbial allergens and pathogens were identified and their relative abundance appeared to have increased with increased concentrations of PM pollution. Our findings may serve as an important reference for environmental scientists, health workers, and city planners. PMID:24456276
On-line LC-MS approach combining collision-induced dissociation (CID), electron-transfer dissociation (ETD), and CID of an isolated charge-reduced species for the trace-level characterization of proteins with post-translational modifications.

PubMed

Wu, Shiaw-Lin; Hühmer, Andreas F R; Hao, Zhiqi; Karger, Barry L

2007-11-01

We have expanded our recent on-line LC-MS platform for large peptide analysis to combine collision-induced dissociation (CID), electron-transfer dissociation (ETD), and CID of an isolated charge-reduced (CRCID) species derived from ETD to determine sites of phosphorylation and glycosylation modifications, as well as the sequence of large peptide fragments (i.e., 2000-10,000 Da) from complex proteins, such as beta-casein, epidermal growth factor receptor (EGFR), and tissue plasminogen activator (t-PA) at the low femtomol level. The incorporation of an additional CID activation step for a charge-reduced species, isolated from ETD fragment ions, improved ETD fragmentation when precursor ions with high m/z (approximately >1000) were automatically selected for fragmentation. Specifically, the identification of the exact phosphorylation sites was strengthened by the extensive coverage of the peptide sequence with a near-continuous product ion series. The identification of N-linked glycosylation sites in EGFR and an O-linked glycosylation site in t-PA were also improved through the enhanced identification of the peptide backbone sequence of the glycosylated precursors. The new strategy is a good starting survey scan to characterize enzymatic peptide mixtures over a broad range of masses using LC-MS with data-dependent acquisition, as the three activation steps can provide complementary information to each other. In general, large peptides can be extensively characterized by the ETD and CRCID steps, including sites of modification from the generated, near-continuous product ion series, supplemented by the CID-MS2 step. At the same time, small peptides (e.g.,
Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence

PubMed Central

Wang, Xiyin; Tang, Haibao; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.

2007-01-01

Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the ∼0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, ∼8% of japonica paralogs produced 5–7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while ∼70-MY-old “paleologs” resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice–sorghum divergence ∼41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity—that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5–7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization. PMID:18039882
Size, Shape, and Sequence-Dependent Immunogenicity of RNA Nanoparticles.

PubMed

Guo, Sijin; Li, Hui; Ma, Mengshi; Fu, Jian; Dong, Yizhou; Guo, Peixuan

2017-12-15

RNA molecules have emerged as promising therapeutics. Like all other drugs, the safety profile and immune response are important criteria for drug evaluation. However, the literature on RNA immunogenicity has been controversial. Here, we used the approach of RNA nanotechnology to demonstrate that the immune response of RNA nanoparticles is size, shape, and sequence dependent. RNA triangle, square, pentagon, and tetrahedron with same shape but different sizes, or same size but different shapes were used as models to investigate the immune response. The levels of pro-inflammatory cytokines induced by these RNA nanoarchitectures were assessed in macrophage-like cells and animals. It was found that RNA polygons without extension at the vertexes were immune inert. However, when single-stranded RNA with a specific sequence was extended from the vertexes of RNA polygons, strong immune responses were detected. These immunostimulations are sequence specific, because some other extended sequences induced little or no immune response. Additionally, larger-size RNA square induced stronger cytokine secretion. 3D RNA tetrahedron showed stronger immunostimulation than planar RNA triangle. These results suggest that the immunogenicity of RNA nanoparticles is tunable to produce either a minimal immune response that can serve as safe therapeutic vectors, or a strong immune response for cancer immunotherapy or vaccine adjuvants. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Nucleotide sequence and proposed secondary structure of Columnea latent viroid: a natural mosaic of viroid sequences.

PubMed Central

Hammond, R; Smith, D R; Diener, T O

1989-01-01

The Columnea latent viroid (CLV) occurs latently in certain Columnea erythrophae plants grown commercially. In potato and tomato, CLV causes potato spindle tuber viroid (PSTV)-like symptoms. Its nucleotide sequence and proposed secondary structure reveal that CLV consists of a single-stranded circular RNA of 370 nucleotides which can assume a rod-like structure with extensive base-pairing characteristic of all known viroids. The electrophoretic mobility of circular CLV under nondenaturing conditions suggests a potential tertiary structure. CLV contains extensive sequence homologies to the PSTV group of viroids but contains a central conserved region identical to that of hop stunt viroid (HSV). CLV also shares some biological properties with each of the two types of viroids. Most probably, CLV is the result of intracellular RNA recombination between an HSV-type and one or more PSTV-type viroids replicating in the same plant. Images PMID:2602114
S -matrix calculations of energy levels of sodiumlike ions

DOE PAGES

Sapirstein, J.; Cheng, K. T.

2015-06-24

A recent S -matrix-based QED calculation of energy levels of the lithium isoelectronic sequence is extended to the general case of a valence electron outside an arbitrary filled core. Emphasis is placed on modifications of the lithiumlike formulas required because more than one core state is present, and an unusual feature of the two-photon exchange contribution involving autoionizing states is discussed. Here, the method is illustrated with a calculation of the energy levels of sodiumlike ions, with results for 3s 1/2, 3p 1/2, and 3p 3/2 energies tabulated for the range Z = 30 – 100 . Comparison with experimentmore » and other calculations is given, and prospects for extension of the method to ions with more complex electronic structure discussed.« less
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)

PubMed Central

Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto

2017-01-01

Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

PubMed

Bokulich, Nicholas A; Kaehler, Benjamin D; Rideout, Jai Ram; Dillon, Matthew; Bolyen, Evan; Knight, Rob; Huttley, Gavin A; Gregory Caporaso, J

2018-05-17

Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.
Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

PubMed

Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

2016-06-01

In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.
Is cratonic sedimentation consistent with available models? An example from the Upper Proterozoic of the West African craton

NASA Astrophysics Data System (ADS)

Bertrand-Sarfati, Janine; Moussine-Pouchkine, Alexis

1988-08-01

The Atar Group, part of the Upper Proterozoic sequence covering the West African craton, stable since 2000 Ma, is characterized by an alternation of extensive carbonate beds and mixed siliciclastic and carbonate facies. The carbonate beds comprise essentially columnar stromatolite biostromes and bioherms which reflect sublittoral environments. The mixed facies contain a variety of laterally discontinuous facies which imply more variable environmental conditions. The settings of the mixed facies are not always clear but they do not contain thick sequences of high-energy facies. Few obvious facies sequences are discernable; those that are present are considered to be punctuated aggradational cycles (PACs) and they always start with biostromes of columnar stromatolites with very few sediments. Composite sequences are interpreted as due to shallowing upward or increasing energy environments that may be laterally contiguous, despite the fact that the contacts are not gradational. However, much of the stratigraphic sequence cannot be subdivided into cycles and seems to consist of unrelated individual facies, bound by sharp boundaries. The basin analysis reveals that biostromes of columnar stromatolites start after an instantaneous geological event corresponding to a sea-level rise. Consequently, their appearance can be considered as a time-line. We describe, in the Atar Group and its equivalents, three sedimentation trends, all of which are interpreted to be of shallowing upward character. The Atar Group appears to have been deposited in an epeiric sea (i.e. an extremely flat ramp). There are two contrasting styles of sedimentation: (1) after the submergence of the whole area, columnar stromatolites built extensive biostromes; (2) during the stable phase, sediments are deposited in a mosaic of laterally-discontinuous facies. Tidal influence cannot be recognized in the sequence, neither can a salinity increase toward the land; both common features in published epeiric sea models. A cratonic sedimentation area such as this is characterized by its size and flatness. Only during the stable phase of the cycle does small-scale topographic relief lead to deposition of a mosaic of facies. The sedimentation is storm- and wave-dominated.
Pangenome evidence for extensive interdomain horizontal transfer affecting lineage core and shell genes in uncultured planktonic thaumarchaeota and euryarchaeota.

PubMed

Deschamps, Philippe; Zivanovic, Yvan; Moreira, David; Rodriguez-Valera, Francisco; López-García, Purificación

2014-06-12

Horizontal gene transfer (HGT) is an important force in evolution, which may lead, among other things, to the adaptation to new environments by the import of new metabolic functions. Recent studies based on phylogenetic analyses of a few genome fragments containing archaeal 16S rRNA genes and fosmid-end sequences from deep-sea metagenomic libraries have suggested that marine planktonic archaea could be affected by high HGT frequency. Likewise, a composite genome of an uncultured marine euryarchaeote showed high levels of gene sequence similarity to bacterial genes. In this work, we ask whether HGT is frequent and widespread in genomes of these marine archaea, and whether HGT is an ancient and/or recurrent phenomenon. To answer these questions, we sequenced 997 fosmid archaeal clones from metagenomic libraries of deep-Mediterranean waters (1,000 and 3,000 m depth) and built comprehensive pangenomes for planktonic Thaumarchaeota (Group I archaea) and Euryarchaeota belonging to the uncultured Groups II and III Euryarchaeota (GII/III-Euryarchaeota). Comparison with available reference genomes of Thaumarchaeota and a composite marine surface euryarchaeote genome allowed us to define sets of core, lineage-specific core, and shell gene ortholog clusters for the two archaeal lineages. Molecular phylogenetic analyses of all gene clusters showed that 23.9% of marine Thaumarchaeota genes and 29.7% of GII/III-Euryarchaeota genes had been horizontally acquired from bacteria. HGT is not only extensive and directional but also ongoing, with high HGT levels in lineage-specific core (ancient transfers) and shell (recent transfers) genes. Many of the acquired genes are related to metabolism and membrane biogenesis, suggesting an adaptive value for life in cold, oligotrophic oceans. We hypothesize that the acquisition of an important amount of foreign genes by the ancestors of these archaeal groups significantly contributed to their divergence and ecological success. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Ecology of malaria parasites infecting Southeast Asian macaques: evidence from cytochrome b sequences

PubMed Central

Putaporntip, Chaturong; Jongwutiwes, Somchai; Thongaree, Siriporn; Seethamchai, Sunee; Grynberg, Priscila; Hughes, Austin L.

2010-01-01

Although malaria parasites infecting non-human primates are important models for human malaria, little is known of the ecology of infection by these parasites in the wild. We extensively sequenced cytochrome b (cytb) of malaria parasites (Apicomplexa: Haemosporida) from free-living Southeast Asian monkeys Macaca nemestrina and M. fascicularis. The two most commonly observed taxa were P. inui and Hepatocystis sp., but certain other sequences did not cluster closely with any previously sequenced species. Most of the major clades of parasites were found in both Macaca species; and the two most commonly occurring parasite infected the two Macaca species at approximately equal levels. However, P. inui showed evidence of genetic differentiation between the populations infecting the two Macaca species, suggesting limited movement of this parasite among hosts. Moreover, coinfection with Plasmodium and Hepatocystis species occurred significantly less frequently than expected on the basis of the rates of infection with either taxon alone, suggesting the possibility of competitive exclusion. The results revealed unexpectedly complex communities of Plasmodium and Hepatocystis taxa infecting wild Southeast Asian monkeys. Parasite taxa differed with respect to both the frequency of between-host movement and their frequency of coinfection. PMID:20646216
Population Structure in Nontypeable Haemophilus influenzae

PubMed Central

LaCross, Nathan C.; Marrs, Carl F.; Gilsdorf, Janet R.

2013-01-01

Nontypeable Haemophilus influenzae (NTHi) frequently colonize the human pharynx asymptomatically, and are an important cause of otitis media in children. Past studies have identified typeable H. influenzae as being clonal, but the population structure of NTHi has not been extensively characterized. The research presented here investigated the diversity and population structure in a well-characterized collection of NTHi isolated from the middle ears of children with otitis media or the pharynges of healthy children in three disparate geographic regions. Multilocus sequence typing identified 109 unique sequence types among 170 commensal and otitis media-associated NTHi isolates from Finland, Israel, and the US. The largest clonal complex contained only five sequence types, indicating a high level of genetic diversity. The eBURST v3, ClonalFrame 1.1, and structure 2.3.3 programs were used to further characterize diversity and population structure from the sequence typing data. Little clustering was apparent by either disease state (otitis media or commensalism) or geography in the ClonalFrame phylogeny. Population structure was clearly evident, with support for eight populations when all 170 isolates were analyzed. Interestingly, one population contained only commensal isolates, while two others consisted solely of otitis media isolates, suggesting associations between population structure and disease. PMID:23266487
Sedimentary evolution of the Pliocene and Pleistocene Ebro margin, northeastern Spain

USGS Publications Warehouse

Alonso, B.; Field, M.E.; Gardner, J.V.; Maldonado, A.

1990-01-01

The Pliocene and Pleistocene deposits of the Spanish Ebro margin overlie a regional unconformity and contain a major disconformity. These unconformities, named Reflector M and Reflector G, mark the bases of two seismic sequences. Except for close to the upper boundary where a few small channel deposits are recognized, the lower sequence lacks channels. The upper sequence contains nine channel-levee complexes as well as base-of-slope aprons that represent the proximal part of the Valencia turbidite system. Diverse geometries and variations in seismic units distinguish shelf, slope, base-of-slope and basin-floor facies. Four events characterize the late Miocene to Pleistocene evolution of the Ebro margin: (a) formation of a paleodrainage system and an extensive erosion-to-depositional surface during the latest Miocene (Messinian), (b) deposition of hemipelagic units during the early Pliocene, (c) development of canyons during the late Pliocene to early Pleistocene, and (d) deposition of slope wedges, channel-levee complexes, and base-of-slope aprons alternating with hemipelagic deposition during the Pleistocene. Sea-level fluctuations influenced the evolution of the sedimentary sequences of the Ebro margin, but the major control was the sediment supply from the Ebro River. ?? 1990.
Dissociation between the Procedural Learning of Letter Names and Motor Sequences in Developmental Dyslexia

ERIC Educational Resources Information Center

Gabay, Yafit; Schiff, Rachel; Vakil, Eli

2012-01-01

Motor sequence learning has been studied extensively in Developmental dyslexia (DD). The purpose of the present research was to examine procedural learning of letter names and motor sequences in individuals with DD and control groups. Both groups completed the Serial Search Task which enabled the assessment of learning of letter names and motor…
Comparative genomic survey, exon-intron annotation and phylogenetic analysis of NAT-homologous sequences in archaea, protists, fungi, viruses, and invertebrates

USDA-ARS?s Scientific Manuscript database

We have previously published extensive genomic surveys [1-3], reporting NAT-homologous sequences in hundreds of sequenced bacterial, fungal and vertebrate genomes. We present here the results of our latest search of 2445 genomes, representing 1532 (70 archaeal, 1210 bacterial, 43 protist, 97 fungal,...

Heteroplasmy in the Mitochondrial Genomes of Human Lice and Ticks Revealed by High Throughput Sequencing

PubMed Central

Xiong, Haoyu; Barker, Stephen C.; Burger, Thomas D.; Raoult, Didier; Shao, Renfu

2013-01-01

The typical mitochondrial (mt) genomes of bilateral animals consist of 37 genes on a single circular chromosome. The mt genomes of the human body louse, Pediculus humanus, and the human head louse, Pediculus capitis, however, are extensively fragmented and contain 20 minichromosomes, with one to three genes on each minichromosome. Heteroplasmy, i.e. nucleotide polymorphisms in the mt genome within individuals, has been shown to be significantly higher in the mt cox1 gene of human lice than in humans and other animals that have the typical mt genomes. To understand whether the extent of heteroplasmy in human lice is associated with mt genome fragmentation, we sequenced the entire coding regions of all of the mt minichromosomes of six human body lice and six human head lice from Ethiopia, China and France with an Illumina HiSeq platform. For comparison, we also sequenced the entire coding regions of the mt genomes of seven species of ticks, which have the typical mitochondrial genome organization of bilateral animals. We found that the level of heteroplasmy varies significantly both among the human lice and among the ticks. The human lice from Ethiopia have significantly higher level of heteroplasmy than those from China and France (Pt<0.05). The tick, Amblyomma cajennense, has significantly higher level of heteroplasmy than other ticks (Pt<0.05). Our results indicate that heteroplasmy level can be substantially variable within a species and among closely related species, and does not appear to be determined by single factors such as genome fragmentation. PMID:24058467
Heteroplasmy in the mitochondrial genomes of human lice and ticks revealed by high throughput sequencing.

PubMed

Xiong, Haoyu; Barker, Stephen C; Burger, Thomas D; Raoult, Didier; Shao, Renfu

2013-01-01

The typical mitochondrial (mt) genomes of bilateral animals consist of 37 genes on a single circular chromosome. The mt genomes of the human body louse, Pediculus humanus, and the human head louse, Pediculus capitis, however, are extensively fragmented and contain 20 minichromosomes, with one to three genes on each minichromosome. Heteroplasmy, i.e. nucleotide polymorphisms in the mt genome within individuals, has been shown to be significantly higher in the mt cox1 gene of human lice than in humans and other animals that have the typical mt genomes. To understand whether the extent of heteroplasmy in human lice is associated with mt genome fragmentation, we sequenced the entire coding regions of all of the mt minichromosomes of six human body lice and six human head lice from Ethiopia, China and France with an Illumina HiSeq platform. For comparison, we also sequenced the entire coding regions of the mt genomes of seven species of ticks, which have the typical mitochondrial genome organization of bilateral animals. We found that the level of heteroplasmy varies significantly both among the human lice and among the ticks. The human lice from Ethiopia have significantly higher level of heteroplasmy than those from China and France (Pt<0.05). The tick, Amblyomma cajennense, has significantly higher level of heteroplasmy than other ticks (Pt<0.05). Our results indicate that heteroplasmy level can be substantially variable within a species and among closely related species, and does not appear to be determined by single factors such as genome fragmentation.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

PubMed Central

Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

2013-01-01

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Zn-metalloprotease sequences in extremophiles

NASA Astrophysics Data System (ADS)

Holden, T.; Dehipawala, S.; Golebiewska, U.; Cheung, E.; Tremberger, G., Jr.; Williams, E.; Schneider, P.; Gadura, N.; Lieberman, D.; Cheung, T.

2010-09-01

The Zn-metalloprotease family contains conserved amino acid structures such that the nucleotide fluctuation at the DNA level would exhibit correlated randomness as described by fractal dimension. A nucleotide sequence fractal dimension can be calculated from a numerical series consisting of the atomic numbers of each nucleotide. The structure's vibration modes can also be studied using a Gaussian Network Model. The vibration measure and fractal dimension values form a two-dimensional plot with a standard vector metric that can be used for comparison of structures. The preference for amino acid usage in extremophiles may suppress nucleotide fluctuations that could be analyzed in terms of fractal dimension and Shannon entropy. A protein level cold adaptation study of the thermolysin Zn-metalloprotease family using molecular dynamics simulation was reported recently and our results show that the associated nucleotide fluctuation suppression is consistent with a regression pattern generated from the sequences's fractal dimension and entropy values (R-square { 0.98, N =5). It was observed that cold adaptation selected for high entropy and low fractal dimension values. Extension to the Archaemetzincin M54 family in extremophiles reveals a similar regression pattern (R-square = 0.98, N = 6). It was observed that the metalloprotease sequences of extremely halophilic organisms possess high fractal dimension and low entropy values as compared with non-halophiles. The zinc atom is usually bonded to the histidine residue, which shows limited levels of vibration in the Gaussian Network Model. The variability of the fractal dimension and entropy for a given protein structure suggests that extremophiles would have evolved after mesophiles, consistent with the bias usage of non-prebiotic amino acids by extremophiles. It may be argued that extremophiles have the capacity to offer extinction protection during drastic changes in astrobiological environments.
Isolation and characterization of 5S rDNA sequences in catfishes genome (Heptapteridae and Pseudopimelodidae): perspectives for rDNA studies in fish by C0t method.

PubMed

Gouveia, Juceli Gonzalez; Wolf, Ivan Rodrigo; de Moraes-Manécolo, Vivian Patrícia Oliveira; Bardella, Vanessa Belline; Ferracin, Lara Munique; Giuliano-Caetano, Lucia; da Rosa, Renata; Dias, Ana Lúcia

2016-12-01

Sequences of 5S ribosomal RNA (rRNA) are extensively used in fish cytogenomic studies, once they have a flexible organization at the chromosomal level, showing inter- and intra-specific variation in number and position in karyotypes. Sequences from the genome of Imparfinis schubarti (Heptapteridae) were isolated, aiming to understand the organization of 5S rDNA families in the fish genome. The isolation of 5S rDNA from the genome of I. schubarti was carried out by reassociation kinetics (C 0 t) and PCR amplification. The obtained sequences were cloned for the construction of a micro-library. The obtained clones were sequenced and hybridized in I. schubarti and Microglanis cottoides (Pseudopimelodidae) for chromosome mapping. An analysis of the sequence alignments with other fish groups was accomplished. Both methods were effective when using 5S rDNA for hybridization in I. schubarti genome. However, the C 0 t method enabled the use of a complete 5S rRNA gene, which was also successful in the hybridization of M. cottoides. Nevertheless, this gene was obtained only partially by PCR. The hybridization results and sequence analyses showed that intact 5S regions are more appropriate for the probe operation, due to conserved structure and motifs. This study contributes to a better understanding of the organization of multigene families in catfish's genomes.
Removing the needle from the haystack: Enrichment of Wolbachia endosymbiont transcripts from host nematode RNA by Cappable-seq™.

PubMed

Luck, Ashley N; Slatko, Barton E; Foster, Jeremy M

2017-01-01

Efficient transcriptomic sequencing of microbial mRNA derived from host-microbe associations is often compromised by the much lower relative abundance of microbial RNA in the mixed total RNA sample. One solution to this problem is to perform extensive sequencing until an acceptable level of transcriptome coverage is obtained. More cost-effective methods include use of prokaryotic and/or eukaryotic rRNA depletion strategies, sometimes in conjunction with depletion of polyadenylated eukaryotic mRNA. Here, we report use of Cappable-seq™ to specifically enrich, in a single step, Wolbachia endobacterial mRNA transcripts from total RNA prepared from the parasitic filarial nematode, Brugia malayi. The obligate Wolbachia endosymbiont is a proven drug target for many human filarial infections, yet the precise nature of its symbiosis with the nematode host is poorly understood. Insightful analysis of the expression levels of Wolbachia genes predicted to underpin the mutualistic association and of known drug target genes at different life cycle stages or in response to drug treatments is typically challenged by low transcriptomic coverage. Cappable-seq resulted in up to ~ 5-fold increase in the number of reads mapping to Wolbachia. On average, coverage of Wolbachia transcripts from B. malayi microfilariae was enriched ~40-fold by Cappable-seq. Additionally, this method has an additional benefit of selectively removing abundant prokaryotic ribosomal RNAs.The deeper microbial transcriptome sequencing afforded by Cappable-seq facilitates more detailed characterization of gene expression levels of pathogens and symbionts present in animal tissues.
Gear Shifting of Quadriceps during Isometric Knee Extension Disclosed Using Ultrasonography.

PubMed

Zhang, Shu; Huang, Weijian; Zeng, Yu; Shi, Wenxiu; Diao, Xianfen; Wei, Xiguang; Ling, Shan

2018-01-01

Ultrasonography has been widely employed to estimate the morphological changes of muscle during contraction. To further investigate the motion pattern of quadriceps during isometric knee extensions, we studied the relative motion pattern between femur and quadriceps under ultrasonography. An interesting observation is that although the force of isometric knee extension can be controlled to change almost linearly, femur in the simultaneously captured ultrasound video sequences has several different piecewise moving patterns. This phenomenon is like quadriceps having several forward gear ratios like a car starting from rest towards maximal voluntary contraction (MVC) and then returning to rest. Therefore, to verify this assumption, we captured several ultrasound video sequences of isometric knee extension and collected the torque/force signal simultaneously. Then we extract the shapes of femur from these ultrasound video sequences using video processing techniques and study the motion pattern both qualitatively and quantitatively. The phenomenon can be seen easier via a comparison between the torque signal and relative spatial distance between femur and quadriceps. Furthermore, we use cluster analysis techniques to study the process and the clustering results also provided preliminary support to the conclusion that, during both ramp increasing and decreasing phases, quadriceps contraction may have several forward gear ratios relative to femur.
The Diversity of Vibrios Associated with Vibriosis in Pacific White Shrimp (Litopenaeus vannamei) from Extensive Shrimp Pond in Kendal District, Indonesia

NASA Astrophysics Data System (ADS)

Sarjito; Harjuno Condro Haditomo, Alfabetian; Desrina; Djunaedi, Ali; Budi Prayitno, Slamet

2018-02-01

Vibriosis out breaks frequently occur in extensive shrimps farming. The study were commenced to find out the clinical signs of white shrimp that was infected by the Vibrio and to identify the bacterial associated with vibriosis in the pacific white shrimp, Litopenaeus vannamei. Bacterial isolates were gained from hepatopancreas and telson of moribund shrimps that were collected from extensive shrimp ponds of Kendal District, Indonesia and cultured on Thiosulfate Citrate Bile Salts Sucrose Agar (TCBSA). Isolates were clustered and identified using repetitive sequence-based polymerase chain reaction (rep-PCR). Three representative isolates (SJV 03, SJV 05 and SJV 19) were amplified with PCR using primers for 16S rRNA, and sequence for further identification. The clinical signs of shrimps affected by vibrio were pale hepatopancreas, weak of telson, dark and reddish coloration of smouth, patches of red colour in part of the body on the carapace, periopods, pleuopods, and telson. A total of 19 isolates were obtained and belong to three groups of genus Vibrios. Result of the 16S DNA sequence analysis, the vibrio found in this study related to vibriosis in white shrimps from extensive shrimp ponds of Kendal were closely related to Vibrio harveyi (SJV 03); V. parahaemolyticus (SJV 05) and V. alginolyticus (SJV 19).
Correlation between phenotypic antibiotic susceptibility and the resistome in Pseudomonas aeruginosa.

PubMed

Jaillard, Magali; van Belkum, Alex; Cady, Kyle C; Creely, David; Shortridge, Dee; Blanc, Bernadette; Barbu, E Magda; Dunne, W Michael; Zambardi, Gilles; Enright, Mark; Mugnier, Nathalie; Le Priol, Christophe; Schicklin, Stéphane; Guigon, Ghislaine; Veyrieras, Jean-Baptiste

2017-08-01

Genetic determinants of antibiotic resistance (AR) have been extensively investigated. High-throughput sequencing allows for the assessment of the relationship between genotype and phenotype. A panel of 672 Pseudomonas aeruginosa strains was analysed, including representatives of globally disseminated multidrug-resistant and extensively drug-resistant clones; genomes and multiple antibiograms were available. This panel was annotated for AR gene presence and polymorphism, defining a resistome in which integrons were included. Integrons were present in >70 distinct cassettes, with In5 being the most prevalent. Some cassettes closely associated with clonal complexes, whereas others spread across the phylogenetic diversity, highlighting the importance of horizontal transfer. A resistome-wide association study (RWAS) was performed for clinically relevant antibiotics by correlating the variability in minimum inhibitory concentration (MIC) values with resistome data. Resistome annotation identified 147 loci associated with AR. These loci consisted mainly of acquired genomic elements and intrinsic genes. The RWAS allowed for correct identification of resistance mechanisms for meropenem, amikacin, levofloxacin and cefepime, and added 46 novel mutations. Among these, 29 were variants of the oprD gene associated with variation in meropenem MIC. Using genomic and MIC data, phenotypic AR was successfully correlated with molecular determinants at the whole-genome sequence level. Copyright © 2017 Elsevier B.V. and International Society of Chemotherapy. All rights reserved.
Processing of the precursor of protamine P2 in mouse. Peptide mapping and N-terminal sequence analysis of intermediates.

PubMed Central

Carré-Eusèbe, D; Lederer, F; Lê, K H; Elsevier, S M

1991-01-01

Protamine P2, the major basic chromosomal protein of mouse spermatozoa, is synthesized as a precursor almost twice as long as the mature protein, its extra length arising from an N-terminal extension of 44 amino acid residues. This precursor is integrated into chromatin of spermatids, and the extension is processed during chromatin condensation in the haploid cells. We have studied processing in the mouse and have identified two intermediates generated by proteolytic cleavage of the precursor. H.p.l.c. separated protamine P2 from four other spermatid proteins, including the precursor and three proteins known to possess physiological characteristics expected of processing intermediates. Peptide mapping indicated that all of these proteins were structurally similar. Two major proteins were further purified by PAGE, transferred to poly(vinylidene difluoride) membranes and submitted to automated N-terminal sequence analysis. Both sequences were found within the deduced sequence of the precursor extension. The N-terminus of the larger intermediate, PP2C, was Gly-12, whereas the N-terminus of the smaller, PP2D, was His-21. Both processing sites involved a peptide bond in which the carbonyl function was contributed by an acidic amino acid. Images Fig. 1. Fig. 3. Fig. 4. PMID:1854346
Substantial genome synteny preservation among woody angiosperm species: comparative genomics of Chinese chestnut (Castanea mollissima) and plant reference genomes.

PubMed

Staton, Margaret; Zhebentyayeva, Tetyana; Olukolu, Bode; Fang, Guang Chen; Nelson, Dana; Carlson, John E; Abbott, Albert G

2015-10-05

Chinese chestnut (Castanea mollissima) has emerged as a model species for the Fagaceae family with extensive genomic resources including a physical map, a dense genetic map and quantitative trait loci (QTLs) for chestnut blight resistance. These resources enable comparative genomics analyses relative to model plants. We assessed the degree of conservation between the chestnut genome and other well annotated and assembled plant genomic sequences, focusing on the QTL regions of most interest to the chestnut breeding community. The integrated physical and genetic map of Chinese chestnut has been improved to now include 858 shared sequence-based markers. The utility of the integrated map has also been improved through the addition of 42,970 BAC (bacterial artificial chromosome) end sequences spanning over 26 million bases of the estimated 800 Mb chestnut genome. Synteny between chestnut and ten model plant species was conducted on a macro-syntenic scale using sequences from both individual probes and BAC end sequences across the chestnut physical map. Blocks of synteny with chestnut were found in all ten reference species, with the percent of the chestnut physical map that could be aligned ranging from 10 to 39 %. The integrated genetic and physical map was utilized to identify BACs that spanned the three previously identified QTL regions conferring blight resistance. The clones were pooled and sequenced, yielding 396 sequence scaffolds covering 13.9 Mbp. Comparative genomic analysis on a microsytenic scale, using the QTL-associated genomic sequence, identified synteny from chestnut to other plant genomes ranging from 5.4 to 12.9 % of the genome sequences aligning. On both the macro- and micro-synteny levels, the peach, grape and poplar genomes were found to be the most structurally conserved with chestnut. Interestingly, these results did not strictly follow the expectation that decreased phylogenetic distance would correspond to increased levels of genome preservation, but rather suggest the additional influence of life-history traits on preservation of synteny. The regions of synteny that were detected provide an important tool for defining and cataloging genes in the QTL regions for advancing chestnut blight resistance research.
Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

PubMed

Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

2013-05-01

Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

PubMed

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-06-15

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome

PubMed Central

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-01-01

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961
Amino-acid sequence and predicted three-dimensional structure of pea seed (Pisum sativum) ferritin.

PubMed Central

Lobreaux, S; Yewdall, S J; Briat, J F; Harrison, P M

1992-01-01

The iron storage protein, ferritin, is widely distributed in the living kingdom. Here the complete cDNA and derived amino-acid sequence of pea seed ferritin are described, together with its predicted secondary structure, namely a four-helix-bundle fold similar to those of mammalian ferritins, with a fifth short helix at the C-terminus. An N-terminal extension of 71 residues contains a transit peptide (first 47 residues) responsible for plastid targetting as in other plant ferritins, and this is cleaved before assembly. The second part of the extension (24 residues) belongs to the mature subunit; it is cleaved during germination. The amino-acid sequence of pea seed ferritin is aligned with those of other ferritins (49% amino-acid identity with H-chains and 40% with L-chains of human liver ferritin in the aligned region). A three-dimensional model has been constructed by fitting the aligned sequence to the coordinates of human H-chains, with appropriate modifications. A folded conformation with an 11-residue helix is predicted for the N-terminal extension. As in mammalian ferritins, 24 subunits assemble into a hollow shell. In pea seed ferritin, its N-terminal extension is exposed on the outside surface of the shell. Within each pea subunit is a ferroxidase centre resembling those of human ferritin H-chains except for a replacement of Glu-62 by His. The channel at the 4-fold-symmetry axes defined by E-helices, is predicted to be hydrophilic in plant ferritins, whereas it is hydrophobic in mammalian ferritins. Images Fig. 3. Fig. 5. Fig. 6. PMID:1472006
Not All Order Memory Is Equal: Test Demands Reveal Dissociations in Memory for Sequence Information

ERIC Educational Resources Information Center

Jonker, Tanya R.; MacLeod, Colin M.

2017-01-01

Remembering the order of a sequence of events is a fundamental feature of episodic memory. Indeed, a number of formal models represent temporal context as part of the memory system, and memory for order has been researched extensively. Yet, the nature of the code(s) underlying sequence memory is still relatively unknown. Across 4 experiments that…
Detection and characterization of Pasteuria 16S rRNA gene sequences from nematodes and soils.

PubMed

Duan, Y P; Castro, H F; Hewlett, T E; White, J H; Ogram, A V

2003-01-01

Various bacterial species in the genus Pasteuria have great potential as biocontrol agents against plant-parasitic nematodes, although study of this important genus is hampered by the current inability to cultivate Pasteuria species outside their host. To aid in the study of this genus, an extensive 16S rRNA gene sequence phylogeny was constructed and this information was used to develop cultivation-independent methods for detection of Pasteuria in soils and nematodes. Thirty new clones of Pasteuria 16S rRNA genes were obtained directly from nematodes and soil samples. These were sequenced and used to construct an extensive phylogeny of this genus. These sequences were divided into two deeply branching clades within the low-G + C, Gram-positive division; some sequences appear to represent novel species within the genus Pasteuria. In addition, a surprising degree of 16S rRNA gene sequence diversity was observed within what had previously been designated a single strain of Pasteuria penetrans (P-20). PCR primers specific to Pasteuria 16S rRNA for detection of Pasteuria in soils were also designed and evaluated. Detection limits for soil DNA were 100-10,000 Pasteuria endospores (g soil)(-1).
cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU.

PubMed

Zhang, Jing; Wang, Hao; Feng, Wu-Chun

2017-01-01

BLAST, short for Basic Local Alignment Search Tool, is a ubiquitous tool used in the life sciences for pairwise sequence search. However, with the advent of next-generation sequencing (NGS), whether at the outset or downstream from NGS, the exponential growth of sequence databases is outstripping our ability to analyze the data. While recent studies have utilized the graphics processing unit (GPU) to speedup the BLAST algorithm for searching protein sequences (i.e., BLASTP), these studies use coarse-grained parallelism, where one sequence alignment is mapped to only one thread. Such an approach does not efficiently utilize the capabilities of a GPU, particularly due to the irregularity of BLASTP in both execution paths and memory-access patterns. To address the above shortcomings, we present a fine-grained approach to parallelize BLASTP, where each individual phase of sequence search is mapped to many threads on a GPU. This approach, which we refer to as cuBLASTP, reorders data-access patterns and reduces divergent branches of the most time-consuming phases (i.e., hit detection and ungapped extension). In addition, cuBLASTP optimizes the remaining phases (i.e., gapped extension and alignment with trace back) on a multicore CPU and overlaps their execution with the phases running on the GPU.
Investigation of deformation twinning under complex stress states in a rolled magnesium alloy

DOE PAGES

Wu, Wei; Chuang, Chih-Pin; Qiao, Dongxiao; ...

2016-05-15

We employed a specially designed semi-circular notch specimen in the current study to generate the various strain conditions, including uniaxial, biaxial, shear, and plane strains, which was utilized to explore the evolution of different deformation twinning systems under complex loading conditions. We found that when using in situ synchrotron X-ray diffraction mapping method, that the extensive double twins were activated during loading, while nearly no extension twinning activity was detected. After the formation of {10.1} and {10.3} compression twins, they transformed into {10.1}-{10.2} and {10.3}-{10.2} double twins instantaneously at the early stage of deformation. The lattice strain evolutions in differentmore » hkls were mapped at selected load levels during the loading-unloading sequence. Finally, the relationship between the macroscopic straining and microscopic response was established.« less
SNP-VISTA: An Interactive SNPs Visualization Tool

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shah, Nameeta; Teplitsky, Michael V.; Pennacchio, Len A.

2005-07-05

Recent advances in sequencing technologies promise better diagnostics for many diseases as well as better understanding of evolution of microbial populations. Single Nucleotide Polymorphisms(SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it is possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease and then screen for causative mutations.In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmentalmore » samples makes possible more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at http://genome.lbl.gov/vista/snpvista.« less

Case Study of a Small Scale Polytechnic Entrepreneurship Capstone Course Sequence

ERIC Educational Resources Information Center

Webster, Rustin D.; Kopp, Richard

2017-01-01

A multidisciplinary entrepreneurial senior capstone has been created for engineering technology students at a research I land-grant university statewide extension. The two semester course sequence welcomes students from Mechanical Engineering Technology, Electrical Engineering Technology, Computer Graphics Technology, and Organizational…
Long-term Quaternary uplift rates inferred from limestone caves in Sarawak, Malaysia

NASA Astrophysics Data System (ADS)

Farrant, Andrew R.; Smart, Peter L.; Whitaker, Fiona F.; Tarling, Donald H.

1995-04-01

The rate of long-term (2 m.y.) base-level lowering estimated in an extensive sequence of limestone caves in Sarawak, Malaysia, from uranium series, electron spin resonance, and paleomagnetic dating is 0.19 +0.03/-0.04 m/ka. This rate has remained constant over at least the last 700 ka, as shown by comparison of the number and spacing of wall notches formed during phases of interstadial and interglacial aggradation with peaks in the deep-sea oxygen isotope curve. It is argued that base-level lowering occurs in response to epirogenic uplift of the more resistant limestones due to regional denudation of the softer shales, and to flexural isostacy associated with high rates of offshore sedimentation.
Major histocompatibility complex variation in the endangered Przewalski's horse.

PubMed Central

Hedrick, P W; Parker, K M; Miller, E L; Miller, P S

1999-01-01

The major histocompatibility complex (MHC) is a fundamental part of the vertebrate immune system, and the high variability in many MHC genes is thought to play an essential role in recognition of parasites. The Przewalski's horse is extinct in the wild and all the living individuals descend from 13 founders, most of whom were captured around the turn of the century. One of the primary genetic concerns in endangered species is whether they have ample adaptive variation to respond to novel selective factors. In examining 14 Przewalski's horses that are broadly representative of the living animals, we found six different class II DRB major histocompatibility sequences. The sequences showed extensive nonsynonymous variation, concentrated in the putative antigen-binding sites, and little synonymous variation. Individuals had from two to four sequences as determined by single-stranded conformation polymorphism (SSCP) analysis. On the basis of the SSCP data, phylogenetic analysis of the nucleotide sequences, and segregation in a family group, we conclude that four of these sequences are from one gene (although one sequence codes for a nonfunctional allele because it contains a stop codon) and two other sequences are from another gene. The position of the stop codon is at the same amino-acid position as in a closely related sequence from the domestic horse. Because other organisms have extensive variation at homologous loci, the Przewalski's horse may have quite low variation in this important adaptive region. PMID:10430594
Applications of the rep-PCR DNA fingerprinting technique to study microbial diversity, ecology and evolution.

PubMed

Ishii, Satoshi; Sadowsky, Michael J

2009-04-01

A large number of repetitive DNA sequences are found in multiple sites in the genomes of numerous bacteria, archaea and eukarya. While the functions of many of these repetitive sequence elements are unknown, they have proven to be useful as the basis of several powerful tools for use in molecular diagnostics, medical microbiology, epidemiological analyses and environmental microbiology. The repetitive sequence-based PCR or rep-PCR DNA fingerprint technique uses primers targeting several of these repetitive elements and PCR to generate unique DNA profiles or 'fingerprints' of individual microbial strains. Although this technique has been extensively used to examine diversity among variety of prokaryotic microorganisms, rep-PCR DNA fingerprinting can also be applied to microbial ecology and microbial evolution studies since it has the power to distinguish microbes at the strain or isolate level. Recent advancement in rep-PCR methodology has resulted in increased accuracy, reproducibility and throughput. In this minireview, we summarize recent improvements in rep-PCR DNA fingerprinting methodology, and discuss its applications to address fundamentally important questions in microbial ecology and evolution.
Mutations that alter a conserved element upstream of the potato virus X triple block and coat protein genes affect subgenomic RNA accumulation.

PubMed

Kim, K H; Hemenway, C

1997-05-26

The putative subgenomic RNA (sgRNA) promoter regions upstream of the potato virus X (PVX) triple block and coat protein (CP) genes contain sequences common to other potexviruses. The importance of these sequences to PVX sgRNA accumulation was determined by inoculation of Nicotiana tabacum NT1 cell suspension protoplasts with transcripts derived from wild-type and modified PVX cDNA clones. Analyses of RNA accumulation by S1 nuclease digestion and primer extension indicated that a conserved octanucleotide sequence element and the spacing between this element and the start-site for sgRNA synthesis are critical for accumulation of the two major sgRNA species. The impact of mutations on CP sgRNA levels was also reflected in the accumulation of CP. In contrast, genomic minus- and plus-strand RNA accumulation were not significantly affected by mutations in these regions. Studies involving inoculation of tobacco plants with the modified transcripts suggested that the conserved octanucleotide element functions in sgRNA accumulation and some other aspect of the infection process.
Nucleospora cyclopteri n. sp., an intranuclear microsporidian infecting wild lumpfish, Cyclopterus lumpus L., in Icelandic waters

PubMed Central

2013-01-01

Background Commercial fisheries of lumpfish Cyclopterus lumpus have been carried out in Iceland for centuries. Traditionally the most valuable part is the eggs which are harvested for use as a caviar substitute. Previously reported parasitic infections from lumpfish include an undescribed intranuclear microsporidian associated with abnormal kidneys and mortalities in captive lumpfish in Canada. During Icelandic lumpfish fisheries in spring 2011, extensive enlargements to the kidneys were observed in some fish during processing. The aim of this study was to identify the pathogen responsible for these abnormalities. Methods Lumpfish from the Icelandic coast were examined for the causative agent of kidney enlargement. Fish were dissected and used in histological and molecular studies. Results Lumpfish, with various grades of clinical signs, were observed at 12 of the 43 sites sampled around Iceland. From a total of 77 fish examined, 18 had clear clinical signs, the most prominent of which was an extensive enlargement and pallor of the kidneys. The histopathology of the most severely affected fish consisted of extensive degeneration and necrosis of kidney tubules and vacuolar degeneration of the haematopoietic tissue. Intranuclear microsporidians were detected in all organs examined in fish with prominent clinical signs and most organs of apparently healthy fish using the new PCR and histological examination. One or multiple uniformly oval shaped spores measuring 3.12 ± 0.15 × 1.30 ± 0.12 μm were observed in the nucleus of affected lymphocytes and lymphocyte precursor cells. DNA sequencing provided a ribosomal DNA sequence that was strongly supported in phylogenetic analyses in a clade containing other microsporidian parasites from the Enterocytozoonidae, showing highest similarity to the intranuclear microsporidian Nucleospora salmonis. Conclusions Intranuclear microsporidian infections are common in wild caught lumpfish from around the Icelandic coast. Infections can cause severe clinical signs and extensive histopathological changes, but are also present, at lower levels, in fish that do not show clinical signs. Some common features exist with the intranuclear microsporidian previously reported from captive Canadian lumpfish, but DNA sequence data is required from Canadian fish to confirm conspecificity. Based on phylogenetic analysis and the intranuclear location of the parasite, the name Nucleospora cyclopteri n. sp. is proposed. PMID:23445616
Low level of sequence diversity at merozoite surface protein-1 locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai isolates.

PubMed

Putaporntip, Chaturong; Hughes, Austin L; Jongwutiwes, Somchai

2013-01-01

The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at this locus.
Low Level of Sequence Diversity at Merozoite Surface Protein-1 Locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai Isolates

PubMed Central

Putaporntip, Chaturong; Hughes, Austin L.; Jongwutiwes, Somchai

2013-01-01

Background The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Methodology/Principal Findings Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. Conclusion/Significance The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at this locus. PMID:23536840
Diversity and Evolution of Mycobacterium tuberculosis: Moving to Whole-Genome-Based Approaches

PubMed Central

Niemann, Stefan; Supply, Philip

2014-01-01

Genotyping of clinical Mycobacterium tuberculosis complex (MTBC) strains has become a standard tool for epidemiological tracing and for the investigation of the local and global strain population structure. Of special importance is the analysis of the expansion of multidrug (MDR) and extensively drug-resistant (XDR) strains. Classical genotyping and, more recently, whole-genome sequencing have revealed that the strains of the MTBC are more diverse than previously anticipated. Globally, several phylogenetic lineages can be distinguished whose geographical distribution is markedly variable. Strains of particular (sub)lineages, such as Beijing, seem to be more virulent and associated with enhanced resistance levels and fitness, likely fueling their spread in certain world regions. The upcoming generalization of whole-genome sequencing approaches will expectedly provide more comprehensive insights into the molecular and epidemiological mechanisms involved and lead to better diagnostic and therapeutic tools. PMID:25190252
Genomic big data hitting the storage bottleneck.

PubMed

Papageorgiou, Louis; Eleni, Picasi; Raftopoulou, Sofia; Mantaiou, Meropi; Megalooikonomou, Vasileios; Vlachakis, Dimitrios

2018-01-01

During the last decades, there is a vast data explosion in bioinformatics. Big data centres are trying to face this data crisis, reaching high storage capacity levels. Although several scientific giants examine how to handle the enormous pile of information in their cupboards, the problem remains unsolved. On a daily basis, there is a massive quantity of permanent loss of extensive information due to infrastructure and storage space problems. The motivation for sequencing has fallen behind. Sometimes, the time that is spent to solve storage space problems is longer than the one dedicated to collect and analyse data. To bring sequencing to the foreground, scientists have to slide over such obstacles and find alternative ways to approach the issue of data volume. Scientific community experiences the data crisis era, where, out of the box solutions may ease the typical research workflow, until technological development meets the needs of Bioinformatics.
Robot Task Commander with Extensible Programming Environment

NASA Technical Reports Server (NTRS)

Hart, Stephen W (Inventor); Wightman, Brian J (Inventor); Dinh, Duy Paul (Inventor); Yamokoski, John D. (Inventor); Gooding, Dustin R (Inventor)

2014-01-01

A system for developing distributed robot application-level software includes a robot having an associated control module which controls motion of the robot in response to a commanded task, and a robot task commander (RTC) in networked communication with the control module over a network transport layer (NTL). The RTC includes a script engine(s) and a GUI, with a processor and a centralized library of library blocks constructed from an interpretive computer programming code and having input and output connections. The GUI provides access to a Visual Programming Language (VPL) environment and a text editor. In executing a method, the VPL is opened, a task for the robot is built from the code library blocks, and data is assigned to input and output connections identifying input and output data for each block. A task sequence(s) is sent to the control module(s) over the NTL to command execution of the task.
Integration, warehousing, and analysis strategies of Omics data.

PubMed

Gedela, Srinubabu

2011-01-01

"-Omics" is a current suffix for numerous types of large-scale biological data generation procedures, which naturally demand the development of novel algorithms for data storage and analysis. With next generation genome sequencing burgeoning, it is pivotal to decipher a coding site on the genome, a gene's function, and information on transcripts next to the pure availability of sequence information. To explore a genome and downstream molecular processes, we need umpteen results at the various levels of cellular organization by utilizing different experimental designs, data analysis strategies and methodologies. Here comes the need for controlled vocabularies and data integration to annotate, store, and update the flow of experimental data. This chapter explores key methodologies to merge Omics data by semantic data carriers, discusses controlled vocabularies as eXtensible Markup Languages (XML), and provides practical guidance, databases, and software links supporting the integration of Omics data.
Basement geology of the National Petroleum Reserve Alaska (NPRA), Northern Alaska

USGS Publications Warehouse

Saltus, R.W.; Hudson, T.L.; Phillips, J.D.; Kulander, C.; Dumoulin, Julie A.; Potter, C.

2002-01-01

Gravity, aeromagnetic, seismic, and borehole information enable mapping of crustal basement characteristics within the National Petroleum Reserve Alaska (NPRA). In general, the pre-Mississippian basement of the southern portion of the NPRA is different from that in the north in that it is deeper and thinner, is made up of dense magnetic rocks, is cut by more normal faults, and underlies thicker accumulations of Mississippian to Triassic Ellesmerian sequence sedimentary rocks. Mafic igneous rocks within the basement and locally within the deeper Ellesmerian sequence sedimentary section could explain the observed density and magnetic variations. Because these variations spatially overlap thicker Ellesmerian sequence sediment accumulations, they may have developed, at least in part, during Mississippian to Triassic extension and basin formation. If this period of extension, and postulated mafic magmatism, was accompanied by higher heat flow, then early Ellesmerian sequence clastic sediments may have become mature for hydrocarbon generation (Magoon and Bird, 1988). This could have produced an early petroleum system in the Colville basin.
Front-End Electron Transfer Dissociation Coupled to a 21 Tesla FT-ICR Mass Spectrometer for Intact Protein Sequence Analysis

NASA Astrophysics Data System (ADS)

Weisbrod, Chad R.; Kaiser, Nathan K.; Syka, John E. P.; Early, Lee; Mullen, Christopher; Dunyach, Jean-Jacques; English, A. Michelle; Anderson, Lissa C.; Blakney, Greg T.; Shabanowitz, Jeffrey; Hendrickson, Christopher L.; Marshall, Alan G.; Hunt, Donald F.

2017-09-01

High resolution mass spectrometry is a key technology for in-depth protein characterization. High-field Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) enables high-level interrogation of intact proteins in the most detail to date. However, an appropriate complement of fragmentation technologies must be paired with FTMS to provide comprehensive sequence coverage, as well as characterization of sequence variants, and post-translational modifications. Here we describe the integration of front-end electron transfer dissociation (FETD) with a custom-built 21 tesla FT-ICR mass spectrometer, which yields unprecedented sequence coverage for proteins ranging from 2.8 to 29 kDa, without the need for extensive spectral averaging (e.g., 60% sequence coverage for apo-myoglobin with four averaged acquisitions). The system is equipped with a multipole storage device separate from the ETD reaction device, which allows accumulation of multiple ETD fragment ion fills. Consequently, an optimally large product ion population is accumulated prior to transfer to the ICR cell for mass analysis, which improves mass spectral signal-to-noise ratio, dynamic range, and scan rate. We find a linear relationship between protein molecular weight and minimum number of ETD reaction fills to achieve optimum sequence coverage, thereby enabling more efficient use of instrument data acquisition time. Finally, real-time scaling of the number of ETD reactions fills during method-based acquisition is shown, and the implications for LC-MS/MS top-down analysis are discussed. [Figure not available: see fulltext.
Genomic structure of the human D-site binding protein (DBP) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shutler, G.; Glassco, T.; Kang, Xiaolin

1996-06-15

The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less
Mapping the Space of Genomic Signatures

PubMed Central

Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.

2015-01-01

We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

PubMed

Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

2017-01-03

Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
htsint: a Python library for sequencing pipelines that combines data through gene set generation.

PubMed

Richards, Adam J; Herrel, Anthony; Bonneaud, Camille

2015-09-24

Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
Dynamic Encoding of Speech Sequence Probability in Human Temporal Cortex

PubMed Central

Leonard, Matthew K.; Bouchard, Kristofer E.; Tang, Claire

2015-01-01

Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. PMID:25948269
Modic Type 1 Changes: Detection Performance of Fat-Suppressed Fluid-Sensitive MRI Sequences.

PubMed

Finkenstaedt, Tim; Del Grande, Filippo; Bolog, Nicolae; Ulrich, Nils; Tok, Sina; Kolokythas, Orpheus; Steurer, Johann; Andreisek, Gustav; Winklhofer, Sebastian

2018-02-01

To assess the performance of fat-suppressed fluid-sensitive MRI sequences compared to T1-weighted (T1w) / T2w sequences for the detection of Modic 1 end-plate changes on lumbar spine MRI. Sagittal T1w, T2w, and fat-suppressed fluid-sensitive MRI images of 100 consecutive patients (consequently 500 vertebral segments; 52 female, mean age 74 ± 7.4 years; 48 male, mean age 71 ± 6.3 years) were retrospectively evaluated. We recorded the presence (yes/no) and extension (i. e., Likert-scale of height, volume, and end-plate extension) of Modic I changes in T1w/T2w sequences and compared the results to fat-suppressed fluid-sensitive sequences (McNemar/Wilcoxon-signed-rank test). Fat-suppressed fluid-sensitive sequences revealed significantly more Modic I changes compared to T1w/T2w sequences (156 vs. 93 segments, respectively; p < 0.001). The extension of Modic I changes in fat-suppressed fluid-sensitive sequences was significantly larger compared to T1w/T2w sequences (height: 2.53 ± 0.82 vs. 2.27 ± 0.79, volume: 2.35 ± 0.76 vs. 2.1 ± 0.65, end-plate: 2.46 ± 0.76 vs. 2.19 ± 0.81), (p < 0.05). Modic I changes that were only visible in fat-suppressed fluid-sensitive sequences but not in T1w/T2w sequences were significantly smaller compared to Modic I changes that were also visible in T1w/T2w sequences (p < 0.05). In conclusion, fat-suppressed fluid-sensitive MRI sequences revealed significantly more Modic I end-plate changes and demonstrated a greater extent compared to standard T1w/T2w imaging. · When the Modic classification was defined in 1988, T2w sequences were heavily T2-weighted and thus virtually fat-suppressed.. · Nowadays, the bright fat signal in T2w images masks edema-like changes.. · The conventional definition of Modic I changes is not fully applicable anymore.. · Fat-suppressed fluid-sensitive MRI sequences revealed more/greater extent of Modic I changes.. · Finkenstaedt T, Del Grande F, Bolog N et al. Modic Type 1 Changes: Detection Performance of Fat-Suppressed Fluid-Sensitive MRI Sequences. Fortschr Röntgenstr 2018; 190: 152 - 160. © Georg Thieme Verlag KG Stuttgart · New York.

Cell-type-specific profiling of protein-DNA interactions without cell isolation using targeted DamID with next-generation sequencing.

PubMed

Marshall, Owen J; Southall, Tony D; Cheetham, Seth W; Brand, Andrea H

2016-09-01

This protocol is an extension to: Nat. Protoc. 2, 1467-1478 (2007); doi:10.1038/nprot.2007.148; published online 7 June 2007The ability to profile transcription and chromatin binding in a cell-type-specific manner is a powerful aid to understanding cell-fate specification and cellular function in multicellular organisms. We recently developed targeted DamID (TaDa) to enable genome-wide, cell-type-specific profiling of DNA- and chromatin-binding proteins in vivo without cell isolation. As a protocol extension, this article describes substantial modifications to an existing protocol, and it offers additional applications. TaDa builds upon DamID, a technique for detecting genome-wide DNA-binding profiles of proteins, by coupling it with the GAL4 system in Drosophila to enable both temporal and spatial resolution. TaDa ensures that Dam-fusion proteins are expressed at very low levels, thus avoiding toxicity and potential artifacts from overexpression. The modifications to the core DamID technique presented here also increase the speed of sample processing and throughput, and adapt the method to next-generation sequencing technology. TaDa is robust, reproducible and highly sensitive. Compared with other methods for cell-type-specific profiling, the technique requires no cell-sorting, cross-linking or antisera, and binding profiles can be generated from as few as 10,000 total induced cells. By profiling the genome-wide binding of RNA polymerase II (Pol II), TaDa can also identify transcribed genes in a cell-type-specific manner. Here we describe a detailed protocol for carrying out TaDa experiments and preparing the material for next-generation sequencing. Although we developed TaDa in Drosophila, it should be easily adapted to other organisms with an inducible expression system. Once transgenic animals are obtained, the entire experimental procedure-from collecting tissue samples to generating sequencing libraries-can be accomplished within 5 d.
Comparative phenotypic analysis of Gossypium raimondii with Upland cotton

USDA-ARS?s Scientific Manuscript database

Gossypium raimondii Ulbr., a wild species with a diploid genome, has been sequenced due to its small genome size and sequence similarity with the polyploidy cultivated Gossypium species. Accessibility of the G. raimondii genome has made the species a reference used extensively in cotton genomic and...
Contrasting fluvial styles across the mid-Pleistocene climate transition in the northern shelf of the South China Sea: Evidence from 3D seismic data

NASA Astrophysics Data System (ADS)

Zhuo, Haiteng; Wang, Yingmin; Shi, Hesheng; He, Min; Chen, Weitao; Li, Hua; Wang, Ying; Yan, Weiyao

2015-12-01

Multiple successions of buried fluvial channel systems were identified in the Quaternary section of the mid-shelf region of the northern South China Sea, providing a new case study for understanding the interplay between sea level variations and climate change. Using three commercial 3D seismic surveys, accompanied by several 2D lines and a few shallow boreholes, the sequence stratigraphy, seismic geomorphology and stratal architecture of these fluvial channels were carefully investigated. Based on their origin, dimensions, planform geometries and infill architectures, six classes of channel systems, from Class 1 to Class 6, were recognized within five sequences of Quaternary section (SQ1 to SQ5). Three types of fluvial systems among them are incised in their nature, including the trunk incised valleys (Class 1), medium incised valleys (Class 2) and incised tributaries (Class 3). The other three types are unincised, which comprise the trunk channels (Class 4), lateral migrating channels (Class 5) and the stable channels (Class 6). The trunk channels and/or the major valleys that contain braided channels at their base are hypothesized to be a product of deposition from the "big rivers" that have puzzled the sedimentologists for the last decade, providing evidence for the existence of such rivers in the ancient record. Absolute age dates from a few shallow boreholes indicate that the landscapes that were associated with these fluvial systems changed significantly near the completion of the mid-Pleistocene climate transition (MPT), which approximately corresponds to horizon SB2 with an age of ∼0.6 Ma BP. Below SB2, the Early Pleistocene sequence (SQ1) is dominated by a range of different types of unincised fluvial systems. Evidence of incised valleys is absent in SQ1. In contrast, extensive fluvial incision occurred in the successions above horizon SB2 (within SQ2-SQ5). Although recent studies call for increased incision being a product of climate-controlled increase in river discharge, the down-dip location of our study area suggests that relative sea level change was the most important control of the evolution of fluvial systems. However, it is acknowledged that climate change was also important through its role in regulating glacio-eustasy. We speculate that the small amplitude and periodicity of sea level cycles before and during the MPT were not sufficient to fully expose the shelf and cause extensive fluvial incisions. Completion of the MPT as well as the onset of 100 ky climate cycles at ∼0.6 Ma, during which the duration of cycles and magnitude of sea level change both increased, are considered to be triggering event for extensive development of incised fluvial systems. In addition to the eustatically driven causes of enhanced incision, the intensification of the East Asia monsoon at 0.9 Ma and 0.6 Ma driven by the episodic uplift of the Tibetan Plateau may have also significantly enhanced the amplitude of sea level falls and thus the fluvial incisions of the northern shelf of the South China Sea.
Analysis of Monoclonal Antibodies in Human Serum as a Model for Clinical Monoclonal Gammopathy by Use of 21 Tesla FT-ICR Top-Down and Middle-Down MS/MS

NASA Astrophysics Data System (ADS)

He, Lidong; Anderson, Lissa C.; Barnidge, David R.; Murray, David L.; Hendrickson, Christopher L.; Marshall, Alan G.

2017-05-01

With the rapid growth of therapeutic monoclonal antibodies (mAbs), stringent quality control is needed to ensure clinical safety and efficacy. Monoclonal antibody primary sequence and post-translational modifications (PTM) are conventionally analyzed with labor-intensive, bottom-up tandem mass spectrometry (MS/MS), which is limited by incomplete peptide sequence coverage and introduction of artifacts during the lengthy analysis procedure. Here, we describe top-down and middle-down approaches with the advantages of fast sample preparation with minimal artifacts, ultrahigh mass accuracy, and extensive residue cleavages by use of 21 tesla FT-ICR MS/MS. The ultrahigh mass accuracy yields an RMS error of 0.2-0.4 ppm for antibody light chain, heavy chain, heavy chain Fc/2, and Fd subunits. The corresponding sequence coverages are 81%, 38%, 72%, and 65% with MS/MS RMS error 4 ppm. Extension to a monoclonal antibody in human serum as a monoclonal gammopathy model yielded 53% sequence coverage from two nano-LC MS/MS runs. A blind analysis of five therapeutic monoclonal antibodies at clinically relevant concentrations in human serum resulted in correct identification of all five antibodies. Nano-LC 21 T FT-ICR MS/MS provides nonpareil mass resolution, mass accuracy, and sequence coverage for mAbs, and sets a benchmark for MS/MS analysis of multiple mAbs in serum. This is the first time that extensive cleavages for both variable and constant regions have been achieved for mAbs in a human serum background.
Amazonian phylogeography: mtDNA sequence variation in arboreal echimyid rodents (Caviomorpha).

PubMed

da Silva, M N; Patton, J L

1993-09-01

Patterns of evolutionary relationships among haplotype clades of sequences of the mitochondrial cytochrome b DNA gene are examined for five genera of arboreal rodents of the Caviomorph family Echimyidae from the Amazon Basin. Data are available for 798 bp of sequence from a total of 24 separate localities in Peru, Venezuela, Bolivia, and Brazil for Mesomys, Isothrix, Makalata, Dactylomys, and Echimys. Sequence divergence, corrected for multiple hits, is extensive, ranging from less than 1% for comparisons within populations of over 20% among geographic units within genera. Both the degree of differentiation and the geographic patterning of the variation suggest that more than one species composes the Amazonian distribution of the currently recognized Mesomys hispidus, Isothrix bistriata, Makalata didelphoides, and Dactylomys dactylinus. There is general concordance in the geographic range of haplotype clades for each of these taxa, and the overall level of differentiation within them is largely equivalent. These observations suggest that a common vicariant history underlies the respective diversification of each genus. However, estimated times of divergence based on the rate of third position transversion substitutions for the major clades within each genus typically range above 1 million years. Thus, allopatric isolation precipitating divergence must have been considerably earlier than the late Pleistocene forest fragmentation events commonly invoked for Amazonian biota.
Mitochondrial DNA variation of indigenous goats in Narok and Isiolo counties of Kenya.

PubMed

Kibegwa, F M; Githui, K E; Jung'a, J O; Badamana, M S; Nyamu, M N

2016-06-01

Phylogenetic relationships among and genetic variability within 60 goats from two different indigenous breeds in Narok and Isiolo counties in Kenya and 22 published goat samples were analysed using mitochondrial control region sequences. The results showed that there were 54 polymorphic sites in a 481-bp sequence and 29 haplotypes were determined. The mean haplotype diversity and nucleotide diversity were 0.981 ± 0.006 and 0.019 ± 0.001, respectively. The phylogenetic analysis in combination with goat haplogroup reference sequences from GenBank showed that all goat sequences were clustered into two haplogroups (A and G), of which haplogroup A was the commonest in the two populations. A very high percentage (99.90%) of the genetic variation was distributed within the regions, and a smaller percentage (0.10%) distributed among regions as revealed by the analysis of molecular variance (amova). This amova results showed that the divergence between regions was not statistically significant. We concluded that the high levels of intrapopulation diversity in Isiolo and Narok goats and the weak phylogeographic structuring suggested that there existed strong gene flow among goat populations probably caused by extensive transportation of goats in history. © 2015 Blackwell Verlag GmbH.
Phylogenetic study of Oryzoideae species and related taxa of the Poaceae based on atpB-rbcL and ndhF DNA sequences.

PubMed

Zeng, Xu; Yuan, Zhengrong; Tong, Xin; Li, Qiushi; Gao, Weiwei; Qin, Minjian; Liu, Zhihua

2012-05-01

Oryzoideae (Poaceae) plants have economic and ecological value. However, the phylogenetic position of some plants is not clear, such as Hygroryza aristata (Retz.) Nees. and Porteresia coarctata (Roxb.) Tateoka (syn. Oryza coarctata). Comprehensive molecular phylogenetic studies have been carried out on many genera in the Poaceae. The different DNA sequences, including nuclear and chloroplast sequences, had been extensively employed to determine relationships at both higher and lower taxonomic levels in the Poaceae. Chloroplast DNA ndhF gene and atpB-rbcL spacer were used to construct phylogenetic trees and estimate the divergence time of Oryzoideae, Bambusoideae, Panicoideae, Pooideae and so on. Complete sequences of atpB-rbcL and ndhF were generated for 17 species representing six species of the Oryzoideae and related subfamilies. Nicotiana tabacum L. was the outgroup species. The two DNA datasets were analyzed, using Maximum Parsimony and Bayesian analysis methods. The molecular phylogeny revealed that H. aristata (Retz.) Nees was the sister to Chikusichloa aquatica Koidz. Moreover, P. coarctata (Roxb.) Tateoka was in the genus Oryza. Furthermore, the result of evolution analysis, which based on the ndhF marker, indicated that the time of origin of Oryzoideae might be 31 million years ago.
Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.

PubMed

Nair, Pradeep S; John, Eugene B

2007-01-01

Aligning specific sequences against a very large number of other sequences is a central aspect of bioinformatics. With the widespread availability of personal computers in biology laboratories, sequence alignment is now often performed locally. This makes it necessary to analyse the performance of personal computers for sequence aligning bioinformatics benchmarks. In this paper, we analyse the performance of a personal computer for the popular BLAST and FASTA sequence alignment suites. Results indicate that these benchmarks have a large number of recurring operations and use memory operations extensively. It seems that the performance can be improved with a bigger L1-cache.
Integer sequence discovery from small graphs

PubMed Central

Hoppe, Travis; Petrone, Anna

2015-01-01

We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526
Complete genome sequence of a ciprofloxacin resistant Salmonella enterica subsp. enterica serovar Kentucky sequence of a ciprofloxacin strain, PU131, isolated from a human patient in Washington State.

USDA-ARS?s Scientific Manuscript database

A ciprofloxacin resistant (CipR) Salmonella enterica subsp. enterica serovar Kentucky ST198 has rapidly and extensively disseminated globally to become a major food-safety and public health concern. Here, we report a complete genome sequence of a CipR S. Kentucky ST198 strain PU131 isolated from a ...
Automated quantification of lumbar vertebral kinematics from dynamic fluoroscopic sequences

NASA Astrophysics Data System (ADS)

Camp, Jon; Zhao, Kristin; Morel, Etienne; White, Dan; Magnuson, Dixon; Gay, Ralph; An, Kai-Nan; Robb, Richard

2009-02-01

We hypothesize that the vertebra-to-vertebra patterns of spinal flexion and extension motion of persons with lower back pain will differ from those of persons who are pain-free. Thus, it is our goal to measure the motion of individual lumbar vertebrae noninvasively from dynamic fluoroscopic sequences. Two-dimensional normalized mutual information-based image registration was used to track frame-to-frame motion. Software was developed that required the operator to identify each vertebra on the first frame of the sequence using a four-point "caliper" placed at the posterior and anterior edges of the inferior and superior end plates of the target vertebrae. The program then resolved the individual motions of each vertebra independently throughout the entire sequence. To validate the technique, 6 cadaveric lumbar spine specimens were potted in polymethylmethacrylate and instrumented with optoelectric sensors. The specimens were then placed in a custom dynamic spine simulator and moved through flexion-extension cycles while kinematic data and fluoroscopic sequences were simultaneously acquired. We found strong correlation between the absolute flexionextension range of motion of each vertebra as recorded by the optoelectric system and as determined from the fluoroscopic sequence via registration. We conclude that this method is a viable way of noninvasively assessing twodimensional vertebral motion.
Protecting unknown two-qubit entangled states by nesting Uhrig's dynamical decoupling sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mukhtar, Musawwadah; Soh, Wee Tee; Saw, Thuan Beng

2010-11-15

Future quantum technologies rely heavily on good protection of quantum entanglement against environment-induced decoherence. A recent study showed that an extension of Uhrig's dynamical decoupling (UDD) sequence can (in theory) lock an arbitrary but known two-qubit entangled state to the Nth order using a sequence of N control pulses [Mukhtar et al., Phys. Rev. A 81, 012331 (2010)]. By nesting three layers of explicitly constructed UDD sequences, here we first consider the protection of unknown two-qubit states as superposition of two known basis states, without making assumptions of the system-environment coupling. It is found that the obtained decoherence suppression canmore » be highly sensitive to the ordering of the three UDD layers and can be remarkably effective with the correct ordering. The detailed theoretical results are useful for general understanding of the nature of controlled quantum dynamics under nested UDD. As an extension of our three-layer UDD, it is finally pointed out that a completely unknown two-qubit state can be protected by nesting four layers of UDD sequences. This work indicates that when UDD is applicable (e.g., when the environment has a sharp frequency cutoff and when control pulses can be taken as instantaneous pulses), dynamical decoupling using nested UDD sequences is a powerful approach for entanglement protection.« less
UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.

PubMed

Du, Pu-Feng; Zhao, Wei; Miao, Yang-Yang; Wei, Le-Yi; Wang, Likun

2017-11-14

With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.
Familial Mediterranean fever with a single MEFV mutation: Where is the second hit?

PubMed Central

Booty, Matthew G.; Chae, Jae Jin; Masters, Seth L.; Remmers, Elaine F.; Barham, Beverly; Lee, Julie M.; Barron, Karyl S.; Holland, Steve; Kastner, Daniel L.; Aksentijevich, Ivona

2009-01-01

Objective FMF has traditionally been considered an autosomal recessive disease; however, it has been observed that a substantial number of patients with clinical FMF possess only one demonstrable MEFV mutation. Here, an extensive search for a second MEFV mutation was performed in 46 patients clinically diagnosed with FMF and carrying only one high-penetrance FMF mutation. Methods MEFV and other candidate genes were sequenced by standard capillary electrophoresis. The entire 15 kb MEFV genomic region was re-sequenced in 10 patients using a hybridization-based chip technology. MEFV gene expression levels were determined by qRT-PCR and pyrin protein levels were examined by Western blotting. Results A second MEFV mutation was not identified in any of the screened patients. Haplotype analysis did not identify a common haplotype that might be associated with the transmission of a second FMF allele. Western blots did not demonstrate a significant difference in pyrin levels between single and double variant patients; however, FMF patients of both types showed higher protein expression compared to controls and non-FMF patients with active inflammation. Screening of genes encoding pyrin-interacting proteins identified rare variants in a small number of patients, suggesting the possibility of digenic inheritance. Conclusion Our data underscore the existence of a significant subset of FMF patients who are carriers of only one MEFV mutation and demonstrate that complete MEFV sequencing is not likely to yield a second mutation. Screening for the set of most common mutations appears sufficient in the presence of clinical symptoms to diagnose FMF and initiate a trial of colchicine. PMID:19479870
GigaTON: an extensive publicly searchable database providing a new reference transcriptome in the pacific oyster Crassostrea gigas.

PubMed

Riviere, Guillaume; Klopp, Christophe; Ibouniyamine, Nabihoudine; Huvet, Arnaud; Boudry, Pierre; Favrel, Pascal

2015-12-02

The Pacific oyster, Crassostrea gigas, is one of the most important aquaculture shellfish resources worldwide. Important efforts have been undertaken towards a better knowledge of its genome and transcriptome, which makes now C. gigas becoming a model organism among lophotrochozoans, the under-described sister clade of ecdysozoans within protostomes. These massive sequencing efforts offer the opportunity to assemble gene expression data and make such resource accessible and exploitable for the scientific community. Therefore, we undertook this assembly into an up-to-date publicly available transcriptome database: the GigaTON (Gigas TranscriptOme pipeliNe) database. We assembled 2204 million sequences obtained from 114 publicly available RNA-seq libraries that were realized using all embryo-larval development stages, adult organs, different environmental stressors including heavy metals, temperature, salinity and exposure to air, which were mostly performed as part of the Crassostrea gigas genome project. This data was analyzed in silico and resulted into 56621 newly assembled contigs that were deposited into a publicly available database, the GigaTON database. This database also provides powerful and user-friendly request tools to browse and retrieve information about annotation, expression level, UTRs, splice and polymorphism, and gene ontology associated to all the contigs into each, and between all libraries. The GigaTON database provides a convenient, potent and versatile interface to browse, retrieve, confront and compare massive transcriptomic information in an extensive range of conditions, tissues and developmental stages in Crassostrea gigas. To our knowledge, the GigaTON database constitutes the most extensive transcriptomic database to date in marine invertebrates, thereby a new reference transcriptome in the oyster, a highly valuable resource to physiologists and evolutionary biologists.
CIDR

Science.gov Websites

NIH CIDR Program Studies For whole exome sequencing projects, we pretest all samples using a high -density SNP array (>200,000 markers). For custom targeted sequencing, we pretest all samples using a 96 pretest samples using a 96 SNP GoldenGate assay. This extensive pretesting allows us to unambiguously tie
Comparative transcriptome analysis in Sclerotinia sclerotiorum and S. trifoliorum by 454 Titanium RNA sequencing

USDA-ARS?s Scientific Manuscript database

Sclerotinia sclerotiorum and S. trifoliorum are two closely related devastating plant pathogens. Extensive research has been conducted on S. sclerotiorum and its genome sequences are available. To take advantages of the genomic information of S. sclerotiorum, we compared the transcriptome of S. tr...
A trace display and editing program for data from fluorescence based sequencing machines.

PubMed

Gleeson, T; Hillier, L

1991-12-11

'Ted' (Trace editor) is a graphical editor for sequence and trace data from automated fluorescence sequencing machines. It provides facilities for viewing sequence and trace data (in top or bottom strand orientation), for editing the base sequence, for automated or manual trimming of the head (vector) and tail (uncertain data) from the sequence, for vertical and horizontal trace scaling, for keeping a history of sequence editing, and for output of the edited sequence. Ted has been used extensively in the C.elegans genome sequencing project, both as a stand-alone program and integrated into the Staden sequence assembly package, and has greatly aided in the efficiency and accuracy of sequence editing. It runs in the X windows environment on Sun workstations and is available from the authors. Ted currently supports sequence and trace data from the ABI 373A and Pharmacia A.L.F. sequencers.
Labeled Nucleoside Triphosphates with Reversibly Terminating Aminoalkoxyl Groups

PubMed Central

Hutter, Daniel; Kim, Myong-Jung; Karalkar, Nilesh; Leal, Nicole A.; Chen, Fei; Guggenheim, Evan; Visalakshi, Visa; Olejnik, Jerzy; Gordon, Steven; Benner, Steven A.

2013-01-01

Nucleoside triphosphates having a 3′-ONH2 blocking group have been prepared with and without fluorescent tags on their nucleobases. DNA polymerases were identified that accepted these, adding a single nucleotide to the 3′-end of a primer in a template-directed extension reaction that then stops. Nitrite chemistry was developed to cleave the 3′-ONH2 group under mild conditions to allow continued primer extension. Extension-cleavage-extension cycles in solution were demonstrated with untagged nucleotides and mixtures of tagged and untagged nucleotides. Multiple extension-cleavage-extension cycles were demonstrated on an Intelligent Bio-Systems Sequencer, showing the potential of the 3′-ONH2 blocking group in “next generation sequencing”. PMID:21128174
The complete plastome of macaw palm [Acrocomia aculeata (Jacq.) Lodd. ex Mart.] and extensive molecular analyses of the evolution of plastid genes in Arecaceae.

PubMed

de Santana Lopes, Amanda; Gomes Pacheco, Túlio; Nimz, Tabea; do Nascimento Vieira, Leila; Guerra, Miguel P; Nodari, Rubens O; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Rogalski, Marcelo

2018-04-01

The plastome of macaw palm was sequenced allowing analyses of evolution and molecular markers. Additionally, we demonstrated that more than half of plastid protein-coding genes in Arecaceae underwent positive selection. Macaw palm is a native species from tropical and subtropical Americas. It shows high production of oil per hectare reaching up to 70% of oil content in fruits and an interesting plasticity to grow in different ecosystems. Its domestication and breeding are still in the beginning, which makes the development of molecular markers essential to assess natural populations and germplasm collections. Therefore, we sequenced and characterized in detail the plastome of macaw palm. A total of 221 SSR loci were identified in the plastome of macaw palm. Additionally, eight polymorphism hotspots were characterized at level of subfamily and tribe. Moreover, several events of gain and loss of RNA editing sites were found within the subfamily Arecoideae. Aiming to uncover evolutionary events in Arecaceae, we also analyzed extensively the evolution of plastid genes. The analyses show that highly divergent genes seem to evolve in a species-specific manner, suggesting that gene degeneration events may be occurring within Arecaceae at the level of genus or species. Unexpectedly, we found that more than half of plastid protein-coding genes are under positive selection, including genes for photosynthesis, gene expression machinery and other essential plastid functions. Furthermore, we performed a phylogenomic analysis using whole plastomes of 40 taxa, representing all subfamilies of Arecaceae, which placed the macaw palm within the tribe Cocoseae. Finally, the data showed here are important for genetic studies in macaw palm and provide new insights into the evolution of plastid genes and environmental adaptation in Arecaceae.

Interim Reliability Evaluation Program: analysis of the Browns Ferry, Unit 1, nuclear plant. Main report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mays, S.E.; Poloski, J.P.; Sullivan, W.H.

1982-07-01

A probabilistic risk assessment (PRA) was made of the Browns Ferry, Unit 1, nuclear plant as part of the Nuclear Regulatory Commission's Interim Reliability Evaluation Program (IREP). Specific goals of the study were to identify the dominant contributors to core melt, develop a foundation for more extensive use of PRA methods, expand the cadre of experienced PRA practitioners, and apply procedures for extension of IREP analyses to other domestic light water reactors. Event tree and fault tree analyses were used to estimate the frequency of accident sequences initiated by transients and loss of coolant accidents. External events such as floods,more » fires, earthquakes, and sabotage were beyond the scope of this study and were, therefore, excluded. From these sequences, the dominant contributors to probable core melt frequency were chosen. Uncertainty and sensitivity analyses were performed on these sequences to better understand the limitations associated with the estimated sequence frequencies. Dominant sequences were grouped according to common containment failure modes and corresponding release categories on the basis of comparison with analyses of similar designs rather than on the basis of detailed plant-specific calculations.« less
Extensive gene conversion at the PMS2 DNA mismatch repair locus.

PubMed

Hayward, Bruce E; De Vos, Michel; Valleley, Elizabeth M A; Charlton, Ruth S; Taylor, Graham R; Sheridan, Eamonn; Bonthron, David T

2007-05-01

Mutations of the PMS2 DNA repair gene predispose to a characteristic range of malignancies, with either childhood onset (when both alleles are mutated) or a partially penetrant adult onset (if heterozygous). These mutations have been difficult to detect, due to interference from a family of pseudogenes located on chromosome 7. One of these, the PMS2CL pseudogene, lies within a 100-kb inverted duplication (inv dup), 700 kb centromeric to PMS2 itself on 7p22. Here, we show that the reference genomic sequences cannot be relied upon to distinguish PMS2 from PMS2CL, because of sequence transfer between the two loci. The 7p22 inv dup occurred prior to the divergence of modern ape species (15 million years ago [Mya]), but has undergone extensive sequence homogenization. This process appears to be ongoing, since there is considerable allelic diversity within the duplicated region, much of it derived from sequence exchange between PMS2 and PMS2CL. This sequence diversity can result in both false-positive and false-negative mutation analysis at this locus. Great caution is still needed in the design and interpretation of PMS2 mutation screens. 2007 Wiley-Liss, Inc.
Elbow kinematics during sit-to-stand and stand-to-sit movements.

PubMed

Packer, T L; Wyss, U P; Costigan, P A

1993-11-01

The sit-to-stand and stand-to-sit movements of 10 healthy women (mean age 52.4 years) were subjected to a descriptive analysis that yielded a definition of phases, determination of the peak angles reached, maximum angular velocity during each movement, and the sequencing of key events. While subjects showed little intrasubject variability, intersubject variability was evident. Subjects differed in the joint angles and angular velocity recorded, but the sequence of flexion/extension and rotation events were unchanged. Changes in direction of flexion/extension and rotation tended to occur very close in time, if not at the same time. Copyright © 1993. Published by Elsevier Ltd.
La formation de l'inkisi (Supergroupe ouest-congolien) en Afrique centrale (Congo et Bas-Zaïre): un delta d'âge Paléozoïque comblant un bassin en extension

NASA Astrophysics Data System (ADS)

Alvarez, Philippe; Maurin, Jean-Christophe; Vicat, Jean-Paul

1995-02-01

The Inkisi Formation (West Congolian Supergroup) corresponds to a large deltaic body, which extends through Congo, Lower Zaire and Angola. In the Congo and Lower Zaire areas, the lower part of this formation is characterized by a fluvial conglomerate with elliptic pebbles. The red arkosic, channelized series from the Brazzaville-Kinshasa area involves delta plain distributary channels and delta front sequences. The transport direction of continental material is from north to south and the source area is the Chaillu basement. Glacial quartzitic pebbles are probably reworked from the fluvio-lacustrine Upper Diamictite Formation. The classical subdivisions of the Inkisi Formation - basal conglomerate (I 0), Lower part (I 1) and Upper part (I 2) - are not used. These subdivisions correspond to a fluvial conglomerate and to delta front and delta plain facies. The coastal onlap progressively covered the conglomerate and the distributary channels in the delta plain was prograding onto the delta front. The prodelta sequence could correspond to the Upper level of the Mpioka molassic Formation. The Inkisi delta was on the northern edge of an extensional basin controlled by NE-SW normal faults. The extension phase is dearly post Pan-African and occurred during the Palaeozoic, probably in relation to the Permian Karoo phase, and is also known in Angola.
Familial hypercholesterolemia with extensive coronary artery disease and tuberous and tendinous xanthomas: A case report and mutation analysis.

PubMed

Agirbasli, Deniz; Hyatt, Tommy; Agirbasli, Mehmet

2018-04-26

This is a case report of a 38-year-old Syrian refugee male with early-onset extensive atherosclerosis. The physical and laboratory examination were remarkable with severe xanthomas in the upper and lower extremities and with low-density lipoprotein cholesterol (LDL-C) 417 mg/dL, total cholesterol 495 mg/dL, high-density lipoprotein cholesterol 30 mg/dL, and triglycerides 242 mg/dL. LDL-C level responded poorly to the high-dose statin treatment. The genetic analysis indicated that the patient had a large homozygous deletion in LDL receptor gene including the exons 7-14. A 12-kb deletion had occurred between the 2 Alu repetitive sequences that were oriented in opposite directions, one in intron 6 and the other in intron 14. This deletion eliminated exons 7-14, which exactly corresponded to the entire exon sequence coding the epidermal growth factor precursor homology domain. This deletion in LDL receptor was previously reported. This rare case of homozygous familial hypercholesterolemia presenting with multiple large and widely distributed xanthomas implicates the need for novel treatment options in familial hypercholesterolemia patients. The case is a Syrian refugee and emphasizes the urgent need to address orphan disease in refugee populations throughout the world. Copyright © 2018 National Lipid Association. Published by Elsevier Inc. All rights reserved.
Breaking symmetry: the zebrafish as a model for understanding left-right asymmetry in the developing brain.

PubMed

Roussigne, Myriam; Blader, Patrick; Wilson, Stephen W

2012-03-01

How does left-right asymmetry develop in the brain and how does the resultant asymmetric circuitry impact on brain function and lateralized behaviors? By enabling scientists to address these questions at the levels of genes, neurons, circuitry and behavior,the zebrafish model system provides a route to resolve the complexity of brain lateralization. In this review, we present the progress made towards characterizing the nature of the gene networks and the sequence of morphogenetic events involved in the asymmetric development of zebrafish epithalamus. In an attempt to integrate the recent extensive knowledge into a working model and to identify the future challenges,we discuss how insights gained at a cellular/developmental level can be linked to the data obtained at a molecular/genetic level. Finally, we present some evolutionary thoughts and discuss how significant discoveries made in zebrafish should provide entry points to better understand the evolutionary origins of brain lateralization.
Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

PubMed Central

2014-01-01

Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920
Orthology detection combining clustering and synteny for very large datasets.

PubMed

Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K; Prohaska, Sonja J; Stadler, Peter F

2014-01-01

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets

PubMed Central

Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Stadler, Peter F.

2014-01-01

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. PMID:25137074
Layered Deposits of Arabia Terra and Meridiani Planum: Keys to the Habitability of Ancient Mars

NASA Technical Reports Server (NTRS)

Allen, Carlton C.; Oehler, Dorothy Z.; Paris, Kristen N.; Venechuk, Elizabeth M.

2006-01-01

Understanding the habitability of ancient Mars is a key goal in the exploration of that planet. Evidence for conditions favorable to early life must be sought in ancient sedimentary rocks, such as those of Arabia Terra and Meridiani Planum. Arabia Terra, the northernmost extension of the ancient highlands, is dominated by cratered plains and minor ridged units. These plains extend south into the adjacent Meridiani Planum. The Opportunity rover landed in northern Meridiani, close to the border with Arabia. High resolution MOC images reveal extensive layered sequences across much of the Arabia and Meridiani region. These layers have been interpreted as eroded remnants of sedimentary rock deposits (Edgett, 2005). The layered sequences are concentrated in the SW quadrant of Arabia and in northern Meridiani. Preliminary mapping by Edgett (2005) distinguished four large scale layered sequences in the Arabia and Meridiani region. These have dimensions of hundreds to more than 1,000 km. MOLA altimetry shows that each of the sequences can attain a thickness of 200 to 400 m, with a total thickness greater than 1 km. The sequences are generally flat lying, with regional slopes of a few degrees. Much finer layering is evident within a number of craters. The plains and ridged units of the Arabia and Meridiani region were originally mapped as Noachian based on crater statistics, particularly the number of large craters (Scott and Carr, 1978). The layered sequences in the current study postdate many, but not all, of these large craters. The layered sequences have partially or totally filled a number of craters with diameters ranging from 20 to over 50 km. The topmost layered sequence, as well as the lower two sequences, have intermediate thermal inertia, as derived from THEMIS, indicative of moderate induration. The TES spectra from the lower sequences include features indicative of basalt. Some areas of the topmost sequence, which includes the Opportunity landing site, have TES spectra dominated by hematite. Just below this topmost sequence lies a sequence with higher thermal inertia, indicative of more indurated or coarser grained material. The TES spectra of this sequence lack distinctive mineral features, and the rocks may be obscured by a thin coating of dust. The layers have been extensively eroded. The uppermost sequences are characterized by deeply scalloped boundaries. Filled craters have been partially exhumed. Finely layered deposits within craters have been strongly dissected. Landforms uniquely attributable to wind erosion are rare, but erosive styles and geomorphology characteristic of water and possibly ice are present. The layered sequences in Arabia Terra and Meridiani Planum likely reflect an epoch when the planet was much more habitable than it is today. Several areas in these layered sequences are under intensive study as candidate landing sites for the 2009 Mars Science Laboratory.
Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

PubMed

Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

1988-02-01

Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Cloning and expression of a cDNA coding for a human monocyte-derived plasminogen activator inhibitor.

PubMed Central

Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P

1988-01-01

Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
Dna Sequencing

DOEpatents

Tabor, Stanley; Richardson, Charles C.

1995-04-25

A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

USDA-ARS?s Scientific Manuscript database

We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the minimum information about any (x) sequence (MIxS). The standards are the minimum information about a single amplified genome (MISAG) and the ...
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

Treesearch

Shannon C.K. Straub; Mark Fishbein; Tatyana Livshult; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C. Cronn; Aaron Liston

2011-01-01

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in...
Applications and Extensions of pClust to Big Microbial Proteomic Data

ERIC Educational Resources Information Center

Lockwood, Svetlana

2016-01-01

The goal of biological sciences is to understand the biomolecular mechanics of living organisms. Proteins serve as the foundation for organisms functional analysis and sequence analysis has shown to be invaluable in answering questions about individual organisms. The first step in any sequence analysis is alignment and it is common that even…
Speed, Accuracy, and Serial Order in Sequence Production

ERIC Educational Resources Information Center

Pfordresher, Peter Q.; Palmer, Caroline; Jungers, Melissa K.

2007-01-01

The production of complex sequences like music or speech requires the rapid and temporally precise production of events (e.g., notes and chords), often at fast rates. Memory retrieval in these circumstances may rely on the simultaneous activation of both the current event and the surrounding context (Lashley, 1951). We describe an extension to a…
SPMBR: a scalable algorithm for mining sequential patterns based on bitmaps

NASA Astrophysics Data System (ADS)

Xu, Xiwei; Zhang, Changhai

2013-12-01

Now some sequential patterns mining algorithms generate too many candidate sequences, and increase the processing cost of support counting. Therefore, we present an effective and scalable algorithm called SPMBR (Sequential Patterns Mining based on Bitmap Representation) to solve the problem of mining the sequential patterns for large databases. Our method differs from previous related works of mining sequential patterns. The main difference is that the database of sequential patterns is represented by bitmaps, and a simplified bitmap structure is presented firstly. In this paper, First the algorithm generate candidate sequences by SE(Sequence Extension) and IE(Item Extension), and then obtain all frequent sequences by comparing the original bitmap and the extended item bitmap .This method could simplify the problem of mining the sequential patterns and avoid the high processing cost of support counting. Both theories and experiments indicate that the performance of SPMBR is predominant for large transaction databases, the required memory size for storing temporal data is much less during mining process, and all sequential patterns can be mined with feasibility.
Molecular evolution of the CYP2D subfamily in primates: purifying selection on substrate recognition sites without the frequent or long-tract gene conversion.

PubMed

Yasukochi, Yoshiki; Satta, Yoko

2015-03-25

The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Molecular Evolution of the CYP2D Subfamily in Primates: Purifying Selection on Substrate Recognition Sites without the Frequent or Long-Tract Gene Conversion

PubMed Central

Yasukochi, Yoshiki; Satta, Yoko

2015-01-01

The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. PMID:25808902

DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.

PubMed

Goris, Johan; Konstantinidis, Konstantinos T; Klappenbach, Joel A; Coenye, Tom; Vandamme, Peter; Tiedje, James M

2007-01-01

DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.
5’-Terminal AUGs in Escherichia coli mRNAs with Shine-Dalgarno Sequences: Identification and Analysis of Their Roles in Non-Canonical Translation Initiation

PubMed Central

Beck, Heather J.; Fleming, Ian M. C.

2016-01-01

Analysis of the Escherichia coli transcriptome identified a unique subset of messenger RNAs (mRNAs) that contain a conventional untranslated leader and Shine-Dalgarno (SD) sequence upstream of the gene’s start codon while also containing an AUG triplet at the mRNA’s 5’- terminus (5’-uAUG). Fusion of the coding sequence specified by the 5’-terminal putative AUG start codon to a lacZ reporter gene, as well as primer extension inhibition assays, reveal that the majority of the 5’-terminal upstream open reading frames (5’-uORFs) tested support some level of lacZ translation, indicating that these mRNAs can function both as leaderless and canonical SD-leadered mRNAs. Although some of the uORFs were expressed at low levels, others were expressed at levels close to that of the respective downstream genes and as high as the naturally leaderless cI mRNA of bacteriophage λ. These 5’-terminal uORFs potentially encode peptides of varying lengths, but their functions, if any, are unknown. In an effort to determine whether expression from the 5’-terminal uORFs impact expression of the immediately downstream cistron, we examined expression from the downstream coding sequence after mutations were introduced that inhibit efficient 5’-uORF translation. These mutations were found to affect expression from the downstream cistrons to varying degrees, suggesting that some 5’-uORFs may play roles in downstream regulation. Since the 5’-uAUGs found on these conventionally leadered mRNAs can function to bind ribosomes and initiate translation, this indicates that canonical mRNAs containing 5’-uAUGs should be examined for their potential to function also as leaderless mRNAs. PMID:27467758
Le graben de l'Anti-Atlas occidental (Maroc) : contrôle tectonique de la paléogéographie et des séquences au Cambrien inférieurThe Lower-Cambrian western Anti-Atlasic graben: tectonic control of palaeogeography and sequential organisation

NASA Astrophysics Data System (ADS)

Benssaou, Mohammed; Hamoumi, Naı̈ma

2003-03-01

In the Moroccan western Anti-Atlas, the combined extensive tectonic events with a long-term sea-level rise is the main factor on building vertical stacking transgressive-regressive sequences. In the Ait Abdallah-Boussafene axis, the subsidence processes, relayed by a brutal platform tilting generated an elongated NE-SW graben. This is an evidence of the persistence of the Anti-Atlasic rifting process during the last part of the Lower-Cambrian succession.
The CADSS design automation system. [computerized design language for small digital systems

NASA Technical Reports Server (NTRS)

Franke, E. A.

1973-01-01

This research was designed to implement and extend a previously defined design automation system for the design of small digital structures. A description is included of the higher level language developed to describe systems as a sequence of register transfer operations. The system simulator which is used to determine if the original description is correct is also discussed. The design automation system produces tables describing the state transistions of the system and the operation of all registers. In addition all Boolean equations specifying system operation are minimized and converted to NAND gate structures. Suggestions for further extensions to the system are also given.
[Coronary disease extension determines mobilization of endothelial progenitor cells and cytokines after a first myocardial infarction with ST elevation].

PubMed

Jiménez-Navarro, Manuel F; González, Francisco Jesús; Caballero-Borrego, Juan; Marchal, Juan Antonio; Rodríguez-Losada, Noela; Carrillo, Esmeralda; García-Pinilla, José Manuel; Hernández-García, José M; Pérez-González, Rita; Ramírez, Gemma; Aránega, Antonia; de Teresa Galván, Eduardo

2011-12-01

Multivessel coronary disease is still a postinfarction prognostic marker despite new forms of reperfusion, such as primary angioplasty. The aim of this study was to determine the time sequence of various sets of endothelial progenitor cells and angiogenic cytokines (vascular endothelial growth factor, hepatocyte growth factor) according to the degree of extension of the postinfarction coronary disease. We studied the release kinetics in 32 patients admitted for a first myocardial infarction with ST elevation, grouped according to whether they had single or multivessel disease, and 26 controls. The patients had a higher number of endothelial progenitor cells and angiogenic cytokines than the controls at all 3 measurements (admission, day 3, and day 7) of the following subsets: CD34, CD34+CD133+, CD34+KDR+, and CD34+CD133+KDR+CD45+(weak); this latter was higher on day 7. The levels of these cell subsets were all higher in the patients with single-vessel disease and at all 3 measurements. The vascular endothelial growth factor levels were raised during the first week and the hepatocyte growth factor showed an early peak on admission for infarction. No significant differences were seen in the cytokines according to coronary disease extension. Although the release kinetics of different subsets of endothelial progenitor cells in patients with a first acute myocardial infarction with ST elevation was similar in those with single vessel disease to those with multivessel disease, the number of circulating endothelial progenitor cells was greater in the patients with single vessel disease. The vascular endothelial growth factor levels were raised during the first postinfarction week and the hepatocyte growth factor were higher on admission. Copyright © 2011 Sociedad Española de Cardiología. Published by Elsevier Espana. All rights reserved.
The delta-subunit of murine guanine nucleotide exchange factor eIF-2B. Characterization of cDNAs predicts isoforms differing at the amino-terminal end.

PubMed

Henderson, R A; Krissansen, G W; Yong, R Y; Leung, E; Watson, J D; Dholakia, J N

1994-12-02

Protein synthesis in mammalian cells is regulated at the level of the guanine nucleotide exchange factor, eIF-2B, which catalyzes the exchange of eukaryotic initiation factor 2-bound GDP for GTP. We have isolated and sequenced cDNA clones encoding the delta-subunit of murine eIF-2B. The cDNA sequence encodes a polypeptide of 544 amino acids with molecular mass of 60 kDa. Antibodies against a synthetic polypeptide of 30 amino acids deduced from the cDNA sequence specifically react with the delta-subunit of mammalian eIF-2B. The cDNA-derived amino acid sequence shows significant homology with the yeast translational regulator Gcd2, supporting the hypothesis that Gcd2 may be the yeast homolog of the delta-subunit of mammalian eIF-2B. Primer extension studies and anchor polymerase chain reaction analysis were performed to determine the 5'-end of the transcript for the delta-subunit of eIF-2B. Results of these experiments demonstrate two different mRNAs for the delta-subunit of eIF-2B in murine cells. The isolation and characterization of two different full-length cDNAs also predicts the presence of two alternate forms of the delta-subunit of eIF-2B in murine cells. These differ at their amino-terminal end but have identical nucleotide sequences coding for amino acids 31-544.
Characterization of 25 full-length S-RNase alleles, including flanking regions, from a pool of resequenced apple cultivars.

PubMed

De Franceschi, Paolo; Bianco, Luca; Cestaro, Alessandro; Dondini, Luca; Velasco, Riccardo

2018-06-01

Data obtained from Illumina resequencing of 63 apple cultivars were used to obtain full-length S-RNase sequences using a strategy based on both alignment and de novo assembly of reads. The reproductive biology of apple is regulated by the S-RNase-based gametophytic self-incompatibility system, that is genetically controlled by the single, multi-genic and multi-allelic S locus. Resequencing of apple cultivars provided a huge amount of genetic data, that can be aligned to the reference genome in order to characterize variation to a genome-wide level. However, this approach is not immediately adaptable to the S-locus, due to some peculiar features such as the high degree of polymorphism, lack of colinearity between haplotypes and extensive presence of repetitive elements. In this study we describe a dedicated procedure aimed at characterizing S-RNase alleles from resequenced cultivars. The S-genotype of 63 apple accessions is reported; the full length coding sequence was determined for the 25 S-RNase alleles present in the 63 resequenced cultivars; these included 10 previously incomplete sequences (S 5 , S 6a , S 6b , S 8 , S 11 , S 23 , S 39 , S 46 , S 50 and S 58 ). Moreover, sequence divergence clearly suggests that alleles S 6a and S 6b , proposed to be neutral variants of the same alleles, should be instead considered different specificities. The promoter sequences have also been analyzed, highlighting regions of homology conserved among all the alleles.
Analysing grouping of nucleotides in DNA sequences using lumped processes constructed from Markov chains.

PubMed

Guédon, Yann; d'Aubenton-Carafa, Yves; Thermes, Claude

2006-03-01

The most commonly used models for analysing local dependencies in DNA sequences are (high-order) Markov chains. Incorporating knowledge relative to the possible grouping of the nucleotides enables to define dedicated sub-classes of Markov chains. The problem of formulating lumpability hypotheses for a Markov chain is therefore addressed. In the classical approach to lumpability, this problem can be formulated as the determination of an appropriate state space (smaller than the original state space) such that the lumped chain defined on this state space retains the Markov property. We propose a different perspective on lumpability where the state space is fixed and the partitioning of this state space is represented by a one-to-many probabilistic function within a two-level stochastic process. Three nested classes of lumped processes can be defined in this way as sub-classes of first-order Markov chains. These lumped processes enable parsimonious reparameterizations of Markov chains that help to reveal relevant partitions of the state space. Characterizations of the lumped processes on the original transition probability matrix are derived. Different model selection methods relying either on hypothesis testing or on penalized log-likelihood criteria are presented as well as extensions to lumped processes constructed from high-order Markov chains. The relevance of the proposed approach to lumpability is illustrated by the analysis of DNA sequences. In particular, the use of lumped processes enables to highlight differences between intronic sequences and gene untranslated region sequences.
Life in the dark: metagenomic evidence that a microbial slime community is driven by inorganic nitrogen metabolism.

PubMed

Tetu, Sasha G; Breakwell, Katy; Elbourne, Liam D H; Holmes, Andrew J; Gillings, Michael R; Paulsen, Ian T

2013-06-01

Beneath Australia's large, dry Nullarbor Plain lies an extensive underwater cave system, where dense microbial communities known as 'slime curtains' are found. These communities exist in isolation from photosynthetically derived carbon and are presumed to be chemoautotrophic. Earlier work found high levels of nitrite and nitrate in the cave waters and a high relative abundance of Nitrospirae in bacterial 16S rRNA clone libraries. This suggested that these communities may be supported by nitrite oxidation, however, details of the inorganic nitrogen cycling in these communities remained unclear. Here we report analysis of 16S rRNA amplicon and metagenomic sequence data from the Weebubbie cave slime curtain community. The microbial community is comprised of a diverse assortment of bacterial and archaeal genera, including an abundant population of Thaumarchaeota. Sufficient thaumarchaeotal sequence was recovered to enable a partial genome sequence to be assembled, which showed considerable synteny with the corresponding regions in the genome of the autotrophic ammonia oxidiser Nitrosopumilus maritimus SCM1. This partial genome sequence, contained regions with high sequence identity to the ammonia mono-oxygenase operon and carbon fixing 3-hydroxypropionate/4-hydroxybutyrate cycle genes of N. maritimus SCM1. Additionally, the community, as a whole, included genes encoding key enzymes for inorganic nitrogen transformations, including nitrification and denitrification. We propose that the Weebubbie slime curtain community represents a distinctive microbial ecosystem, in which primary productivity is due to the combined activity of archaeal ammonia-oxidisers and bacterial nitrite oxidisers.
The Mitochondrial Cytochrome Oxidase Subunit I Gene Occurs on a Minichromosome with Extensive Heteroplasmy in Two Species of Chewing Lice, Geomydoecus aurei and Thomomydoecus minor

PubMed Central

Pietan, Lucas L.; Spradling, Theresa A.

2016-01-01

In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589
Transposon diversity in Arabidopsis thaliana

PubMed Central

Le, Quang Hien; Wright, Stephen; Yu, Zhihui; Bureau, Thomas

2000-01-01

Recent availability of extensive genome sequence information offers new opportunities to analyze genome organization, including transposon diversity and accumulation, at a level of resolution that was previously unattainable. In this report, we used sequence similarity search and analysis protocols to perform a fine-scale analysis of a large sample (≈17.2 Mb) of the Arabidopsis thaliana (Columbia) genome for transposons. Consistent with previous studies, we report that the A. thaliana genome harbors diverse representatives of most known superfamilies of transposons. However, our survey reveals a higher density of transposons of which over one-fourth could be classified into a single novel transposon family designated as Basho, which appears unrelated to any previously known superfamily. We have also identified putative transposase-coding ORFs for miniature inverted-repeat transposable elements (MITEs), providing clues into the mechanism of mobility and origins of the most abundant transposons associated with plant genes. In addition, we provide evidence that most mined transposons have a clear distribution preference for A + T-rich sequences and show that structural variation for many mined transposons is partly due to interelement recombination. Taken together, these findings further underscore the complexity of transposons within the compact genome of A. thaliana. PMID:10861007
Using string alignment in a query-by-humming system for real world applications

NASA Astrophysics Data System (ADS)

Sailer, Christian

2005-09-01

Though query by humming (i.e., retrieving music or information about music by singing a characteristic melody) has been a popular research topic during the past decade, few approaches have reached a level of usefulness beyond mere scientific interest. One of the main problems is the inherent contradiction between error tolerance and dicriminative power in conventional melody matching algorithms that rely on a melody contour approach to handle intonation or transcription errors. Adopting the string matching/alignment techniques from bioinformatics to melody sequences allows to directly assess the similarity between two melodies. This method takes an MPEG-7 compliant melody sequence (i.e., a list of note intervals and length ratios) as query and evaluates the steps necessary to transform it into the reference sequence. By introducing a musically founded cost-of-replace function and an adequate post processing, this method yields a measure for melodic similarity. Thus it is possible to construct a query by humming system that can properly discriminate between thousands of melodies and still be sufficiently error tolerant to be used by untrained singers. The robustness has been verified in extensive tests and real world applications.
Molecular mechanics of silk nanostructures under varied mechanical loading.

PubMed

Bratzel, Graham; Buehler, Markus J

2012-06-01

Spider dragline silk is a self-assembling tunable protein composite fiber that rivals many engineering fibers in tensile strength, extensibility, and toughness, making it one of the most versatile biocompatible materials and most inviting for synthetic mimicry. While experimental studies have shown that the peptide sequence and molecular structure of silk have a direct influence on the stiffness, toughness, and failure strength of silk, few molecular-level analyses of the nanostructure of silk assemblies, in particular, under variations of genetic sequences have been reported. In this study, atomistic-level structures of wildtype as well as modified MaSp1 protein from the Nephila clavipes spider dragline silk sequences, obtained using an in silico approach based on replica exchange molecular dynamics and explicit water molecular dynamics, are subjected to simulated nanomechanical testing using different force-control loading conditions including stretch, pull-out, and peel. The authors have explored the effects of the poly-alanine length of the N. clavipes MaSp1 peptide sequence and identify differences in nanomechanical loading conditions on the behavior of a unit cell of 15 strands with 840-990 total residues used to represent a cross-linking β-sheet crystal node in the network within a fibril of the dragline silk thread. The specific loading condition used, representing concepts derived from the protein network connectivity at larger scales, have a significant effect on the mechanical behavior. Our analysis incorporates stretching, pull-out, and peel testing to connect biochemical features to mechanical behavior. The method used in this study could find broad applications in de novo design of silk-like tunable materials for an array of applications. Copyright © 2011 Wiley Periodicals, Inc.
Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.

PubMed

Wu, Jia Qian; Du, Jiang; Rozowsky, Joel; Zhang, Zhengdong; Urban, Alexander E; Euskirchen, Ghia; Weissman, Sherman; Gerstein, Mark; Snyder, Michael

2008-01-03

Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.
Stable centromere positioning in diverse sequence contexts of complex and satellite centromeres of maize and wild relatives.

PubMed

Gent, Jonathan I; Wang, Na; Dawe, R Kelly

2017-06-21

Paradoxically, centromeres are known both for their characteristic repeat sequences (satellite DNA) and for being epigenetically defined. Maize (Zea mays mays) is an attractive model for studying centromere positioning because many of its large (~2 Mb) centromeres are not dominated by satellite DNA. These centromeres, which we call complex centromeres, allow for both assembly into reference genomes and for mapping short reads from ChIP-seq with antibodies to centromeric histone H3 (cenH3). We found frequent complex centromeres in maize and its wild relatives Z. mays parviglumis, Z. mays mexicana, and particularly Z. mays huehuetenangensis. Analysis of individual plants reveals minor variation in the positions of complex centromeres among siblings. However, such positional shifts are stochastic and not heritable, consistent with prior findings that centromere positioning is stable at the population level. Centromeres are also stable in multiple F1 hybrid contexts. Analysis of repeats in Z. mays and other species (Zea diploperennis, Zea luxurians, and Tripsacum dactyloides) reveals tenfold differences in abundance of the major satellite CentC, but similar high levels of sequence polymorphism in individual CentC copies. Deviation from the CentC consensus has little or no effect on binding of cenH3. These data indicate that complex centromeres are neither a peculiarity of cultivation nor inbreeding in Z. mays. While extensive arrays of CentC may be the norm for other Zea and Tripsacum species, these data also reveal that a wide diversity of DNA sequences and multiple types of genetic elements in and near centromeres support centromere function and constrain centromere positions.
Comparison of Macroscopic Pathology Measurements With Magnetic Resonance Imaging and Assessment of Microscopic Pathology Extension for Colorectal Liver Metastases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mendez Romero, Alejandra, E-mail: a.mendezromero@erasmusmc.nl; Verheij, Joanne; Dwarkasing, Roy S.

2012-01-01

Purpose: To compare pathology macroscopic tumor dimensions with magnetic resonance imaging (MRI) measurements and to establish the microscopic tumor extension of colorectal liver metastases. Methods and Materials: In a prospective pilot study we included patients with colorectal liver metastases planned for surgery and eligible for MRI. A liver MRI was performed within 48 hours before surgery. Directly after surgery, an MRI of the specimen was acquired to measure the degree of tumor shrinkage. The specimen was fixed in formalin for 48 hours, and another MRI was performed to assess the specimen/tumor shrinkage. All MRI sequences were imported into our radiotherapymore » treatment planning system, where the tumor and the specimen were delineated. For the macroscopic pathology analyses, photographs of the sliced specimens were used to delineate and reconstruct the tumor and the specimen volumes. Microscopic pathology analyses were conducted to assess the infiltration depth of tumor cell nests. Results: Between February 2009 and January 2010 we included 13 patients for analysis with 21 colorectal liver metastases. Specimen and tumor shrinkage after resection and fixation was negligible. The best tumor volume correlations between MRI and pathology were found for T1-weighted (w) echo gradient sequence (r{sub s} = 0.99, slope = 1.06), and the T2-w fast spin echo (FSE) single-shot sequence (r{sub s} = 0.99, slope = 1.08), followed by the T2-w FSE fat saturation sequence (r{sub s} = 0.99, slope = 1.23), and the T1-w gadolinium-enhanced sequence (r{sub s} = 0.98, slope = 1.24). We observed 39 tumor cell nests beyond the tumor border in 12 metastases. Microscopic extension was found between 0.2 and 10 mm from the main tumor, with 90% of the cases within 6 mm. Conclusions: MRI tumor dimensions showed a good agreement with the macroscopic pathology suggesting that MRI can be used for accurate tumor delineation. However, microscopic extensions found beyond the tumor border indicate that caution is needed in selecting appropriate tumor margins.« less
The CMS High Level Trigger System: Experience and Future Development

NASA Astrophysics Data System (ADS)

Bauer, G.; Behrens, U.; Bowen, M.; Branson, J.; Bukowiec, S.; Cittolin, S.; Coarasa, J. A.; Deldicque, C.; Dobson, M.; Dupont, A.; Erhan, S.; Flossdorf, A.; Gigi, D.; Glege, F.; Gomez-Reino, R.; Hartl, C.; Hegeman, J.; Holzner, A.; Hwong, Y. L.; Masetti, L.; Meijers, F.; Meschi, E.; Mommsen, R. K.; O'Dell, V.; Orsini, L.; Paus, C.; Petrucci, A.; Pieri, M.; Polese, G.; Racz, A.; Raginel, O.; Sakulin, H.; Sani, M.; Schwick, C.; Shpakov, D.; Simon, S.; Spataru, A. C.; Sumorok, K.

2012-12-01

The CMS experiment at the LHC features a two-level trigger system. Events accepted by the first level trigger, at a maximum rate of 100 kHz, are read out by the Data Acquisition system (DAQ), and subsequently assembled in memory in a farm of computers running a software high-level trigger (HLT), which selects interesting events for offline storage and analysis at a rate of order few hundred Hz. The HLT algorithms consist of sequences of offline-style reconstruction and filtering modules, executed on a farm of 0(10000) CPU cores built from commodity hardware. Experience from the operation of the HLT system in the collider run 2010/2011 is reported. The current architecture of the CMS HLT, its integration with the CMS reconstruction framework and the CMS DAQ, are discussed in the light of future development. The possible short- and medium-term evolution of the HLT software infrastructure to support extensions of the HLT computing power, and to address remaining performance and maintenance issues, are discussed.
Electricity forecasting on the individual household level enhanced based on activity patterns

PubMed Central

Gajowniczek, Krzysztof; Ząbkowski, Tomasz

2017-01-01

Leveraging smart metering solutions to support energy efficiency on the individual household level poses novel research challenges in monitoring usage and providing accurate load forecasting. Forecasting electricity usage is an especially important component that can provide intelligence to smart meters. In this paper, we propose an enhanced approach for load forecasting at the household level. The impacts of residents’ daily activities and appliance usages on the power consumption of the entire household are incorporated to improve the accuracy of the forecasting model. The contributions of this paper are threefold: (1) we addressed short-term electricity load forecasting for 24 hours ahead, not on the aggregate but on the individual household level, which fits into the Residential Power Load Forecasting (RPLF) methods; (2) for the forecasting, we utilized a household specific dataset of behaviors that influence power consumption, which was derived using segmentation and sequence mining algorithms; and (3) an extensive load forecasting study using different forecasting algorithms enhanced by the household activity patterns was undertaken. PMID:28423039
Electricity forecasting on the individual household level enhanced based on activity patterns.

PubMed

Gajowniczek, Krzysztof; Ząbkowski, Tomasz

2017-01-01

Leveraging smart metering solutions to support energy efficiency on the individual household level poses novel research challenges in monitoring usage and providing accurate load forecasting. Forecasting electricity usage is an especially important component that can provide intelligence to smart meters. In this paper, we propose an enhanced approach for load forecasting at the household level. The impacts of residents' daily activities and appliance usages on the power consumption of the entire household are incorporated to improve the accuracy of the forecasting model. The contributions of this paper are threefold: (1) we addressed short-term electricity load forecasting for 24 hours ahead, not on the aggregate but on the individual household level, which fits into the Residential Power Load Forecasting (RPLF) methods; (2) for the forecasting, we utilized a household specific dataset of behaviors that influence power consumption, which was derived using segmentation and sequence mining algorithms; and (3) an extensive load forecasting study using different forecasting algorithms enhanced by the household activity patterns was undertaken.
MRI to delineate the gross tumor volume of nasopharyngeal cancers: which sequences and planes should be used?

PubMed

Popovtzer, Aron; Ibrahim, Mohannad; Tatro, Daniel; Feng, Felix Y; Ten Haken, Randall K; Eisbruch, Avraham

2014-09-01

Magnetic resonance imaging (MRI) has been found to be better than computed tomography for defining the extent of primary gross tumor volume (GTV) in advanced nasopharyngeal cancer. It is routinely applied for target delineation in planning radiotherapy. However, the specific MRI sequences/planes that should be used are unknown. Twelve patients with nasopharyngeal cancer underwent primary GTV evaluation with gadolinium-enhanced axial T1 weighted image (T1) and T2 weighted image (T2), coronal T1, and sagittal T1 sequences. Each sequence was registered with the planning computed tomography scans. Planning target volumes (PTVs) were derived by uniform expansions of the GTVs. The volumes encompassed by the various sequences/planes, and the volumes common to all sequences/planes, were compared quantitatively and anatomically to the volume delineated by the commonly used axial T1-based dataset. Addition of the axial T2 sequence increased the axial T1-based GTV by 12% on average (p = 0.004), and composite evaluations that included the coronal T1 and sagittal T1 planes increased the axial T1-based GTVs by 30% on average (p = 0.003). The axial T1-based PTVs were increased by 20% by the additional sequences (p = 0.04). Each sequence/plane added unique volume extensions. The GTVs common to all the T1 planes accounted for 38% of the total volumes of all the T1 planes. Anatomically, addition of the coronal and sagittal-based GTVs extended the axial T1-based GTV caudally and cranially, notably to the base of the skull. Adding MRI planes and sequences to the traditional axial T1 sequence yields significant quantitative and anatomically important extensions of the GTVs and PTVs. For accurate target delineation in nasopharyngeal cancer, we recommend that GTVs be outlined in all MRI sequences/planes and registered with the planning computed tomography scans.

Single nucleotide primer extension to detect genetic diseases: Experimental application to hemophilia B (factor IX) and cystic fibrosis genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuppuswamy, M.N.; Hoffmann, J.W.; Spitzer, S.G.

1991-02-15

In this report, the authors describe an approach to detect the presence of abnormal alleles in those genetic diseases in which frequency of occurrence of the same mutation is high (e.g., hemophilia B). Initially, from each subject, the DNA fragment containing the putative mutation site is amplified by the polymerase chain reaction. For each fragment two reaction mixtures are then prepared. Each contains the amplified fragment, a primer (18-mer or longer) whose sequence is identical to the coding sequence of the normal gene immediately flanking the 5{prime} end of the mutation site, and either an {alpha}-{sup 32}P-labeled nucleotide corresponding tomore » the normal coding sequence at the mutation site or an {alpha}-{sup 32}P-labeled nucleotide corresponding to the mutant sequence. An essential feature of the present methodology is that the base immediately 3{prime} to the template-bound primer is one of those altered in the mutant, since in this way an extension of the primer by a single base will give an extended molecule characteristic of either the mutant or the wild type. The method is rapid and should be useful in carrier detection and prenatal diagnosis of every genetic disease with a known sequence variation.« less
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

PubMed Central

Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

2011-01-01

Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Use of dolutegravir in two INI-experienced patients with multiclass resistance resulted in excellent virological and immunological responses

PubMed Central

Marije Hofstra, Laura; Nijhuis, Monique; Mudrikova, Tania; Fun, Axel; Schipper, Pauline; Schneider, Margriet; Wensing, Annemarie

2014-01-01

Introduction Dolutegravir is a second generation integrase inhibitor with a proposed high genetic barrier to resistance. However, in clinical trials, decreased virological response was seen in a subset of patients with prior exposure to raltegravir and multiple integrase resistance mutations. Methods We describe two cases of HIV subtype B-infected patients starting dolutegravir after previous failure on a raltegravir-containing regimen with extensive resistance. Genotypic analysis was performed using population sequencing and 454 ultradeep sequencing of integrase at time of raltegravir exposure. Results Both patients were diagnosed in early 1990s and received mono- and dual therapy, followed by several cART-regimens. Due to presence of extensive resistance, the genotypic susceptibility score of these regimens never reached a score >2 and never resulted in sustained virological suppression despite good adherence. Early 2012, the clinical condition of patient 1 worsened during persistent failure of a mega-cART regimen despite excellent drug levels. Six major PI, six minor PI, seven NRTI, six NNRTI and two INI mutations plus DM-virus were detected (Table 1). Ultra-deep sequencing of integrase showed the selection of Q148R, E138K+Q148K, and N155H variants and phenotypic raltegravir resistance was demonstrated. After addition of dolutegravir and enfuvirtide to the failing regimen (zidovudine, lamivudine, tenofovir, etravirine, darunavir/ritonavir, maraviroc), viral load (VL) decreased from 244,000 to <20 cps/mL within five months, CD4-count increased (33 to 272 mm3) and the clinical condition improved substantially. In patient 2, similar worsening of the clinical condition was observed late 2012 during persistent failure on mega-cART. Five major PI, six minor PI, nine NRTI, seven NNRTI and one INI mutation plus DM-virus were detected. Ultra-deep sequencing showed selection of N155H, followed by Q95K and V151I variants and phenotypic raltegravir resistance was demonstrated. Dolutegravir was added to his failing regimen (zidovudine, lamivudine, etravirine, atazanavir/ritonavir, maraviroc) at a VL of 39,000 cps/mL. Sustained virological suppression was reached within five months with considerable increase of CD4-count (41 to 175 mm3) and slight improvement of clinical condition. Conclusions We present the first patients with extensive integrase resistance who were treated with dolutegravir in clinical practice and who achieved excellent virological and immunological success. These cases demonstrate the high genetic barrier of dolutegravir. PMID:25397500
Super Normal Vector for Human Activity Recognition with Depth Cameras.

PubMed

Yang, Xiaodong; Tian, YingLi

2017-05-01

The advent of cost-effectiveness and easy-operation depth cameras has facilitated a variety of visual recognition tasks including human activity recognition. This paper presents a novel framework for recognizing human activities from video sequences captured by depth cameras. We extend the surface normal to polynormal by assembling local neighboring hypersurface normals from a depth sequence to jointly characterize local motion and shape information. We then propose a general scheme of super normal vector (SNV) to aggregate the low-level polynormals into a discriminative representation, which can be viewed as a simplified version of the Fisher kernel representation. In order to globally capture the spatial layout and temporal order, an adaptive spatio-temporal pyramid is introduced to subdivide a depth video into a set of space-time cells. In the extensive experiments, the proposed approach achieves superior performance to the state-of-the-art methods on the four public benchmark datasets, i.e., MSRAction3D, MSRDailyActivity3D, MSRGesture3D, and MSRActionPairs3D.
ExoLocator--an online view into genetic makeup of vertebrate proteins.

PubMed

Khoo, Aik Aun; Ogrizek-Tomas, Mario; Bulovic, Ana; Korpar, Matija; Gürler, Ece; Slijepcevic, Ivan; Šikic, Mile; Mihalek, Ivana

2014-01-01

ExoLocator (http://exolocator.eopsf.org) collects in a single place information needed for comparative analysis of protein-coding exons from vertebrate species. The main source of data--the genomic sequences, and the existing exon and homology annotation--is the ENSEMBL database of completed vertebrate genomes. To these, ExoLocator adds the search for ostensibly missing exons in orthologous protein pairs across species, using an extensive computational pipeline to narrow down the search region for the candidate exons and find a suitable template in the other species, as well as state-of-the-art implementations of pairwise alignment algorithms. The resulting complements of exons are organized in a way currently unique to ExoLocator: multiple sequence alignments, both on the nucleotide and on the peptide levels, clearly indicating the exon boundaries. The alignments can be inspected in the web-embedded viewer, downloaded or used on the spot to produce an estimate of conservation within orthologous sets, or functional divergence across paralogues.
Syntactic sequencing in Hebbian cell assemblies.

PubMed

Wennekers, Thomas; Palm, Günther

2009-12-01

Hebbian cell assemblies provide a theoretical framework for the modeling of cognitive processes that grounds them in the underlying physiological neural circuits. Recently we have presented an extension of cell assemblies by operational components which allows to model aspects of language, rules, and complex behaviour. In the present work we study the generation of syntactic sequences using operational cell assemblies timed by unspecific trigger signals. Syntactic patterns are implemented in terms of hetero-associative transition graphs in attractor networks which cause a directed flow of activity through the neural state space. We provide regimes for parameters that enable an unspecific excitatory control signal to switch reliably between attractors in accordance with the implemented syntactic rules. If several target attractors are possible in a given state, noise in the system in conjunction with a winner-takes-all mechanism can randomly choose a target. Disambiguation can also be guided by context signals or specific additional external signals. Given a permanently elevated level of external excitation the model can enter an autonomous mode, where it generates temporal grammatical patterns continuously.
Fine-tuning gene networks using simple sequence repeats

PubMed Central

Egbert, Robert G.; Klavins, Eric

2012-01-01

The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Language Model Combination and Adaptation Using Weighted Finite State Transducers

NASA Technical Reports Server (NTRS)

Liu, X.; Gales, M. J. F.; Hieronymus, J. L.; Woodland, P. C.

2010-01-01

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences
Identification of novel Theileria genotypes from Grant's gazelle

PubMed Central

Hooge, Janis; Howe, Laryssa; Ezenwa, Vanessa O.

2015-01-01

Blood samples collected from Grant's gazelles (Nanger granti) in Kenya were screened for hemoparasites using a combination of microscopic and molecular techniques. All 69 blood smears examined by microscopy were positive for hemoparasites. In addition, Theileria/Babesia DNA was detected in all 65 samples screened by PCR for a ~450-base pair fragment of the V4 hypervariable region of the 18S rRNA gene. Sequencing and BLAST analysis of a subset of PCR amplicons revealed widespread co-infection (25/39) and the existence of two distinct Grant's gazelle Theileria subgroups. One group of 11 isolates clustered as a subgroup with previously identified Theileria ovis isolates from small ruminants from Europe, Asia and Africa; another group of 3 isolates clustered with previously identified Theileria spp. isolates from other African antelope. Based on extensive levels of sequence divergence (1.2–2%) from previously reported Theileria species within Kenya and worldwide, the Theileria isolates detected in Grant's gazelles appear to represent at least two novel Theileria genotypes. PMID:25973394
Identification of novel Theileria genotypes from Grant's gazelle.

PubMed

Hooge, Janis; Howe, Laryssa; Ezenwa, Vanessa O

2015-08-01

Blood samples collected from Grant's gazelles (Nanger granti) in Kenya were screened for hemoparasites using a combination of microscopic and molecular techniques. All 69 blood smears examined by microscopy were positive for hemoparasites. In addition, Theileria/Babesia DNA was detected in all 65 samples screened by PCR for a ~450-base pair fragment of the V4 hypervariable region of the 18S rRNA gene. Sequencing and BLAST analysis of a subset of PCR amplicons revealed widespread co-infection (25/39) and the existence of two distinct Grant's gazelle Theileria subgroups. One group of 11 isolates clustered as a subgroup with previously identified Theileria ovis isolates from small ruminants from Europe, Asia and Africa; another group of 3 isolates clustered with previously identified Theileria spp. isolates from other African antelope. Based on extensive levels of sequence divergence (1.2-2%) from previously reported Theileria species within Kenya and worldwide, the Theileria isolates detected in Grant's gazelles appear to represent at least two novel Theileria genotypes.
Limited utility of residue masking for positive-selection inference.

PubMed

Spielman, Stephanie J; Dawson, Eric T; Wilke, Claus O

2014-09-01

Errors in multiple sequence alignments (MSAs) can reduce accuracy in positive-selection inference. Therefore, it has been suggested to filter MSAs before conducting further analyses. One widely used filter, Guidance, allows users to remove MSA positions aligned with low confidence. However, Guidance's utility in positive-selection inference has been disputed in the literature. We have conducted an extensive simulation-based study to characterize fully how Guidance impacts positive-selection inference, specifically for protein-coding sequences of realistic divergence levels. We also investigated whether novel scoring algorithms, which phylogenetically corrected confidence scores, and a new gap-penalization score-normalization scheme improved Guidance's performance. We found that no filter, including original Guidance, consistently benefitted positive-selection inferences. Moreover, all improvements detected were exceedingly minimal, and in certain circumstances, Guidance-based filters worsened inferences. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Association between mitochondrial DNA variations and Alzheimer's Disease in the ADNI cohort

PubMed Central

Lakatos, Anita; Derbeneva, Olga; Younes, Danny; Keator, David; Bakken, Trygve; Lvova, Maria; Brandon, Marty; Guffanti, Guia; Reglodi, Dora; Saykin, Andrew; Weiner, Michael; Macciardi, Fabio; Schork, Nicholas; Wallace, Douglas C.; Potkin, Steven G.

2010-01-01

Despite the central role of amyloid deposition in the development of Alzheimer's disease (AD), the pathogenesis of AD still remains elusive at the molecular level. Increasing evidence suggests that compromised mitochondrial function contributes to the aging process and thus may increase the risk of AD. Dysfunctional mitochondria contribute to reactive oxygen species (ROS) which can lead to extensive macromolecule oxidative damage and the progression of amyloid pathology. Oxidative stress and amyloid toxicity leave neurons chemically vulnerable. Because the brain relies on aerobic metabolism, it is apparent that mitochondria are critical for the cerebral function. Mitochondrial DNA sequence-changes could shift cell dynamics and facilitate neuronal vulnerability. Therefore we postulated that mitochondrial DNA sequence polymorphisms may increase the risk of AD. We evaluated the role of mitochondrial haplogroups derived from 138 mitochondrial polymorphisms in 358 Caucasian ADNI subjects. Our results indicate that the mitochondrial haplogroup UK may confer genetic susceptibility to AD independently of the APOE4 allele. PMID:20538375
Genomic region operation kit for flexible processing of deep sequencing data.

PubMed

Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

2013-01-01

Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.
Tertiary structural propensities reveal fundamental sequence/structure relationships.

PubMed

Zheng, Fan; Zhang, Jian; Grigoryan, Gevorg

2015-05-05

Extracting useful generalizations from the continually growing Protein Data Bank (PDB) is of central importance. We hypothesize that the PDB contains valuable quantitative information on the level of local tertiary structural motifs (TERMs). We show that by breaking a protein structure into its constituent TERMs, and querying the PDB to characterize the natural ensemble matching each, we can estimate the compatibility of the structure with a given amino acid sequence through a metric we term "structure score." Considering submissions from recent Critical Assessment of Structure Prediction (CASP) experiments, we found a strong correlation (R = 0.69) between structure score and model accuracy, with poorly predicted regions readily identifiable. This performance exceeds that of leading atomistic statistical energy functions. Furthermore, TERM-based analysis of two prototypical multi-state proteins rapidly produced structural insights fully consistent with prior extensive experimental studies. We thus find that TERM-based analysis should have considerable utility for protein structural biology. Copyright © 2015 Elsevier Ltd. All rights reserved.
Image encryption using random sequence generated from generalized information domain

NASA Astrophysics Data System (ADS)

Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu

2016-05-01

A novel image encryption method based on the random sequence generated from the generalized information domain and permutation-diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.
Rapidly rotating polytropes in general relativity

NASA Technical Reports Server (NTRS)

Cook, Gregory B.; Shapiro, Stuart L.; Teukolsky, Saul A.

1994-01-01

We construct an extensive set of equilibrium sequences of rotating polytropes in general relativity. We determine a number of important physical parameters of such stars, including maximum mass and maximum spin rate. The stability of the configurations against quasi-radial perturbations is diagnosed. Two classes of evolutionary sequences of fixed rest mass and entropy are explored: normal sequences which behave very much like Newtonian evolutionary sequences, and supramassive sequences which exist solely because of relativistic effects. Dissipation leading to loss of angular momentum causes a star to evolve in a quasi-stationary fashion along an evolutionary sequence. Supramassive sequences evolve towards eventual catastrophic collapse to a black hole. Prior to collapse, the star must spin up as it loses angular momentum, an effect which may provide an observational precursor to gravitational collapse to a black hole.
Open source database of images DEIMOS: extension for large-scale subjective image quality assessment

NASA Astrophysics Data System (ADS)

Vítek, Stanislav

2014-09-01

DEIMOS (Database of Images: Open Source) is an open-source database of images and video sequences for testing, verification and comparison of various image and/or video processing techniques such as compression, reconstruction and enhancement. This paper deals with extension of the database allowing performing large-scale web-based subjective image quality assessment. Extension implements both administrative and client interface. The proposed system is aimed mainly at mobile communication devices, taking into account advantages of HTML5 technology; it means that participants don't need to install any application and assessment could be performed using web browser. The assessment campaign administrator can select images from the large database and then apply rules defined by various test procedure recommendations. The standard test procedures may be fully customized and saved as a template. Alternatively the administrator can define a custom test, using images from the pool and other components, such as evaluating forms and ongoing questionnaires. Image sequence is delivered to the online client, e.g. smartphone or tablet, as a fully automated assessment sequence or viewer can decide on timing of the assessment if required. Environmental data and viewing conditions (e.g. illumination, vibrations, GPS coordinates, etc.), may be collected and subsequently analyzed.
Status of the Microbial Census

PubMed Central

Schloss, Patrick D.; Handelsman, Jo

2004-01-01

Over the past 20 years, more than 78,000 16S rRNA gene sequences have been deposited in GenBank and the Ribosomal Database Project, making the 16S rRNA gene the most widely studied gene for reconstructing bacterial phylogeny. While there is a general appreciation that these sequences are largely unique and derived from diverse species of bacteria, there has not been a quantitative attempt to describe the extent of sequencing efforts to date. We constructed rarefaction curves for each bacterial phylum and for the entire bacterial domain to assess the current state of sampling and the relative taxonomic richness of each phylum. This analysis quantifies the general sense among microbiologists that we are a long way from a complete census of the bacteria on Earth. Moreover, the analysis indicates that current sampling strategies might not be the most effective ones to describe novel diversity because there remain numerous phyla that are globally distributed yet poorly sampled. Based on the current level of sampling, it is not possible to estimate the total number of bacterial species on Earth, but the minimum species richness is 35,498. Considering previous global species richness estimates of 107 to 109, we are certain that this estimate will increase with additional sequencing efforts. The data support previous calls for extensive surveys of multiple chemically disparate environments and of specific phylogenetic groups to advance the census most rapidly. PMID:15590780
High resolution seismic stratigraphy of Tampa Bay, Florida

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tihansky, A.B.; Hine, A.C.; Locker, S.D.

1993-03-01

Tampa Bay is one of two large embayments that interrupt the broad, regional nature of the carbonate ramp of the west coast of the Florida carbonate platform. It is believed to have formed as a result of preferential dissolution of the Cenozoic limestones beneath it. Highly reactive freshwater systems became hydrologically focused in the bay region as the surface and groundwater systems established themselves during sea-level lowstands. This weakening of the underlying limestone resulted in extensive karstification, including warping, subsidence, sinkhole and spring formation. Over 120 miles of high resolution seismic reflection data were collected within Tampa Bay. This recordmore » has been tied into 170 core borings taken from within the bay. This investigation has found three major seismic stratigraphic sequences beneath the bay. The lowermost sequence is probably of Miocene age. Its surface is highly irregular due to erosion and dissolution and exhibits a great deal of vertical relief as well as gentler undulations or warping. Much of the middle sequence consists of low angle clinoforms that gently downlap and fill in the underlying karst features. The uppermost sequence is a discontinuous unit comprised of horizontal to low angle clinoforms that are local in their extent. The recent drainage and sedimentation patterns within the bay area are related to the underlying structure controlled by the Miocene karst activity.« less

RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

PubMed

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.

PubMed

Fouquier, Jennifer; Rideout, Jai Ram; Bolyen, Evan; Chase, John; Shiffer, Arron; McDonald, Daniel; Knight, Rob; Caporaso, J Gregory; Kelley, Scott T

2016-02-24

Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a "foundation" phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, "extension" phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new "extension tree" child. We applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non-phylogenetic methods for larger effect sizes. The Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees. ghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree .
Neural pathways mediating control of reproductive behaviour in male Japanese quail

PubMed Central

Wild, J Martin; Balthazart, Jacques

2012-01-01

The sexually dimorphic medial preoptic nucleus (POM) in Japanese quail has for many years been the focus of intensive investigations into its role in reproductive behaviour. The present paper delineates a sequence of descending pathways that finally reach sacral levels of the spinal cord housing motor neurons innervating cloacal muscles involved in reproductive behaviour. We first retrogradely labeled the motor neurons innervating the large cloacal sphincter muscle (mSC) that forms part of the foam gland complex (Seiwert and Adkins-Regan, 1998, Brain Behav Evol 52:61–80) and then putative premotor nuclei in the brainstem, one of which was nucleus retroambigualis (RAm) in the caudal medulla. Anterograde tracing from RAm defined a bulbospinal pathway, terminations of which overlapped the distribution of mSC motor neurons and their extensive dorsally directed dendrites. Descending input to RAm arose from an extensive dorsomedial nucleus of the intercollicular complex (DM-ICo), electrical stimulation of which drove vocalizations. POM neurons were retrogradely labeled by injections of tracer into DM-ICo, but POM projections largely surrounded DM, rather than penetrated it. Thus, although a POM projection to ICo was shown, a POM projection to DM must be inferred. Nevertheless, the sequence of projections in the male quail from POM to cloacal motor neurons strongly resembles that in rats, cats and monkeys for the control of reproductive behaviour, as largely defined by Holstege and co-workers (e.g., Holstege et al., 1997, Neuroscience 80: 587–598). PMID:23225613
Origin of secondary potash deposits; a case from Miocene evaporites of NW Central Iran

NASA Astrophysics Data System (ADS)

Rahimpour-Bonab, H.; Kalantarzadeh, Z.

2005-04-01

In early Miocene times, an extensive carbonate shelf developed in Central Iran and during several cycles of sea-level fluctuations, evaporite-bearing carbonate sequences of the Qom Formation were deposited. However, in the early-middle Miocene, development of restricted marine conditions led to a facies change from shelf carbonates of the Qom Formation to the evaporite series of the M 1 member of the overlying Lower Red Formation. This member is a facies mosaic of lagoonal and salina evaporites (mainly halite beds) admixed with wadi siliciclastics. The purpose of this study, which focuses on two salt mines in the northwestern portion of Central Iran in the Zanjan province, was to reveal the origin, sedimentary environment, and diagenesis of these potash-bearing evaporite sequences. Petrographic examination revealed the following mineral assemblage: halite, gypsum, anhydrite and carnallite as primary precipitates, and langbeinite and aphthitalite as secondary metamorphic potash salts. In the Iljaq mine, distorted halite beds are dominated by burial and deformational textures and a great deal of secondary potash salts. In the Qarah-Aghaje mine, however, the bedded halite shows pristine primary textures and is devoid of the secondary potash salts. High bromine content of most evaporite minerals suggests their marine origin, and confirms the absence of the extensive meteoric alterations and subsequent bromine depletions. Potash salts are mainly secondary, and resulted from diagenetic replacements of distorted halite beds during thermal and dynamic metamorphism in a burial setting.
Draft genome sequence of an extensively drug-resistant Pseudomonas aeruginosa isolate belonging to ST644 isolated from a footpad infection in a Magellanic penguin (Spheniscus magellanicus).

PubMed

Sellera, Fábio P; Fernandes, Miriam R; Moura, Quézia; Souza, Tiago A; Nascimento, Cristiane L; Cerdeira, Louise; Lincopan, Nilton

2018-03-01

The incidence of multidrug-resistant bacteria in wildlife animals has been investigated to improve our knowledge of the spread of clinically relevant antimicrobial resistance genes. The aim of this study was to report the first draft genome sequence of an extensively drug-resistant (XDR) Pseudomonas aeruginosa ST644 isolate recovered from a Magellanic penguin with a footpad infection (bumblefoot) undergoing rehabilitation process. The genome was sequenced on an Illumina NextSeq ® platform using 150-bp paired-end reads. De novo genome assembly was performed using Velvet v.1.2.10, and the whole genome sequence was evaluated using bioinformatics approaches from the Center of Genomic Epidemiology, whereas an in-house method (mapping of raw whole genome sequence reads) was used to identify chromosomal point mutations. The genome size was calculated at 6436450bp, with 6357 protein-coding sequences and the presence of genes conferring resistance to aminoglycosides, β-lactams, phenicols, sulphonamides, tetracyclines, quinolones and fosfomycin; in addition, mutations in the genes gyrA (Thr83Ile), parC (Ser87Leu), phoQ (Arg61His) and pmrB (Tyr345His), conferring resistance to quinolones and polymyxins, respectively, were confirmed. This draft genome sequence can provide useful information for comparative genomic analysis regarding the dissemination of clinically significant antibiotic resistance genes and XDR bacterial species at the human-animal interface. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.
Systematic Characterization and Comparative Analysis of the Rabbit Immunoglobulin Repertoire

PubMed Central

Lavinder, Jason J.; Hoi, Kam Hon; Reddy, Sai T.; Wine, Yariv; Georgiou, George

2014-01-01

Rabbits have been used extensively as a model system for the elucidation of the mechanism of immunoglobulin diversification and for the production of antibodies. We employed Next Generation Sequencing to analyze Ig germline V and J gene usage, CDR3 length and amino acid composition, and gene conversion frequencies within the functional (transcribed) IgG repertoire of the New Zealand white rabbit (Oryctolagus cuniculus). Several previously unannotated rabbit heavy chain variable (VH) and light chain variable (VL) germline elements were deduced bioinformatically using multidimensional scaling and k-means clustering methods. We estimated the gene conversion frequency in the rabbit at 23% of IgG sequences with a mean gene conversion tract length of 59±36 bp. Sequencing and gene conversion analysis of the chicken, human, and mouse repertoires revealed that gene conversion occurs much more extensively in the chicken (frequency 70%, tract length 79±57 bp), was observed to a small, yet statistically significant extent in humans, but was virtually absent in mice. PMID:24978027
A report on extensive lateral genetic reciprocation between arsenic resistant Bacillus subtilis and Bacillus pumilus strains analyzed using RAPD-PCR.

PubMed

Khowal, Sapna; Siddiqui, Md Zulquarnain; Ali, Shadab; Khan, Mohd Taha; Khan, Mather Ali; Naqvi, Samar Husain; Wajid, Saima

2017-02-01

The study involves isolation of arsenic resistant bacteria from soil samples. The characterization of bacteria isolates was based on 16S rRNA gene sequences. The phylogenetic consanguinity among isolates was studied employing rpoB and gltX gene sequence. RAPD-PCR technique was used to analyze genetic similarity between arsenic resistant isolates. In accordance with the results Bacillus subtilis and Bacillus pumilus strains may exhibit extensive horizontal gene transfer. Arsenic resistant potency in Bacillus sonorensis and high arsenite tolerance in Bacillus pumilus strains was identified. The RAPD-PCR primer OPO-02 amplified a 0.5kb DNA band specific to B. pumilus 3ZZZ strain and 0.75kb DNA band specific to B. subtilis 3PP. These unique DNA bands may have potential use as SCAR (Sequenced Characterized Amplified Region) molecular markers for identification of arsenic resistant B. pumilus and B. subtilis strains. Copyright © 2016 Elsevier Inc. All rights reserved.
Extensive characterization of peptides from Panax ginseng C. A. Meyer using mass spectrometric approach.

PubMed

Ye, Xueting; Zhao, Nan; Yu, Xi; Han, Xiaoli; Gao, Huiyuan; Zhang, Xiaozhe

2016-11-01

Panax ginseng is an important herb that has clear effects on the treatment of diverse diseases. Until now, the natural peptide constitution of this herb remains unclear. Here, we conduct an extensive characterization of Ginseng peptidome using MS-based data mining and sequencing. The screen on the charge states of precursor ions indicated that Ginseng is a peptide-rich herb in comparison of a number of commonly used herbs. The Ginseng peptides were then extracted and submitted to nano-LC-MS/MS analysis using different fragmentation modes, including CID, high-energy collisional dissociation, and electron transfer dissociation. Further database search and de novo sequencing allowed the identification of total 308 peptides, some of which might have important biological activities. This study illustrates the abundance and sequences of endogenous Ginseng peptides, thus providing the information of more candidates for the screening of active compounds for future biological research and drug discovery studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Plasmids encoding therapeutic agents

DOEpatents

Keener, William K [Idaho Falls, ID

2007-08-07

Plasmids encoding anti-HIV and anti-anthrax therapeutic agents are disclosed. Plasmid pWKK-500 encodes a fusion protein containing DP178 as a targeting moiety, the ricin A chain, an HIV protease cleavable linker, and a truncated ricin B chain. N-terminal extensions of the fusion protein include the maltose binding protein and a Factor Xa protease site. C-terminal extensions include a hydrophobic linker, an L domain motif peptide, a KDEL ER retention signal, another Factor Xa protease site, an out-of-frame buforin II coding sequence, the lacZ.alpha. peptide, and a polyhistidine tag. More than twenty derivatives of plasmid pWKK-500 are described. Plasmids pWKK-700 and pWKK-800 are similar to pWKK-500 wherein the DP178-encoding sequence is substituted by RANTES- and SDF-1-encoding sequences, respectively. Plasmid pWKK-900 is similar to pWKK-500 wherein the HIV protease cleavable linker is substituted by a lethal factor (LF) peptide-cleavable linker.
Whole-Genome Sequence of Coxiella burnetii Nine Mile RSA439 (Phase II, Clone 4), a Laboratory Workhorse Strain

PubMed Central

Beare, Paul A.; Moses, Abraham S.; Martens, Craig A.; Heinzen, Robert A.

2017-01-01

ABSTRACT Here, we report the whole-genome sequence of Coxiella burnetii Nine Mile RSA439 (phase II, clone 4), a laboratory strain used extensively to investigate the biology of this intracellular bacterial pathogen. The genome consists of a 1.97-Mb chromosome and a 37.32-kb plasmid. PMID:28596399
Whole-Genome Sequence of Coxiella burnetii Nine Mile RSA439 (Phase II, Clone 4), a Laboratory Workhorse Strain.

PubMed

Millar, Jess A; Beare, Paul A; Moses, Abraham S; Martens, Craig A; Heinzen, Robert A; Raghavan, Rahul

2017-06-08

Here, we report the whole-genome sequence of Coxiella burnetii Nine Mile RSA439 (phase II, clone 4), a laboratory strain used extensively to investigate the biology of this intracellular bacterial pathogen. The genome consists of a 1.97-Mb chromosome and a 37.32-kb plasmid. Copyright © 2017 Millar et al.
Selective 2'-hydroxyl acylation analyzed by primer extension and mutational profiling (SHAPE-MaP) for direct, versatile and accurate RNA structure analysis.

PubMed

Smola, Matthew J; Rice, Greggory M; Busan, Steven; Siegfried, Nathan A; Weeks, Kevin M

2015-11-01

Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistries exploit small electrophilic reagents that react with 2'-hydroxyl groups to interrogate RNA structure at single-nucleotide resolution. Mutational profiling (MaP) identifies modified residues by using reverse transcriptase to misread a SHAPE-modified nucleotide and then counting the resulting mutations by massively parallel sequencing. The SHAPE-MaP approach measures the structure of large and transcriptome-wide systems as accurately as can be done for simple model RNAs. This protocol describes the experimental steps, implemented over 3 d, that are required to perform SHAPE probing and to construct multiplexed SHAPE-MaP libraries suitable for deep sequencing. Automated processing of MaP sequencing data is accomplished using two software packages. ShapeMapper converts raw sequencing files into mutational profiles, creates SHAPE reactivity plots and provides useful troubleshooting information. SuperFold uses these data to model RNA secondary structures, identify regions with well-defined structures and visualize probable and alternative helices, often in under 1 d. SHAPE-MaP can be used to make nucleotide-resolution biophysical measurements of individual RNA motifs, rare components of complex RNA ensembles and entire transcriptomes.
Coiled-coil length: Size does matter.

PubMed

Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

2015-12-01

Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.
HGVS Recommendations for the Description of Sequence Variants: 2016 Update.

PubMed

den Dunnen, Johan T; Dalgleish, Raymond; Maglott, Donna R; Hart, Reece K; Greenblatt, Marc S; McGowan-Jordan, Jean; Roux, Anne-Francoise; Smith, Timothy; Antonarakis, Stylianos E; Taschner, Peter E M

2016-06-01

The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and standardized description and sharing of the variants detected. The sequence variant nomenclature system proposed in 2000 by the Human Genome Variation Society has been widely adopted and has developed into an internationally accepted standard. The recommendations are currently commissioned through a Sequence Variant Description Working Group (SVD-WG) operating under the auspices of three international organizations: the Human Genome Variation Society (HGVS), the Human Variome Project (HVP), and the Human Genome Organization (HUGO). Requests for modifications and extensions go through the SVD-WG following a standard procedure including a community consultation step. Version numbers are assigned to the nomenclature system to allow users to specify the version used in their variant descriptions. Here, we present the current recommendations, HGVS version 15.11, and briefly summarize the changes that were made since the 2000 publication. Most focus has been on removing inconsistencies and tightening definitions allowing automatic data processing. An extensive version of the recommendations is available online, at http://www.HGVS.org/varnomen. © 2016 WILEY PERIODICALS, INC.
Russell body inducing threshold depends on the variable domain sequences of individual human IgG clones and the cellular protein homeostasis.

PubMed

Stoops, Janelle; Byrd, Samantha; Hasegawa, Haruki

2012-10-01

Russell bodies are intracellular aggregates of immunoglobulins. Although the mechanism of Russell body biogenesis has been extensively studied by using truncated mutant heavy chains, the importance of the variable domain sequences in this process and in immunoglobulin biosynthesis remains largely unknown. Using a panel of structurally and functionally normal human immunoglobulin Gs, we show that individual immunoglobulin G clones possess distinctive Russell body inducing propensities that can surface differently under normal and abnormal cellular conditions. Russell body inducing predisposition unique to each immunoglobulin G clone was corroborated by the intrinsic physicochemical properties encoded in the heavy chain variable domain/light chain variable domain sequence combinations that define each immunoglobulin G clone. While the sequence based intrinsic factors predispose certain immunoglobulin G clones to be more prone to induce Russell bodies, extrinsic factors such as stressful cell culture conditions also play roles in unmasking Russell body propensity from immunoglobulin G clones that are normally refractory to developing Russell bodies. By taking advantage of heterologous expression systems, we dissected the roles of individual subunit chains in Russell body formation and examined the effect of non-cognate subunit chain pair co-expression on Russell body forming propensity. The results suggest that the properties embedded in the variable domain of individual light chain clones and their compatibility with the partnering heavy chain variable domain sequences underscore the efficiency of immunoglobulin G biosynthesis, the threshold for Russell body induction, and the level of immunoglobulin G secretion. We propose that an interplay between the unique properties encoded in variable domain sequences and the state of protein homeostasis determines whether an immunoglobulin G expressing cell will develop the Russell body phenotype in a dynamic cellular setting. Copyright © 2012 Elsevier B.V. All rights reserved.
A genome sequence resource for the aye-aye (Daubentonia madagascariensis), a nocturnal lemur from Madagascar.

PubMed

Perry, George H; Reeves, Darryl; Melsted, Páll; Ratan, Aakrosh; Miller, Webb; Michelini, Katelyn; Louis, Edward E; Pritchard, Jonathan K; Mason, Christopher E; Gilad, Yoav

2012-01-01

We present a high-coverage draft genome assembly of the aye-aye (Daubentonia madagascariensis), a highly unusual nocturnal primate from Madagascar. Our assembly totals ~3.0 billion bp (3.0 Gb), roughly the size of the human genome, comprised of ~2.6 million scaffolds (N50 scaffold size = 13,597 bp) based on short paired-end sequencing reads. We compared the aye-aye genome sequence data with four other published primate genomes (human, chimpanzee, orangutan, and rhesus macaque) as well as with the mouse and dog genomes as nonprimate outgroups. Unexpectedly, we observed strong evidence for a relatively slow substitution rate in the aye-aye lineage compared with these and other primates. In fact, the aye-aye branch length is estimated to be ~10% shorter than that of the human lineage, which is known for its low substitution rate. This finding may be explained, in part, by the protracted aye-aye life-history pattern, including late weaning and age of first reproduction relative to other lemurs. Additionally, the availability of this draft lemur genome sequence allowed us to polarize nucleotide and protein sequence changes to the ancestral primate lineage-a critical period in primate evolution, for which the relevant fossil record is sparse. Finally, we identified 293,800 high-confidence single nucleotide polymorphisms in the donor individual for our aye-aye genome sequence, a captive-born individual from two wild-born parents. The resulting heterozygosity estimate of 0.051% is the lowest of any primate studied to date, which is understandable considering the aye-aye's extensive home-range size and relatively low population densities. Yet this level of genetic diversity also suggests that conservation efforts benefiting this unusual species should be prioritized, especially in the face of the accelerating degradation and fragmentation of Madagascar's forests.
RNA sequencing confirms similarities between PPI-responsive oesophageal eosinophilia and eosinophilic oesophagitis.

PubMed

Peterson, K A; Yoshigi, M; Hazel, M W; Delker, D A; Lin, E; Krishnamurthy, C; Consiglio, N; Robson, J; Yandell, M; Clayton, F

2018-06-04

Although current American guidelines distinguish proton pump inhibitor-responsive oesophageal eosinophilia (PPI-REE) from eosinophilic oesophagitis (EoE), these entities are broadly similar. While two microarray studies showed that they have similar transcriptomes, more extensive RNA sequencing studies have not been done previously. To determine whether RNA sequencing identifies genetic markers distinguishing PPI-REE from EoE. We retrospectively examined 13 PPI-REE and 14 EoE biopsies, matched for tissue eosinophil content, and 14 normal controls. Patients and controls were not PPI-treated at the time of biopsy. We did RNA sequencing on formalin-fixed, paraffin-embedded tissue, with differential expression confirmation by quantitative polymerase chain reaction (PCR). We validated the use of formalin-fixed, paraffin-embedded vs RNAlater-preserved tissue, and compared our formalin-fixed, paraffin-embedded EoE results to a prior EoE study. By RNA sequencing, no genes were differentially expressed between the EoE and PPI-REE groups at the false discovery rate (FDR) ≤0.01 level. Compared to normal controls, 1996 genes were differentially expressed in the PPI-REE group and 1306 genes in the EoE group. By less stringent criteria, only MAPK8IP2 was differentially expressed between PPI-REE and EoE (FDR = 0.029, 2.2-fold less in EoE than in PPI-REE), with similar results by PCR. KCNJ2, which was differentially expressed in a prior study, was similar in the EoE and PPI-REE groups by both RNA sequencing and real-time PCR. Eosinophilic oesophagitis and PPI-REE have comparable transcriptomes, confirming that they are part of the same disease continuum. © 2018 John Wiley & Sons Ltd.
Nurture Versus Nature: Accounting for the Differences Between the Taiwan and Timor active arc-continent collisions

NASA Astrophysics Data System (ADS)

Harris, R. A.

2011-12-01

The active Banda arc/continent collision of the Timor region provides many important contrasts to what is observed in Taiwan, which is mostly a function of differences in the nature of the subducting plate. One of the most important differences is the thermal state of the respective continental margins: 30 Ma China passive margin versus 160 Ma NW Australian continental margin. The subduction of the cold and strong NW Australian passive margin beneath the Banda trench provides many new constraints for resolving longstanding issues about the formative stages of collision and accretion of continental crust. Some of these issues include evidence for slab rollback and subduction erosion, deep continental subduction, emplacement or demise of forearc basement, relative amounts of uplift from crustal vs. lithospheric processes, influence of inherited structure, partitioning of strain away from the thrust front, extent of mélange development, metamorphic conditions and exhumation mechanisms, continental contamination and accretion of volcanic arcs, does the slab tear, and does subduction polarity reverse? Most of these issues link to the profound control of lower plate crustal heterogeneity, thermal state and inherited structure. The thermomechanical characteristics of subducting an old continental margin allow for extensive underthrusting of lower plate cover units beneath the forearc and emplacement and uplift of extensive nappes of forearc basement. It also promotes subduction of continental crust to deep enough levels to experience high pressure metamorphism (not found in Taiwan) and extensive contamination of the volcanic arc. Seismic tomography confirms subduction of continental lithosphere beneath the Banda Arc to at least 400 km with no evidence for slab tear. Slab rollback during this process results in massive subduction erosion and extension of the upper plate. Other differences in the nature of the subducting plates in Taiwan in Timor are differences in the lateral continuity of the continental margins. The northern Australian continental margin is highly irregular with many rift basins subducting parallel to their axes. This feature gives rise to irregularities in the uplift pattern of the collision and its continental margin parallel structural grain. Another major difference between Taiwan and Timor is the mechanical stratigraphy entering the trench. The Australian continental margin bears a carbonate rich pre and post rift sequence that is separated by a 1000 m thick, over pressured mudstone unit that acts as major detachment and promotes extensive mud diapirism. The post breakup Australian Passive Margin Sequence is incorporated into the orogenic wedge by frontal accretion and forms a classic imbricate thrust stack near the front of the Banda forearc. The pre breakup Gondwana Sequence below the detachment continues at least to depth of 30 km in the subduction channel beneath the Banda forearc upper plate and stacks up into a duplex zone that forms structural culminations throughout Timor. The upper plate of both collisions is similar in nature but is deformed in different ways due to the strong influence of the lower plate. However, both have extensive subduction erosion and demise of the forearc and systematic accretion of the arc.
Sequencing intractable DNA to close microbial genomes.

PubMed

Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

2012-01-01

Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.
Phylogenetic Characterizations of Highly Mutated EV-B106 Recombinants Showing Extensive Genetic Exchanges with Other EV-B in Xinjiang, China.

PubMed

Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo

2017-02-23

Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5-80.8% nucleotide identity and 95.4-97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China.

Phylogenetic Characterizations of Highly Mutated EV-B106 Recombinants Showing Extensive Genetic Exchanges with Other EV-B in Xinjiang, China

PubMed Central

Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo

2017-01-01

Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5–80.8% nucleotide identity and 95.4–97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China. PMID:28230168
Biodegradation of engine oil by fungi from mangrove habitat.

PubMed

Ameen, Fuad; Hadi, Sarfaraz; Moslem, Mohamed; Al-Sabri, Ahmed; Yassin, Mohamed A

2015-01-01

The pollution of land and water by petroleum compounds is a matter of growing concern necessitating the development of methodologies, including microbial biodegradation, to minimize the impending impacts. It has been extensively reported that fungi from polluted habitats have the potential to degrade pollutants, including petroleum compounds. The Red Sea is used extensively for the transport of oil and is substantially polluted, due to leaks, spills, and occasional accidents. Tidal water, floating debris, and soil sediment were collected from mangrove stands on three polluted sites along the Red Sea coast of Saudi Arabia and forty-five fungal isolates belonging to 13 genera were recovered from these samples. The isolates were identified on the basis of a sequence analysis of the 18S rRNA gene fragment. Nine of these isolates were found to be able to grow in association with engine oil, as the sole carbon source, under in vitro conditions. These selected isolates and their consortium accumulated greater biomass, liberated more CO2, and produced higher levels of extracellular enzymes, during cultivation with engine oil as compared with the controls. These observations were authenticated by gas chromatography-mass spectrophotometry (GC-MS) analysis, which indicated that many high mass compounds present in the oil before treatment either disappeared or showed diminished levels.
A proteomic analysis of the chromoplasts isolated from sweet orange fruits [Citrus sinensis (L.) Osbeck].

PubMed

Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin

2011-11-01

Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level.
An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

PubMed Central

Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

1999-01-01

A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707
DNA sequencing with pyrophosphatase

DOEpatents

Tabor, S.; Richardson, C.C.

1996-03-12

A kit or solution is disclosed for use in extension of an oligonucleotide primer having a first single-stranded region on a template molecule and having a second single-stranded region homologous to the first single-stranded region. The first agent is able to cause extension of the first single-stranded region of the primer on the second single-stranded region of the template in a reaction mixture. The second agent is able to reduce the amount of pyrophosphate in the reaction mixture below the amount produced during the extension in the absence of the second agent.
DNA sequencing with pyrophosphatase

DOEpatents

Tabor, Stanley; Richardson, Charles C.

1996-03-12

A kit or solution for use in extension of an oligonucleotide primer having a first single-stranded region on a template molecule having a second single-stranded region homologous to the first single-stranded region, comprising a first agent able to cause extension of the first single-stranded region of the primer on the second single-stranded region of the template in a reaction mixture, and a second agent able to reduce the amount of pyrophosphate in the reaction mixture below the amount produced during the extension in the absence of the second agent.
Transcriptome analysis of stem development in the tumourous stem mustard Brassica juncea var. tumida Tsen et Lee by RNA sequencing.

PubMed

Sun, Quan; Zhou, Guanfan; Cai, Yingfan; Fan, Yonghong; Zhu, Xiaoyan; Liu, Yihua; He, Xiaohong; Shen, Jinjuan; Jiang, Huaizhong; Hu, Daiwen; Pan, Zheng; Xiang, Liuxin; He, Guanghua; Dong, Daiwen; Yang, Jianping

2012-04-21

Tumourous stem mustard (Brassica juncea var. tumida Tsen et Lee) is an economically and nutritionally important vegetable crop of the Cruciferae family that also provides the raw material for Fuling mustard. The genetics breeding, physiology, biochemistry and classification of mustards have been extensively studied, but little information is available on tumourous stem mustard at the molecular level. To gain greater insight into the molecular mechanisms underlying stem swelling in this vegetable and to provide additional information for molecular research and breeding, we sequenced the transcriptome of tumourous stem mustard at various stem developmental stages and compared it with that of a mutant variety lacking swollen stems. Using Illumina short-read technology with a tag-based digital gene expression (DGE) system, we performed de novo transcriptome assembly and gene expression analysis. In our analysis, we assembled genetic information for tumourous stem mustard at various stem developmental stages. In addition, we constructed five DGE libraries, which covered the strains Yong'an and Dayejie at various development stages. Illumina sequencing identified 146,265 unigenes, including 11,245 clusters and 135,020 singletons. The unigenes were subjected to a BLAST search and annotated using the GO and KO databases. We also compared the gene expression profiles of three swollen stem samples with those of two non-swollen stem samples. A total of 1,042 genes with significantly different expression levels occurring simultaneously in the six comparison groups were screened out. Finally, the altered expression levels of a number of randomly selected genes were confirmed by quantitative real-time PCR. Our data provide comprehensive gene expression information at the transcriptional level and the first insight into the understanding of the molecular mechanisms and regulatory pathways of stem swelling and development in this plant, and will help define new mechanisms of stem development in non-model plant organisms.
Comparative genome analysis of Prevotella ruminicola and Prevotella bryantii: insights into their environmental niche.

PubMed

Purushe, Janaki; Fouts, Derrick E; Morrison, Mark; White, Bryan A; Mackie, Roderick I; Coutinho, Pedro M; Henrissat, Bernard; Nelson, Karen E

2010-11-01

The Prevotellas comprise a diverse group of bacteria that has received surprisingly limited attention at the whole genome-sequencing level. In this communication, we present the comparative analysis of the genomes of Prevotella ruminicola 23 (GenBank: CP002006) and Prevotella bryantii B(1)4 (GenBank: ADWO00000000), two gastrointestinal isolates. Both P. ruminicola and P. bryantii have acquired an extensive repertoire of glycoside hydrolases that are targeted towards non-cellulosic polysaccharides, especially GH43 bifunctional enzymes. Our analysis demonstrates the diversity of this genus. The results from these analyses highlight their role in the gastrointestinal tract, and provide a template for additional work on genetic characterization of these species.
Image encryption using a synchronous permutation-diffusion technique

NASA Astrophysics Data System (ADS)

Enayatifar, Rasul; Abdullah, Abdul Hanan; Isnin, Ismail Fauzi; Altameem, Ayman; Lee, Malrey

2017-03-01

In the past decade, the interest on digital images security has been increased among scientists. A synchronous permutation and diffusion technique is designed in order to protect gray-level image content while sending it through internet. To implement the proposed method, two-dimensional plain-image is converted to one dimension. Afterward, in order to reduce the sending process time, permutation and diffusion steps for any pixel are performed in the same time. The permutation step uses chaotic map and deoxyribonucleic acid (DNA) to permute a pixel, while diffusion employs DNA sequence and DNA operator to encrypt the pixel. Experimental results and extensive security analyses have been conducted to demonstrate the feasibility and validity of this proposed image encryption method.
Power Spectrum of Long Eigenlevel Sequences in Quantum Chaotic Systems.

PubMed

Riser, Roman; Osipov, Vladimir Al; Kanzieper, Eugene

2017-05-19

We present a nonperturbative analysis of the power spectrum of energy level fluctuations in fully chaotic quantum structures. Focusing on systems with broken time-reversal symmetry, we employ a finite-N random matrix theory to derive an exact multidimensional integral representation of the power spectrum. The N→∞ limit of the exact solution furnishes the main result of this study-a universal, parameter-free prediction for the power spectrum expressed in terms of a fifth Painlevé transcendent. Extensive numerics lends further support to our theory which, as discussed at length, invalidates a traditional assumption that the power spectrum is merely determined by the spectral form factor of a quantum system.
Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants.

PubMed

Kim, Kyung; Seong, Moon-Woo; Chung, Won-Hyong; Park, Sung Sup; Leem, Sangseob; Park, Won; Kim, Jihyun; Lee, KiYoung; Park, Rae Woong; Kim, Namshin

2015-06-01

Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ~200×. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of 120×. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about 120×. Moreover, the phenomena were consistent across the breast cancer samples.
Identification of new polymorphisms of the angiotensin I-converting enzyme (ACE) gene, and study of their relationship to plasma ACE levels by two-QTL segregation-linkage analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villard, E.; Soubrier, F.; Tiret, L.

1996-06-01

Plasma angiotensin I-converting enzyme (ACE) levels are highly genetically determined. A previous segregation-linkage analysis suggested the existence of a functional mutation located within or close to the ACE locus, in almost complete linkage disequilibrium (LD) with the ACE insertion/deletion (I/D) polymorphism and accounting for half the ACE variance. In order to identify the functional variant at the molecular level, we compared ACE gene sequences between four subjects selected for having contrasted ACE levels and I/D genotypes. We identified 10 new polymorphisms, among which 8 were genotyped in 95 healthy nuclear families, in addition to the I/D polymorphism. These polymorphisms couldmore » be divided into two groups: five polymorphisms in the 5{prime} region and three in the coding sequence and the 3{prime} UTR. Within each group, polymorphisms were in nearly complete association, whereas polymorphisms from the two groups were in strong negative LD. After adjustment for the I/D polymorphism, all polymorphisms of the 5{prime} group remained significantly associated with ACE levels, which suggests the existence of two quantitative trait loci (QTL) acting additively on ACE levels. Segregation-linkage analyses including one or two ACE-linked QTLs in LD with two ACE markers were performed to test this hypothesis. The two QTLs and the two markers were assumed to be in complete LD. Results supported the existence of two ACE-linked QTLs, which would explain 38% and 49% of the ACE variance in parents and offspring, respectively. One of these QTLs might be the I/D polymorphism itself or the newly characterized 4656(CT){sub 2/3} polymorphism. The second QTL would have a frequency of {approximately}.20, which is incompatible with any of the yet-identified polymorphisms. More extensive sequencing and extended analyses in larger samples and in other populations will be necessary to characterize definitely the functional variants. 30 refs., 1 fig., 6 tabs.« less
Identification of a Novel Rhabdovirus in Spodoptera frugiperda Cell Lines

PubMed Central

Ma, Hailun; Galvin, Teresa A.; Glasner, Dustin R.; Shaheduzzaman, Syed

2014-01-01

ABSTRACT The Sf9 cell line, derived from Spodoptera frugiperda, is used as a cell substrate for biological products, and no viruses have been reported in this cell line after extensive testing. We used degenerate PCR assays and massively parallel sequencing (MPS) to identify a novel RNA virus belonging to the order Mononegavirales in Sf9 cells. Sequence analysis of the assembled virus genome showed the presence of five open reading frames (ORFs) corresponding to the genes for the N, P, M, G, and L proteins in other rhabdoviruses and an unknown ORF of 111 amino acids located between the G- and L-protein genes. BLAST searches indicated that the S. frugiperda rhabdovirus (Sf-rhabdovirus) was related in a limited region of the L-protein gene to Taastrup virus, a newly discovered member of the Mononegavirales from a leafhopper (Hemiptera), and also to plant rhabdoviruses, particularly in the genus Cytorhabdovirus. Phylogenetic analysis of sequences in the L-protein gene indicated that Sf-rhabdovirus is a novel virus that branched with Taastrup virus. Rhabdovirus morphology was confirmed by transmission electron microscopy of filtered supernatant samples from Sf9 cells. Infectivity studies indicated potential transient infection by Sf-rhabdovirus in other insect cell lines, but there was no evidence of entry or virus replication in human cell lines. Sf-rhabdovirus sequences were also found in the Sf21 parental cell line of Sf9 cells but not in other insect cell lines, such as BT1-TN-5B1-4 (Tn5; High Five) cells and Schneider's Drosophila line 2 [D.Mel.(2); SL2] cells, indicating a species-specific infection. The results indicate that conventional methods may be complemented by state-of-the-art technologies with extensive bioinformatics analysis for identification of novel viruses. IMPORTANCE The Spodoptera frugiperda Sf9 cell line is used as a cell substrate for the development and manufacture of biological products. Extensive testing has not previously identified any viruses in this cell line. This paper reports on the identification and characterization of a novel rhabdovirus in Sf9 cells. This was accomplished through the use of next-generation sequencing platforms, de novo assembly tools, and extensive bioinformatics analysis. Rhabdovirus identification was further confirmed by transmission electron microscopy. Infectivity studies showed the lack of replication of Sf-rhabdovirus in human cell lines. The overall study highlights the use of a combinatorial testing approach including conventional methods and new technologies for evaluation of cell lines for unexpected viruses and use of comprehensive bioinformatics strategies for obtaining confident next-generation sequencing results. PMID:24672045
Evolution of meiotic recombination genes in maize and teosinte.

PubMed

Sidhu, Gaganpreet K; Warzecha, Tomasz; Pawlowski, Wojciech P

2017-01-25

Meiotic recombination is a major source of genetic variation in eukaryotes. The role of recombination in evolution is recognized but little is known about how evolutionary forces affect the recombination pathway itself. Although the recombination pathway is fundamentally conserved across different species, genetic variation in recombination components and outcomes has been observed. Theoretical predictions and empirical studies suggest that changes in the recombination pathway are likely to provide adaptive abilities to populations experiencing directional or strong selection pressures, such as those occurring during species domestication. We hypothesized that adaptive changes in recombination may be associated with adaptive evolution patterns of genes involved in meiotic recombination. To examine how maize evolution and domestication affected meiotic recombination genes, we studied patterns of sequence polymorphism and divergence in eleven genes controlling key steps in the meiotic recombination pathway in a diverse set of maize inbred lines and several accessions of teosinte, the wild ancestor of maize. We discovered that, even though the recombination genes generally exhibited high sequence conservation expected in a pathway controlling a key cellular process, they showed substantial levels and diverse patterns of sequence polymorphism. Among others, we found differences in sequence polymorphism patterns between tropical and temperate maize germplasms. Several recombination genes displayed patterns of polymorphism indicative of adaptive evolution. Despite their ancient origin and overall sequence conservation, meiotic recombination genes can exhibit extensive and complex patterns of molecular evolution. Changes in these genes could affect the functioning of the recombination pathway, and may have contributed to the successful domestication of maize and its expansion to new cultivation areas.
Spatio-temporal analysis of aftershock sequences in terms of Non Extensive Statistical Physics.

NASA Astrophysics Data System (ADS)

Chochlaki, Kalliopi; Vallianatos, Filippos

2017-04-01

Earth's seismicity is considered as an extremely complicated process where long-range interactions and fracturing exist (Vallianatos et al., 2016). For this reason, in order to analyze it, we use an innovative methodological approach, introduced by Tsallis (Tsallis, 1988; 2009), named Non Extensive Statistical Physics. This approach introduce a generalization of the Boltzmann-Gibbs statistical mechanics and it is based on the definition of Tsallis entropy Sq, which maximized leads the the so-called q-exponential function that expresses the probability distribution function that maximizes the Sq. In the present work, we utilize the concept of Non Extensive Statistical Physics in order to analyze the spatiotemporal properties of several aftershock series. Marekova (Marekova, 2014) suggested that the probability densities of the inter-event distances between successive aftershocks follow a beta distribution. Using the same data set we analyze the inter-event distance distribution of several aftershocks sequences in different geographic regions by calculating non extensive parameters that determine the behavior of the system and by fitting the q-exponential function, which expresses the degree of non-extentivity of the investigated system. Furthermore, the inter-event times distribution of the aftershocks as well as the frequency-magnitude distribution has been analyzed. The results supports the applicability of Non Extensive Statistical Physics ideas in aftershock sequences where a strong correlation exists along with memory effects. References C. Tsallis, Possible generalization of Boltzmann-Gibbs statistics, J. Stat. Phys. 52 (1988) 479-487. doi:10.1007/BF01016429 C. Tsallis, Introduction to nonextensive statistical mechanics: Approaching a complex world, 2009. doi:10.1007/978-0-387-85359-8. E. Marekova, Analysis of the spatial distribution between successive earthquakes in aftershocks series, Annals of Geophysics, 57, 5, doi:10.4401/ag-6556, 2014 F. Vallianatos, G. Papadakis, G. Michas, Generalized statistical mechanics approaches to earthquakes and tectonics. Proc. R. Soc. A, 472, 20160497, 2016.
Multiple Cis-acting elements modulate programmed -1 ribosomal frameshifting in Pea enation mosaic virus

PubMed Central

Gao, Feng; Simon, Anne E.

2016-01-01

Programmed -1 ribosomal frameshifting (-1 PRF) is used by many positive-strand RNA viruses for translation of required products. Despite extensive studies, it remains unresolved how cis-elements just downstream of the recoding site promote a precise level of frameshifting. The Umbravirus Pea enation mosaic virus RNA2 expresses its RNA polymerase by -1 PRF of the 5′-proximal ORF (p33). Three hairpins located in the vicinity of the recoding site are phylogenetically conserved among Umbraviruses. The central Recoding Stimulatory Element (RSE), located downstream of the p33 termination codon, is a large hairpin with two asymmetric internal loops. Mutational analyses revealed that sequences throughout the RSE and the RSE lower stem (LS) structure are important for frameshifting. SHAPE probing of mutants indicated the presence of higher order structure, and sequences in the LS may also adapt an alternative conformation. Long-distance pairing between the RSE and a 3′ terminal hairpin was less critical when the LS structure was stabilized. A basal level of frameshifting occurring in the absence of the RSE increases to 72% of wild-type when a hairpin upstream of the slippery site is also deleted. These results suggest that suppression of frameshifting may be needed in the absence of an active RSE conformation. PMID:26578603
The 2-micron plasmid as a nonselectable, stable, high copy number yeast vector

NASA Technical Reports Server (NTRS)

Ludwig, D. L.; Bruschi, C. V.

1991-01-01

The endogenous 2-microns plasmid of Saccharomyces cerevisiae has been used extensively for the construction of yeast cloning and expression plasmids because it is a native yeast plasmid that is able to be maintained stably in cells at high copy number. Almost invariably, these plasmid constructs, containing some or all 2-microns sequences, exhibit copy number levels lower than 2-microns and are maintained stably only under selective conditions. We were interested in determining if there was a means by which 2-microns could be utilized for vector construction, without forfeiting either copy number or nonselective stability. We identified sites in the 2-microns plasmid that could be used for the insertion of genetic sequences without disrupting 2-microns coding elements and then assessed subsequent plasmid constructs for stability and copy number in vivo. We demonstrate the utility of a previously described 2-microns recombination chimera, pBH-2L, for the manipulation and transformation of 2-microns as a pure yeast plasmid vector. We show that the HpaI site near the STB element in the 2-microns plasmid can be utilized to clone yeast DNA of at least 3.9 kb with no loss of plasmid stability. Additionally, the copy number of these constructs is as high as levels reported for the endogenous 2-microns.
Extensive computation of allowed and forbidden transition probabilities in the potassium isoelectronic sequence

NASA Astrophysics Data System (ADS)

Dixit, Gopal; Deshmukh, Pranawa C.; Manson, Steven T.; Majumder, Sonjoy

2007-06-01

Our primary aim in this work is to present both allowed and forbidden transition amplitudes and corresponding wavelengths and oscillator strengths for a few ions in the 19-electron potassium isoelectronic sequence. All of these ions have the configuration [Ar] 3^2D3/2 as their ground state, except in the case of K and Ca^+, where it is [Ar] 4^2S1/2.This difference in ground state configuration arises due to strong contributions of correlation effects in the energy levels of these systems [1]. Allowed and forbidden transitions in these systems are of great importance in astrophysics [2] and in laboratory plasma research [3]. We apply in the present work the relativistic coupled-cluster (RCC) theory [4] to evaluate the energy levels and wave functions of these systems and study amplitudes for electric and magnetic dipole transition amplitudes and also the electric quadrupole transition amplitudes. The contributions of various electron correlation effects to the transition amplitudes are estimated in some detail using the RCC theory. [1] Gopal Dixit et al., Astrophys. J (submitted); arXiv.org: physics/0702066. [2] C. R. Cowley and G. M. Wahlgern, Astronomy & Astrophysics, 447, 681 (2002). [3] J. E. Vernazza, E. M. Reeves, Astrophys. J. Suppl. 37, 485 (1978) [4] I. Lindgren, Physics Scripta, 36, 591 (1987).
Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

PubMed Central

Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

2016-01-01

Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141
Holocene sea-level variations and geomorphological response: An example from northern Brittany (France)

NASA Astrophysics Data System (ADS)

Regnauld, H.; Jennings, S.; Delaney, C.; Lemasson, L.

In northern Brittany an important geomorphological response to Holocene sea-level rise has been the development of coastal dunes with associated lagoons and marshes. At Anse du Verger, a marsh has formed behind a dune system which has been developing in situ for the last 4000 years. The lithostratigraphy of the marsh comprises extensive peat formation, with sands, silts and occasional sand lenses, the latter probably associated with storm surges. The sequence dates from 10,320±120 BP. After 3000 BP, flood episodes on the marsh are more common, while the upper marsh deposits can be correlated with the recent period of dune building. Prehistoric artifacts (remains of cooking implements) have been found on a cliff to the east of the marsh and are buried by washover deposits, which indicates a sudden abandonment of a settlement possibly due to a storm surge soon after 2460±80 BP. Surge levels are proposed as a controlling factor on dune crest elevation.

Dissipative N-point-vortex Models in the Plane

NASA Astrophysics Data System (ADS)

Shashikanth, Banavara N.

2010-02-01

A method is presented for constructing point vortex models in the plane that dissipate the Hamiltonian function at any prescribed rate and yet conserve the level sets of the invariants of the Hamiltonian model arising from the SE (2) symmetries. The method is purely geometric in that it uses the level sets of the Hamiltonian and the invariants to construct the dissipative field and is based on elementary classical geometry in ℝ3. Extension to higher-dimensional spaces, such as the point vortex phase space, is done using exterior algebra. The method is in fact general enough to apply to any smooth finite-dimensional system with conserved quantities, and, for certain special cases, the dissipative vector field constructed can be associated with an appropriately defined double Nambu-Poisson bracket. The most interesting feature of this method is that it allows for an infinite sequence of such dissipative vector fields to be constructed by repeated application of a symmetric linear operator (matrix) at each point of the intersection of the level sets.
From non-random molecular structure to life and mind

NASA Technical Reports Server (NTRS)

Fox, S. W.

1989-01-01

The evolutionary hierarchy molecular structure-->macromolecular structure-->protobiological structure-->biological structure-->biological functions has been traced by experiments. The sequence always moves through protein. Extension of the experiments traces the formation of nucleic acids instructed by proteins. The proteins themselves were, in this picture, instructed by the self-sequencing of precursor amino acids. While the sequence indicated explains the thread of the emergence of life, protein in cellular membrane also provides the only known material basis for the emergence of mind in the context of emergence of life.
Transcriptome sequencing of different narrow-leafed lupin tissue types provides a comprehensive uni-gene assembly and extensive gene-based molecular markers

PubMed Central

Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B

2015-01-01

Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map. PMID:25060816
Design and cloning strategies for constructing shRNA expression vectors

PubMed Central

McIntyre, Glen J; Fanning, Gregory C

2006-01-01

Background Short hairpin RNA (shRNA) encoded within an expression vector has proven an effective means of harnessing the RNA interference (RNAi) pathway in mammalian cells. A survey of the literature revealed that shRNA vector construction can be hindered by high mutation rates and the ensuing sequencing is often problematic. Current options for constructing shRNA vectors include the use of annealed complementary oligonucleotides (74 % of surveyed studies), a PCR approach using hairpin containing primers (22 %) and primer extension of hairpin templates (4 %). Results We considered primer extension the most attractive method in terms of cost. However, in initial experiments we encountered a mutation frequency of 50 % compared to a reported 20 – 40 % for other strategies. By modifying the technique to be an isothermal reaction using the DNA polymerase Phi29, we reduced the error rate to 10 %, making primer extension the most efficient and cost-effective approach tested. We also found that inclusion of a restriction site in the loop could be exploited for confirming construct integrity by automated sequencing, while maintaining intended gene suppression. Conclusion In this study we detail simple improvements for constructing and sequencing shRNA that overcome current limitations. We also compare the advantages of our solutions against proposed alternatives. Our technical modifications will be of tangible benefit to researchers looking for a more efficient and reliable shRNA construction process. PMID:16396676
Using hidden Markov models and observed evolution to annotate viral genomes.

PubMed

McCauley, Stephen; Hein, Jotun

2006-06-01

ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Quantitative controls on location and architecture of carbonate depositional sequences: Upper miocene, cabo de gata region, SE Spain

USGS Publications Warehouse

Franseen, E.K.; Goldstein, R.H.; Farr, M.R.

1997-01-01

Sequence stratigraphy, pinning-point relative sea-level curves, and magnetostratigraphy provide the quantitative data necessary to understand how rates of sea-level change and different substrate paleoslopes are dominant controls on accumulation rate, carbonate depositional sequence location, and internal architecture. Five third-order (1-10 my) and fourth-order (0.1-1.0 my) upper Miocene carbonate depositional sequences (DS1A, DS1B, DS2, DS3, TCC) formed with superimposed higher-frequency sea-level cycles in an archipelago setting in SE Spain. Overall, our study indicates when areas of high substrate slope (> 15??) are in shallow water, independent of climate, the location and internal architecture of carbonate deposits are not directly linked to sea-level position but, instead, are controlled by location of gently sloping substrates and processes of bypass. In contrast, if carbonate sediments are generated where substrates of low slope ( 15.6 cm/ky to ??? 2 cm/ky and overall relative sea level rose at rates of 17-21.4 cm/ky. Higher frequency sea-level rates were about 111 to more than 260 cm/ky, producing onlapping, fining- (deepening-) upward cycles. Decreasing accumulation rates resulted from decreasing surface area for shallow-water sediment production, drowning of shallow-water substrates, and complex sediment dispersal related to the archipelago setting. Typical systems tract and parasequence development should not be expected in "bypass ramp" settings; facies of onlapping strata do not track base level and are likely to be significantly different compared to onlapping strata associated with coastal onlap. Basal and upper DS2 reef megabreccias (indicating the transition from cool to warmer climatic conditions) were eroded from steep upslope positions and redeposited downslope onto areas of gentle substrate during rapid sea-level falls (> 22.7 cm/ky) of short duration. Such rapid sea-level falls and presence of steep slopes are not conducive to formation of forced regressive systems tracts composed of down-stepping reef clinoforms. The DS3 reefal platform formed where shallow water coincided with gently sloping substrates created by earlier deposition. Slow progradation (0.39-1.45 km/my) is best explained by the lack of an extensive bank top, progressively falling sea level, and low productivity resulting from siliciclastic debris and excess nutrients shed from nearby volcanic islands. Although DS3 strata were deposited during a third-order relative sea-level cycle, a typical transgressive systems tract is not recognizable, indicating that the initial relative rise in sea level was too rapid (??? 19 cm/ky). Downstepping reefs, forming a forced regressive systems tract, were deposited during the relative sea-level fall at the end of DS3, indicating that relatively slow rates of fall (10 cm/ky or less) over favorable paleoslope conditions are conducive to generation of forced regressive systems tracts consisting of downstepping reef clinoforms. The TCC sequence consists of four shallow-water sedimentary cycles that were deposited during a 400 ky to 100 ky time span. Such shallow-water cycles, typical of many platforms, form only where shallow water intersects gently sloping substrates. The relative thicknesses of cycles (< 2 m to 15 m thick), magnitudes of relative sea-level fluctuations associated with each cycle (25-30 m), high rates of relative sea-level fluctuations (minimum of 25-120 cm/ky), and the widespread distribution of similar TCC cycles in the Mediterranean and elsewhere are supportive of a glacio-eustatic
Quantitative controls on location and architecture of carbonate depositional sequences: upper miocene, cabo de gata region, se Spain

USGS Publications Warehouse

Franseen, E.K.; Goldstein, R.H.; Farr, M.R.

1998-01-01

Sequence stratigraphy, pinning-point relative sea-level curves, and magnetostratigraphy provide the quantitative data necessary to understand how rates of sea-level change and different substrate paleoslopes are dominant controls on accumulation rate, carbonate depositional sequence location, and internal architecture. Five third-order (1-10 my) and fourth-order (0.1-1.0 my) upper Miocene carbonate depositional sequences (DS1A, DS1B, DS2, DS3, TCC) formed with superimposed higher-frequency sea-level cycles in an archipelago setting in SE Spain. Overall, our study indicates when areas of high substrate slope (> 15??) are in shallow water, independent of climate, the location and internal architecture of carbonate deposits are not directly linked to sea-level position but, instead, are controlled by location of gently sloping substrates and processes of bypass. In contrast, if carbonate sediments are generated where substrates of low slope ( 15.6 cm/ky to ??? 2 cm/ky and overall relative sea level rose at rates of 17-21.4 cm/ky. Higher frequency sea-level rates were about 111 to more than 260 cm/ky, producing onlapping, fining- (deepening-) upward cycles. Decreasing accumulation rates resulted from decreasing surface area for shallow-water sediment production, drowning of shallow-water substrates, and complex sediment dispersal related to the archipelago setting. Typical systems tract and parasequence development should not be expected in "bypass ramp" settings; facies of onlapping strata do not track base level and are likely to be significantly different compared to onlapping strata associated with coastal onlap. Basal and upper DS2 reef megabreccias (indicating the transition from cool to warmer climatic conditions) were eroded from steep upslope positions and redeposited downslope onto areas of gentle substrate during rapid sea-level falls (> 22.7 cm/ky) of short duration. Such rapid sea-level falls and presence of steep slopes are not conducive to formation of forced regressive systems tracts composed of downstepping reef clinoforms. The DS3 reefal platform formed where shallow water coincided with gently sloping substrates created by earlier deposition. Slow progradation (0.39-1.45 km/my) is best explained by the lack of an extensive bank top, progressively falling sea level, and low productivity resulting from siliciclastic debris and excess nutrients shed from nearby volcanic islands. Although DS3 strata were deposited during a third-order relative sea-level cycle, a typical transgresse??e systems tract is not recognizable, indicating that the initial relative rise in sea level was too rapid (??? 19 cm/ky). Downstepping reefs, forming a forced regressive systems tract, were deposited during the relative sea-level fall at the end of DS3, indicating that relatively slow rates of fall (10 cm/ky or less) over favorable paleoslope conditions are conducive to generation of forced regressive systems tracts consisting of downstepping reef clinoforms. The TCC sequence consists of four shallow -water sedimentary cycles that were deposited during a 400 ky to 100 ky time span. Such shallow-water cycles, typical of many platforms, form only where shallow water intersects gently sloping substrates. The relative thicknesses of cycles (< 2 m to 15 m thick), magnitudes of relative sea-level fluctuations associated with each cycle (25-30 m), high rates of relative sea-level fluctuations (minimum of 25-120 cm/ky), and the widespread distribution of similar TCC cycles in the Mediterranean and elsewhere are supportive of a glacio-eustati
Gene Expression Profiles in Paired Gingival Biopsies from Periodontitis-Affected and Healthy Tissues Revealed by Massively Parallel Sequencing

PubMed Central

Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay

2012-01-01

Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519
Engineering Signal Peptides for Enhanced Protein Secretion from Lactococcus lactis

PubMed Central

Ng, Daphne T. W.

2013-01-01

Lactococcus lactis is an attractive vehicle for biotechnological production of proteins and clinical delivery of therapeutics. In many such applications using this host, it is desirable to maximize secretion of recombinant proteins into the extracellular space, which is typically achieved by using the native signal peptide from a major secreted lactococcal protein, Usp45. In order to further increase protein secretion from L. lactis, inherent limitations of the Usp45 signal peptide (Usp45sp) must be elucidated. Here, we performed extensive mutagenesis on Usp45sp to probe the effects of both the mRNA sequence (silent mutations) and the peptide sequence (amino acid substitutions) on secretion. We screened signal peptides based on their resulting secretion levels of Staphylococcus aureus nuclease and further evaluated them for secretion of Bacillus subtilis α-amylase. Silent mutations alone gave an increase of up to 16% in the secretion of α-amylase through a mechanism consistent with relaxed mRNA folding around the ribosome binding site and enhanced translation. Targeted amino acid mutagenesis in Usp45sp, combined with additional silent mutations from the best clone in the initial screen, yielded an increase of up to 51% in maximum secretion of α-amylase while maintaining secretion at lower induction levels. The best sequence from our screen preserves the tripartite structure of the native signal peptide but increases the positive charge of the n-region. Our study presents the first example of an engineered L. lactis signal peptide with a higher secretion yield than Usp45sp and, more generally, provides strategies for further enhancing protein secretion in bacterial hosts. PMID:23124224
Extensive de novo mutation rate variation between individuals and across the genome of Chlamydomonas reinhardtii

PubMed Central

Ness, Rob W.; Morgan, Andrew D.; Vasanthakrishnan, Radhakrishnan B.; Colegrave, Nick; Keightley, Peter D.

2015-01-01

Describing the process of spontaneous mutation is fundamental for understanding the genetic basis of disease, the threat posed by declining population size in conservation biology, and much of evolutionary biology. Directly studying spontaneous mutation has been difficult, however, because new mutations are rare. Mutation accumulation (MA) experiments overcome this by allowing mutations to build up over many generations in the near absence of natural selection. Here, we sequenced the genomes of 85 MA lines derived from six genetically diverse strains of the green alga Chlamydomonas reinhardtii. We identified 6843 new mutations, more than any other study of spontaneous mutation. We observed sevenfold variation in the mutation rate among strains and that mutator genotypes arose, increasing the mutation rate approximately eightfold in some replicates. We also found evidence for fine-scale heterogeneity in the mutation rate, with certain sequence motifs mutating at much higher rates, and clusters of multiple mutations occurring at closely linked sites. There was little evidence, however, for mutation rate heterogeneity between chromosomes or over large genomic regions of 200 kbp. We generated a predictive model of the mutability of sites based on their genomic properties, including local GC content, gene expression level, and local sequence context. Our model accurately predicted the average mutation rate and natural levels of genetic diversity of sites across the genome. Notably, trinucleotides vary 17-fold in rate between the most and least mutable sites. Our results uncover a rich heterogeneity in the process of spontaneous mutation both among individuals and across the genome. PMID:26260971
Engineering signal peptides for enhanced protein secretion from Lactococcus lactis.

PubMed

Ng, Daphne T W; Sarkar, Casim A

2013-01-01

Lactococcus lactis is an attractive vehicle for biotechnological production of proteins and clinical delivery of therapeutics. In many such applications using this host, it is desirable to maximize secretion of recombinant proteins into the extracellular space, which is typically achieved by using the native signal peptide from a major secreted lactococcal protein, Usp45. In order to further increase protein secretion from L. lactis, inherent limitations of the Usp45 signal peptide (Usp45sp) must be elucidated. Here, we performed extensive mutagenesis on Usp45sp to probe the effects of both the mRNA sequence (silent mutations) and the peptide sequence (amino acid substitutions) on secretion. We screened signal peptides based on their resulting secretion levels of Staphylococcus aureus nuclease and further evaluated them for secretion of Bacillus subtilis α-amylase. Silent mutations alone gave an increase of up to 16% in the secretion of α-amylase through a mechanism consistent with relaxed mRNA folding around the ribosome binding site and enhanced translation. Targeted amino acid mutagenesis in Usp45sp, combined with additional silent mutations from the best clone in the initial screen, yielded an increase of up to 51% in maximum secretion of α-amylase while maintaining secretion at lower induction levels. The best sequence from our screen preserves the tripartite structure of the native signal peptide but increases the positive charge of the n-region. Our study presents the first example of an engineered L. lactis signal peptide with a higher secretion yield than Usp45sp and, more generally, provides strategies for further enhancing protein secretion in bacterial hosts.
Complete Genome Sequence of the Avian Pathogenic Escherichia coli Strain APEC O78

PubMed Central

Mangiamele, Paul; Nicholson, Bryon; Wannemuehler, Yvonne; Seemann, Torsten; Logue, Catherine M.; Li, Ganwu; Tivendale, Kelly A.

2013-01-01

Colibacillosis, caused by avian pathogenic Escherichia coli (APEC), is a significant disease, causing extensive animal and financial losses globally. Because of the significance of this disease, more knowledge is needed regarding APEC's mechanisms of virulence. Here, we present the fully closed genome sequence of a typical avian pathogenic E. coli strain belonging to the serogroup O78. PMID:23516182
Whole-genome sequence of Escherichia coli serotype O157:H7 strain B6914-ARS

USDA-ARS?s Scientific Manuscript database

Escherichia coli serotype O157:H7 strain B6914-MS1 is a Shiga toxin-deficient human fecal isolate obtained by the Centers for Disease Control and Prevention that has been used extensively in applied research studies. Here we report the genome sequence of strain B6914-ARS, a B6914-MS1 clone that has ...
Deciphering the complexities of the wheat flour proteome using quantitative two-dimensional electrophoresis, three proteases and tandem mass spectrometry

PubMed Central

2011-01-01

Background Wheat flour is one of the world's major food ingredients, in part because of the unique end-use qualities conferred by the abundant glutamine- and proline-rich gluten proteins. Many wheat flour proteins also present dietary problems for consumers with celiac disease or wheat allergies. Despite the importance of these proteins it has been particularly challenging to use MS/MS to distinguish the many proteins in a flour sample and relate them to gene sequences. Results Grain from the extensively characterized spring wheat cultivar Triticum aestivum 'Butte 86' was milled to white flour from which proteins were extracted, then separated and quantified by 2-DE. Protein spots were identified by separate digestions with three proteases, followed by tandem mass spectrometry analysis of the peptides. The spectra were used to interrogate an improved protein sequence database and results were integrated using the Scaffold program. Inclusion of cultivar specific sequences in the database greatly improved the results, and 233 spots were identified, accounting for 93.1% of normalized spot volume. Identified proteins were assigned to 157 wheat sequences, many for proteins unique to wheat and nearly 40% from Butte 86. Alpha-gliadins accounted for 20.4% of flour protein, low molecular weight glutenin subunits 18.0%, high molecular weight glutenin subunits 17.1%, gamma-gliadins 12.2%, omega-gliadins 10.5%, amylase/protease inhibitors 4.1%, triticins 1.6%, serpins 1.6%, purinins 0.9%, farinins 0.8%, beta-amylase 0.5%, globulins 0.4%, other enzymes and factors 1.9%, and all other 3%. Conclusions This is the first successful effort to identify the majority of abundant flour proteins for a single wheat cultivar, relate them to individual gene sequences and estimate their relative levels. Many genes for wheat flour proteins are not expressed, so this study represents further progress in describing the expressed wheat genome. Use of cultivar-specific contigs helped to overcome the difficulties of matching peptides to gene sequences for members of highly similar, rapidly evolving storage protein families. Prospects for simplifying this process for routine analyses are discussed. The ability to measure expression levels for individual flour protein genes complements information gained from efforts to sequence the wheat genome and is essential for studies of effects of environment on gene expression. PMID:21314956
The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

PubMed Central

Rusch, Douglas B; Halpern, Aaron L; Sutton, Granger; Heidelberg, Karla B; Williamson, Shannon; Yooseph, Shibu; Wu, Dongying; Eisen, Jonathan A; Hoffman, Jeff M; Remington, Karin; Beeson, Karen; Tran, Bao; Smith, Hamilton; Baden-Tillson, Holly; Stewart, Clare; Thorpe, Joyce; Freeman, Jason; Andrews-Pfannkoch, Cynthia; Venter, Joseph E; Li, Kelvin; Kravitz, Saul; Heidelberg, John F; Utterback, Terry; Rogers, Yu-Hui; Falcón, Luisa I; Souza, Valeria; Bonilla-Rosso, Germán; Eguiarte, Luis E; Karl, David M; Sathyendranath, Shubha; Platt, Trevor; Bermingham, Eldredge; Gallardo, Victor; Tamayo-Castillo, Giselle; Ferrari, Michael R; Strausberg, Robert L; Nealson, Kenneth; Friedman, Robert; Frazier, Marvin; Venter, J. Craig

2007-01-01

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS. PMID:17355176
Unique transposon landscapes are pervasive across Drosophila melanogaster genomes

PubMed Central

Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.

2015-01-01

To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579
Nostoc thermotolerans sp. nov., a soil-dwelling species of Nostoc (Cyanobacteria).

PubMed

Suradkar, Archana; Villanueva, Chelsea; Gaysina, Lira A; Casamatta, Dale A; Saraf, Aniket; Dighe, Gandhali; Mergu, Ratnaprabha; Singh, Prashant

2017-05-01

A filamentous, soil-dwelling cyanobacterial strain (9C-PST) was isolated from Mandsaur, Madhya Pradesh, India, and is described as a new species of the genus Nostoc. Extensive morphological and molecular characterization along with a thorough assessment of ecology was performed. The style of filament orientation, type and nature of the sheath (e.g. distribution and visibility across the trichome), and vegetative and heterocyte cell dimensions and shape were assessed for over one year using both the laboratory grown culture and the naturally occurring samples. Sequencing of the 16S rRNA gene showed 94 % similarity with Nostocpiscinale CENA21 while analyses of the secondary structures of the 16S-23S ITS region showed unique folding patterns that differentiated this strain from other species of Nostoc. The level of rbcl and rpoC1 gene sequence similarity was 91 and 94 % to Nostocsp. PCC 7524 and Nostocpiscinale CENA21, respectively, while the nifD gene sequence similarity was found to be 99 % with Nostocpiscinale CENA21. The phenotypic, ecological, genetic and phylogenetic observations indicate that the strain 9C-PST represents a novel species of the genus Nostoc with the name proposed being Nostoc thermotolerans sp. nov. according to the International Code of Nomenclature for Algae, Fungi, and Plants.
Genetic characterization of Anaplasma marginale strains from Tunisia using single and multiple gene typing reveals novel variants with an extensive genetic diversity.

PubMed

Ben Said, Mourad; Ben Asker, Alaa; Belkahia, Hanène; Ghribi, Raoua; Selmi, Rachid; Messadi, Lilia

2018-05-12

Anaplasma marginale, which is responsible for bovine anaplasmosis in tropical and subtropical regions, is a tick-borne obligatory intraerythrocytic bacterium of cattle and wild ruminants. In Tunisia, information about the genetic diversity and the phylogeny of A. marginale strains are limited to the msp4 gene analysis. The purpose of this study is to investigate A. marginale isolates infecting 16 cattle located in different bioclimatic areas of northern Tunisia with single gene analysis and multilocus sequence typing methods on the basis of seven partial genes (dnaA, ftsZ, groEL, lipA, secY, recA and sucB). The single gene analysis confirmed the presence of different and novel heterogenic A. marginale strains infecting cattle from the north of Tunisia. The concatenated sequence analysis showed a phylogeographical resolution at the global level and that most of the Tunisian sequence types (STs) formed a separate cluster from a South African isolate and from all New World isolates and strains. By combining the characteristics of each single locus with those of the multi-loci scheme, these results provide a more detailed understanding on the diversity and the evolution of Tunisian A. marginale strains. Copyright © 2018 Elsevier GmbH. All rights reserved.
Sequence-based characterization of Listeria monocytogenes strains isolated from domestic retail meat in the Tokyo metropolitan area of Japan.

PubMed

Yoshikawa, Yuko; Ochiai, Yoshitsugu; Mochizuki, Mariko; Takano, Takashi; Hondo, Ryo; Ueda, Fukiko

2018-05-31

To assess the level of Listeria monocytogenes contamination of domestic retail meat in Tokyo, Japan, we compared isolates from 2004 to 2007 with those isolated before 2003. The overall prevalence of L. monocytogenes among these samples significantly diminished over time (1998-2003, 28.0%; 2004-2007, 17.6%) reflecting a significant decrease in the frequency of contamination of beef. Serotype 1/2a was isolated most frequently, reflecting a change in the predominant serotype in pork from 1/2c to 1/2a. We performed a simple genetic subtyping method based on three genes, iap, sigB, and actA, as well as traditional multilocus sequence typing to classify the allele types (ATs). No extensive variation among sequence types was detected; however, increased genetic diversity among the ATs of the three genes in the 2004-2007 isolates was evident. We identified AT 26 of the iap gene, not previously reported in Japanese isolates, and six ATs of the sigB gene, including four with nonsense mutations not currently registered in L. monocytogenes DNA databases. sigB is an evolutionally conserved gene that plays a role in the stress response. Our results indicate that the sigB gene may be relatively unstable among L. monocytogenes strains circulating in Japan.
Sequence-based evidence for major histocompatibility complex-disassortative mating in a colonial seabird.

PubMed

Juola, Frans A; Dearborn, Donald C

2012-01-07

The major histocompatibility complex (MHC) is a polymorphic gene family associated with immune defence, and it can play a role in mate choice. Under the genetic compatibility hypothesis, females choose mates that differ genetically from their own MHC genotypes, avoiding inbreeding and/or enhancing the immunocompetence of their offspring. We tested this hypothesis of disassortative mating based on MHC genotypes in a population of great frigatebirds (Fregata minor) by sequencing the second exon of MHC class II B. Extensive haploid cloning yielded two to four alleles per individual, suggesting the amplification of two genes. MHC similarity between mates was not significantly different between pairs that did (n = 4) or did not (n = 42) exhibit extra-pair paternity. Comparing all 46 mated pairs to a distribution based on randomized re-pairings, we observed the following (i): no evidence for mate choice based on maximal or intermediate levels of MHC allele sharing (ii), significantly disassortative mating based on similarity of MHC amino acid sequences, and (iii) no evidence for mate choice based on microsatellite alleles, as measured by either allele sharing or similarity in allele size. This suggests that females choose mates that differ genetically from themselves at MHC loci, but not as an inbreeding-avoidance mechanism.

Menzerath-Altmann law in mammalian exons reflects the dynamics of gene structure evolution.

PubMed

Nikolaou, Christoforos

2014-12-01

Genomic sequences exhibit self-organization properties at various hierarchical levels. One such is the gene structure of higher eukaryotes with its complex exon/intron arrangement. Exon sizes and exon numbers in genes have been shown to conform to a law derived from statistical linguistics and formulated by Menzerath and Altmann, according to which the mean size of the constituents of an entity is inversely related to the number of these constituents. We herein perform a detailed analysis of this property in the complete exon set of the mouse genome in correlation to the sequence conservation of each exon and the transcriptional complexity of each gene locus. We show that extensive linear fits, representative of accordance to Menzerath-Altmann law are restricted to a particular subset of genes that are formed by exons under low or intermediate sequence constraints and have a small number of alternative transcripts. Based on this observation we propose a hypothesis for the law of Menzerath-Altmann in mammalian genes being predominantly due to genes that are more versatile in function and thus, more prone to undergo changes in their structure. To this end we demonstrate one test case where gene categories of different functionality also show differences in the extent of conformity to Menzerath-Altmann law. Copyright © 2014 Elsevier Ltd. All rights reserved.
Single-Molecule Counting of Point Mutations by Transient DNA Binding

NASA Astrophysics Data System (ADS)

Su, Xin; Li, Lidan; Wang, Shanshan; Hao, Dandan; Wang, Lei; Yu, Changyuan

2017-03-01

High-confidence detection of point mutations is important for disease diagnosis and clinical practice. Hybridization probes are extensively used, but are hindered by their poor single-nucleotide selectivity. Shortening the length of DNA hybridization probes weakens the stability of the probe-target duplex, leading to transient binding between complementary sequences. The kinetics of probe-target binding events are highly dependent on the number of complementary base pairs. Here, we present a single-molecule assay for point mutation detection based on transient DNA binding and use of total internal reflection fluorescence microscopy. Statistical analysis of single-molecule kinetics enabled us to effectively discriminate between wild type DNA sequences and single-nucleotide variants at the single-molecule level. A higher single-nucleotide discrimination is achieved than in our previous work by optimizing the assay conditions, which is guided by statistical modeling of kinetics with a gamma distribution. The KRAS c.34 A mutation can be clearly differentiated from the wild type sequence (KRAS c.34 G) at a relative abundance as low as 0.01% mutant to WT. To demonstrate the feasibility of this method for analysis of clinically relevant biological samples, we used this technology to detect mutations in single-stranded DNA generated from asymmetric RT-PCR of mRNA from two cancer cell lines.
[Engineered spider silk: the intelligent biomaterial of the future. Part I].

PubMed

Florczak, Anna; Piekoś, Konrad; Kaźmierska, Katarzyna; Mackiewicz, Andrzej; Dams-Kozłowska, Hanna

2011-06-17

The unique properties of spider silk such as strength, extensibility, toughness, biocompatibility and biodegradability are the reasons for the recent development in silk biomaterial technology. For a long time scientific progress was impeded by limited access to spider silk. However, the development of the molecular biology strategy was a breaking point in synthetic spider silk protein design. The sequences of engineered spider silk are based on the consensus motives of the corresponding natural equivalents. Moreover, the engineered silk proteins may be modified in order to gain a new function. The strategy of the hybrid proteins constructed on the DNA level combines the sequence of engineered silk, which is responsible for the biomaterial structure, with the sequence of polypeptide which allows functionalization of the silk biomaterial. The functional domains may comprise receptor binding sites, enzymes, metal or sugar binding sites and others. Currently, advanced research is being conducted, which on the one hand focuses on establishing the particular silk structure and understanding the process of silk thread formation in nature. On the other hand, there are attempts to improve methods of engineered spider silk protein production. Due to acquired knowledge and recent progress in synthetic protein technology, the engineered silk will turn into intelligent biomaterial of the future, while its industrial production scale will trigger a biotechnological revolution.
Einstein Observatory magnitude-limited X-ray survey of late-type giant and supergiant stars

NASA Technical Reports Server (NTRS)

Maggio, A.; Vaiana, G. S.; Haisch, B. M.; Stern, R. A.; Bookbinder, J.

1990-01-01

Results are presented of an extensive X-ray survey of 380 giant and supergiant stars of spectral types from F to M, carried out with the Einstein Observatory. It was found that the observed F giants or subgiants (slightly evolved stars with a mass M less than about 2 solar masses) are X-ray emitters at the same level of main-sequence stars of similar spectral type. The G giants show a range of emissions more than 3 orders of magnitude wide; some single G giants exist with X-ray luminosities comparable to RS CVn systems, while some nearby large G giants have upper limits on the X-ray emission below typical solar values. The K giants have an observed X-ray emission level significantly lower than F and F giants. None of the 29 M giants were detected, except for one spectroscopic binary.
The Arabidopsis thaliana mobilome and its impact at the species level.

PubMed

Quadrana, Leandro; Bortolini Silveira, Amanda; Mayhew, George F; LeBlanc, Chantal; Martienssen, Robert A; Jeddeloh, Jeffrey A; Colot, Vincent

2016-06-03

Transposable elements (TEs) are powerful motors of genome evolution yet a comprehensive assessment of recent transposition activity at the species level is lacking for most organisms. Here, using genome sequencing data for 211 Arabidopsis thaliana accessions taken from across the globe, we identify thousands of recent transposition events involving half of the 326 TE families annotated in this plant species. We further show that the composition and activity of the 'mobilome' vary extensively between accessions in relation to climate and genetic factors. Moreover, TEs insert equally throughout the genome and are rapidly purged by natural selection from gene-rich regions because they frequently affect genes, in multiple ways. Remarkably, loci controlling adaptive responses to the environment are the most frequent transposition targets observed. These findings demonstrate the pervasive, species-wide impact that a rich mobilome can have and the importance of transposition as a recurrent generator of large-effect alleles.
Assessing Date Palm Genetic Diversity Using Different Molecular Markers.

PubMed

Atia, Mohamed A M; Sakr, Mahmoud M; Adawy, Sami S

2017-01-01

Molecular marker technologies which rely on DNA analysis provide powerful tools to assess biodiversity at different levels, i.e., among and within species. A range of different molecular marker techniques have been developed and extensively applied for detecting variability in date palm at the DNA level. Recently, the employment of gene-targeting molecular marker approaches to study biodiversity and genetic variations in many plant species has increased the attention of researchers interested in date palm to carry out phylogenetic studies using these novel marker systems. Molecular markers are good indicators of genetic distances among accessions, because DNA-based markers are neutral in the face of selection. Here we describe the employment of multidisciplinary molecular marker approaches: amplified fragment length polymorphism (AFLP), start codon targeted (SCoT) polymorphism, conserved DNA-derived polymorphism (CDDP), intron-targeted amplified polymorphism (ITAP), simple sequence repeats (SSR), and random amplified polymorphic DNA (RAPD) to assess genetic diversity in date palm.
Theory of positive disintegration as a model of adolescent development.

PubMed

Laycraft, Krystyna

2011-01-01

This article introduces a conceptual model of the adolescent development based on the theory of positive disintegration combined with theory of self-organization. Dabrowski's theory of positive disintegration, which was created almost a half century ago, still attracts psychologists' and educators' attention, and is extensively applied into studies of gifted and talented people. The positive disintegration is the mental development described by the process of transition from lower to higher levels of mental life and stimulated by tension, inner conflict, and anxiety. This process can be modeled by a sequence of patterns of organization (attractors) as a developmental potential (a control parameter) changes. Three levels of disintegration (unilevel disintegration, spontaneous multilevel disintegration, and organized multilevel disintegration) are analyzed in detail and it is proposed that they represent behaviour of early, middle and late periods of adolescence. In the discussion, recent research on the adolescent brain development is included.
Hardware fault insertion and instrumentation system: Mechanization and validation

NASA Technical Reports Server (NTRS)

Benson, J. W.

1987-01-01

Automated test capability for extensive low-level hardware fault insertion testing is developed. The test capability is used to calibrate fault detection coverage and associated latency times as relevant to projecting overall system reliability. Described are modifications made to the NASA Ames Reconfigurable Flight Control System (RDFCS) Facility to fully automate the total test loop involving the Draper Laboratories' Fault Injector Unit. The automated capability provided included the application of sequences of simulated low-level hardware faults, the precise measurement of fault latency times, the identification of fault symptoms, and bulk storage of test case results. A PDP-11/60 served as a test coordinator, and a PDP-11/04 as an instrumentation device. The fault injector was controlled by applications test software in the PDP-11/60, rather than by manual commands from a terminal keyboard. The time base was especially developed for this application to use a variety of signal sources in the system simulator.
Development of ten microsatellite loci in the invasive giant African land snail, Achatina (=Lissachatina) fulica Bowdich, 1822

USGS Publications Warehouse

Morrison, Cheryl L.; Springmann, Marcus J.; Iwanowicz, Deborah D.; Wade, Christopher M.

2015-01-01

A suite of tetra-nucleotide microsatellite loci were developed for the invasive giant African land snail, Achatina (=Lissachatina) fulica Bowdich, 1822, from Ion Torrent next-generation sequencing data. Ten of the 96 primer sets tested amplified consistently in 30 snails from Miami, Florida, plus 12 individuals representative of their native East Africa, Indian and Pacific Ocean regions. The loci displayed moderate levels of allelic diversity (average 5.6 alleles/locus) and heterozygosity (average 42 %). Levels of genetic diversity were sufficient to produce unique multi-locus genotypes and detect phylogeographic structuring among regional samples. The invasive A. fulica can cause extensive damage to important food crops and natural resources, including native flora and fauna. The loci characterized here will be useful for determining the origins and tracking the spread of invasions, detecting fine-scale spatial structuring and estimating demographic parameters.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

PubMed Central

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Coarse-grained sequences for protein folding and design.

PubMed

Brown, Scott; Fawzi, Nicolas J; Head-Gordon, Teresa

2003-09-16

We present the results of sequence design on our off-lattice minimalist model in which no specification of native-state tertiary contacts is needed. We start with a sequence that adopts a target topology and build on it through sequence mutation to produce new sequences that comprise distinct members within a target fold class. In this work, we use the alpha/beta ubiquitin fold class and design two new sequences that, when characterized through folding simulations, reproduce the differences in folding mechanism seen experimentally for proteins L and G. The primary implication of this work is that patterning of hydrophobic and hydrophilic residues is the physical origin for the success of relative contact-order descriptions of folding, and that these physics-based potentials provide a predictive connection between free energy landscapes and amino acid sequence (the original protein folding problem). We present results of the sequence mapping from a 20- to the three-letter code for determining a sequence that folds into the WW domain topology to illustrate future extensions to protein design.
Coarse-grained sequences for protein folding and design

PubMed Central

Brown, Scott; Fawzi, Nicolas J.; Head-Gordon, Teresa

2003-01-01

We present the results of sequence design on our off-lattice minimalist model in which no specification of native-state tertiary contacts is needed. We start with a sequence that adopts a target topology and build on it through sequence mutation to produce new sequences that comprise distinct members within a target fold class. In this work, we use the α/β ubiquitin fold class and design two new sequences that, when characterized through folding simulations, reproduce the differences in folding mechanism seen experimentally for proteins L and G. The primary implication of this work is that patterning of hydrophobic and hydrophilic residues is the physical origin for the success of relative contact-order descriptions of folding, and that these physics-based potentials provide a predictive connection between free energy landscapes and amino acid sequence (the original protein folding problem). We present results of the sequence mapping from a 20- to the three-letter code for determining a sequence that folds into the WW domain topology to illustrate future extensions to protein design. PMID:12963815
Purification and Characterization of Four β-Expansins (Zea m 1 Isoforms) from Maize Pollen1[w

PubMed Central

Li, Lian-Chao; Bedinger, Patricia A.; Volk, Carol; Jones, A. Daniel; Cosgrove, Daniel J.

2003-01-01

Four proteins with wall extension activity on grass cell walls were purified from maize (Zea mays) pollen by conventional column chromatography and high-performance liquid chromatography. Each is a basic glycoprotein (isoelectric point = 9.1–9.5) of approximately 28 kD and was identified by immunoblot analysis as an isoform of Zea m 1, the major group 1 allergen of maize pollen and member of the β-expansin family. Four distinctive cDNAs for Zea m 1 were identified by cDNA library screening and by GenBank analysis. One pair (GenBank accession nos. AY104999 and AY104125) was much closer in sequence to well-characterized allergens such as Lol p 1 and Phl p 1 from ryegrass (Lolium perenne) and Phleum pretense, whereas a second pair was much more divergent. The N-terminal sequence and mass spectrometry fingerprint of the most abundant isoform (Zea m 1d) matched that predicted for AY197353, whereas N-terminal sequences of the other isoforms matched or nearly matched AY104999 and AY104125. Highly purified Zea m 1d induced extension of a variety of grass walls but not dicot walls. Wall extension activity of Zea m 1d was biphasic with respect to protein concentration, had a broad pH optimum between 5 and 6, required more than 50 μg mL-1 for high activity, and led to cell wall breakage after only approximately 10% extension. These characteristics differ from those of α-expansins. Some of the distinctive properties of Zea m 1 may not be typical of β-expansins as a class but may relate to the specialized function of this β-expansin in pollen function. PMID:12913162
A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

PubMed

Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

1995-04-01

The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).
Single-cell sequencing technologies: current and future.

PubMed

Liang, Jialong; Cai, Wanshi; Sun, Zhongsheng

2014-10-20

Intensively developed in the last few years, single-cell sequencing technologies now present numerous advantages over traditional sequencing methods for solving the problems of biological heterogeneity and low quantities of available biological materials. The application of single-cell sequencing technologies has profoundly changed our understanding of a series of biological phenomena, including gene transcription, embryo development, and carcinogenesis. However, before single-cell sequencing technologies can be used extensively, researchers face the serious challenge of overcoming inherent issues of high amplification bias, low accuracy and reproducibility. Here, we simply summarize the techniques used for single-cell isolation, and review the current technologies used in single-cell genomic, transcriptomic, and epigenomic sequencing. We discuss the merits, defects, and scope of application of single-cell sequencing technologies and then speculate on the direction of future developments. Copyright © 2014 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Evidence of automatic processing in sequence learning using process-dissociation

PubMed Central

Mong, Heather M.; McCabe, David P.; Clegg, Benjamin A.

2012-01-01

This paper proposes a way to apply process-dissociation to sequence learning in addition and extension to the approach used by Destrebecqz and Cleeremans (2001). Participants were trained on two sequences separated from each other by a short break. Following training, participants self-reported their knowledge of the sequences. A recognition test was then performed which required discrimination of two trained sequences, either under the instructions to call any sequence encountered in the experiment “old” (the inclusion condition), or only sequence fragments from one half of the experiment “old” (the exclusion condition). The recognition test elicited automatic and controlled process estimates using the process dissociation procedure, and suggested both processes were involved. Examining the underlying processes supporting performance may provide more information on the fundamental aspects of the implicit and explicit constructs than has been attainable through awareness testing. PMID:22679465
Extensive Copy Number Variation in Fermentation-Related Genes Among Saccharomyces cerevisiae Wine Strains.

PubMed

Steenwyk, Jacob; Rokas, Antonis

2017-05-05

Due to the importance of Saccharomyces cerevisiae in wine-making, the genomic variation of wine yeast strains has been extensively studied. One of the major insights stemming from these studies is that wine yeast strains harbor low levels of genetic diversity in the form of single nucleotide polymorphisms (SNPs). Genomic structural variants, such as copy number (CN) variants, are another major type of variation segregating in natural populations. To test whether genetic diversity in CN variation is also low across wine yeast strains, we examined genome-wide levels of CN variation in 132 whole-genome sequences of S. cerevisiae wine strains. We found an average of 97.8 CN variable regions (CNVRs) affecting ∼4% of the genome per strain. Using two different measures of CN diversity, we found that gene families involved in fermentation-related processes such as copper resistance ( CUP ), flocculation ( FLO ), and glucose metabolism ( HXT ), as well as the SNO gene family whose members are expressed before or during the diauxic shift, showed substantial CN diversity across the 132 strains examined. Importantly, these same gene families have been shown, through comparative transcriptomic and functional assays, to be associated with adaptation to the wine fermentation environment. Our results suggest that CN variation is a substantial contributor to the genomic diversity of wine yeast strains, and identify several candidate loci whose levels of CN variation may affect the adaptation and performance of wine yeast strains during fermentation. Copyright © 2017 Steenwyk and Rokas.
Application of Genotyping during an Extensive Outbreak of Waterborne Giardiasis in Bergen, Norway, during Autumn and Winter 2004†

PubMed Central

Robertson, L. J.; Hermansen, L.; Gjerde, B. K.; Strand, E.; Alvsvåg, J. O.; Langeland, N.

2006-01-01

During the autumn and winter of 2004 and 2005, an extensive outbreak of waterborne giardiasis occurred in Bergen, Norway. Over 1,500 patients were diagnosed with giardiasis. Analysis of water from the implicated source revealed low numbers of Giardia cysts, but the initial contamination event probably occurred up to 10 weeks previously. While sewage leakage from a residential area is now considered to be the probable source of contamination, during the episode waste from one particular septic tank was thought to be a possible source. Genotyping of cysts from the septic tank demonstrated that they were assemblage A cysts, although the sequences were not identical to any previously published sequences. For the β-giardin gene, the closest published subgenotype was subgenotype A3; for the gdh gene, the closest published subgenotype was subgenotype A2. Genotyping of cysts from 21 patient samples revealed that they were assemblage B cysts; thus, the septic tank was unlikely to be the contamination source. Sequencing of the β-giardin and gdh genes from patient samples and a comparison of the sequences gave complex results. For the β-giardin gene, three isolates had sequences identical to subgenotype B3 sequences. However, other isolates had between one and four single-nucleotide polymorphisms (SNPs). For the gdh gene, none of the sequences were identical to the sequence published for subgenotype B3, and the sequences had between one and three SNPs. One isolate, which was identical to subgenotype B3 at the β-giardin gene, was more similar to subgenotype B2 at the gdh gene. Grouping the isolates on the basis of SNPs resulted in different groups for the two genes. The results are discussed in relation to giardiasis in Norway and to other Giardia genotyping studies. PMID:16517674
Blind Predictions of DNA and RNA Tweezers Experiments with Force and Torque

PubMed Central

Chou, Fang-Chieh; Lipfert, Jan; Das, Rhiju

2014-01-01

Single-molecule tweezers measurements of double-stranded nucleic acids (dsDNA and dsRNA) provide unprecedented opportunities to dissect how these fundamental molecules respond to forces and torques analogous to those applied by topoisomerases, viral capsids, and other biological partners. However, tweezers data are still most commonly interpreted post facto in the framework of simple analytical models. Testing falsifiable predictions of state-of-the-art nucleic acid models would be more illuminating but has not been performed. Here we describe a blind challenge in which numerical predictions of nucleic acid mechanical properties were compared to experimental data obtained recently for dsRNA under applied force and torque. The predictions were enabled by the HelixMC package, first presented in this paper. HelixMC advances crystallography-derived base-pair level models (BPLMs) to simulate kilobase-length dsDNAs and dsRNAs under external forces and torques, including their global linking numbers. These calculations recovered the experimental bending persistence length of dsRNA within the error of the simulations and accurately predicted that dsRNA's “spring-like” conformation would give a two-fold decrease of stretch modulus relative to dsDNA. Further blind predictions of helix torsional properties, however, exposed inaccuracies in current BPLM theory, including three-fold discrepancies in torsional persistence length at the high force limit and the incorrect sign of dsRNA link-extension (twist-stretch) coupling. Beyond these experiments, HelixMC predicted that ‘nucleosome-excluding’ poly(A)/poly(T) is at least two-fold stiffer than random-sequence dsDNA in bending, stretching, and torsional behaviors; Z-DNA to be at least three-fold stiffer than random-sequence dsDNA, with a near-zero link-extension coupling; and non-negligible effects from base pair step correlations. We propose that experimentally testing these predictions should be powerful next steps for understanding the flexibility of dsDNA and dsRNA in sequence contexts and under mechanical stresses relevant to their biology. PMID:25102226
Impact of alemtuzumab on HIV persistence in an HIV-infected individual on antiretroviral therapy with Sezary syndrome.

PubMed

Rasmussen, Thomas A; McMahon, James; Chang, J Judy; Symons, Jori; Roche, Michael; Dantanarayana, Ashanti; Okoye, Afam; Hiener, Bonnie; Palmer, Sarah; Lee, Wen Shi; Kent, Stephen J; Van Der Weyden, Carrie; Prince, H Miles; Cameron, Paul U; Lewin, Sharon R

2017-08-24

To study the effects of alemtuzumab on HIV persistence in an HIV-infected individual on antiretroviral therapy (ART) with Sezary syndrome, a rare malignancy of CD4 T cells. Case report. Blood was collected 30 and 18 months prior to presentation with Sezary syndrome, at the time of presentation and during alemtuzumab. T-cell subsets in malignant (CD7-CD26-TCR-VBeta2+) and nonmalignant cells were quantified by flow cytometry. HIV-DNA in total CD4 T cells, in sorted malignant and nonmalignant CD4 T cells, was quantified by PCR and clonal expansion of HIV-DNA assessed by full-length next-generation sequencing. HIV-hepatitis B virus coinfection was diagnosed and antiretroviral therapy initiated 4 years prior to presentation with Sezary syndrome and primary cutaneous anaplastic large cell lymphoma. The patient received alemtuzumab 10 mg three times per week for 4 weeks but died 6 weeks post alemtuzumab. HIV-DNA was detected in nonmalignant but not in malignant CD4 T cells, consistent with expansion of a noninfected CD4 T-cell clone. Full-length HIV-DNA sequencing demonstrated multiple defective viruses but no identical or expanded sequences. Alemtuzumab extensively depleted T cells, including more than 1 log reduction in total T cells and more than 3 log reduction in CD4 T cells. Finally, alemtuzumab decreased HIV-DNA in CD4 T cells by 57% but HIV-DNA remained detectable at low levels even after depletion of nearly all CD4 T cells. Alemtuzumab extensively depleted multiple T-cell subsets and decreased the frequency of but did not eliminate HIV-infected CD4 T cells. Studying the effects on HIV persistence following immune recovery in HIV-infected individuals who require alemtuzumab for malignancy or in animal studies may provide further insights into novel cure strategies.

Formation and tectonic evolution of the Pattani Basin, Gulf of Thailand

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bustin, R.M.; Chonchawalit, A.

The stratigraphic and structural evolution of the Pattani Basin, the most prolific petroleum basin in Thailand, reflects the extensional tectonic regime of continental Southeast Asia. E-W extension resulting from the northward collision of India with Eurasia since the Early Tertiary resulted in the formation of a series of N-S-trending sedimentary basins, which include the Pattani Basin. The sedimentary succession in the Pattani Basin is divisible into synrift and postrift sequences. Deposition of the synrift sequence accompanied rifting and extension, with episodic block faulting and rapid subsidence. The synrift sequence comprises three stratigraphic units: (1) Upper Eocene to Lower Olikgocene alluvial-fan,more » braided-river, and floodplain deposits; (2) Upper oligocene to Lowe Miocene floodplain and channel deposits; and (3) a Lower Miocene regressive package consisting of marine to nonmarine sediments. Post-rift succession comprises: (1) a Lower to Middle Miocene regressive package of shallow marine sediments through floodplain and channel deposits; (2) an upper Lower Miocene transgressive sequence; and (3) and Upper Miocene to Pleistocene transgressive succession. The post-rift phase is characterized by slower subsidence and decreased sediment influx. The present-day shallow-marine condition in the Gulf of Thailand is the continuation of this latest transgressive phase. The subsidence and thermal history of the Pattani Basin is consistent with a nonuniform lithospheric-stretching model. The amount of extension as well as surface heat flow generally increases from the margin to the basin center. The crustal stretching factor ({beta}) varies form 1.3 at the basin margin to 2.8 in the center. The subcrustal stretching factor ({delta}) ranges from 1.3 at the basin margin to more than 3.0 in the basin center. 31 refs., 13 figs., 4 tabs.« less
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

1998-03-01

Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
A middle Pleistocene through middle Miocene moraine sequence in the central Transantarctic Mountains, Antarctica

NASA Astrophysics Data System (ADS)

Balter, A.; Bromley, G. R.; Balco, G.; Thomas, H.; Jackson, M. S.

2017-12-01

Ice-free areas at high elevation in the central Transantarctic Mountains preserve extensive moraine sequences and drift deposits that comprise a geologic record of former East Antarctic Ice Sheet thickness and extent. We are applying cosmogenic-nuclide exposure dating to determine the ages of these moraine sequences at Roberts Massif and Otway Massif, at the heads of the Shackleton and Beardmore Glaciers, respectively. Moraines at these sites are for the most part openwork boulder belts characteristic of deposition by cold-based ice, which is consistent with present climate and glaciological conditions. To develop our chronology, we collected samples from 30 distinct ice-marginal landforms and have so far measured >100 3He, 10Be, and 21Ne exposure ages. Apparent exposure ages range from 1-14 Ma, which shows that these landforms record glacial events between the middle Pleistocene and middle Miocene. These data show that the thickness of the East Antarctic Ice Sheet in this region was similar to or thicker than present for long periods between the middle Miocene and today. The time range represented by these moraine sequences indicates that they may also provide direct geologic evidence for East Antarctic Ice Sheet behavior during past periods of warmer-than-present climate, specifically the Miocene and Pliocene. As the East Antarctic Ice Sheet is the largest ice sheet on earth, understanding its sensitivity to warm-climate conditions is critical for projections of ice sheet behavior and sea-level rise in future warm climates.
High-throughput sequence-based analysis of the bacterial composition of kefir and an associated kefir grain.

PubMed

Dobson, Alleson; O'Sullivan, Orla; Cotter, Paul D; Ross, Paul; Hill, Colin

2011-07-01

Lacticin 3147 is a two-peptide broad spectrum lantibiotic produced by Lactococcus lactis DPC3147 shown to inhibit a number of clinically relevant Gram-positive pathogens. Initially isolated from an Irish kefir grain, lacticin 3147 is one of the most extensively studied lantibiotics to date. In this study, the bacterial diversity of the Irish kefir grain from which L. lactis DPC3147 was originally isolated was for the first time investigated using a high-throughput parallel sequencing strategy. A total of 17 416 unique V4 variable regions of the 16S rRNA gene were analysed from both the kefir starter grain and its derivative kefir-fermented milk. Firmicutes (which includes the lactic acid bacteria) was the dominant phylum accounting for > 92% of sequences. Within the Firmicutes, dramatic differences in abundance were observed when the starter grain and kefir milk fermentate were compared. The kefir grain-associated bacterial community was largely composed of the Lactobacillaceae family while Streptococcaceae (primarily Lactococcus spp.) was the dominant family within the kefir milk fermentate. Sequencing data confirmed previous findings that the microbiota of kefir milk and the starter grain are quite different while at the same time, establishing that the microbial diversity of the starter grain is not uniform with a greater level of diversity associated with the interior kefir starter grain compared with the exterior. © 2011 Teagasc Food Research Centre, Moorepark. FEMS Microbiology Letters © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd.
Structural and Functional Insights from the Metagenome of an Acidic Hot Spring Microbial Planktonic Community in the Colombian Andes

PubMed Central

Jiménez, Diego Javier; Andreote, Fernando Dini; Chaves, Diego; Montaña, José Salvador; Osorio-Forero, Cesar; Junca, Howard; Zambrano, María Mercedes; Baena, Sandra

2012-01-01

A taxonomic and annotated functional description of microbial life was deduced from 53 Mb of metagenomic sequence retrieved from a planktonic fraction of the Neotropical high Andean (3,973 meters above sea level) acidic hot spring El Coquito (EC). A classification of unassembled metagenomic reads using different databases showed a high proportion of Gammaproteobacteria and Alphaproteobacteria (in total read affiliation), and through taxonomic affiliation of 16S rRNA gene fragments we observed the presence of Proteobacteria, micro-algae chloroplast and Firmicutes. Reads mapped against the genomes Acidiphilium cryptum JF-5, Legionella pneumophila str. Corby and Acidithiobacillus caldus revealed the presence of transposase-like sequences, potentially involved in horizontal gene transfer. Functional annotation and hierarchical comparison with different datasets obtained by pyrosequencing in different ecosystems showed that the microbial community also contained extensive DNA repair systems, possibly to cope with ultraviolet radiation at such high altitudes. Analysis of genes involved in the nitrogen cycle indicated the presence of dissimilatory nitrate reduction to N2 (narGHI, nirS, norBCDQ and nosZ), associated with Proteobacteria-like sequences. Genes involved in the sulfur cycle (cysDN, cysNC and aprA) indicated adenylsulfate and sulfite production that were affiliated to several bacterial species. In summary, metagenomic sequence data provided insight regarding the structure and possible functions of this hot spring microbial community, describing some groups potentially involved in the nitrogen and sulfur cycling in this environment. PMID:23251687
Force measurements on the molecular interactions between ligand (RGD) and human platelet α IIbβ 3 receptor system

NASA Astrophysics Data System (ADS)

Lee, ImShik; Marchant, Roger E.

2001-10-01

The peptide sequence arginine-glycine-aspartate (RGD) found in fibrinogen, von Willebrand factor, fibronectin, and vitronectin, plays a critical role in platelet adhesion and thrombus formation, when bound to the platelet α IIbβ 3 integrin receptor. Using atomic force microscopy (AFM), we have measured the debonding interaction between an RGD peptide-modified AFM probe tip and a human platelet surface from pN to nN levels of force. The peptide sequence, GSSSGRGDSPA, which contains the biologically active RGDSP sequence with a hydrophilic spacer sequence (GSSSG), was covalently coupled to AFM probe tips. Direct measurements on the debonding force for the RGD ligand - α IIbβ 3 platelet receptor system were carried out in Tyrode buffer at room temperature. Our results show three distinct distributions of debonding forces at a loading rate of 12 nN/s, from which we estimate the debonding force for the single ligand-receptor to be ˜93 pN. The results also show evidence for considerable extension in the flexible sample surface during the debonding process, and a linear correlation between the debonding force and the logarithm of the rate of loading. From our analysis, the zero kinetic off-rate Koff(0), the single molecular binding energy Eb, and the transition state xB, assuming rigid binding, were extracted from the data, and estimated to be 22.6 s -1, -2.64×10 -20 J and 0.1 nm, respectively.
Comparative transcriptome sequencing and de novo analysis of Vaccinium corymbosum during fruit and color development.

PubMed

Li, Lingli; Zhang, Hehua; Liu, Zhongshuai; Cui, Xiaoyue; Zhang, Tong; Li, Yanfang; Zhang, Lingyun

2016-10-12

Blueberry is an economically important fruit crop in Ericaceae family. The substantial quantities of flavonoids in blueberry have been implicated in a broad range of health benefits. However, the information regarding fruit development and flavonoid metabolites based on the transcriptome level is still limited. In the present study, the transcriptome and gene expression profiling over berry development, especially during color development were initiated. A total of approximately 13.67 Gbp of data were obtained and assembled into 186,962 transcripts and 80,836 unigenes from three stages of blueberry fruit and color development. A large number of simple sequence repeats (SSRs) and candidate genes, which are potentially involved in plant development, metabolic and hormone pathways, were identified. A total of 6429 sequences containing 8796 SSRs were characterized from 15,457 unigenes and 1763 unigenes contained more than one SSR. The expression profiles of key genes involved in anthocyanin biosynthesis were also studied. In addition, a comparison between our dataset and other published results was carried out. Our high quality reads produced in this study are an important advancement and provide a new resource for the interpretation of high-throughput data for blueberry species whether regarding sequencing data depth or species extension. The use of this transcriptome data will serve as a valuable public information database for the studies of blueberry genome and would greatly boost the research of fruit and color development, flavonoid metabolisms and regulation and breeding of more healthful blueberries.
Inter-Species Grafting Caused Extensive and Heritable Alterations of DNA Methylation in Solanaceae Plants

PubMed Central

Lin, Yan; Ma, Yiqiao; Liu, Gang; Yu, Xiaoming; Zhong, Silin; Liu, Bao

2013-01-01

Background Grafting has been extensively used to enhance the performance of horticultural crops. Since Charles Darwin coined the term “graft hybrid” meaning that asexual combination of different plant species may generate products that are genetically distinct, highly discrepant opinions exist supporting or against the concept. Recent studies have documented that grafting enables exchanges of both RNA and DNA molecules between the grafting partners, thus providing a molecular basis for grafting-induced genetic variation. DNA methylation is known as prone to alterations as a result of perturbation of internal and external conditions. Given characteristics of grafting, it is interesting to test whether the process may cause an alteration of this epigenetic marker in the grafted organismal products. Methodology/Principal Findings We analyzed relative global DNA methylation levels and locus-specific methylation patterns by the MSAP marker and locus-specific bisulfite-sequencing in the seed plants (wild-type controls), self- and hetero-grafted scions/rootstocks, selfed progenies of scions and their seed-plant controls, involving three Solanaceae species. We quantified expression of putative genes involved in establishing and/or maintaining DNA methylation by q-(RT)-PCR. We found that (1) hetero-grafting caused extensive alteration of DNA methylation patterns in a locus-specific manner, especially in scions, although relative methylation levels remain largely unaltered; (2) the altered methylation patterns in the hetero-grafting-derived scions could be inherited to sexual progenies with some sites showing further alterations or revisions; (3) hetero-grafting caused dynamic changes in steady-state transcript abundance of genes encoding for a set of enzymes functionally relevant to DNA methylation. Conclusions/Significance Our results demonstrate that inter-species grafting in plants could produce extensive and heritable alterations in DNA methylation. We suggest that these readily altered, yet heritable, epigenetic modifications due to interspecies hetero-grafting may shed one facet of insight into the molecular underpinnings for the still contentious concept of graft hybrid. PMID:23614002
Extensive characterization of Tupaia belangeri neuropeptidome using an integrated mass spectrometric approach.

PubMed

Petruzziello, Filomena; Fouillen, Laetitia; Wadensten, Henrik; Kretz, Robert; Andren, Per E; Rainer, Gregor; Zhang, Xiaozhe

2012-02-03

Neuropeptidomics is used to characterize endogenous peptides in the brain of tree shrews (Tupaia belangeri). Tree shrews are small animals similar to rodents in size but close relatives of primates, and are excellent models for brain research. Currently, tree shrews have no complete proteome information available on which direct database search can be allowed for neuropeptide identification. To increase the capability in the identification of neuropeptides in tree shrews, we developed an integrated mass spectrometry (MS)-based approach that combines methods including data-dependent, directed, and targeted liquid chromatography (LC)-Fourier transform (FT)-tandem MS (MS/MS) analysis, database construction, de novo sequencing, precursor protein search, and homology analysis. Using this integrated approach, we identified 107 endogenous peptides that have sequences identical or similar to those from other mammalian species. High accuracy MS and tandem MS information, with BLAST analysis and chromatographic characteristics were used to confirm the sequences of all the identified peptides. Interestingly, further sequence homology analysis demonstrated that tree shrew peptides have a significantly higher degree of homology to equivalent sequences in humans than those in mice or rats, consistent with the close phylogenetic relationship between tree shrews and primates. Our results provide the first extensive characterization of the peptidome in tree shrews, which now permits characterization of their function in nervous and endocrine system. As the approach developed fully used the conservative properties of neuropeptides in evolution and the advantage of high accuracy MS, it can be portable for identification of neuropeptides in other species for which the fully sequenced genomes or proteomes are not available.
Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

PubMed

Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

2015-01-01

Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.
Universal Recurrence Time Statistics of Characteristic Earthquakes

NASA Astrophysics Data System (ADS)

Goltz, C.; Turcotte, D. L.; Abaimov, S.; Nadeau, R. M.

2006-12-01

Characteristic earthquakes are defined to occur quasi-periodically on major faults. Do recurrence time statistics of such earthquakes follow a particular statistical distribution? If so, which one? The answer is fundamental and has important implications for hazard assessment. The problem cannot be solved by comparing the goodness of statistical fits as the available sequences are too short. The Parkfield sequence of M ≍ 6 earthquakes, one of the most extensive reliable data sets available, has grown to merely seven events with the last earthquake in 2004, for example. Recently, however, advances in seismological monitoring and improved processing methods have unveiled so-called micro-repeaters, micro-earthquakes which recur exactly in the same location on a fault. It seems plausible to regard these earthquakes as a miniature version of the classic characteristic earthquakes. Micro-repeaters are much more frequent than major earthquakes, leading to longer sequences for analysis. Due to their recent discovery, however, available sequences contain less than 20 events at present. In this paper we present results for the analysis of recurrence times for several micro-repeater sequences from Parkfield and adjacent regions. To improve the statistical significance of our findings, we combine several sequences into one by rescaling the individual sets by their respective mean recurrence intervals and Weibull exponents. This novel approach of rescaled combination yields the most extensive data set possible. We find that the resulting statistics can be fitted well by an exponential distribution, confirming the universal applicability of the Weibull distribution to characteristic earthquakes. A similar result is obtained from rescaled combination, however, with regard to the lognormal distribution.
Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

PubMed Central

Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

2015-01-01

Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430
DNA barcode and identification of the varieties and provenances of Taiwan's domestic and imported made teas using ribosomal internal transcribed spacer 2 sequences.

PubMed

Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh

2017-04-01

The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.
Molecular Phylogeny and Phylogeography of the Australian Freshwater Fish Genus Galaxiella, with an Emphasis on Dwarf Galaxias (G. pusilla)

PubMed Central

Unmack, Peter J.; Bagley, Justin C.; Adams, Mark; Hammer, Michael P.; Johnson, Jerald B.

2012-01-01

The freshwater fauna of Southern Australia is primarily restricted to the southwestern and southeastern corners of the continent, and is separated by a large, arid region that is inhospitable to this biota. This geographic phenomenon has attracted considerable interest from biogeographers looking to explain evolutionary diversification in this region. Here, we employed phylogenetic and phylogeographic approaches to evaluate the effect of this barrier on a group of four galaxiid fish species (Galaxiella) endemic to temperate Southern Australia. We also tested if continental shelf width has influenced connectivity among populations during low sea levels when rivers, now isolated, could have been connected. We addressed these questions by sampling each species across its range using multiple molecular markers (mitochondrial cytochrome b sequences, nuclear S7 intron sequences, and 49 allozyme loci). These data also allowed us to assess species boundaries, to refine phylogenetic affinities, and to estimate species ages. Interestingly, we found compelling evidence for cryptic species in G. pusilla, manifesting as allopatric eastern and western taxa. Our combined phylogeny and dating analysis point to an origin for the genus dating to the early Cenozoic, with three of the four species originating during the Oligocene-Miocene. Each Galaxiella species showed high levels of genetic divergences between all but the most proximate populations. Despite extensive drainage connections during recent low sea levels in southeastern Australia, populations of both species within G. pusilla maintained high levels of genetic structure. All populations experienced Late Pleistocene-Holocene population growth, possibly in response to the relaxation of arid conditions after the last glacial maximum. High levels of genetic divergence and the discovery of new cryptic species have important implications for the conservation of this already threatened group of freshwater species. PMID:22693638
Molecular phylogeny and phylogeography of the Australian freshwater fish genus Galaxiella, with an emphasis on dwarf galaxias (G. pusilla).

PubMed

Unmack, Peter J; Bagley, Justin C; Adams, Mark; Hammer, Michael P; Johnson, Jerald B

2012-01-01

The freshwater fauna of Southern Australia is primarily restricted to the southwestern and southeastern corners of the continent, and is separated by a large, arid region that is inhospitable to this biota. This geographic phenomenon has attracted considerable interest from biogeographers looking to explain evolutionary diversification in this region. Here, we employed phylogenetic and phylogeographic approaches to evaluate the effect of this barrier on a group of four galaxiid fish species (Galaxiella) endemic to temperate Southern Australia. We also tested if continental shelf width has influenced connectivity among populations during low sea levels when rivers, now isolated, could have been connected. We addressed these questions by sampling each species across its range using multiple molecular markers (mitochondrial cytochrome b sequences, nuclear S7 intron sequences, and 49 allozyme loci). These data also allowed us to assess species boundaries, to refine phylogenetic affinities, and to estimate species ages. Interestingly, we found compelling evidence for cryptic species in G. pusilla, manifesting as allopatric eastern and western taxa. Our combined phylogeny and dating analysis point to an origin for the genus dating to the early Cenozoic, with three of the four species originating during the Oligocene-Miocene. Each Galaxiella species showed high levels of genetic divergences between all but the most proximate populations. Despite extensive drainage connections during recent low sea levels in southeastern Australia, populations of both species within G. pusilla maintained high levels of genetic structure. All populations experienced Late Pleistocene-Holocene population growth, possibly in response to the relaxation of arid conditions after the last glacial maximum. High levels of genetic divergence and the discovery of new cryptic species have important implications for the conservation of this already threatened group of freshwater species.
Evidence for Differential Glycosylation of Trophoblast Cell Types*

PubMed Central

Chen, Qiushi; Pang, Poh-Choo; Cohen, Marie E.; Longtine, Mark S.; Schust, Danny J.; Haslam, Stuart M.; Blois, Sandra M.; Dell, Anne; Clark, Gary F.

2016-01-01

Human placental villi are surfaced by the syncytiotrophoblast (STB), with a layer of cytotrophoblasts (CTB) positioned just beneath the STB. STB in normal term pregnancies is exposed to maternal immune cells in the placental intervillous space. Extravillous cytotrophoblasts (EVT) invade the decidua and spiral arteries, where they act in conjunction with natural killer (NK) cells to convert the spiral arteries into flaccid conduits for maternal blood that support a 3–4 fold increase in the rate of maternal blood flow into the placental intervillous space. The functional roles of these distinct trophoblast subtypes during pregnancy suggested that they could be differentially glycosylated. Glycomic analysis of these trophoblasts has revealed the expression of elevated levels of biantennary N-glycans in STB and CTB, with the majority of them bearing a bisecting GlcNAc. N-glycans terminated with polylactosamine extensions were also detected at low levels. A subset of the N-glycans linked to these trophoblasts were sialylated, primarily with terminal NeuAcα2–3Gal sequences. EVT were decorated with the same N-glycans as STB and CTB, except in different proportions. The level of bisecting type N-glycans was reduced, but the level of N-glycans decorated with polylactosamine sequences were substantially elevated compared with the other types of trophoblasts. The level of triantennary and tetraantennary N-glycans was also elevated in EVT. The sialylated N-glycans derived from EVT were completely susceptible to an α2–3 specific neuraminidase (sialidase S). The possibility exists that the N-glycans associated with these different trophoblast subpopulations could act as functional groups. These potential relationships will be considered. PMID:26929217
Evidence for Differential Glycosylation of Trophoblast Cell Types.

PubMed

Chen, Qiushi; Pang, Poh-Choo; Cohen, Marie E; Longtine, Mark S; Schust, Danny J; Haslam, Stuart M; Blois, Sandra M; Dell, Anne; Clark, Gary F

2016-06-01

Human placental villi are surfaced by the syncytiotrophoblast (STB), with a layer of cytotrophoblasts (CTB) positioned just beneath the STB. STB in normal term pregnancies is exposed to maternal immune cells in the placental intervillous space. Extravillous cytotrophoblasts (EVT) invade the decidua and spiral arteries, where they act in conjunction with natural killer (NK) cells to convert the spiral arteries into flaccid conduits for maternal blood that support a 3-4 fold increase in the rate of maternal blood flow into the placental intervillous space. The functional roles of these distinct trophoblast subtypes during pregnancy suggested that they could be differentially glycosylated. Glycomic analysis of these trophoblasts has revealed the expression of elevated levels of biantennary N-glycans in STB and CTB, with the majority of them bearing a bisecting GlcNAc. N-glycans terminated with polylactosamine extensions were also detected at low levels. A subset of the N-glycans linked to these trophoblasts were sialylated, primarily with terminal NeuAcα2-3Gal sequences. EVT were decorated with the same N-glycans as STB and CTB, except in different proportions. The level of bisecting type N-glycans was reduced, but the level of N-glycans decorated with polylactosamine sequences were substantially elevated compared with the other types of trophoblasts. The level of triantennary and tetraantennary N-glycans was also elevated in EVT. The sialylated N-glycans derived from EVT were completely susceptible to an α2-3 specific neuraminidase (sialidase S). The possibility exists that the N-glycans associated with these different trophoblast subpopulations could act as functional groups. These potential relationships will be considered. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Ancient DNA and the population genetics of cave bears (Ursus spelaeus) through space and time.

PubMed

Orlando, Ludovic; Bonjean, Dominique; Bocherens, Herve; Thenot, Aurelie; Argant, Alain; Otte, Marcel; Hänni, Catherine

2002-11-01

The cave bear spread from Western Europe to the Near East during the Riss glaciation (250 KYA) before becoming extinct approximately 12 KYA. During that period, the climatic conditions were highly dynamic, oscillating between glacial and temperate episodes. Such events have constrained the geographic repartition of species, the movements of populations and shaped their genetic diversity. We retrieved and analyzed ancient DNA from 21 samples from five European caves ranging from 40 to 130 KYA. Combined with available data, our data set accounts for a total of 41 sequences of cave bear, coming from 18 European caves. We distinguish four haplogroups at the level of the mitochondrial DNA control region. The large population size of cave bear could account for the maintenance of such polymorphism. Extensive gene flow seems to have connected European populations because two haplogroups cover wide geographic areas. Furthermore, the extensive sampling of the deposits of the Scladina cave located in Belgium allowed us to correlate changes in climatic conditions with the intrapopulational genetic diversity over 90 KY.
An object-oriented data reduction system in Fortran

NASA Technical Reports Server (NTRS)

Bailey, J.

1992-01-01

A data reduction system for the AAO two-degree field project is being developed using an object-oriented approach. Rather than use an object-oriented language (such as C++) the system is written in Fortran and makes extensive use of existing subroutine libraries provided by the UK Starlink project. Objects are created using the extensible N-dimensional Data Format (NDF) which itself is based on the Hierarchical Data System (HDS). The software consists of a class library, with each class corresponding to a Fortran subroutine with a standard calling sequence. The methods of the classes provide operations on NDF objects at a similar level of functionality to the applications of conventional data reduction systems. However, because they are provided as callable subroutines, they can be used as building blocks for more specialist applications. The class library is not dependent on a particular software environment thought it can be used effectively in ADAM applications. It can also be used from standalone Fortran programs. It is intended to develop a graphical user interface for use with the class library to form the 2dF data reduction system.
Zero-profile hybrid fusion construct versus 2-level plate fixation to treat adjacent-level disease in the cervical spine.

PubMed

Healy, Andrew T; Sundar, Swetha J; Cardenas, Raul J; Mageswaran, Prasath; Benzel, Edward C; Mroz, Thomas E; Francis, Todd B

2014-11-01

Single-level anterior cervical discectomy and fusion (ACDF) is an established surgical treatment for cervical myelopathy. Within 10 years of undergoing ACDF, 19.2% of patients develop symptomatic adjacent-level degeneration. Performing ACDF adjacent to prior fusion requires exposure and removal of previously placed hardware, which may increase the risk of adverse outcomes. Zero-profile cervical implants combine an interbody spacer with an anterior plate into a single device that does not extend beyond the intervertebral disc space, potentially obviating the need to remove prior hardware. This study compared the biomechanical stability and adjacent-level range of motion (ROM) following placement of a zero-profile device (ZPD) adjacent to a single-level ACDF against a standard 2-level ACDF. In this in vitro biomechanical cadaveric study, multidirectional flexibility testing was performed by a robotic spine system that simulates flexion-extension, lateral bending, and axial rotation by applying a continuous pure moment load. Testing conditions were as follows: 1) intact, 2) C5-6 ACDF, 3) C4-5 ZPD supraadjacent to simulated fusion at C5-6, and 4) 2-level ACDF (C4-6). The sequence of the latter 2 test conditions was randomized. An unconstrained pure moment of 1.5 Nm with a 40-N simulated head weight load was applied to the intact condition first in all 3 planes of motion and then using the hybrid test protocol, overall intact kinematics were replicated subsequently for each surgical test condition. Intersegmental rotations were measured optoelectronically. Mean segmental ROM for operated levels and adjacent levels was recorded and normalized to the intact condition and expressed as a percent change from intact. A repeated-measures ANOVA was used to analyze the ROM between test conditions with a 95% level of significance. No statistically significant differences in immediate construct stability were found between construct Patterns 3 and 4, in all planes of motion (p > 0.05). At the operated level, C4-5, the zero-profile construct showed greater decreases in axial rotation (-45% vs -36%) and lateral bending (-55% vs -38%), whereas the 2-level ACDF showed greater decreases in flexion-extension (-40% vs -34%). These differences were marginal and not statistically significant. Adjacent-level motion was nearly equivalent, with minor differences in flexion-extension. When treating degeneration adjacent to a single-level ACDF, a zero-profile implant showed stabilizing potential at the operated level statistically similar to that of the standard revision with a 2-level plate. Revision for adjacent-level disease is common, and using a ZPD in this setting should be investigated clinically because it may be a faster, safer alternative.

Comparative immunogenomics of molluscs.

PubMed

Schultz, Jonathan H; Adema, Coen M

2017-10-01

Comparative immunology, studying both vertebrates and invertebrates, provided the earliest descriptions of phagocytosis as a general immune mechanism. However, the large scale of animal diversity challenges all-inclusive investigations and the field of immunology has developed by mostly emphasizing study of a few vertebrate species. In addressing the lack of comprehensive understanding of animal immunity, especially that of invertebrates, comparative immunology helps toward management of invertebrates that are food sources, agricultural pests, pathogens, or transmit diseases, and helps interpret the evolution of animal immunity. Initial studies showed that the Mollusca (second largest animal phylum), and invertebrates in general, possess innate defenses but lack the lymphocytic immune system that characterizes vertebrate immunology. Recognizing the reality of both common and taxon-specific immune features, and applying up-to-date cell and molecular research capabilities, in-depth studies of a select number of bivalve and gastropod species continue to reveal novel aspects of molluscan immunity. The genomics era heralded a new stage of comparative immunology; large-scale efforts yielded an initial set of full molluscan genome sequences that is available for analyses of full complements of immune genes and regulatory sequences. Next-generation sequencing (NGS), due to lower cost and effort required, allows individual researchers to generate large sequence datasets for growing numbers of molluscs. RNAseq provides expression profiles that enable discovery of immune genes and genome sequences reveal distribution and diversity of immune factors across molluscan phylogeny. Although computational de novo sequence assembly will benefit from continued development and automated annotation may require some experimental validation, NGS is a powerful tool for comparative immunology, especially increasing coverage of the extensive molluscan diversity. To date, immunogenomics revealed new levels of complexity of molluscan defense by indicating sequence heterogeneity in individual snails and bivalves, and members of expanded immune gene families are expressed differentially to generate pathogen-specific defense responses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Anatomy of major coal successions: Facies analysis and sequence architecture of a brown coal-bearing valley fill to lacustrine tract (Upper Valdarno Basin, Northern Apennines, Italy)

NASA Astrophysics Data System (ADS)

Ielpi, Alessandro

2012-07-01

A late Pliocene incised valley fill to lacustrine succession, which contains an interbedded brown coal seam (< 20 m thick), is examined in terms of facies analysis, physical stratigraphy and sequence architecture. The succession (< 50 m thick) constitutes the first depositional event of the Castelnuovo Synthem, which is the oldest unconformity bounded stratigraphic unit of the nonmarine Upper Valdarno Basin, Northern Apennines (Italy). The integration of field surveys and borehole logs identified the following event sequence: first valley filling stages by coarse alluvial fan and channelised streams; the progressive setting of low gradient floodbasins with shallow floodplain lakes; subsequent major waterlogging and extensive peat mire development; and system drowning and establishment of permanent lacustrine conditions. The deposits are grouped in a set of nested valley fills and are arranged as high-frequency depositional sequences. The sequences are bounded by minor erosive truncations and have distinctive upward trends: lowstand system tract thinning; transgressive system tract thickening; highstand system tract thinning and eventual non-deposition; and the smoothing of along-sequence boundary sub-aerial incisions. Such features fit in with the notion of an idealised model where second-order (high-frequency) fluctuations, modulated by first-order (low-frequency) base-level rising, have short-lived standing + falling phases and prolonged transgressions, respectively. Furthermore, the general sequence architecture reveals how a mixed palustrine-siliciclastic system differs substantially from a purely siliciclastic one. In the transgressive phases, terrigenous starvation induces prevailing peat accumulation, generating abnormally thick transgressive system tracts that eventually come to occupy much of the same transgression-generated accommodation space. In the highstand phases, the development of thick highstand system tracts is then prevented by sediment upstream trapping due to retrogressive fluvial aggradations, probably coupled with low-accommodation settings inherited from the transgressive phases.
Intra- and inter-isolate variation of ribosomal and protein-coding genes in Pleurotus: implications for molecular identification and phylogeny on fungal groups.

PubMed

He, Xiao-Lan; Li, Qian; Peng, Wei-Hong; Zhou, Jie; Cao, Xue-Lian; Wang, Di; Huang, Zhong-Qian; Tan, Wei; Li, Yu; Gan, Bing-Cheng

2017-06-26

The internal transcribed spacer (ITS), RNA polymerase II second largest subunit (RPB2), and elongation factor 1-alpha (EF1α) are often used in fungal taxonomy and phylogenetic analysis. As we know, an ideal molecular marker used in molecular identification and phylogenetic studies is homogeneous within species, and interspecific variation exceeds intraspecific variation. However, during our process of performing ITS, RPB2, and EF1α sequencing on the Pleurotus spp., we found that intra-isolate sequence polymorphism might be present in these genes because direct sequencing of PCR products failed in some isolates. Therefore, we detected intra- and inter-isolate variation of the three genes in Pleurotus by polymerase chain reaction amplification and cloning in this study. Results showed that intra-isolate variation of ITS was not uncommon but the polymorphic level in each isolate was relatively low in Pleurotus; intra-isolate variations of EF1α and RPB2 sequences were present in an unexpectedly high amount. The polymorphism level differed significantly between ITS, RPB2, and EF1α in the same individual, and the intra-isolate heterogeneity level of each gene varied between isolates within the same species. Intra-isolate and intraspecific variation of ITS in the tested isolates was less than interspecific variation, and intra-isolate and intraspecific variation of RPB2 was probably equal with interspecific divergence. Meanwhile, intra-isolate and intraspecific variation of EF1α could exceed interspecific divergence. These findings suggested that RPB2 and EF1α are not desirable barcoding candidates for Pleurotus. We also discussed the reason why rDNA and protein-coding genes showed variants within a single isolate in Pleurotus, but must be addressed in further research. Our study demonstrated that intra-isolate variation of ribosomal and protein-coding genes are likely widespread in fungi. This has implications for studies on fungal evolution, taxonomy, phylogenetics, and population genetics. More extensive sampling of these genes and other candidates will be required to ensure reliability as phylogenetic markers and DNA barcodes.
Lack of robustness of life extension associated with several single-gene P element mutations in Drosophila melanogaster.

PubMed

Mockett, Robin J; Nobles, Amber C

2013-10-01

The hypothesis tested in this study was that single-gene mutations found previously to extend the life span of Drosophila melanogaster could do so consistently in both long-lived y w and standard w (1118) genetic backgrounds. GAL4 drivers were used to express upstream activation sequence (UAS)-responder transgenes globally or in the nervous system. Transgenes associated with oxidative damage prevention (UAS-hSOD1 and UAS-GCLc) or removal (EP-UAS-Atg8a and UAS-dTOR (FRB) ) failed to increase mean life spans in any expression pattern in either genetic background. Flies containing a UAS-EGFP-bMSRA (C) transgene associated with protein repair were found not to exhibit life extension or detectable enhanced green fluorescent protein (EGFP) activity. The presence of UAS-responder transgenes was confirmed by PCR amplification and sequencing at the 5' and 3' end of each insertion. These results cast doubt on the robustness of life extension in flies carrying single-gene mutations and suggest that the effects of all such mutations should be tested independently in multiple genetic backgrounds and laboratory environments.
Sequence-Dependent Persistence Length of Long DNA

NASA Astrophysics Data System (ADS)

Chuang, Hui-Min; Reifenberger, Jeffrey G.; Cao, Han; Dorfman, Kevin D.

2017-12-01

Using a high-throughput genome-mapping approach, we obtained circa 50 million measurements of the extension of internal human DNA segments in a 41 nm ×41 nm nanochannel. The underlying DNA sequences, obtained by mapping to the reference human genome, are 2.5-393 kilobase pairs long and contain percent GC contents between 32.5% and 60%. Using Odijk's theory for a channel-confined wormlike chain, these data reveal that the DNA persistence length increases by almost 20% as the percent GC content increases. The increased persistence length is rationalized by a model, containing no adjustable parameters, that treats the DNA as a statistical terpolymer with a sequence-dependent intrinsic persistence length and a sequence-independent electrostatic persistence length.
Whole-Genome Sequence of Escherichia coli Serotype O157:H7 Strain B6914-ARS.

PubMed

Uhlich, Gaylen A; Reichenberger, Erin R; Cottrell, Bryan J; Fratamico, Pina; Andreozzi, Elisa

2017-11-02

Escherichia coli serotype O157:H7 strain B6914-MS1 is an isolate from the Centers for Disease Control and Prevention that is missing both Shiga toxin genes and has been used extensively in applied research studies. Here we report the genome sequence of strain B6914-ARS, a B6914-MS1 clone that has unique biofilm properties.
Draft Genome Sequences of Escherichia coli Isolates from Wounded Military Personnel.

PubMed

Arivett, Brock A; Ream, Dave C; Fiester, Steven E; Kidane, Destaalem; Actis, Luis A

2016-08-11

Members of the Escherichia coli bacterial family have been grouped as ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogens because of their extensive drug resistance phenotypes and increasing threat to human health. The genomes of six extended-spectrum β-lactamase (ESBL)-producing E. coli strains isolated from wounded military personnel were sequenced and annotated. Copyright © 2016 Arivett et al.
Recombination in Enteroviruses Is a Biphasic Replicative Process Involving the Generation of Greater-than Genome Length ‘Imprecise’ Intermediates

PubMed Central

Lowry, Kym; Woodman, Andrew; Cook, Jonathan; Evans, David J.

2014-01-01

Recombination in enteroviruses provides an evolutionary mechanism for acquiring extensive regions of novel sequence, is suggested to have a role in genotype diversity and is known to have been key to the emergence of novel neuropathogenic variants of poliovirus. Despite the importance of this evolutionary mechanism, the recombination process remains relatively poorly understood. We investigated heterologous recombination using a novel reverse genetic approach that resulted in the isolation of intermediate chimeric intertypic polioviruses bearing genomes with extensive duplicated sequences at the recombination junction. Serial passage of viruses exhibiting such imprecise junctions yielded progeny with increased fitness which had lost the duplicated sequences. Mutations or inhibitors that changed polymerase fidelity or the coalescence of replication complexes markedly altered the yield of recombinants (but did not influence non-replicative recombination) indicating both that the process is replicative and that it may be possible to enhance or reduce recombination-mediated viral evolution if required. We propose that extant recombinants result from a biphasic process in which an initial recombination event is followed by a process of resolution, deleting extraneous sequences and optimizing viral fitness. This process has implications for our wider understanding of ‘evolution by duplication’ in the positive-strand RNA viruses. PMID:24945141
Evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus

NASA Technical Reports Server (NTRS)

Miller, S. R.; Castenholz, R. W.

2000-01-01

The extension of ecological tolerance limits may be an important mechanism by which microorganisms adapt to novel environments, but it may come at the evolutionary cost of reduced performance under ancestral conditions. We combined a comparative physiological approach with phylogenetic analyses to study the evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus. Among the 20 laboratory clones of Synechococcus isolated from collections made along an Oregon hot spring thermal gradient, four different 16S rRNA gene sequences were identified. Phylogenies constructed by using the sequence data indicated that the clones were polyphyletic but that three of the four sequence groups formed a clade. Differences in thermotolerance were observed for clones with different 16S rRNA gene sequences, and comparison of these physiological differences within a phylogenetic framework provided evidence that more thermotolerant lineages of Synechococcus evolved from less thermotolerant ancestors. The extension of the thermal limit in these bacteria was correlated with a reduction in the breadth of the temperature range for growth, which provides evidence that enhanced thermotolerance has come at the evolutionary cost of increased thermal specialization. This study illustrates the utility of using phylogenetic comparative methods to investigate how evolutionary processes have shaped historical patterns of ecological diversification in microorganisms.
The development of the red giant branch. I - Theoretical evolutionary sequences

NASA Technical Reports Server (NTRS)

Sweigart, Allen V.; Greggio, Laura; Renzini, Alvio

1989-01-01

A grid of 100 evolutionary sequences extending from the zero-age main sequence to the onset of helium burning has been computed for stellar masses between 1.4 and 3.4 solar masses, helium abundances of 0.20 and 0.30, and heavy-element abundances of 0.004, 0.01, and 0.04. Using these computations the transition in the morphology of the red giant branch (RGB) between low-mass stars, which have an extended and luminous first RGB phase prior to helium ignition, and intermediate-mass stars, which do not, is investigated. Extensive tabulations of the numerical results are provided to aid in applying these sequences. The effects of the first dredge-up on the surface helium and CNO abundances of the sequences is discussed.
galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.

PubMed

Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M

2004-06-12

The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.

PubMed

Catalano, Domenico; Licciulli, Flavio; Turi, Antonio; Grillo, Giorgio; Saccone, Cecilia; D'Elia, Domenica

2006-01-24

Mitochondria are sub-cellular organelles that have a central role in energy production and in other metabolic pathways of all eukaryotic respiring cells. In the last few years, with more and more genomes being sequenced, a huge amount of data has been generated providing an unprecedented opportunity to use the comparative analysis approach in studies of evolution and functional genomics with the aim of shedding light on molecular mechanisms regulating mitochondrial biogenesis and metabolism. In this context, the problem of the optimal extraction of representative datasets of genomic and proteomic data assumes a crucial importance. Specialised resources for nuclear-encoded mitochondria-related proteins already exist; however, no mitochondrial database is currently available with the same features of MitoRes, which is an update of the MitoNuc database extensively modified in its structure, data sources and graphical interface. It contains data on nuclear-encoded mitochondria-related products for any metazoan species for which this type of data is available and also provides comprehensive sequence datasets (gene, transcript and protein) as well as useful tools for their extraction and export. MitoRes http://www2.ba.itb.cnr.it/MitoRes/ consolidates information from publicly external sources and automatically annotates them into a relational database. Additionally, it also clusters proteins on the basis of their sequence similarity and interconnects them with genomic data. The search engine and sequence management tools allow the query/retrieval of the database content and the extraction and export of sequences (gene, transcript, protein) and related sub-sequences (intron, exon, UTR, CDS, signal peptide and gene flanking regions) ready to be used for in silico analysis. The tool we describe here has been developed to support lab scientists and bioinformaticians alike in the characterization of molecular features and evolution of mitochondrial targeting sequences. The way it provides for the retrieval and extraction of sequences allows the user to overcome the obstacles encountered in the integrative use of different bioinformatic resources and the completeness of the sequence collection allows intra- and interspecies comparison at different biological levels (gene, transcript and protein).
RNAstructure: software for RNA secondary structure prediction and analysis.

PubMed

Reuter, Jessica S; Mathews, David H

2010-03-15

To understand an RNA sequence's mechanism of action, the structure must be known. Furthermore, target RNA structure is an important consideration in the design of small interfering RNAs and antisense DNA oligonucleotides. RNA secondary structure prediction, using thermodynamics, can be used to develop hypotheses about the structure of an RNA sequence. RNAstructure is a software package for RNA secondary structure prediction and analysis. It uses thermodynamics and utilizes the most recent set of nearest neighbor parameters from the Turner group. It includes methods for secondary structure prediction (using several algorithms), prediction of base pair probabilities, bimolecular structure prediction, and prediction of a structure common to two sequences. This contribution describes new extensions to the package, including a library of C++ classes for incorporation into other programs, a user-friendly graphical user interface written in JAVA, and new Unix-style text interfaces. The original graphical user interface for Microsoft Windows is still maintained. The extensions to RNAstructure serve to make RNA secondary structure prediction user-friendly. The package is available for download from the Mathews lab homepage at http://rna.urmc.rochester.edu/RNAstructure.html.
Extension of the COG and arCOG databases by amino acid and nucleotide sequences

PubMed Central

Meereis, Florian; Kaufmann, Michael

2008-01-01

Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
Thermodynamic stability of biomolecules and evolution.

PubMed

Chakravarty, Ashim K

2017-08-01

The thermodynamic stability of biomolecules in the perspective of evolution is a complex issue and needs discussion. Intra molecular bonds maintain the structure and the state of internal energy (E) of a biomolecule at "local minima". In this communication, possibility of loss in internal energy level of a biomolecule through the changes in the bonds has been discussed, that might earn more thermodynamic stability for the molecule. In the process variations in structure and functions of the molecule could occur. Thus, E of a biomolecule is likely to have energy stature for minimization. Such change in energy status is an intrinsic factor for evolving biomolecules buying more stability and generating variations in the structure and function of DNA molecules undergoing natural selection. Thus, the variations might very well contribute towards the process of evolution. A brief discussion on conserved sequence in the light of proposition in this communication has been made at the end. Extension of the idea may resolve certain standing problems in evolution, such as maintenance of conserved sequences in genome of diverse species, pre- versus post adaptive mutations, 'orthogenesis', etc. Copyright © 2017 Elsevier Ltd. All rights reserved.
Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici

PubMed Central

McDonald, Megan C.; McGinness, Lachlan; Hane, James K.; Williams, Angela H.; Milgate, Andrew; Solomon, Peter S.

2016-01-01

Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene. PMID:26837952
Characterization and in situ localization of a salt-induced tomato peroxidase mRNA.

PubMed

Botella, M A; Quesada, M A; Kononowicz, A K; Bressan, R A; Pliego, F; Hasegawa, P M; Valpuesta, V

1994-04-01

NaCl treatment of tomato plants in hydroponic culture at concentrations as low as 50 mM resulted in enhanced accumulation of transcripts of TPX1, a full-length cDNA clone that we had isolated from a library of NaCl-treated tomato plants using a peroxidase-specific oligonucleotide probe. Although the overall amino acid sequence identity of TPX1 to other peroxidase genes was less than 45%, there was a very high degree of identity in all of the conserved domains. The deduced amino acid sequence included the presence of a N-terminal signal peptide but not the C-terminal extension present in peroxidases targeted to the vacuole. The mature protein has a theoretical pI value of 7.5. Transcripts that hybridized to TPX1 were detected only in the roots with higher levels of mRNA in epidermal and subepidermal cell layers. Isoelectric focusing of root extracts showed two major bands of peroxidase activity at pI 5.9 and 6.2. Both activities increased with salt treatment. Southern analysis indicated the presence of only a single TPX1 gene in tomato.
Crystal structures of the SAM-III/SMK riboswitch reveal the SAM-dependent translation inhibition mechanism

PubMed Central

Lu, Changrui; Smith, Angela M; Fuchs, Ryan T; Ding, Fang; Rajashankar, Kanagalaghatta; Henkin, Tina M; Ke, Ailong

2011-01-01

Three distinct classes of S-adenosyl-l-methionine (SAM)-responsive riboswitches have been identified that regulate bacterial gene expression at the levels of transcription attenuation or translation inhibition. The SMK box (SAM-III) translational riboswitch has been identified in the SAM synthetase gene in members of the Lactobacillales. Here we report the 2.2-Å crystal structure of the Enterococcus faecalis SMK box riboswitch. The Y-shaped riboswitch organizes its conserved nucleotides around a three-way junction for SAM recognition. The Shine-Dalgarno sequence, which is sequestered by base-pairing with the anti–Shine-Dalgarno sequence in response to SAM binding, also directly participates in SAM recognition. The riboswitch makes extensive interactions with the adenosine and sulfonium moieties of SAM but does not appear to recognize the tail of the methionine moiety. We captured a structural snapshot of the SMK box riboswitch sampling the near-cognate ligand S-adenosyl-l-homocysteine (SAH) in which SAH was found to adopt an alternative conformation and fails to make several key interactions. PMID:18806797
The resurrection genome of Boea hygrometrica: A blueprint for survival of dehydration.

PubMed

Xiao, Lihong; Yang, Ge; Zhang, Liechi; Yang, Xinhua; Zhao, Shuang; Ji, Zhongzhong; Zhou, Qing; Hu, Min; Wang, Yu; Chen, Ming; Xu, Yu; Jin, Haijing; Xiao, Xuan; Hu, Guipeng; Bao, Fang; Hu, Yong; Wan, Ping; Li, Legong; Deng, Xin; Kuang, Tingyun; Xiang, Chengbin; Zhu, Jian-Kang; Oliver, Melvin J; He, Yikun

2015-05-05

"Drying without dying" is an essential trait in land plant evolution. Unraveling how a unique group of angiosperms, the Resurrection Plants, survive desiccation of their leaves and roots has been hampered by the lack of a foundational genome perspective. Here we report the ∼1,691-Mb sequenced genome of Boea hygrometrica, an important resurrection plant model. The sequence revealed evidence for two historical genome-wide duplication events, a compliment of 49,374 protein-coding genes, 29.15% of which are unique (orphan) to Boea and 20% of which (9,888) significantly respond to desiccation at the transcript level. Expansion of early light-inducible protein (ELIP) and 5S rRNA genes highlights the importance of the protection of the photosynthetic apparatus during drying and the rapid resumption of protein synthesis in the resurrection capability of Boea. Transcriptome analysis reveals extensive alternative splicing of transcripts and a focus on cellular protection strategies. The lack of desiccation tolerance-specific genome organizational features suggests the resurrection phenotype evolved mainly by an alteration in the control of dehydration response genes.
Crystal structures of the SAM-III/S[subscript MK] riboswitch reveal the SAM-dependent translation inhibition mechanism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, C.; Smith, A.M.; Fuchs, R.T.

2010-01-07

Three distinct classes of S-adenosyl-L-methionine (SAM)-responsive riboswitches have been identified that regulate bacterial gene expression at the levels of transcription attenuation or translation inhibition. The SMK box (SAM-III) translational riboswitch has been identified in the SAM synthetase gene in members of the Lactobacillales. Here we report the 2.2-{angstrom} crystal structure of the Enterococcus faecalis SMK box riboswitch. The Y-shaped riboswitch organizes its conserved nucleotides around a three-way junction for SAM recognition. The Shine-Dalgarno sequence, which is sequestered by base-pairing with the anti-Shine-Dalgarno sequence in response to SAM binding, also directly participates in SAM recognition. The riboswitch makes extensive interactions withmore » the adenosine and sulfonium moieties of SAM but does not appear to recognize the tail of the methionine moiety. We captured a structural snapshot of the SMK box riboswitch sampling the near-cognate ligand S-adenosyl-L-homocysteine (SAH) in which SAH was found to adopt an alternative conformation and fails to make several key interactions.« less

Evolution of MHC class I genes in the endangered loggerhead sea turtle (Caretta caretta) revealed by 454 amplicon sequencing.

PubMed

Stiebens, Victor A; Merino, Sonia E; Chain, Frédéric J J; Eizaguirre, Christophe

2013-04-30

In evolutionary and conservation biology, parasitism is often highlighted as a major selective pressure. To fight against parasites and pathogens, genetic diversity of the immune genes of the major histocompatibility complex (MHC) are particularly important. However, the extensive degree of polymorphism observed in these genes makes it difficult to conduct thorough population screenings. We utilized a genotyping protocol that uses 454 amplicon sequencing to characterize the MHC class I in the endangered loggerhead sea turtle (Caretta caretta) and to investigate their evolution at multiple relevant levels of organization. MHC class I genes revealed signatures of trans-species polymorphism across several reptile species. In the studied loggerhead turtle individuals, it results in the maintenance of two ancient allelic lineages. We also found that individuals carrying an intermediate number of MHC class I alleles are larger than those with either a low or high number of alleles. Multiple modes of evolution seem to maintain MHC diversity in the loggerhead turtles, with relatively high polymorphism for an endangered species.
Programmable and highly resolved in vitro detection of 5-methylcytosine by TALEs.

PubMed

Kubik, Grzegorz; Schmidt, Moritz J; Penner, Johanna E; Summerer, Daniel

2014-06-02

Gene expression is extensively regulated by specific patterns of genomic 5-methylcytosine (mC), but the ability to directly detect this modification at user-defined genomic loci is limited. One reason is the lack of molecules that discriminate between mC and cytosine (C) and at the same time provide inherent, programmable sequence-selectivity. Programmable transcription-activator-like effectors (TALEs) have been observed to exhibit mC-sensitivity in vivo, but to only a limited extent in vitro. We report an mC-detection assay based on TALE control of DNA replication that displays unexpectedly strong mC-discrimination ability in vitro. The status and level of mC modification at single positions in oligonucleotides can be determined unambiguously by this assay, independently of the overall target sequence. Moreover, discrimination is reliably observed for positions bound by N-terminal and central regions of TALEs. This indicates the wide scope and robustness of the approach for highly resolved mC detection and enabled the detection of a single mC in a large, eukaryotic genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
HuH-7 reference genome profile: complex karyotype composed of massive loss of heterozygosity.

PubMed

Kasai, Fumio; Hirayama, Noriko; Ozawa, Midori; Satoh, Motonobu; Kohara, Arihiro

2018-05-17

Human cell lines represent a valuable resource as in vitro experimental models. A hepatoma cell line, HuH-7 (JCRB0403), has been used extensively in various research fields and a number of studies using this line have been published continuously since it was established in 1982. However, an accurate genome profile, which can be served as a reliable reference, has not been available. In this study, we performed M-FISH, SNP microarray and amplicon sequencing to characterize the cell line. Single cell analysis of metaphases revealed a high level of heterogeneity with a mode of 60 chromosomes. Cytogenetic results demonstrated chromosome abnormalities involving every chromosome in addition to a massive loss of heterozygosity, which accounts for 55.3% of the genome, consistent with the homozygous variants seen in the sequence analysis. We provide empirical data that the HuH-7 cell line is composed of highly heterogeneous cell populations, suggesting that besides cell line authentication, the quality of cell lines needs to be taken into consideration in the future use of tumor cell lines.
Spliced leader RNA of trypanosomes: in vivo mutational analysis reveals extensive and distinct requirements for trans splicing and cap4 formation.

PubMed Central

Lücke, S; Xu, G L; Palfi, Z; Cross, M; Bellofatto, V; Bindereif, A

1996-01-01

In trypanosomes mRNAs are generated through trans splicing. The spliced leader (SL) RNA, which donates the 5'-terminal mini-exon to each of the protein coding exons, plays a central role in the trans splicing process. We have established in vivo assays to study in detail trans splicing, cap4 modification, and RNP assembly of the SL RNA in the trypanosomatid species Leptomonas seymouri. First, we found that extensive sequences within the mini-exon are required for SL RNA function in vivo, although a conserved length of 39 nt is not essential. In contrast, the intron sequence appears to be surprisingly tolerant to mutation; only the stem-loop II structure is indispensable. The asymmetry of the sequence requirements in the stem I region suggests that this domain may exist in different functional conformations. Second, distinct mini-exon sequences outside the modification site are important for efficient cap4 formation. Third, all SL RNA mutations tested allowed core RNP assembly, suggesting flexible requirements for core protein binding. In sum, the results of our mutational analysis provide evidence for a discrete domain structure of the SL RNA and help to explain the strong phylogenetic conservation of the mini-exon sequence and of the overall SL RNA secondary structure; they also suggest that there may be certain differences between trans splicing in nematodes and trypanosomes. This approach provides a basis for studying RNA-RNA interactions in the trans spliceosome. Images PMID:8861965
Mechanism for DNA transposons to generate introns on genomic scales

PubMed Central

Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.

2017-01-01

Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
Dynamic evolution at pericentromeres.

PubMed

Hall, Anne E; Kettler, Gregory C; Preuss, Daphne

2006-03-01

Pericentromeres are exceptional genomic regions: in animals they contain extensive segmental duplications implicated in gene creation, and in plants they sustain rearrangements and insertions uncommon in euchromatin. To examine the mechanisms and patterns of plant pericentromere evolution, we compared pericentromere sequence from four Brassicaceae species separated by <15 million years (Myr). This flowering plant family is ideal for studying relationships between genome reorganization and pericentromere evolution-its members have undergone recent polyploidization and hybridization, with close relatives changing in genome size and chromosome number. Through sequence and hybridization analyses, we examined regions from Arabidopsis arenosa, Capsella rubella, and Olimarabidopsis pumila that are homologous to Arabidopsis thaliana pericentromeres (peri-CENs) III and V, and used FISH to demonstrate they have been maintained near centromere satellite arrays in each species. Sequence analysis revealed a set of highly conserved genes, yet we discovered substantial differences in intergenic length and species-specific changes in sequence content and gene density. We discovered that A. thaliana has undergone recent, significant expansions within its pericentromeres, in some cases measuring hundreds of kilobases; these findings are in marked contrast to euchromatic segments in these species that exhibit only minor length changes. While plant pericentromeres do contain some duplications, we did not find evidence of extensive segmental duplications, as has been documented in primates. Our data support a model in which plant pericentromeres may experience selective pressures distinct from euchromatin, tolerating rapid, dynamic changes in structure and sequence content, including large insertions of mobile elements, 5S rDNA arrays and pseudogenes.
Identification, cloning, and sequencing of a fragment of Amsacta moorei entomopoxvirus DNA containing the spheroidin gene and three vaccinia virus-related open reading frames.

PubMed Central

Hall, R L; Moyer, R W

1991-01-01

Entomopoxvirus virions are frequently contained within crystalline occlusion bodies, which are composed of primarily a single protein, spheroidin, which is analogous to the polyhedrin protein of baculovirus. The spheroidin gene of Amsacta moorei entomopoxvirus was identified following the microsequencing of polypeptides generated from cyanogen bromide treatment of spheroidin and the subsequent synthesis of oligonucleotide hybridization probes. DNA sequencing of a 6.8-kb region of DNA containing the spheroidin gene showed that the spheroidin protein is derived from a 3.0-kb open reading frame potentially encoding a protein of 115 kDa. Three copies of the heptanucleotide, TTTTTNT, a sequence associated with early gene transcription in the vertebrate poxviruses, and four in-frame translational termination signals were found within 60 bp upstream of the putative spheroidin gene promoter (TAAATG). The spheroidin gene promoter region contains the sequence TAAATG, which is found in many late promoters of the vertebrate poxviruses and which serves as the site of transcriptional initiation, as shown by primer extension. Primer extension experiments also showed that spheroidin gene transcripts contain 5' poly(A) sequences typical of vertebrate poxvirus late transcripts. The 92 bases upstream of the initiating TAAATG are unusually A + T rich and contain only 7 G or C residues. An analysis of open reading frames around the spheroidin gene suggests that the colinear core of "essential genes" typical of the vertebrate poxviruses is absent in A. moorei entomopoxvirus. Images PMID:1942245
GobyWeb: Simplified Management and Analysis of Gene Expression and DNA Methylation Sequencing Data

PubMed Central

Dorff, Kevin C.; Chambwe, Nyasha; Zeno, Zachary; Simi, Manuele; Shaknovich, Rita; Campagne, Fabien

2013-01-01

We present GobyWeb, a web-based system that facilitates the management and analysis of high-throughput sequencing (HTS) projects. The software provides integrated support for a broad set of HTS analyses and offers a simple plugin extension mechanism. Analyses currently supported include quantification of gene expression for messenger and small RNA sequencing, estimation of DNA methylation (i.e., reduced bisulfite sequencing and whole genome methyl-seq), or the detection of pathogens in sequenced data. In contrast to previous analysis pipelines developed for analysis of HTS data, GobyWeb requires significantly less storage space, runs analyses efficiently on a parallel grid, scales gracefully to process tens or hundreds of multi-gigabyte samples, yet can be used effectively by researchers who are comfortable using a web browser. We conducted performance evaluations of the software and found it to either outperform or have similar performance to analysis programs developed for specialized analyses of HTS data. We found that most biologists who took a one-hour GobyWeb training session were readily able to analyze RNA-Seq data with state of the art analysis tools. GobyWeb can be obtained at http://gobyweb.campagnelab.org and is freely available for non-commercial use. GobyWeb plugins are distributed in source code and licensed under the open source LGPL3 license to facilitate code inspection, reuse and independent extensions http://github.com/CampagneLaboratory/gobyweb2-plugins. PMID:23936070
Integrating restriction site-associated DNA sequencing (RAD-seq) with morphological cladistic analysis clarifies evolutionary relationships among major species groups of bee orchids

PubMed Central

Sramkó, Gábor; Paun, Ovidiu

2018-01-01

Abstract Background and Aims Bee orchids (Ophrys) have become the most popular model system for studying reproduction via insect-mediated pseudo-copulation and for exploring the consequent, putatively adaptive, evolutionary radiations. However, despite intensive past research, both the phylogenetic structure and species diversity within the genus remain highly contentious. Here, we integrate next-generation sequencing and morphological cladistic techniques to clarify the phylogeny of the genus. Methods At least two accessions of each of the ten species groups previously circumscribed from large-scale cloned nuclear ribosomal internal transcibed spacer (nrITS) sequencing were subjected to restriction site-associated sequencing (RAD-seq). The resulting matrix of 4159 single nucleotide polymorphisms (SNPs) for 34 accessions was used to construct an unrooted network and a rooted maximum likelihood phylogeny. A parallel morphological cladistic matrix of 43 characters generated both polymorphic and non-polymorphic sets of parsimony trees before being mapped across the RAD-seq topology. Key Results RAD-seq data strongly support the monophyly of nine out of ten groups previously circumscribed using nrITS and resolve three major clades; in contrast, supposed microspecies are barely distinguishable. Strong incongruence separated the RAD-seq trees from both the morphological trees and traditional classifications; mapping of the morphological characters across the RAD-seq topology rendered them far more homoplastic. Conclusions The comparatively high level of morphological homoplasy reflects extensive convergence, whereas the derived placement of the fusca group is attributed to paedomorphic simplification. The phenotype of the most recent common ancestor of the extant lineages is inferred, but it post-dates the majority of the character-state changes that typify the genus. RAD-seq may represent the high-water mark of the contribution of molecular phylogenetics to understanding evolution within Ophrys; further progress will require large-scale population-level studies that integrate phenotypic and genotypic data in a cogent conceptual framework. PMID:29325077
Evidence for Deep Regulatory Similarities in Early Developmental Programs across Highly Diverged Insects

PubMed Central

Zhang, Yinan; Samee, Md. Abul Hassan; Halfon, Marc S.; Sinha, Saurabh

2014-01-01

Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like “long germband” development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250–350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as “training data” to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution. PMID:25173756
Evidence for deep regulatory similarities in early developmental programs across highly diverged insects.

PubMed

Kazemian, Majid; Suryamohan, Kushal; Chen, Jia-Yu; Zhang, Yinan; Samee, Md Abul Hassan; Halfon, Marc S; Sinha, Saurabh

2014-09-01

Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Integrating restriction site-associated DNA sequencing (RAD-seq) with morphological cladistic analysis clarifies evolutionary relationships among major species groups of bee orchids.

PubMed

Bateman, Richard M; Sramkó, Gábor; Paun, Ovidiu

2018-01-25

Bee orchids (Ophrys) have become the most popular model system for studying reproduction via insect-mediated pseudo-copulation and for exploring the consequent, putatively adaptive, evolutionary radiations. However, despite intensive past research, both the phylogenetic structure and species diversity within the genus remain highly contentious. Here, we integrate next-generation sequencing and morphological cladistic techniques to clarify the phylogeny of the genus. At least two accessions of each of the ten species groups previously circumscribed from large-scale cloned nuclear ribosomal internal transcibed spacer (nrITS) sequencing were subjected to restriction site-associated sequencing (RAD-seq). The resulting matrix of 4159 single nucleotide polymorphisms (SNPs) for 34 accessions was used to construct an unrooted network and a rooted maximum likelihood phylogeny. A parallel morphological cladistic matrix of 43 characters generated both polymorphic and non-polymorphic sets of parsimony trees before being mapped across the RAD-seq topology. RAD-seq data strongly support the monophyly of nine out of ten groups previously circumscribed using nrITS and resolve three major clades; in contrast, supposed microspecies are barely distinguishable. Strong incongruence separated the RAD-seq trees from both the morphological trees and traditional classifications; mapping of the morphological characters across the RAD-seq topology rendered them far more homoplastic. The comparatively high level of morphological homoplasy reflects extensive convergence, whereas the derived placement of the fusca group is attributed to paedomorphic simplification. The phenotype of the most recent common ancestor of the extant lineages is inferred, but it post-dates the majority of the character-state changes that typify the genus. RAD-seq may represent the high-water mark of the contribution of molecular phylogenetics to understanding evolution within Ophrys; further progress will require large-scale population-level studies that integrate phenotypic and genotypic data in a cogent conceptual framework. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company.
FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

DOE PAGES

Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

2017-06-09

FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
Rapid Diagnostics of Onboard Sequences

NASA Technical Reports Server (NTRS)

Starbird, Thomas W.; Morris, John R.; Shams, Khawaja S.; Maimone, Mark W.

2012-01-01

Keeping track of sequences onboard a spacecraft is challenging. When reviewing Event Verification Records (EVRs) of sequence executions on the Mars Exploration Rover (MER), operators often found themselves wondering which version of a named sequence the EVR corresponded to. The lack of this information drastically impacts the operators diagnostic capabilities as well as their situational awareness with respect to the commands the spacecraft has executed, since the EVRs do not provide argument values or explanatory comments. Having this information immediately available can be instrumental in diagnosing critical events and can significantly enhance the overall safety of the spacecraft. This software provides auditing capability that can eliminate that uncertainty while diagnosing critical conditions. Furthermore, the Restful interface provides a simple way for sequencing tools to automatically retrieve binary compiled sequence SCMFs (Space Command Message Files) on demand. It also enables developers to change the underlying database, while maintaining the same interface to the existing applications. The logging capabilities are also beneficial to operators when they are trying to recall how they solved a similar problem many days ago: this software enables automatic recovery of SCMF and RML (Robot Markup Language) sequence files directly from the command EVRs, eliminating the need for people to find and validate the corresponding sequences. To address the lack of auditing capability for sequences onboard a spacecraft during earlier missions, extensive logging support was added on the Mars Science Laboratory (MSL) sequencing server. This server is responsible for generating all MSL binary SCMFs from RML input sequences. The sequencing server logs every SCMF it generates into a MySQL database, as well as the high-level RML file and dictionary name inputs used to create the SCMF. The SCMF is then indexed by a hash value that is automatically included in all command EVRs by the onboard flight software. Second, both the binary SCMF result and the RML input file can be retrieved simply by specifying the hash to a Restful web interface. This interface enables command line tools as well as large sophisticated programs to download the SCMF and RMLs on-demand from the database, enabling a vast array of tools to be built on top of it. One such command line tool can retrieve and display RML files, or annotate a list of EVRs by interleaving them with the original sequence commands. This software has been integrated with the MSL sequencing pipeline where it will serve sequences useful in diagnostics, debugging, and situational awareness throughout the mission.
The Oral Microbiota in Health and Disease: An Overview of Molecular Findings.

PubMed

Siqueira, José F; Rôças, Isabela N

2017-01-01

Culture-independent nucleic acid technologies have been extensively applied to the analysis of oral bacterial communities associated with healthy and diseased conditions. These methods have confirmed and substantially expanded the findings from culture studies to reveal the oral microbial inhabitants and candidate pathogens associated with the major oral diseases. Over 1000 bacterial distinct species-level taxa have been identified in the oral cavity and studies using next-generation DNA sequencing approaches indicate that the breadth of bacterial diversity may be even much larger. Nucleic acid technologies have also been helpful in profiling bacterial communities and identifying disease-related patterns. This chapter provides an overview of the diversity and taxonomy of oral bacteria associated with health and disease.
Identification and Molecular Characterization of Genes Coding Pharmaceutically Important Enzymes from Halo-Thermo Tolerant Bacillus

PubMed Central

Safary, Azam; Moniri, Rezvan; Hamzeh-Mivehroud, Maryam; Dastmalchi, Siavoush

2016-01-01

Purpose: Robust pharmaceutical and industrial enzymes from extremophile microorganisms are main source of enzymes with tremendous stability under harsh conditions which make them potential tools for commercial and biotechnological applications. Methods: The genome of a Gram-positive halo-thermotolerant Bacillus sp. SL1, new isolate from Saline Lake, was investigated for the presence of genes coding for potentially pharmaceutical enzymes. We determined gene sequences for the enzymes laccase (CotA), l-asparaginase (ansA3, ansA1), glutamate-specific endopeptidase (blaSE), l-arabinose isomerase (araA2), endo-1,4-β mannosidase (gmuG), glutaminase (glsA), pectate lyase (pelA), cellulase (bglC1), aldehyde dehydrogenase (ycbD) and allantoinases (pucH) in the genome of Bacillus sp. SL1. Results: Based on the DNA sequence alignment results, six of the studied enzymes of Bacillus sp. SL-1 showed 100% similarity at the nucleotide level to the same genes of B. licheniformis 14580 demonstrating extensive organizational relationship between these two strains. Despite high similarities between the B. licheniformis and Bacillus sp. SL-1 genomes, there are minor differences in the sequences of some enzyme. Approximately 30% of the enzyme sequences revealed more than 99% identity with some variations in nucleotides leading to amino acid substitution in protein sequences. Conclusion: Molecular characterization of this new isolate provides useful information regarding evolutionary relationship between B. subtilis and B. licheniformis species. Since, the most industrial processes are often performed in harsh conditions, enzymes from such halo-thermotolerant bacteria may provide economically and industrially appealing biocatalysts to be used under specific physicochemical situations in medical, pharmaceutical, chemical and other industries. PMID:28101462
NifH-Harboring Bacterial Community Composition across an Alaskan Permafrost Thaw Gradient

PubMed Central

Penton, C. Ryan; Yang, Caiyun; Wu, Liyou; Wang, Qiong; Zhang, Jin; Liu, Feifei; Qin, Yujia; Deng, Ye; Hemme, Christopher L.; Zheng, Tianling; Schuur, Edward A. G.; Tiedje, James; Zhou, Jizhong

2016-01-01

Since nitrogen (N) is often limiting in permafrost soils, we investigated the N2-fixing genetic potential and the inferred taxa harboring those genes by sequencing nifH gene fragments in samples taken along a permafrost thaw gradient in an Alaskan boreal soil. Samples from minimally, moderately and extensively thawed sites were taken to a depth of 79 cm to encompass zones above and below the depth of the water table. NifH reads were translated with frameshift correction and 112,476 sequences were clustered at 5% amino acid dissimilarity resulting in 1,631 OTUs. Sample depth in relation to water table depth was correlated to differences in the NifH sequence classes with those most closely related to group I nifH-harboring Alpha- and Beta-Proteobacteria in higher abundance above water table depth while those related to group III nifH-harboring Delta Proteobacteria more abundant below. The most dominant below water table depth NifH sequences, comprising 1/3 of the total, were distantly related to Verrucomicrobia-Opitutaceae. Overall, these results suggest that permafrost thaw alters the class-level composition of N2-fixing communities in the thawed soil layers and that this distinction corresponds to the depth of the water table. These nifH data were also compared to nifH sequences obtained from a study at an Alaskan taiga site, and to those of other geographically distant, non-permafrost sites. The two Alaska sites were differentiated largely by changes in relative abundances of the same OTUs, whereas the non-Alaska sites were differentiated by the lack of many Alaskan OTUs, and the presence of unique halophilic, sulfate- and iron-reducing taxa in the Alaska sites. PMID:27933054
Comparative Transcriptome Analysis of the Accessory Sex Gland and Testis from the Chinese Mitten Crab (Eriocheir sinensis)

PubMed Central

He, Lin; Jiang, Hui; Cao, Dandan; Liu, Lihua; Hu, Songnian; Wang, Qun

2013-01-01

The accessory sex gland (ASG) is an important component of the male reproductive system, which functions to enhance the fertility of spermatozoa during male reproduction. Certain proteins secreted by the ASG are known to bind to the spermatozoa membrane and affect its function. The ASG gene expression profile in Chinese mitten crab (Eriocheir sinensis) has not been extensively studied, and limited genetic research has been conducted on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for the ASG of E. sinensis using Illumina sequencing technology. This analysis yielded a total of 33,221,284 sequencing reads, including 2.6 Gb of total nucleotides. Reads were assembled into 85,913 contigs (average 218 bp), or 58,567 scaffold sequences (average 292 bp), that identified 37,955 unigenes (average 385 bp). We assembled all unigenes and compared them with the published testis transcriptome from E. sinensis. In order to identify which genes may be involved in ASG function, as it pertains to modification of spermatozoa, we compared the ASG and testis transcriptome of E. sinensis. Our analysis identified specific genes with both higher and lower tissue expression levels in the two tissues, and the functions of these genes were analyzed to elucidate their potential roles during maturation of spermatozoa. Availability of detailed transcriptome data from ASG and testis in E. sinensis can assist our understanding of the molecular mechanisms involved with spermatozoa conservation, transport, maturation and capacitation and potentially acrosome activation. PMID:23342039
Breakpoint structure of the Anopheles gambiae 2Rb chromosomal inversion.

PubMed

Lobo, Neil F; Sangaré, Djibril M; Regier, Allison A; Reidenbach, Kyanne R; Bretz, David A; Sharakhova, Maria V; Emrich, Scott J; Traore, Sekou F; Costantini, Carlo; Besansky, Nora J; Collins, Frank H

2010-10-25

Alternative arrangements of chromosome 2 inversions in Anopheles gambiae are important sources of population structure, and are associated with adaptation to environmental heterogeneity. The forces responsible for their origin and maintenance are incompletely understood. Molecular characterization of inversion breakpoints provides insight into how they arose, and provides the basis for development of molecular karyotyping methods useful in future studies. Sequence comparison of regions near the cytological breakpoints of 2Rb allowed the molecular delineation of breakpoint boundaries. Comparisons were made between the standard 2R+b arrangement in the An. gambiae PEST reference genome and the inverted 2Rb arrangements in the An. gambiae M and S genome assemblies. Sequence differences between alternative 2Rb arrangements were exploited in the design of a PCR diagnostic assay, which was evaluated against the known chromosomal banding pattern of laboratory colonies and field-collected samples from Mali and Cameroon. The breakpoints of the 7.55 Mb 2Rb inversion are flanked by extensive runs of the same short (72 bp) tandemly organized sequence, which was likely responsible for chromosomal breakage and rearrangement. Application of the molecular diagnostic assay suggested that 2Rb has a single common origin in An. gambiae and its sibling species, Anopheles arabiensis, and also that the standard arrangement (2R+b) may have arisen twice through breakpoint reuse. The molecular diagnostic was reliable when applied to laboratory colonies, but its accuracy was lower in natural populations. The complex repetitive sequence flanking the 2Rb breakpoint region may be prone to structural and sequence-level instability. The 2Rb molecular diagnostic has immediate application in studies based on laboratory colonies, but its usefulness in natural populations awaits development of complementary molecular tools.

Identification of the promoter of the myelomonocytic leukocyte integrin CD11b.

PubMed Central

Hickstein, D D; Baker, D M; Gollahon, K A; Back, A L

1992-01-01

The CD11b (or macrophage-1 antigen; MAC-1) subunit of the leukocyte integrin family forms a noncovalently associated heterodimeric structure with the CD18 (beta) subunit on the surface of human granulocytes and monocyte/macrophages, where it enables these myeloid cells to participate in a variety of adherence-related activities. Expression of the CD11b subunit is restricted to cells of the myelomonocytic lineage and depends upon the stage of differentiation with the most mature myeloid cells expressing the highest levels of CD11b. To study the regulation of CD11b expression, a genomic clone corresponding to the 5' region of the CD11b gene was isolated from a human chromosome 16 library. Primer extension and RNase protection assays identified two major transcriptional start sites, located 90 base pairs and 54 base pairs upstream from the initiation methionine. DNA sequence analysis of 1.7 kilobases of the 5' flanking sequence of the CD11b gene indicated the absence of a "CAAT" or "TATA" box; however, potential binding sites for the transcription activators Sp1, PU.1, ets, and AP-2 are present, as well as retinoic acid response elements. The 1.7-kilobase CD11b promoter sequence displayed functional activity in transient transfection assays in the monocytic cell line THP-1 and the myeloid cell line HL-60. In contrast, this 1.7-kilobase promoter sequence did not display functional activity in the Jurkat T-lymphoid cell line. Detailed characterization of the CD11b promoter sequence should provide insight into the molecular events regulating the tissue-specific and developmental stage-specific expression of the CD11b molecule in myelomonocytic cells. Images PMID:1347945
NifH-Harboring Bacterial Community Composition across an Alaskan Permafrost Thaw Gradient

DOE PAGES

Penton, C. Ryan; Yang, Caiyun; Wu, Liyou; ...

2016-11-24

Since nitrogen (N) is often limiting in permafrost soils, we investigated the N 2-fixing genetic potential and the inferred taxa harboring those genes by sequencing nifH gene fragments in samples taken along a permafrost thaw gradient in an Alaskan boreal soil. Samples from minimally, moderately and extensively thawed sites were taken to a depth of 79 cm to encompass zones above and below the depth of the water table. NifH reads were translated with frameshift correction and 112,476 sequences were clustered at 5% amino acid dissimilarity resulting in 1,631 OTUs. Sample depth in relation to water table depth was correlatedmore » to differences in the NifH sequence classes with those most closely related to group I nifH-harboring Alpha- and Beta-Proteobacteria in higher abundance above water table depth while those related to group III nifH-harboring Delta Proteobacteria more abundant below. The most dominant below water table depth NifH sequences, comprising 1/3 of the total, were distantly related to Verrucomicrobia-Opitutaceae. Overall, these results suggest that permafrost thaw alters the class-level composition of N 2-fixing communities in the thawed soil layers and that this distinction corresponds to the depth of the water table. These nifH data were also compared to nifH sequences obtained from a study at an Alaskan taiga site, and to those of other geographically distant, non-permafrost sites. The two Alaska sites were differentiated largely by changes in relative abundances of the same OTUs, whereas the non-Alaska sites were differentiated by the lack of many Alaskan OTUs, and the presence of unique halophilic, sulfate- and iron-reducing taxa in the Alaska sites.« less
Identification and Molecular Characterization of Genes Coding Pharmaceutically Important Enzymes from Halo-Thermo Tolerant Bacillus.

PubMed

Safary, Azam; Moniri, Rezvan; Hamzeh-Mivehroud, Maryam; Dastmalchi, Siavoush

2016-12-01

Purpose: Robust pharmaceutical and industrial enzymes from extremophile microorganisms are main source of enzymes with tremendous stability under harsh conditions which make them potential tools for commercial and biotechnological applications. Methods: The genome of a Gram-positive halo-thermotolerant Bacillus sp. SL1, new isolate from Saline Lake, was investigated for the presence of genes coding for potentially pharmaceutical enzymes. We determined gene sequences for the enzymes laccase (CotA), l-asparaginase (ansA3, ansA1), glutamate-specific endopeptidase (blaSE), l-arabinose isomerase (araA2), endo-1,4-β mannosidase (gmuG), glutaminase (glsA), pectate lyase (pelA), cellulase (bglC1), aldehyde dehydrogenase (ycbD) and allantoinases (pucH) in the genome of Bacillus sp. SL1. Results: Based on the DNA sequence alignment results, six of the studied enzymes of Bacillus sp. SL-1 showed 100% similarity at the nucleotide level to the same genes of B. licheniformis 14580 demonstrating extensive organizational relationship between these two strains. Despite high similarities between the B. licheniformis and Bacillus sp. SL-1 genomes, there are minor differences in the sequences of some enzyme. Approximately 30% of the enzyme sequences revealed more than 99% identity with some variations in nucleotides leading to amino acid substitution in protein sequences. Conclusion: Molecular characterization of this new isolate provides useful information regarding evolutionary relationship between B. subtilis and B. licheniformis species. Since, the most industrial processes are often performed in harsh conditions, enzymes from such halo-thermotolerant bacteria may provide economically and industrially appealing biocatalysts to be used under specific physicochemical situations in medical, pharmaceutical, chemical and other industries.
Characterization and Exploitation of CRISPR Loci in Bifidobacterium longum

PubMed Central

Hidalgo-Cantabrana, Claudio; Crawley, Alexandra B.; Sanchez, Borja; Barrangou, Rodolphe

2017-01-01

Diverse CRISPR-Cas systems provide adaptive immunity in many bacteria and most archaea, via a DNA-encoded, RNA-mediated, nucleic-acid targeting mechanism. Over time, CRISPR loci expand via iterative uptake of invasive DNA sequences into the CRISPR array during the adaptation process. These genetic vaccination cards thus provide insights into the exposure of strains to phages and plasmids in space and time, revealing the historical predatory exposure of a strain. These genetic loci thus constitute a unique basis for genotyping of strains, with potential of resolution at the strain-level. Here, we investigate the occurrence and diversity of CRISPR-Cas systems in the genomes of various Bifidobacterium longum strains across three sub-species. Specifically, we analyzed the genomic content of 66 genomes belonging to B. longum subsp. longum, B. longum subsp. infantis and B. longum subsp. suis, and identified 25 strains that carry 29 total CRISPR-Cas systems. We identify various Type I and Type II CRISPR-Cas systems that are widespread in this species, notably I-C, I-E, and II-C. Noteworthy, Type I-C systems showed extended CRISPR arrays, with extensive spacer diversity. We show how these hypervariable loci can be used to gain insights into strain origin, evolution and phylogeny, and can provide discriminatory sequences to distinguish even clonal isolates. By investigating CRISPR spacer sequences, we reveal their origin and implicate phages and prophages as drivers of CRISPR immunity expansion in this species, with redundant targeting of select prophages. Analysis of CRISPR spacer origin also revealed novel PAM sequences. Our results suggest that CRISPR-Cas immune systems are instrumental in mounting diversified viral resistance in B. longum, and show that these sequences are useful for typing across three subspecies. PMID:29033911
Characterization and Exploitation of CRISPR Loci in Bifidobacterium longum.

PubMed

Hidalgo-Cantabrana, Claudio; Crawley, Alexandra B; Sanchez, Borja; Barrangou, Rodolphe

2017-01-01

Diverse CRISPR-Cas systems provide adaptive immunity in many bacteria and most archaea, via a DNA-encoded, RNA-mediated, nucleic-acid targeting mechanism. Over time, CRISPR loci expand via iterative uptake of invasive DNA sequences into the CRISPR array during the adaptation process. These genetic vaccination cards thus provide insights into the exposure of strains to phages and plasmids in space and time, revealing the historical predatory exposure of a strain. These genetic loci thus constitute a unique basis for genotyping of strains, with potential of resolution at the strain-level. Here, we investigate the occurrence and diversity of CRISPR-Cas systems in the genomes of various Bifidobacterium longum strains across three sub-species. Specifically, we analyzed the genomic content of 66 genomes belonging to B. longum subsp. longum, B. longum subsp. infantis and B. longum subsp. suis , and identified 25 strains that carry 29 total CRISPR-Cas systems. We identify various Type I and Type II CRISPR-Cas systems that are widespread in this species, notably I-C, I-E, and II-C. Noteworthy, Type I-C systems showed extended CRISPR arrays, with extensive spacer diversity. We show how these hypervariable loci can be used to gain insights into strain origin, evolution and phylogeny, and can provide discriminatory sequences to distinguish even clonal isolates. By investigating CRISPR spacer sequences, we reveal their origin and implicate phages and prophages as drivers of CRISPR immunity expansion in this species, with redundant targeting of select prophages. Analysis of CRISPR spacer origin also revealed novel PAM sequences. Our results suggest that CRISPR-Cas immune systems are instrumental in mounting diversified viral resistance in B. longum , and show that these sequences are useful for typing across three subspecies.
NifH-Harboring Bacterial Community Composition across an Alaskan Permafrost Thaw Gradient

DOE Office of Scientific and Technical Information (OSTI.GOV)

Penton, C. Ryan; Yang, Caiyun; Wu, Liyou

Since nitrogen (N) is often limiting in permafrost soils, we investigated the N 2-fixing genetic potential and the inferred taxa harboring those genes by sequencing nifH gene fragments in samples taken along a permafrost thaw gradient in an Alaskan boreal soil. Samples from minimally, moderately and extensively thawed sites were taken to a depth of 79 cm to encompass zones above and below the depth of the water table. NifH reads were translated with frameshift correction and 112,476 sequences were clustered at 5% amino acid dissimilarity resulting in 1,631 OTUs. Sample depth in relation to water table depth was correlatedmore » to differences in the NifH sequence classes with those most closely related to group I nifH-harboring Alpha- and Beta-Proteobacteria in higher abundance above water table depth while those related to group III nifH-harboring Delta Proteobacteria more abundant below. The most dominant below water table depth NifH sequences, comprising 1/3 of the total, were distantly related to Verrucomicrobia-Opitutaceae. Overall, these results suggest that permafrost thaw alters the class-level composition of N 2-fixing communities in the thawed soil layers and that this distinction corresponds to the depth of the water table. These nifH data were also compared to nifH sequences obtained from a study at an Alaskan taiga site, and to those of other geographically distant, non-permafrost sites. The two Alaska sites were differentiated largely by changes in relative abundances of the same OTUs, whereas the non-Alaska sites were differentiated by the lack of many Alaskan OTUs, and the presence of unique halophilic, sulfate- and iron-reducing taxa in the Alaska sites.« less
Ginkgo and Welwitschia Mitogenomes Reveal Extreme Contrasts in Gymnosperm Mitochondrial Evolution.

PubMed

Guo, Wenhu; Grewe, Felix; Fan, Weishu; Young, Gregory J; Knoop, Volker; Palmer, Jeffrey D; Mower, Jeffrey P

2016-06-01

Mitochondrial genomes (mitogenomes) of flowering plants are well known for their extreme diversity in size, structure, gene content, and rates of sequence evolution and recombination. In contrast, little is known about mitogenomic diversity and evolution within gymnosperms. Only a single complete genome sequence is available, from the cycad Cycas taitungensis, while limited information is available for the one draft sequence, from Norway spruce (Picea abies). To examine mitogenomic evolution in gymnosperms, we generated complete genome sequences for the ginkgo tree (Ginkgo biloba) and a gnetophyte (Welwitschia mirabilis). There is great disparity in size, sequence conservation, levels of shared DNA, and functional content among gymnosperm mitogenomes. The Cycas and Ginkgo mitogenomes are relatively small, have low substitution rates, and possess numerous genes, introns, and edit sites; we infer that these properties were present in the ancestral seed plant. By contrast, the Welwitschia mitogenome has an expanded size coupled with accelerated substitution rates and extensive loss of these functional features. The Picea genome has expanded further, to more than 4 Mb. With regard to structural evolution, the Cycas and Ginkgo mitogenomes share a remarkable amount of intergenic DNA, which may be related to the limited recombinational activity detected at repeats in Ginkgo Conversely, the Welwitschia mitogenome shares almost no intergenic DNA with any other seed plant. By conducting the first measurements of rates of DNA turnover in seed plant mitogenomes, we discovered that turnover rates vary by orders of magnitude among species. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
CRAWview: for viewing splicing variation, gene families, and polymorphism in clusters of ESTs and full-length sequences.

PubMed

Chou, A; Burke, J

1999-05-01

DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Mitochondrial Genome Sequences of Nematocera (Lower Diptera): Evidence of Rearrangement following a Complete Genome Duplication in a Winter Crane Fly

PubMed Central

Beckenbach, Andrew T.

2012-01-01

The complete mitochondrial DNA sequences of eight representatives of lower Diptera, suborder Nematocera, along with nearly complete sequences from two other species, are presented. These taxa represent eight families not previously represented by complete mitochondrial DNA sequences. Most of the sequences retain the ancestral dipteran mitochondrial gene arrangement, while one sequence, that of the midge Arachnocampa flava (family Keroplatidae), has an inversion of the trnE gene. The most unusual result is the extensive rearrangement of the mitochondrial genome of a winter crane fly, Paracladura trichoptera (family Trichocera). The pattern of rearrangement indicates that the mechanism of rearrangement involved a tandem duplication of the entire mitochondrial genome, followed by random and nonrandom loss of one copy of each gene. Another winter crane fly retains the ancestral diperan gene arrangement. A preliminary mitochondrial phylogeny of the Diptera is also presented. PMID:22155689
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.

PubMed

Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M

2003-02-28

Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
Genetic Code Analysis Toolkit: A novel tool to explore the coding properties of the genetic code and DNA sequences

NASA Astrophysics Data System (ADS)

Kraljić, K.; Strüngmann, L.; Fimmel, E.; Gumbel, M.

2018-01-01

The genetic code is degenerated and it is assumed that redundancy provides error detection and correction mechanisms in the translation process. However, the biological meaning of the code's structure is still under current research. This paper presents a Genetic Code Analysis Toolkit (GCAT) which provides workflows and algorithms for the analysis of the structure of nucleotide sequences. In particular, sets or sequences of codons can be transformed and tested for circularity, comma-freeness, dichotomic partitions and others. GCAT comes with a fertile editor custom-built to work with the genetic code and a batch mode for multi-sequence processing. With the ability to read FASTA files or load sequences from GenBank, the tool can be used for the mathematical and statistical analysis of existing sequence data. GCAT is Java-based and provides a plug-in concept for extensibility. Availability: Open source Homepage:http://www.gcat.bio/
Protein Interaction Profile Sequencing (PIP-seq).

PubMed

Foley, Shawn W; Gregory, Brian D

2016-10-10

Every eukaryotic RNA transcript undergoes extensive post-transcriptional processing from the moment of transcription up through degradation. This regulation is performed by a distinct cohort of RNA-binding proteins which recognize their target transcript by both its primary sequence and secondary structure. Here, we describe protein interaction profile sequencing (PIP-seq), a technique that uses ribonuclease-based footprinting followed by high-throughput sequencing to globally assess both protein-bound RNA sequences and RNA secondary structure. PIP-seq utilizes single- and double-stranded RNA-specific nucleases in the absence of proteins to infer RNA secondary structure. These libraries are also compared to samples that undergo nuclease digestion in the presence of proteins in order to find enriched protein-bound sequences. Combined, these four libraries provide a comprehensive, transcriptome-wide view of RNA secondary structure and RNA protein interaction sites from a single experimental technique. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

NASA Astrophysics Data System (ADS)

Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

2014-03-01

Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.
Whole-genome sequencing and analyses identify high genetic heterogeneity, diversity and endemicity of rotavirus genotype P[6] strains circulating in Africa.

PubMed

Nyaga, Martin M; Tan, Yi; Seheri, Mapaseka L; Halpin, Rebecca A; Akopov, Asmik; Stucker, Karla M; Fedorova, Nadia B; Shrivastava, Susmita; Duncan Steele, A; Mwenda, Jason M; Pickett, Brett E; Das, Suman R; Jeffrey Mphahlele, M

2018-05-18

Rotavirus A (RVA) exhibits a wide genotype diversity globally. Little is known about the genetic composition of genotype P[6] from Africa. This study investigated possible evolutionary mechanisms leading to genetic diversity of genotype P[6] VP4 sequences. Phylogenetic analyses on 167 P[6] VP4 full-length sequences were conducted, which included six porcine-origin sequences. Of the 167 sequences, 57 were newly acquired through whole genome sequencing as part of this study. The other 110 sequences were all publicly-available global P[6] VP4 full-length sequences downloaded from GenBank. The strength of association between the phenotypic features and the phylogeny was also determined. A number of reassortment and mixed infections of RVA genotype P[6] strains were observed in this study. Phylogenetic analyses demostrated the extensive genetic diversity that exists among human P[6] strains, porcine-like strains, their concomitant clades/subclades and estimated that P[6] VP4 gene has a higher substitution rate with the mean of 1.05E-3 substitutions/site/year. Further, the phylogenetic analyses indicated that genotype P[6] strains were endemic in Africa, characterised by an extensive genetic diversity and long-time local evolution of the viruses. This was also supported by phylogeographic clustering and G-genotype clustering of the P[6] strains when Bayesian Tip-association Significance testing (BaTS) was applied, clearly supporting that the viruses evolved locally in Africa instead of spatial mixing among different regions. Overall, the results demonstrated that multiple mechanisms such as reassortment events, various mutations and possibly interspecies transmission account for the enormous diversity of genotype P[6] strains in Africa. These findings highlight the need for continued global surveillance of rotavirus diversity. Copyright © 2018 Elsevier B.V. All rights reserved.
The primary structures of two yeast enolase genes. Homology between the 5' noncoding flanking regions of yeast enolase and glyceraldehyde-3-phosphate dehydrogenase genes.

PubMed

Holland, M J; Holland, J P; Thill, G P; Jackson, K A

1981-02-10

Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra)

PubMed Central

2012-01-01

Background Chinese bayberry (Myrica rubra Sieb. and Zucc.) is a subtropical evergreen tree originating in China. It has been cultivated in southern China for several thousand years, and annual production has reached 1.1 million tons. The taste and high level of health promoting characters identified in the fruit in recent years has stimulated its extension in China and introduction to Australia. A limited number of co-dominant markers have been developed and applied in genetic diversity and identity studies. Here we report, for the first time, a survey of whole genome shotgun data to develop a large number of simple sequence repeat (SSR) markers to analyse the genetic diversity of the common cultivated Chinese bayberry and the relationship with three other Myrica species. Results The whole genome shotgun survey of Chinese bayberry produced 9.01Gb of sequence data, about 26x coverage of the estimated genome size of 323 Mb. The genome sequences were highly heterozygous, but with little duplication. From the initial assembled scaffold covering 255 Mb sequence data, 28,602 SSRs (≥5 repeats) were identified. Dinucleotide was the most common repeat motif with a frequency of 84.73%, followed by 13.78% trinucleotide, 1.34% tetranucleotide, 0.12% pentanucleotide and 0.04% hexanucleotide. From 600 primer pairs, 186 polymorphic SSRs were developed. Of these, 158 were used to screen 29 Chinese bayberry accessions and three other Myrica species: 91.14%, 89.87% and 46.84% SSRs could be used in Myrica adenophora, Myrica nana and Myrica cerifera, respectively. The UPGMA dendrogram tree showed that cultivated Myrica rubra is closely related to Myrica adenophora and Myrica nana, originating in southwest China, and very distantly related to Myrica cerifera, originating in America. These markers can be used in the construction of a linkage map and for genetic diversity studies in Myrica species. Conclusion Myrica rubra has a small genome of about 323 Mb with a high level of heterozygosity. A large number of SSRs were identified, and 158 polymorphic SSR markers developed, 91% of which can be transferred to other Myrica species. PMID:22621340
Seismic stratigraphic characteristics of upper Louisiana continental slope: an area east of Green Canyon

USGS Publications Warehouse

Bouma, Arnold H.; Feeley, Mary H.; Kindinger, Jack G.; Stelting, Charles E.; Hilde, Thomas W.C.

1981-01-01

A high-resolution seismic reflection survey was conducted in a small area of the upper Louisiana Continental Slope known as Green Canyon Area. This area includes tracts 427, 428, 471, 472, 515, and 516, that will be offered for sale in March 1982 as part of Lease Sale 67.The sea floor of this region is, slightly hummocky and is underlain by salt diapirs that are mantled by early Tertiary shale. Most of the shale is overlain by younger Tertiary and Quaternary deposits, although locally some of the shale protrudes the sea floor. Because of proximity to older Mississippi River sources, the sediments are thick. The sediment cover shows an abundance of geologic phenomena such as horsts, grabens, growth faults, normal faults, and consolidation faults, zones with distinct and indistinct parallel reflections, semi-transparent zones, distorted zones, and angular unconformities.The major feature of this region is a N-S linear zone of uplifted and intruded sedimentary deposits formed due to diapiric intrusion.Small scale graben development over the crest of the structure can be attributed to extension and collapse. Large scale undulations of reflections well off the flanks of the uplifted structure suggest sediment creep and slumping. Dipping of parallel reflections show block faulting and tilting.Air gun (5 and 40 cubic inch) records reveal at least five major sequences that show masked onlap and slumping in their lower parts grading into more distinct parallel reflections in their upper parts. Such sequences can be related to local uplift and sea level changes. Minisparker records of this area show similar sequences but on a smaller scale. The distinct parallel reflections often onlap the diapir flanks. The highly reflective parts of these sequences may represent turbidite-type deposition, possibly at times of lower sea level. The acoustically more transparent parts of each sequence may represent deposits containing primarily hemipelagic and pelagic sediment.A complex ridge system is present along the west side of the area and distinct parallel reflections onlap onto this structure primarily from the east. Much of this deposition may be ascribed to sedimentation within a submarine canyon whose position is controlled by this ridge.
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
On the nature and dynamics of the seismogenetic systems of North California, USA: An analysis based on Non-Extensive Statistical Physics

NASA Astrophysics Data System (ADS)

Efstathiou, Angeliki; Tzanis, Andreas; Vallianatos, Filippos

2017-09-01

We examine the nature of the seismogenetic system in North California, USA, by searching for evidence of complexity and non-extensivity in the earthquake record. We attempt to determine whether earthquakes are generated by a self-excited Poisson process, in which case they obey Boltzmann-Gibbs thermodynamics, or by a Critical process, in which long-range interactions in non-equilibrium states are expected (correlation) and the thermodynamics deviate from the Boltzmann-Gibbs formalism. Emphasis is given to background seismicity since it is generally agreed that aftershock sequences comprise correlated sets. We use the complete and homogeneous earthquake catalogue published by the North California Earthquake Data Centre, in which aftershocks are either included, or have been removed by a stochastic declustering procedure. We examine multivariate cumulative frequency distributions of earthquake magnitudes, interevent time and interevent distance in the context of Non-Extensive Statistical Physics, which is a generalization of extensive Boltzmann-Gibbs thermodynamics to non-equilibrating (non-extensive) systems. Our results indicate that the seismogenetic systems of North California are generally sub-extensive complex and non-Poissonian. The background seismicity exhibits long-range interaction as evidenced by the overall increase of correlation observed by declustering the earthquake catalogues, as well as by the high correlation observed for earthquakes separated by long interevent distances. It is also important to emphasize that two subsystems with rather different properties appear to exist. The correlation observed along the Sierra Nevada Range - Walker Lane is quasi-stationary and indicates a Self-Organized Critical fault system. Conversely, the north segment of the San Andreas Fault exhibits changes in the level of correlation with reference to the large Loma Prieta event of 1989 and thus has attributes of Critical Point behaviour albeit without acceleration of seismic release rates. SOC appears to be a likely explanation of complexity mechanisms but since there are other ways by which complexity may emerge, additional work is required before assertive conclusions can be drawn.
Identification of a novel rhabdovirus in Spodoptera frugiperda cell lines.

PubMed

Ma, Hailun; Galvin, Teresa A; Glasner, Dustin R; Shaheduzzaman, Syed; Khan, Arifa S

2014-06-01

The Sf9 cell line, derived from Spodoptera frugiperda, is used as a cell substrate for biological products, and no viruses have been reported in this cell line after extensive testing. We used degenerate PCR assays and massively parallel sequencing (MPS) to identify a novel RNA virus belonging to the order Mononegavirales in Sf9 cells. Sequence analysis of the assembled virus genome showed the presence of five open reading frames (ORFs) corresponding to the genes for the N, P, M, G, and L proteins in other rhabdoviruses and an unknown ORF of 111 amino acids located between the G- and L-protein genes. BLAST searches indicated that the S. frugiperda rhabdovirus (Sf-rhabdovirus) was related in a limited region of the L-protein gene to Taastrup virus, a newly discovered member of the Mononegavirales from a leafhopper (Hemiptera), and also to plant rhabdoviruses, particularly in the genus Cytorhabdovirus. Phylogenetic analysis of sequences in the L-protein gene indicated that Sf-rhabdovirus is a novel virus that branched with Taastrup virus. Rhabdovirus morphology was confirmed by transmission electron microscopy of filtered supernatant samples from Sf9 cells. Infectivity studies indicated potential transient infection by Sf-rhabdovirus in other insect cell lines, but there was no evidence of entry or virus replication in human cell lines. Sf-rhabdovirus sequences were also found in the Sf21 parental cell line of Sf9 cells but not in other insect cell lines, such as BT1-TN-5B1-4 (Tn5; High Five) cells and Schneider's Drosophila line 2 [D.Mel.(2); SL2] cells, indicating a species-specific infection. The results indicate that conventional methods may be complemented by state-of-the-art technologies with extensive bioinformatics analysis for identification of novel viruses. The Spodoptera frugiperda Sf9 cell line is used as a cell substrate for the development and manufacture of biological products. Extensive testing has not previously identified any viruses in this cell line. This paper reports on the identification and characterization of a novel rhabdovirus in Sf9 cells. This was accomplished through the use of next-generation sequencing platforms, de novo assembly tools, and extensive bioinformatics analysis. Rhabdovirus identification was further confirmed by transmission electron microscopy. Infectivity studies showed the lack of replication of Sf-rhabdovirus in human cell lines. The overall study highlights the use of a combinatorial testing approach including conventional methods and new technologies for evaluation of cell lines for unexpected viruses and use of comprehensive bioinformatics strategies for obtaining confident next-generation sequencing results. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

Development of a Multiplex Single Base Extension Assay for Mitochondrial DNA Haplogroup Typing

PubMed Central

Nelson, Tahnee M.; Just, Rebecca S.; Loreille, Odile; Schanfield, Moses S.; Podini, Daniele

2007-01-01

Aim To provide a screening tool to reduce time and sample consumption when attempting mtDNA haplogroup typing. Methods A single base primer extension assay was developed to enable typing, in a single reaction, of twelve mtDNA haplogroup specific polymorphisms. For validation purposes a total of 147 samples were tested including 73 samples successfully haplogroup typed using mtDNA control region (CR) sequence data, 21 samples inconclusively haplogroup typed by CR data, 20 samples previously haplogroup typed using restriction fragment length polymorphism (RFLP) analysis, and 31 samples of known ancestral origin without previous haplogroup typing. Additionally, two highly degraded human bones embalmed and buried in the early 1950s were analyzed using the single nucleotide polymorphisms (SNP) multiplex. Results When the SNP multiplex was used to type the 96 previously CR sequenced specimens, an increase in haplogroup or macrohaplogroup assignment relative to conventional CR sequence analysis was observed. The single base extension assay was also successfully used to assign a haplogroup to decades-old, embalmed skeletal remains dating to World War II. Conclusion The SNP multiplex was successfully used to obtain haplogroup status of highly degraded human bones, and demonstrated the ability to eliminate possible contributors. The SNP multiplex provides a low-cost, high throughput method for typing of mtDNA haplogroups A, B, C, D, E, F, G, H, L1/L2, L3, M, and N that could be useful for screening purposes for human identification efforts and anthropological studies. PMID:17696300
Analysis of short tandem repeat polymorphisms using infrared fluorescence with M18 tailed primers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oetting, W.S.; Wiesner, G.; Laken, S.

The use of short tandem repeat polymorphisms (STRPs) are becoming increasingly important as markers for linkage analysis due to their large numbers of the human genome and their high degree of polymorphism. Fluorescence based detection of the STRP pattern using the LI-COR model 4000S automated DNA sequencer eliminates the need for radioactivity and produces a digitized image that can be used for the analysis of the polymorphisms. In an effort to reduce the cost of STRP analysis, we have synthesized primers with a 19 bp extension complementary to the sequence of the M13 primer on the 5{prime} end of onemore » of the two primers used in the amplification of the STRP instead of using primers with direct conjugation of the infrared fluorescent dye. Up to 5 primer pairs can be multiplexed together with the M13 primer-dye conjugate as the sole primer conjugated to the fluorescent dye. Comparisons between primers that have been directly conjugated to the fluor with those having the M13 sequence extension show no difference in the ability to determine the STRP pattern. At present, the entire Weber 4A set of STRP markers is available with the M13 5{prime} extension. We are currently using this technique for linkage analysis of familial breast cancer and asthma. The combination of STRP analysis using fluorescence detection will allow this technique to be fully automated for allele scoring and linkage analysis.« less
A proteomic analysis of the chromoplasts isolated from sweet orange fruits [Citrus sinensis (L.) Osbeck

PubMed Central

Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin

2011-01-01

Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level. PMID:21841170
Complete genome sequence of Streptococcus mutans GS-5, a serotype c strain.

PubMed

Biswas, Saswati; Biswas, Indranil

2012-09-01

Streptococcus mutans, a principal causative agent of dental caries, is considered to be the most cariogenic among all oral streptococci. Of the four S. mutans serotypes (c, e, f, and k), serotype c strains predominate in the oral cavity. Here, we present the complete genome sequence of S. mutans GS-5, a serotype c strain originally isolated from human carious lesions, which is extensively used as a laboratory strain worldwide.
Draft Genome Sequences of Pseudomonas aeruginosa Isolates from Wounded Military Personnel.

PubMed

Arivett, Brock A; Ream, Dave C; Fiester, Steven E; Kidane, Destaalem; Actis, Luis A

2016-08-11

Pseudomonas aeruginosa, a Gram-negative bacterium that causes severe hospital-acquired infections, is grouped as an ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogen because of its extensive drug resistance phenotypes and effects on human health worldwide. Five multidrug resistant P. aeruginosa strains isolated from wounded military personnel were sequenced and annotated in this work. Copyright © 2016 Arivett et al.
Draft Genome Sequences of Acinetobacter baumannii Isolates from Wounded Military Personnel.

PubMed

Arivett, Brock A; Ream, Dave C; Fiester, Steven E; Kidane, Destaalem; Actis, Luis A

2016-08-25

Acinetobacter baumannii is a Gram-negative bacterium capable of causing hospital-acquired infections that has been grouped with Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species as ESKAPE pathogens because of their extensive drug resistance phenotypes and increasing risk to human health. Twenty-four multidrug-resistant A. baumannii strains isolated from wounded military personnel were sequenced and annotated. Copyright © 2016 Arivett et al.
Deletion of L4 domains reveals insights into the importance of ribosomal protein extensions in eukaryotic ribosome assembly.

PubMed

Gamalinda, Michael; Woolford, John L

2014-11-01

Numerous ribosomal proteins have a striking bipartite architecture: a globular body positioned on the ribosomal exterior and an internal loop buried deep into the rRNA core. In eukaryotes, a significant number of conserved r-proteins have evolved extra amino- or carboxy-terminal tail sequences, which thread across the solvent-exposed surface. The biological importance of these extended domains remains to be established. In this study, we have investigated the universally conserved internal loop and the eukaryote-specific extensions of yeast L4. We show that in contrast to findings with bacterial L4, deleting the internal loop of yeast L4 causes severely impaired growth and reduced levels of large ribosomal subunits. We further report that while depleting the entire L4 protein blocks early assembly steps in yeast, deletion of only its extended internal loop affects later steps in assembly, revealing a second role for L4 during ribosome biogenesis. Surprisingly, deletion of the entire eukaryote-specific carboxy-terminal tail of L4 has no effect on viability, production of 60S subunits, or translation. These unexpected observations provide impetus to further investigate the functions of ribosomal protein extensions, especially eukaryote-specific examples, in ribosome assembly and function. © 2014 Gamalinda and Woolford; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Encoding and choice in the task span paradigm.

PubMed

Reiman, Kaitlin M; Weaver, Starla M; Arrington, Catherine M

2015-03-01

Cognitive control during sequences of planned behaviors requires both plan-level processes such as generating, maintaining, and monitoring the plan, as well as task-level processes such as selecting, establishing and implementing specific task sets. The task span paradigm (Logan in J Exp Psychol Gen 133:218-236, 2004) combines two common cognitive control paradigms, task switching and working memory span, to investigate the integration of plan-level and task-level processes during control of sequential behavior. The current study expands past task span research to include measures of encoding processes and choice behavior with volitional sequence generation, using the standard task span as well as a novel voluntary task span paradigm. In two experiments, we consider how sequence complexity, defined separately for plan-level and task-level complexity, influences sequence encoding (Experiment 1), sequence choice (Experiment 2), sequence memory, and task performance of planned sequences of action. Results indicate that participants were sensitive to sequence complexity, but that different aspects of behavior are most strongly influenced by different types of complexity. Hierarchical complexity at the plan level best predicts voluntary sequence generation and memory; while switch frequency at the task level best predicts encoding of externally defined sequences and task performance. Furthermore, performance RTs were similar for externally and internally defined plans, whereas memory was improved for internally defined sequences. Finally, participants demonstrated a significant sequence choice bias in the voluntary task span. Consistent with past research on choice behavior, volitional selection of plans was markedly influenced by both the ease of memory and performance.
Archaebacterial rhodopsin sequences: Implications for evolution

NASA Technical Reports Server (NTRS)

Lanyi, J. K.

1991-01-01

It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity.

PubMed

King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach

2014-01-01

Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Raising the continental crust

NASA Astrophysics Data System (ADS)

Campbell, Ian H.; Davies, D. Rhodri

2017-02-01

The changes that occur at the boundary between the Archean and Proterozoic eons are arguably the most fundamental to affect the evolution of Earth's continental crust. The principal component of Archean continental crust is Granite-Greenstone Terranes (GGTs), with granites always dominant. The greenstones consist of a lower sequence of submarine komatiites and basalts, which erupted onto a pre-existing Tonalite-Trondhjemite-Granodiorite (TTG) crust. These basaltic rocks pass upwards initially into evolved volcanic rocks, such as andesites and dacites and, subsequently, into reworked felsic pyroclastic material and immature sediments. This transition coincides with widespread emplacement of granitoids, which stabilised (cratonised) the continental crust. Proterozoic supra-crustal rocks, on the other hand, are dominated by extensive flat-lying platform sequences of mature sediments, which were deposited on stable cratonic basements, with basaltic rocks appreciably less abundant. The siliceous TTGs cannot be produced by direct melting of the mantle, with most hypotheses for their origin requiring them to be underlain by a complimentary dense amphibole-garnet-pyroxenite root, which we suggest acted as ballast to the early continents. Ubiquitous continental pillow basalts in Archean lower greenstone sequences require the early continental crust to have been sub-marine, whereas the appearance of abundant clastic sediments, at higher stratigraphic levels, shows that it had emerged above sea level by the time of sedimentation. We hypothesise that the production of komatiites and associated basalts, the rise of the continental crust, widespread melting of the continental crust, the onset of sedimentation and subsequent cratonisation form a continuum that is the direct result of removal of the continent's dense amphibole-garnet-pyroxenite roots, triggered at a regional scale by the arrival of a mantle plume at the base of the lithosphere. Our idealised calculations suggest that the removal of 40 km of the amphibole-garnet-pyroxenite root would have raised the average level of the continental crust by ∼3 km. The emergence of the continental crust was an essential precursor to the rise of oxygen, which started some 200 Myr later.
RNA sequencing reveals sexually dimorphic gene expression before gonadal differentiation in chicken and allows comprehensive annotation of the W-chromosome

PubMed Central

2013-01-01

Background Birds have a ZZ male: ZW female sex chromosome system and while the Z-linked DMRT1 gene is necessary for testis development, the exact mechanism of sex determination in birds remains unsolved. This is partly due to the poor annotation of the W chromosome, which is speculated to carry a female determinant. Few genes have been mapped to the W and little is known of their expression. Results We used RNA-seq to produce a comprehensive profile of gene expression in chicken blastoderms and embryonic gonads prior to sexual differentiation. We found robust sexually dimorphic gene expression in both tissues pre-dating gonadogenesis, including sex-linked and autosomal genes. This supports the hypothesis that sexual differentiation at the molecular level is at least partly cell autonomous in birds. Different sets of genes were sexually dimorphic in the two tissues, indicating that molecular sexual differentiation is tissue specific. Further analyses allowed the assembly of full-length transcripts for 26 W chromosome genes, providing a view of the W transcriptome in embryonic tissues. This is the first extensive analysis of W-linked genes and their expression profiles in early avian embryos. Conclusion Sexual differentiation at the molecular level is established in chicken early in embryogenesis, before gonadal sex differentiation. We find that the W chromosome is more transcriptionally active than previously thought, expand the number of known genes to 26 and present complete coding sequences for these W genes. This includes two novel W-linked sequences and three small RNAs reassigned to the W from the Un_Random chromosome. PMID:23531366
Napoleon Bonaparte and the fate of an Amazonian rat: new data on the taxonomy of Mesomys hispidus (Rodentia: Echimyidae).

PubMed

Orlando, Ludovic; Mauffrey, Jean-François; Cuisin, Jacques; Patton, James L; Hänni, Catherine; Catzeflis, François

2003-04-01

The spiny rat Mesomys hispidus is one of many South American rodents that lack adequate taxonomic definition. The few sampled populations of this broadly distributed trans-Amazonian arboreal rat have come from widely separated regions and are typically highly divergent. The holotype was described in 1817 by A.-G. Desmarest, after Napoleon's army brought it to Paris following the plunder of Lisbon in 1808; however, the locality of origin has remained unknown. Here we examine the taxonomic status of this species by direct comparison of 50 extant individuals with the holotype at the morphometric and genetic levels, the latter based on 331 bp of the mitochondrial cytochrome b gene retrieved from a small skin fragment of the holotype with ancient DNA technology. Extensive sequence divergence is present among samples of M. hispidus collected from throughout its range, from French Guiana across Amazonia to Bolivia and Peru, with at least seven mitochondrial clades recognized (average divergence of 7.7% Kimura 2-parameter distance). Sequence from the holotype is, however, only weakly divergent from those of recent samples from French Guiana. Moreover, the holotype clusters with greater that 99% posterior probability with samples from this part of Amazonia in a discriminant analysis based on 22 cranial and dental measurements. Thus, we suggest that the holotype was originally obtained in eastern Amazonia north of the Amazon River, most likely in the Brazilian state of Amapá. Despite the high level of sequence diversity and marked morphological differences in size across the range of M. hispidus, we continue to regard this assemblage as a single species until additional samples and analyses suggest otherwise. Copyright 2002 Elsevier Science (USA)
CMS: A Web-Based System for Visualization and Analysis of Genome-Wide Methylation Data of Human Cancers

PubMed Central

Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong

2013-01-01

Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
Assessment of snake DNA barcodes based on mitochondrial COI and Cytb genes revealed multiple putative cryptic species in Thailand.

PubMed

Laopichienpong, Nararat; Muangmai, Narongrit; Supikamolseni, Arrjaree; Twilprawat, Panupon; Chanhome, Lawan; Suntrarachun, Sunutcha; Peyachoknagul, Surin; Srikulnath, Kornsorn

2016-12-15

DNA barcodes of mitochondrial cytochrome c oxidase I (COI), cytochrome b (Cytb) genes, and their combined data sets were constructed from 35 snake species in Thailand. No barcoding gap was detected in either of the two genes from the observed intra- and interspecific sequence divergences. Intra- and interspecific sequence divergences of the COI gene differed 14 times, with barcode cut-off scores ranging over 2%-4% for threshold values differentiated among most of the different species; the Cytb gene differed 6 times with cut-off scores ranging over 2%-6%. Thirty-five specific nucleotide mutations were also found at interspecific level in the COI gene, identifying 18 snake species, but no specific nucleotide mutation was observed for Cytb in any single species. This suggests that COI barcoding was a better marker than Cytb. Phylogenetic clustering analysis indicated that most species were represented by monophyletic clusters, suggesting that these snake species could be clearly differentiated using COI barcodes. However, the two-marker combination of both COI and Cytb was more effective, differentiating snake species by over 2%-4%, and reducing species numbers in the overlap value between intra- and interspecific divergences. Three species delimitation algorithms (general mixed Yule-coalescent, automatic barcoding gap detection, and statistical parsimony network analysis) were extensively applied to a wide range of snakes based on both barcodes. This revealed cryptic diversity for eleven snake species in Thailand. In addition, eleven accessions from the database previously grouped under the same species were represented at different species level, suggesting either high genetic diversity, or the misidentification of these sequences in the database as a consequence of cryptic species. Copyright © 2016 Elsevier B.V. All rights reserved.
Extending SEQenv: a taxa-centric approach to environmental annotations of 16S rDNA sequences

PubMed Central

Jeffries, Thomas C.; Ijaz, Umer Z.; Hamonts, Kelly

2017-01-01

Understanding how the environment selects a given taxon and the diversity patterns that emerge as a result of environmental filtering can dramatically improve our ability to analyse any environment in depth as well as advancing our knowledge on how the response of different taxa can impact each other and ecosystem functions. Most of the work investigating microbial biogeography has been site-specific, and logical environmental factors, rather than geographical location, may be more influential on microbial diversity. SEQenv, a novel pipeline aiming to provide environmental annotations of sequences emerged to provide a consistent description of the environmental niches using the ENVO ontology. While the pipeline provides a list of environmental terms on the basis of sample datasets and, therefore, the annotations obtained are at the dataset level, it lacks a taxa centric approach to environmental annotation. The work here describes an extension developed to enhance the SEQenv pipeline, which provided the means to directly generate environmental annotations for taxa under different contexts. 16S rDNA amplicon datasets belonging to distinct biomes were selected to illustrate the applicability of the extended SEQenv pipeline. A literature survey of the results demonstrates the immense importance of sequence level environmental annotations by illustrating the distribution of both taxa across environments as well as the various environmental sources of a specific taxon. Significantly enhancing the SEQenv pipeline in the process, this information would be valuable to any biologist seeking to understand the various taxa present in the habitat and the environment they originated from, enabling a more thorough analysis of which lineages are abundant in certain habitats and the recovery of patterns in taxon distribution across different habitats and environmental gradients. PMID:29038749
CMS: a web-based system for visualization and analysis of genome-wide methylation data of human cancers.

PubMed

Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong

2013-01-01

DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.
Molecular Characterization and Expression Analysis of Chloroplast Protein Import Components in Tomato (Solanum lycopersicum)

PubMed Central

Yan, Jianmin; Campbell, James H.; Glick, Bernard R.; Smith, Matthew D.; Liang, Yan

2014-01-01

The translocon at the outer envelope membrane of chloroplasts (Toc) mediates the recognition and initial import into the organelle of thousands of nucleus-encoded proteins. These proteins are translated in the cytosol as precursor proteins with cleavable amino-terminal targeting sequences called transit peptides. The majority of the known Toc components that mediate chloroplast protein import were originally identified in pea, and more recently have been studied most extensively in Arabidopsis. With the completion of the tomato genome sequencing project, it is now possible to identify putative homologues of the chloroplast import components in tomato. In the work reported here, the Toc GTPase cDNAs from tomato were identified, cloned and analyzed. The analysis revealed that there are four Toc159 homologues (slToc159-1, -2, -3 and -4) and two Toc34 homologues (slToc34-1 and -2) in tomato, and it was shown that tomato Toc159 and Toc34 homologues share high sequence similarity with the comparable import apparatus components from Arabidopsis and pea. Thus, tomato is a valid model for further study of this system. The expression level of Toc complex components was also investigated in different tissues during tomato development. The two tomato Toc34 homologues are expressed at higher levels in non-photosynthetic tissues, whereas, the expression of two tomato Toc159 homologues, slToc159-1 and slToc159-4, were higher in photosynthetic tissues, and the expression patterns of slToc159-2 was not significantly different in photosynthetic and non-photosynthetic tissues, and slToc159-3 expression was limited to a few select tissues. PMID:24751891
Global DNA methylation analysis reveals miR-214-3p contributes to cisplatin resistance in pediatric intracranial nongerminomatous malignant germ cell tumors.

PubMed

Hsieh, Tsung-Han; Liu, Yun-Ru; Chang, Ting-Yu; Liang, Muh-Lii; Chen, Hsin-Hung; Wang, Hsei-Wei; Yen, Yun; Wong, Tai-Tong

2018-03-27

Pediatric central nervous system germ cell tumors (CNSGCTs) are rare and heterogeneous neoplasms, which can be divided into germinomas and nongerminomatous germ cell tumors (NGGCTs). NGGCTs are further subdivided into mature teratomas and nongerminomatous malignant GCTs (NGMGCTs). Clinical outcomes suggest that NGMGCTs have poor prognosis and survival and that they require more extensive radiotherapy and adjuvant chemotherapy. However, the mechanisms underlying this difference are still unclear. DNA methylation alteration is generally acknowledged to cause therapeutic resistance in cancers. We hypothesized that the pediatric NGMGCTs exhibit a different genome-wide DNA methylation pattern, which is involved in the mechanism of its therapeutic resistance. We performed methylation and hydroxymethylation DNA immunoprecipitation sequencing, mRNA expression microarray, and small RNA sequencing (smRNA-seq) to determine methylation-regulated genes, including microRNAs (miRNAs). The expression levels of 97 genes and 8 miRNAs were correlated with promoter DNA methylation and hydroxymethylation status, such as the miR-199/-214 cluster, and treatment with DNA demethylating agent 5-aza-2'-deoxycytidine elevated its expression level. Furthermore, smRNA-seq analysis showed 27 novel miRNA candidates with differential expression between germinomas and NGMGCTs. Overexpresssion of miR-214-3p in NCCIT cells leads to reduced expression of the pro-apoptotic protein BCL2-like 11 and induces cisplatin resistance. We interrogated the differential DNA methylation patterns between germinomas and NGMGCTs and proposed a mechanism for chemoresistance in NGMGCTs. In addition, our sequencing data provide a roadmap for further pediatric CNSGCT research and potential targets for the development of new therapeutic strategies.
Analysis of noise-induced temporal correlations in neuronal spike sequences

NASA Astrophysics Data System (ADS)

Reinoso, José A.; Torrent, M. C.; Masoller, Cristina

2016-11-01

We investigate temporal correlations in sequences of noise-induced neuronal spikes, using a symbolic method of time-series analysis. We focus on the sequence of time-intervals between consecutive spikes (inter-spike-intervals, ISIs). The analysis method, known as ordinal analysis, transforms the ISI sequence into a sequence of ordinal patterns (OPs), which are defined in terms of the relative ordering of consecutive ISIs. The ISI sequences are obtained from extensive simulations of two neuron models (FitzHugh-Nagumo, FHN, and integrate-and-fire, IF), with correlated noise. We find that, as the noise strength increases, temporal order gradually emerges, revealed by the existence of more frequent ordinal patterns in the ISI sequence. While in the FHN model the most frequent OP depends on the noise strength, in the IF model it is independent of the noise strength. In both models, the correlation time of the noise affects the OP probabilities but does not modify the most probable pattern.

A proposal to rename the hyperthermophile Pyrococcus woesei as Pyrococcus furiosus subsp. woesei.

PubMed

Kanoksilapatham, Wirojne; González, Juan M; Maeder, Dennis L; DiRuggiero, Jocelyne; Robb, Frank T

2004-10-01

Pyrococcus species are hyperthermophilic members of the order Thermococcales, with optimal growth temperatures approaching 100 degrees C. All species grow heterotrophically and produce H2 or, in the presence of elemental sulfur (S(o)), H2S. Pyrococcus woesei and P. furiosus were isolated from marine sediments at the same Vulcano Island beach site and share many morphological and physiological characteristics. We report here that the rDNA operons of these strains have identical sequences, including their intergenic spacer regions and part of the 23S rRNA. Both species grow rapidly and produce H2 in the presence of 0.1% maltose and 10-100 microM sodium tungstate in S(o)-free medium. However, P. woesei shows more extensive autolysis than P. furiosus in the stationary phase. Pyrococcus furiosus and P. woesei share three closely related families of insertion sequences (ISs). A Southern blot performed with IS probes showed extensive colinearity between the genomes of P. woesei and P. furiosus. Cloning and sequencing of ISs that were in different contexts in P. woesei and P. furiosus revealed that the napA gene in P. woesei is disrupted by a type III IS element, whereas in P. furiosus, this gene is intact. A type I IS element, closely linked to the napA gene, was observed in the same context in both P. furiosus and P. woesei genomes. Our results suggest that the IS elements are implicated in genomic rearrangements and reshuffling in these closely related strains. We propose to rename P. woesei a subspecies of P. furiosus based on their identical rDNA operon sequences, many common IS elements that are shared genomic markers, and the observation that all P. woesei nucleotide sequences deposited in GenBank to date are > 99% identical to P. furiosus sequences.
Molecular Population Genetics of the Alcohol Dehydrogenase Gene Region of DROSOPHILA MELANOGASTER

PubMed Central

Aquadro, Charles F.; Desse, Susan F.; Bland, Molly M.; Langley, Charles H.; Laurie-Ahlberg, Cathy C.

1986-01-01

Variation in the DNA restriction map of a 13-kb region of chromosome II including the alcohol dehydrogenase structural gene (Adh) was examined in Drosophila melanogaster from natural populations. Detailed analysis of 48 D. melanogaster lines representing four eastern United States populations revealed extensive DNA sequence variation due to base substitutions, insertions and deletions. Cloning of this region from several lines allowed characterization of length variation as due to unique sequence insertions or deletions [nine sizes; 21–200 base pairs (bp)] or transposable element insertions (several sizes, 340 bp to 10.2 kb, representing four different elements). Despite this extensive variation in sequences flanking the Adh gene, only one length polymorphism is clearly associated with altered Adh expression (a copia element approximately 250 bp 5' to the distal transcript start site). Nonetheless, the frequency spectra of transposable elements within and between Drosophila species suggests they are slightly deleterious. Strong nonrandom associations are observed among Adh region sequence variants, ADH allozyme (Fast vs. Slow), ADH enzyme activity and the chromosome inversion ln(2L) t. Phylogenetic analysis of restriction map haplotypes suggest that the major twofold component of ADH activity variation (high vs. low, typical of Fast and Slow allozymes, respectively) is due to sequence variation tightly linked to and possibly distinct from that underlying the allozyme difference. The patterns of nucleotide and haplotype variation for Fast and Slow allozyme lines are consistent with the recent increase in frequency and spread of the Fast haplotype associated with high ADH activity. These data emphasize the important role of evolutionary history and strong nonrandom associations among tightly linked sequence variation as determinants of the patterns of variation observed in natural populations. PMID:3026893
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

DOE PAGES

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; ...

2017-08-08

Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
Whole-gene CFTR sequencing combined with digital RT-PCR improves genetic diagnosis of cystic fibrosis.

PubMed

Straniero, Letizia; Soldà, Giulia; Costantino, Lucy; Seia, Manuela; Melotti, Paola; Colombo, Carla; Asselta, Rosanna; Duga, Stefano

2016-12-01

Despite extensive screening, 1-5% of cystic fibrosis (CF) patients lack a definite molecular diagnosis. Next-generation sequencing (NGS) is making affordable genetic testing based on the identification of variants in extended genomic regions. In this frame, we analyzed 23 CF patients and one carrier by whole-gene CFTR resequencing: 4 were previously characterized and served as controls; 17 were cases lacking a complete diagnosis after a full conventional CFTR screening; 3 were consecutive subjects referring to our centers, not previously submitted to any screening. We also included in the custom NGS design the coding portions of the SCNN1A, SCNN1B and SCNN1G genes, encoding the subunits of the sodium channel ENaC, which were found to be mutated in CF-like patients. Besides 2 novel SCNN1B missense mutations, we identified 22 previously-known CFTR mutations, including 2 large deletions (whose breakpoints were precisely mapped), and novel deep-intronic variants, whose role on splicing was excluded by ex-vivo analyses. Finally, for 2 patients, compound heterozygotes for a CFTR mutation and the intron-9c.1210-34TG [11-12] T 5 allele-known to be associated with decreased CFTR mRNA levels-the molecular diagnosis was implemented by measuring the residual level of wild-type transcript by digital reverse transcription polymerase chain reaction performed on RNA extracted from nasal brushing.
Software Reviews.

ERIC Educational Resources Information Center

Science Software Quarterly, 1984

1984-01-01

Provides extensive reviews of computer software, examining documentation, ease of use, performance, error handling, special features, and system requirements. Includes statistics, problem-solving (TK Solver), label printing, database management, experimental psychology, Encyclopedia Britannica biology, and DNA-sequencing programs. A program for…
Structures in the transition zone of the northeast South China Sea: serpentinite dome vs mantle exhumation, or evidence of Mesozoic active subduction transferring to Cenozoic passive extension?

NASA Astrophysics Data System (ADS)

Sun, Z.; Zhou, D.

2013-12-01

Complete sedimentary sequences and weak erosion make the transition zone of the South China Sea the optimal place to study the entire evolution history of marginal sea basins, as well as the transition mechanism from active subduction to passive extension. 2D long cable seismic profiles revealed that both Baiyun and Liwan sag in the northeast South China Sea margin were lack of large controlling faults, especially in Liwan sag, syn-rift sequences waved above the basement. Dome-like uplifts(serpetinite uplifts?) or diapirs(?) came from below the basement, caused the syn-rift sequences pushed up around 36Ma(T80). Gravity inversion based on seismic reflection indicated that the dome has a lower density and a lower layer velocity than normal crust. Also around the Continent-Ocean Boundary (COB), a small segment similar to the lower crust was exposed. Between this exposed segment and the Cenozoic oceanic crust, mantle seems to be exhumed along the breakup point. Between the COB and roughly the shelf break, high velocity lower crust was discriminated in the northeast continental margin. Structures in northeast South China Sea seems having many similarities with Newfoundland-Iberia margin, by serpentinite(?) dome and exhumed mantle, although spreading rate here is intermediate. In fact, regional background suggests that there might be another interpretation: transition from Mesozoic subduction to Cenozoic extension occurred through paleo oceanic crust breakup in the northeast, which in turn retained Mesozoic subduction system beneath the northeast continental margin. Confined with magnetic anomaly, Bouguer gravity gradient anomaly, and well drilling lithological evidences, Cenozoic Baiyun sag developed upon Mesozoic fore-arc, while Cenozoic Liwan sag developed upon Mesozoic accretionary prism. The high velocity lower crust was caused by both remnant subducted slab and by Oceanic-Continent interaction due to subduction. There might also be serpentinite dome and exhumed mantle, but may be caused by extension and breakup of paleo oceanic slab, not the depth-dependent extension. IODP drillings are needed to test all these scientific conjectures.
A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

PubMed Central

2012-01-01

Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We suggest that small differences in our discovered motif could confer specificity for one or more homologous GTF proteins. We offer a free implementation of the MotifCatcher software package at http://www.bme.ucdavis.edu/facciotti/resources_data/software/. PMID:23181585
Clinical applicability and cost of a 46-gene panel for genomic analysis of solid tumours: Retrospective validation and prospective audit in the UK National Health Service.

PubMed

Hamblin, Angela; Wordsworth, Sarah; Fermont, Jilles M; Page, Suzanne; Kaur, Kulvinder; Camps, Carme; Kaisaki, Pamela; Gupta, Avinash; Talbot, Denis; Middleton, Mark; Henderson, Shirley; Cutts, Anthony; Vavoulis, Dimitrios V; Housby, Nick; Tomlinson, Ian; Taylor, Jenny C; Schuh, Anna

2017-02-01

Single gene tests to predict whether cancers respond to specific targeted therapies are performed increasingly often. Advances in sequencing technology, collectively referred to as next generation sequencing (NGS), mean the entire cancer genome or parts of it can now be sequenced at speed with increased depth and sensitivity. However, translation of NGS into routine cancer care has been slow. Healthcare stakeholders are unclear about the clinical utility of NGS and are concerned it could be an expensive addition to cancer diagnostics, rather than an affordable alternative to single gene testing. We validated a 46-gene hotspot cancer panel assay allowing multiple gene testing from small diagnostic biopsies. From 1 January 2013 to 31 December 2013, solid tumour samples (including non-small-cell lung carcinoma [NSCLC], colorectal carcinoma, and melanoma) were sequenced in the context of the UK National Health Service from 351 consecutively submitted prospective cases for which treating clinicians thought the patient had potential to benefit from more extensive genetic analysis. Following histological assessment, tumour-rich regions of formalin-fixed paraffin-embedded (FFPE) sections underwent macrodissection, DNA extraction, NGS, and analysis using a pipeline centred on Torrent Suite software. With a median turnaround time of seven working days, an integrated clinical report was produced indicating the variants detected, including those with potential diagnostic, prognostic, therapeutic, or clinical trial entry implications. Accompanying phenotypic data were collected, and a detailed cost analysis of the panel compared with single gene testing was undertaken to assess affordability for routine patient care. Panel sequencing was successful for 97% (342/351) of tumour samples in the prospective cohort and showed 100% concordance with known mutations (detected using cobas assays). At least one mutation was identified in 87% (296/342) of tumours. A locally actionable mutation (i.e., available targeted treatment or clinical trial) was identified in 122/351 patients (35%). Forty patients received targeted treatment, in 22/40 (55%) cases solely due to use of the panel. Examination of published data on the potential efficacy of targeted therapies showed theoretically actionable mutations (i.e., mutations for which targeted treatment was potentially appropriate) in 66% (71/107) and 39% (41/105) of melanoma and NSCLC patients, respectively. At a cost of £339 (US$449) per patient, the panel was less expensive locally than performing more than two or three single gene tests. Study limitations include the use of FFPE samples, which do not always provide high-quality DNA, and the use of "real world" data: submission of cases for sequencing did not always follow clinical guidelines, meaning that when mutations were detected, patients were not always eligible for targeted treatments on clinical grounds. This study demonstrates that more extensive tumour sequencing can identify mutations that could improve clinical decision-making in routine cancer care, potentially improving patient outcomes, at an affordable level for healthcare providers.
Structure and regulation of the Yersinia pestis yscBCDEF operon.

PubMed Central

Haddix, P L; Straley, S C

1992-01-01

We have investigated the physical and genetic structure and regulation of the Yersinia pestis yscBCDEF region, previously called lcrC. DNA sequence analysis showed that this region is homologous to the corresponding part of the ysc locus of Yersinia enterocolitica and suggested that the yscBCDEF cistrons belong to a single operon on the low-calcium response virulence plasmid pCD1. Promoter activity measurements of ysc subclones indicated that yscBCDEF constitutes a suboperon of the larger ysc region by revealing promoter activity in a clone containing the 3' end of yscD, intact yscE and yscF, and part of yscG. These experiments also revealed an additional weak promoter upstream of yscD. Northern (RNA) analysis with a yscD probe showed that operon transcription is thermally induced and downregulated in the presence of Ca2+. Primer extension of operon transcripts suggested that two promoters, a moderate-level constitutive one and a stronger, calcium-downregulated one, control full-length operon transcription at 37 degrees C. Primer extension provided additional support for the proposed designation of a yscBCDEF suboperon by identifying a 5' end within yscF, for which relative abundances in the presence and absence of Ca2+ revealed regulation that is distinct from that for transcripts initiating farther upstream. YscB and YscC were expressed in Escherichia coli by using a high-level transcription system. Attempts to express YscD were only partially successful, but they revealed interesting regulation at the translational level. Images PMID:1624469
Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.

PubMed

Ragothaman, Anjani; Boddu, Sairam Chowdary; Kim, Nayong; Feinstein, Wei; Brylinski, Michal; Jha, Shantenu; Kim, Joohyun

2014-01-01

While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread--a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.
Structure and dynamics of single hydrophobic/ionic heteropolymers at the vapor-liquid interface of water.

PubMed

Vembanur, Srivathsan; Venkateshwaran, Vasudevan; Garde, Shekhar

2014-04-29

We focus on the conformational stability, structure, and dynamics of hydrophobic/charged homopolymers and heteropolymers at the vapor-liquid interface of water using extensive molecular dynamics simulations. Hydrophobic polymers collapse into globular structures in bulk water but unfold and sample a broad range of conformations at the vapor-liquid interface of water. We show that adding a pair of charges to a hydrophobic polymer at the interface can dramatically change its conformations, stabilizing hairpinlike structures, with molecular details depending on the location of the charged pair in the sequence. The translational dynamics of homopolymers and heteropolymers are also different, whereas the homopolymers skate on the interface with low drag, the tendency of charged groups to remain hydrated pulls the heteropolymers toward the liquid side of the interface, thus pinning them, increasing drag, and slowing the translational dynamics. The conformational dynamics of heteropolymers are also slower than that of the homopolymer and depend on the location of the charged groups in the sequence. Conformational dynamics are most restricted for the end-charged heteropolymer and speed up as the charge pair is moved toward the center of the sequence. We rationalize these trends using the fundamental understanding of the effects of the interface on primitive pair-level interactions between two hydrophobic groups and between oppositely charged ions in its vicinity.
Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

PubMed Central

Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

2014-01-01

While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285
Detecting novel genes with sparse arrays

PubMed Central

Haiminen, Niina; Smit, Bart; Rautio, Jari; Vitikainen, Marika; Wiebe, Marilyn; Martinez, Diego; Chee, Christine; Kunkel, Joe; Sanchez, Charles; Nelson, Mary Anne; Pakula, Tiina; Saloheimo, Markku; Penttilä, Merja; Kivioja, Teemu

2014-01-01

Species-specific genes play an important role in defining the phenotype of an organism. However, current gene prediction methods can only efficiently find genes that share features such as sequence similarity or general sequence characteristics with previously known genes. Novel sequencing methods and tiling arrays can be used to find genes without prior information and they have demonstrated that novel genes can still be found from extensively studied model organisms. Unfortunately, these methods are expensive and thus are not easily applicable, e.g., to finding genes that are expressed only in very specific conditions. We demonstrate a method for finding novel genes with sparse arrays, applying it on the 33.9 Mb genome of the filamentous fungus Trichoderma reesei. Our computational method does not require normalisations between arrays and it takes into account the multiple-testing problem typical for analysis of microarray data. In contrast to tiling arrays, that use overlapping probes, only one 25mer microarray oligonucleotide probe was used for every 100 b. Thus, only relatively little space on a microarray slide was required to cover the intergenic regions of a genome. The analysis was done as a by-product of a conventional microarray experiment with no additional costs. We found at least 23 good candidates for novel transcripts that could code for proteins and all of which were expressed at high levels. Candidate genes were found to neighbour ire1 and cre1 and many other regulatory genes. Our simple, low-cost method can easily be applied to finding novel species-specific genes without prior knowledge of their sequence properties. PMID:20691772
EggLib: processing, analysis and simulation tools for population genetics and genomics

PubMed Central

2012-01-01

Background With the considerable growth of available nucleotide sequence data over the last decade, integrated and flexible analytical tools have become a necessity. In particular, in the field of population genetics, there is a strong need for automated and reliable procedures to conduct repeatable and rapid polymorphism analyses, coalescent simulations, data manipulation and estimation of demographic parameters under a variety of scenarios. Results In this context, we present EggLib (Evolutionary Genetics and Genomics Library), a flexible and powerful C++/Python software package providing efficient and easy to use computational tools for sequence data management and extensive population genetic analyses on nucleotide sequence data. EggLib is a multifaceted project involving several integrated modules: an underlying computationally efficient C++ library (which can be used independently in pure C++ applications); two C++ programs; a Python package providing, among other features, a high level Python interface to the C++ library; and the egglib script which provides direct access to pre-programmed Python applications. Conclusions EggLib has been designed aiming to be both efficient and easy to use. A wide array of methods are implemented, including file format conversion, sequence alignment edition, coalescent simulations, neutrality tests and estimation of demographic parameters by Approximate Bayesian Computation (ABC). Classes implementing different demographic scenarios for ABC analyses can easily be developed by the user and included to the package. EggLib source code is distributed freely under the GNU General Public License (GPL) from its website http://egglib.sourceforge.net/ where a full documentation and a manual can also be found and downloaded. PMID:22494792
Mediators of exposure therapy for youth obsessive-compulsive disorder: specificity and temporal sequence of client and treatment factors.

PubMed

Chu, Brian C; Colognori, Daniela B; Yang, Guang; Xie, Min-ge; Lindsey Bergman, R; Piacentini, John

2015-05-01

Behavioral engagement and cognitive coping have been hypothesized to mediate effectiveness of exposure-based therapies. Identifying which specific child factors mediate successful therapy and which therapist factors facilitate change can help make our evidence-based treatments more efficient and robust. The current study examines the specificity and temporal sequence of relations among hypothesized client and therapist mediators in exposure therapy for pediatric Obsessive Compulsive Disorder (OCD). Youth coping (cognitive, behavioral), youth safety behaviors (avoidance, escape, compulsive behaviors), therapist interventions (cognitive, exposure extensiveness), and youth anxiety were rated via observational ratings of therapy sessions of OCD youth (N=43; ages=8 - 17; 62.8% male) who had received Exposure and Response Prevention (ERP). Regression analysis using Generalized Estimation Equations and cross-lagged panel analysis (CLPA) were conducted to model anxiety change within and across sessions, to determine formal mediators of anxiety change, and to establish sequence of effects. Anxiety ratings decreased linearly across exposures within sessions. Youth coping and therapist interventions significantly mediated anxiety change across exposures, and youth-interfering behavior mediated anxiety change at the trend level. In CLPA, youth-interfering behaviors predicted, and were predicted by, changes in anxiety. Youth coping was predicted by prior anxiety change. The study provides a preliminary examination of specificity and temporal sequence among child and therapist behaviors in predicting youth anxiety. Results suggest that therapists should educate clients in the natural rebound effects of anxiety between sessions and should be aware of the negatively reinforcing properties of avoidance during exposure. Copyright © 2015. Published by Elsevier Ltd.
Evaluation of a Campylobacter fetus subspecies venerealis real-time quantitative polymerase chain reaction for direct analysis of bovine preputial samples

PubMed Central

Chaban, Bonnie; Chu, Shirley; Hendrick, Steven; Waldner, Cheryl; Hill, Janet E.

2012-01-01

The detection and subspeciation of Campylobacter fetus subsp. venerealis (CFV) from veterinary samples is important for both clinical and economic reasons. Campylobacter fetus subsp. venerealis is the causative agent of bovine genital campylobacteriosis, a venereal disease that can lead to serious reproductive problems in cattle, and strict international regulations require animals and animal products to be CFV-free for trade. This study evaluated methods reported in the literature for CFV detection and reports the translation of an extensively tested CFV-specific polymerase chain reaction (PCR) primer set; including the VenSF/VenSR primers and a real-time, quantitative PCR (qPCR) platform using SYBR Green chemistry. Three methods of preputial sample preparation for direct qPCR were evaluated and a heat lysis DNA extraction method was shown to allow for CFV detection at the level of approximately one cell equivalent per reaction (or 1.0 × 103 CFU/mL) from prepuce. The optimized sample preparation and qPCR protocols were then used to evaluate 3 western Canadian bull cohorts, which included 377 bulls, for CFV. The qPCR assay detected 11 positive bulls for the CFV-specific parA gene target. DNA sequence data confirmed the identity of the amplified product and revealed that positive samples were comprised of 2 sequence types; one identical to previously reported CFV parA gene sequences and one with a 9% sequence divergence. These results add valuable information towards our understanding of an important CFV subspeciation target and offer a significantly improved format for an internationally recognized PCR test. PMID:23277694
The “Naked Coral” Hypothesis Revisited – Evidence for and Against Scleractinian Monophyly

PubMed Central

Forêt, Sylvain; Huttley, Gavin; Miller, David J.; Chen, Chaolun Allen

2014-01-01

The relationship between Scleractinia and Corallimorpharia, Orders within Anthozoa distinguished by the presence of an aragonite skeleton in the former, is controversial. Although classically considered distinct groups, some phylogenetic analyses have placed the Corallimorpharia within a larger Scleractinia/Corallimorpharia clade, leading to the suggestion that the Corallimorpharia are “naked corals” that arose via skeleton loss during the Cretaceous from a Scleractinian ancestor. Scleractinian paraphyly is, however, contradicted by a number of recent phylogenetic studies based on mt nucleotide (nt) sequence data. Whereas the “naked coral” hypothesis was based on analysis of the sequences of proteins encoded by a relatively small number of mt genomes, here a much-expanded dataset was used to reinvestigate hexacorallian phylogeny. The initial observation was that, whereas analyses based on nt data support scleractinian monophyly, those based on amino acid (aa) data support the “naked coral” hypothesis, irrespective of the method and with very strong support. To better understand the bases of these contrasting results, the effects of systematic errors were examined. Compared to other hexacorallians, the mt genomes of “Robust” corals have a higher (A+T) content, codon usage is far more constrained, and the proteins that they encode have a markedly higher phenylalanine content, leading us to suggest that mt DNA repair may be impaired in this lineage. Thus the “naked coral” topology could be caused by high levels of saturation in these mitochondrial sequences, long-branch effects or model violations. The equivocal results of these extensive analyses highlight the fundamental problems of basing coral phylogeny on mitochondrial sequence data. PMID:24740380
EggLib: processing, analysis and simulation tools for population genetics and genomics.

PubMed

De Mita, Stéphane; Siol, Mathieu

2012-04-11

With the considerable growth of available nucleotide sequence data over the last decade, integrated and flexible analytical tools have become a necessity. In particular, in the field of population genetics, there is a strong need for automated and reliable procedures to conduct repeatable and rapid polymorphism analyses, coalescent simulations, data manipulation and estimation of demographic parameters under a variety of scenarios. In this context, we present EggLib (Evolutionary Genetics and Genomics Library), a flexible and powerful C++/Python software package providing efficient and easy to use computational tools for sequence data management and extensive population genetic analyses on nucleotide sequence data. EggLib is a multifaceted project involving several integrated modules: an underlying computationally efficient C++ library (which can be used independently in pure C++ applications); two C++ programs; a Python package providing, among other features, a high level Python interface to the C++ library; and the egglib script which provides direct access to pre-programmed Python applications. EggLib has been designed aiming to be both efficient and easy to use. A wide array of methods are implemented, including file format conversion, sequence alignment edition, coalescent simulations, neutrality tests and estimation of demographic parameters by Approximate Bayesian Computation (ABC). Classes implementing different demographic scenarios for ABC analyses can easily be developed by the user and included to the package. EggLib source code is distributed freely under the GNU General Public License (GPL) from its website http://egglib.sourceforge.net/ where a full documentation and a manual can also be found and downloaded.
Artificial grasping system for the paralyzed hand.

PubMed

Ferrari de Castro, M C; Cliquet, A

2000-03-01

Neuromuscular electrical stimulation has been used in upper limb rehabilitation towards restoring motor hand function. In this work, an 8 channel microcomputer controlled stimulator with monophasic square voltage output was used. Muscle activation sequences were defined to perform palmar and lateral prehension and power grip (index finger extension type). The sequences used allowed subjects to demonstrate their ability to hold and release objects that are encountered in daily living, permitting activities such as drinking, eating, writing, and typing.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.