potential simple sequence: Topics by Science.gov

Sample records for potential simple sequence

Investigation of sequential properties of snoring episodes for obstructive sleep apnoea identification.

PubMed

Cavusoglu, M; Ciloglu, T; Serinagaoglu, Y; Kamasak, M; Erogul, O; Akcam, T

2008-08-01

In this paper, 'snore regularity' is studied in terms of the variations of snoring sound episode durations, separations and average powers in simple snorers and in obstructive sleep apnoea (OSA) patients. The goal was to explore the possibility of distinguishing among simple snorers and OSA patients using only sleep sound recordings of individuals and to ultimately eliminate the need for spending a whole night in the clinic for polysomnographic recording. Sequences that contain snoring episode durations (SED), snoring episode separations (SES) and average snoring episode powers (SEP) were constructed from snoring sound recordings of 30 individuals (18 simple snorers and 12 OSA patients) who were also under polysomnographic recording in Gülhane Military Medical Academy Sleep Studies Laboratory (GMMA-SSL), Ankara, Turkey. Snore regularity is quantified in terms of mean, standard deviation and coefficient of variation values for the SED, SES and SEP sequences. In all three of these sequences, OSA patients' data displayed a higher variation than those of simple snorers. To exclude the effects of slow variations in the base-line of these sequences, new sequences that contain the coefficient of variation of the sample values in a 'short' signal frame, i.e., short time coefficient of variation (STCV) sequences, were defined. The mean, the standard deviation and the coefficient of variation values calculated from the STCV sequences displayed a stronger potential to distinguish among simple snorers and OSA patients than those obtained from the SED, SES and SEP sequences themselves. Spider charts were used to jointly visualize the three parameters, i.e., the mean, the standard deviation and the coefficient of variation values of the SED, SES and SEP sequences, and the corresponding STCV sequences as two-dimensional plots. Our observations showed that the statistical parameters obtained from the SED and SES sequences, and the corresponding STCV sequences, possessed a strong potential to distinguish among simple snorers and OSA patients, both marginally, i.e., when the parameters are examined individually, and jointly. The parameters obtained from the SEP sequences and the corresponding STCV sequences, on the other hand, did not have a strong discrimination capability. However, the joint behaviour of these parameters showed some potential to distinguish among simple snorers and OSA patients.
Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

Treesearch

Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

2011-01-01

Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...
Hippocampal Replay is Not a Simple Function of Experience

PubMed Central

Gupta, Anoopum S.; van der Meer, Matthijs A. A.; Touretzky, David S.; Redish, A. David

2015-01-01

Summary Replay of behavioral sequences in the hippocampus during sharp-wave-ripple-complexes (SWRs) provides a potential mechanism for memory consolidation and the learning of knowledge structures. Current hypotheses imply that replay should straightforwardly reflect recent experience. However, we find these hypotheses to be incompatible with the content of replay on a task with two distinct behavioral sequences (A&B). We observed forward and backward replay of B even when rats had been performing A for >10 minutes. Furthermore, replay of non-local sequence B occurred more often when B was infrequently experienced. Neither forward nor backward sequences preferentially represented highly-experienced trajectories within a session. Additionally, we observed the construction of never-experienced novel-path sequences. These observations challenge the idea that sequence activation during SWRs is a simple replay of recent experience. Instead, replay reflected all physically available trajectories within the environment, suggesting a potential role in active learning and maintenance of the cognitive map. PMID:20223204
Kangaroo – A pattern-matching program for biological sequences

PubMed Central

2002-01-01

Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718
Not all transmembrane helices are born equal: Towards the extension of the sequence homology concept to membrane proteins

PubMed Central

2011-01-01

Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092
A Simple Acronym for Doing Calculus: CAL

ERIC Educational Resources Information Center

Hathaway, Richard J.

2008-01-01

An acronym is presented that provides students a potentially useful, unifying view of the major topics covered in an elementary calculus sequence. The acronym (CAL) is based on viewing the calculus procedure for solving a calculus problem P* in three steps: (1) recognizing that the problem cannot be solved using simple (non-calculus) techniques;…
Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

PubMed

Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

2014-04-08

The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.
Single molecule targeted sequencing for cancer gene mutation detection.

PubMed

Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

2016-05-19

With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.
Genetic diversity studies and identification of SSR markers associated with Fusarium wilt (Fusarium udum) resistance in cultivated pigeonpea (Cajanus cajan).

PubMed

Singh, A K; Rai, V P; Chand, R; Singh, R P; Singh, M N

2013-01-01

Genetic diversity and identification of simple sequence repeat markers correlated with Fusarium wilt resistance was performed in a set of 36 elite cultivated pigeonpea genotypes differing in levels of resistance to Fusarium wilt. Twenty-four polymorphic sequence repeat markers were screened across these genotypes, and amplified a total of 59 alleles with an average high polymorphic information content value of 0.52. Cluster analysis, done by UPGMA and PCA, grouped the 36 pigeonpea genotypes into two main clusters according to their Fusarium wilt reaction. Based on the Kruskal-Wallis ANOVA and simple regression analysis, six simple sequence repeat markers were found to be significantly associated with Fusarium wilt resistance. The phenotypic variation explained by these markers ranged from 23.7 to 56.4%. The present study helps in finding out feasibility of prescreened SSR markers to be used in genetic diversity analysis and their potential association with disease resistance.
Capturing chloroplast variation for molecular ecology studies: a simple next generation sequencing approach applied to a rainforest tree

PubMed Central

2013-01-01

Background With high quantity and quality data production and low cost, next generation sequencing has the potential to provide new opportunities for plant phylogeographic studies on single and multiple species. Here we present an approach for in silicio chloroplast DNA assembly and single nucleotide polymorphism detection from short-read shotgun sequencing. The approach is simple and effective and can be implemented using standard bioinformatic tools. Results The chloroplast genome of Toona ciliata (Meliaceae), 159,514 base pairs long, was assembled from shotgun sequencing on the Illumina platform using de novo assembly of contigs. To evaluate its practicality, value and quality, we compared the short read assembly with an assembly completed using 454 data obtained after chloroplast DNA isolation. Sanger sequence verifications indicated that the Illumina dataset outperformed the longer read 454 data. Pooling of several individuals during preparation of the shotgun library enabled detection of informative chloroplast SNP markers. Following validation, we used the identified SNPs for a preliminary phylogeographic study of T. ciliata in Australia and to confirm low diversity across the distribution. Conclusions Our approach provides a simple method for construction of whole chloroplast genomes from shotgun sequencing of whole genomic DNA using short-read data and no available closely related reference genome (e.g. from the same species or genus). The high coverage of Illumina sequence data also renders this method appropriate for multiplexing and SNP discovery and therefore a useful approach for landscape level studies of evolutionary ecology. PMID:23497206
Detection of possible restriction sites for type II restriction enzymes in DNA sequences.

PubMed

Gagniuc, P; Cimponeriu, D; Ionescu-Tîrgovişte, C; Mihai, Andrada; Stavarachi, Monica; Mihai, T; Gavrilă, L

2011-01-01

In order to make a step forward in the knowledge of the mechanism operating in complex polygenic disorders such as diabetes and obesity, this paper proposes a new algorithm (PRSD -possible restriction site detection) and its implementation in Applied Genetics software. This software can be used for in silico detection of potential (hidden) recognition sites for endonucleases and for nucleotide repeats identification. The recognition sites for endonucleases may result from hidden sequences through deletion or insertion of a specific number of nucleotides. Tests were conducted on DNA sequences downloaded from NCBI servers using specific recognition sites for common type II restriction enzymes introduced in the software database (n = 126). Each possible recognition site indicated by the PRSD algorithm implemented in Applied Genetics was checked and confirmed by NEBcutter V2.0 and Webcutter 2.0 software. In the sequence NG_008724.1 (which includes 63632 nucleotides) we found a high number of potential restriction sites for ECO R1 that may be produced by deletion (n = 43 sites) or insertion (n = 591 sites) of one nucleotide. The second module of Applied Genetics has been designed to find simple repeats sizes with a real future in understanding the role of SNPs (Single Nucleotide Polymorphisms) in the pathogenesis of the complex metabolic disorders. We have tested the presence of simple repetitive sequences in five DNA sequence. The software indicated exact position of each repeats detected in the tested sequences. Future development of Applied Genetics can provide an alternative for powerful tools used to search for restriction sites or repetitive sequences or to improve genotyping methods.
Development of Scoring Functions for Antibody Sequence Assessment and Optimization

PubMed Central

Seeliger, Daniel

2013-01-01

Antibody development is still associated with substantial risks and difficulties as single mutations can radically change molecule properties like thermodynamic stability, solubility or viscosity. Since antibody generation methodologies cannot select and optimize for molecule properties which are important for biotechnological applications, careful sequence analysis and optimization is necessary to develop antibodies that fulfil the ambitious requirements of future drugs. While efforts to grab the physical principles of undesired molecule properties from the very bottom are becoming increasingly powerful, the wealth of publically available antibody sequences provides an alternative way to develop early assessment strategies for antibodies using a statistical approach which is the objective of this paper. Here, publically available sequences were used to develop heuristic potentials for the framework regions of heavy and light chains of antibodies of human and murine origin. The potentials take into account position dependent probabilities of individual amino acids but also conditional probabilities which are inevitable for sequence assessment and optimization. It is shown that the potentials derived from human sequences clearly distinguish between human sequences and sequences from mice and, hence, can be used as a measure of humaness which compares a given sequence with the phenotypic pool of human sequences instead of comparing sequence identities to germline genes. Following this line, it is demonstrated that, using the developed potentials, humanization of an antibody can be described as a simple mathematical optimization problem and that the in-silico generated framework variants closely resemble native sequences in terms of predicted immunogenicity. PMID:24204701
Stimulus novelty, task relevance and the visual evoked potential in man

NASA Technical Reports Server (NTRS)

Courchesne, E.; Hillyard, S. A.; Galambos, R.

1975-01-01

The effect of task relevance on P3 (waveform of human evoked potential) waves and the methodologies used to deal with them are outlined. Visual evoked potentials (VEPs) were recorded from normal adult subjects performing in a visual discrimination task. Subjects counted the number of presentations of the numeral 4 which was interposed rarely and randomly within a sequence of tachistoscopically flashed background stimuli. Intrusive, task-irrelevant (not counted) stimuli were also interspersed rarely and randomly in the sequence of 2s; these stimuli were of two types: simples, which were easily recognizable, and novels, which were completely unrecognizable. It was found that the simples and the counted 4s evoked posteriorly distributed P3 waves while the irrelevant novels evoked large, frontally distributed P3 waves. These large, frontal P3 waves to novels were also found to be preceded by large N2 waves. These findings indicate that the P3 wave is not a unitary phenomenon but should be considered in terms of a family of waves, differing in their brain generators and in their psychological correlates.
Stimulus-dependent deliberation process leading to a specific motor action demonstrated via a multi-channel EEG analysis

PubMed Central

Henz, Sonja; Kutz, Dieter F.; Werner, Jana; Hürster, Walter; Kolb, Florian P.; Nida-Ruemelin, Julian

2015-01-01

The aim of the study was to determine whether a deliberative process, leading to a motor action, is detectable in high density EEG recordings. Subjects were required to press one of two buttons. In a simple motor task the subject knew which button to press, whilst in a color-word Stroop task subjects had to press the right button with the right index finger when meaning and color coincided, or the left button with the left index finger when meaning and color were disparate. EEG recordings obtained during the simple motor task showed a sequence of positive (P) and negative (N) cortical potentials (P1-N1-P2) which are assumed to be related to the processing of the movement. The sequence of cortical potentials was similar in EEG recordings of subjects having to deliberate over how to respond, but the above sequence (P1-N1-P2) was preceded by slowly increasing negativity (N0), with N0 being assumed to represent the end of the deliberation process. Our data suggest the existence of neurophysiological correlates of deliberative processes. PMID:26190987
Simple sequence repeat markers for interspecific hybrid detections in Agrostis

USDA-ARS?s Scientific Manuscript database

Agrostis stolonifera L. (creeping bentgrass) and Agrostis capillaris (colonial bentgrass) are turfgrass species that are well adapted for golf course use in regions of the world where cool-season grasses are grown. Interspecific hybrids between the species do form and have the potential to incorpora...
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
[Analysis of MAT1A gene mutations in a child affected with simple hypermethioninemia].

PubMed

Sun, Yun; Ma, Dingyuan; Wang, Yanyun; Yang, Bin; Jiang, Tao

2017-02-10

To detect potential mutations of MAT1A gene in a child suspected with simple hypermethioninemia by MS/MS neonatal screening. Clinical data of the child was collected. Genomic DNA was extracted by a standard method and subjected to targeted sequencing using an Ion Ampliseq TM Inherited Disease Panel. Detected mutations were verified by Sanger sequencing. The child showed no clinical features except evaluated methionine. A novel compound mutation of the MAT1A gene, i.e., c.345delA and c.529C>T, was identified in the child. His father and mother were found to be heterozygous for the c.345delA mutation and c.529C>T mutation, respectively. The compound mutation c.345delA and c.529C>T of the MAT1A gene probably underlie the disease in the child. The semi-conductor sequencing has provided an important means for the diagnosis of hereditary diseases.
VISA--Vector Integration Site Analysis server: a web-based server to rapidly identify retroviral integration sites from next-generation sequencing.

PubMed

Hocum, Jonah D; Battrell, Logan R; Maynard, Ryan; Adair, Jennifer E; Beard, Brian C; Rawlings, David J; Kiem, Hans-Peter; Miller, Daniel G; Trobridge, Grant D

2015-07-07

Analyzing the integration profile of retroviral vectors is a vital step in determining their potential genotoxic effects and developing safer vectors for therapeutic use. Identifying retroviral vector integration sites is also important for retroviral mutagenesis screens. We developed VISA, a vector integration site analysis server, to analyze next-generation sequencing data for retroviral vector integration sites. Sequence reads that contain a provirus are mapped to the human genome, sequence reads that cannot be localized to a unique location in the genome are filtered out, and then unique retroviral vector integration sites are determined based on the alignment scores of the remaining sequence reads. VISA offers a simple web interface to upload sequence files and results are returned in a concise tabular format to allow rapid analysis of retroviral vector integration sites.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

PubMed

Powell, Bradford C; Hutchison, Clyde A

2006-01-19

Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

PubMed Central

Powell, Bradford C; Hutchison, Clyde A

2006-01-01

Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288

LookSeq: a browser-based viewer for deep sequencing data.

PubMed

Manske, Heinrich Magnus; Kwiatkowski, Dominic P

2009-11-01

Sequencing a genome to great depth can be highly informative about heterogeneity within an individual or a population. Here we address the problem of how to visualize the multiple layers of information contained in deep sequencing data. We propose an interactive AJAX-based web viewer for browsing large data sets of aligned sequence reads. By enabling seamless browsing and fast zooming, the LookSeq program assists the user to assimilate information at different levels of resolution, from an overview of a genomic region to fine details such as heterogeneity within the sample. A specific problem, particularly if the sample is heterogeneous, is how to depict information about structural variation. LookSeq provides a simple graphical representation of paired sequence reads that is more revealing about potential insertions and deletions than are conventional methods.
Mycobacterium tuberculosis and whole genome sequencing: a practical guide and online tools available for the clinical microbiologist.

PubMed

Satta, G; Atzeni, A; McHugh, T D

2017-02-01

Whole genome sequencing (WGS) has the potential to revolutionize the diagnosis of Mycobacterium tuberculosis infection but the lack of bioinformatic expertise among clinical microbiologists is a barrier for adoption. Software products for analysis should be simple, free of charge, able to accept data directly from the sequencer (FASTQ files) and to provide the basic functionalities all-in-one. The main aim of this narrative review is to provide a practical guide for the clinical microbiologist, with little or no practical experience of WGS analysis, with a specific focus on software products tailor-made for M. tuberculosis analysis. With sequencing performed by an external provider, it is now feasible to implement WGS analysis in the routine clinical practice of any microbiology laboratory, with the potential to detect resistance weeks before traditional phenotypic culture methods, but the clinical microbiologist should be aware of the limitations of this approach. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Diatom centromeres suggest a mechanism for nuclear DNA acquisition

DOE PAGES

Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...

2017-07-18

Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less
Diatom centromeres suggest a mechanism for nuclear DNA acquisition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.

Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less
Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

USDA-ARS?s Scientific Manuscript database

Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

PubMed

Chuzhanova, Nadia; Abeysinghe, Shaun S; Krawczak, Michael; Cooper, David N

2003-09-01

Translocations and gross deletions are responsible for a significant proportion of both cancer and inherited disease. Although such gene rearrangements are nonuniformly distributed in the human genome, the underlying mutational mechanisms remain unclear. We have studied the potential involvement of various types of repetitive sequence elements in the formation of secondary structure intermediates between the single-stranded DNA ends that recombine during rearrangements. Complexity analysis was used to assess the potential of these ends to form secondary structures, the maximum decrease in complexity consequent to a gross rearrangement being used as an indicator of the type of repeat and the specific DNA ends involved. A total of 175 pairs of deletion/translocation breakpoint junction sequences available from the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] were analyzed. Potential secondary structure was noted between the 5' flanking sequence of the first breakpoint and the 3' flanking sequence of the second breakpoint in 49% of rearrangements and between the 5' flanking sequence of the second breakpoint and the 3' flanking sequence of the first breakpoint in 36% of rearrangements. Inverted repeats, inversions of inverted repeats, and symmetric elements were found in association with gross rearrangements at approximately the same frequency. However, inverted repeats and inversions of inverted repeats accounted for the vast majority (83%) of deletions plus small insertions, symmetric elements for one-half of all antigen receptor-mediated translocations, while direct repeats appear only to be involved in mediating simple deletions. These findings extend our understanding of illegitimate recombination by highlighting the importance of secondary structure formation between single-stranded DNA ends at breakpoint junctions. Copyright 2003 Wiley-Liss, Inc.
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
Exponentially accurate approximations to piece-wise smooth periodic functions

NASA Technical Reports Server (NTRS)

Greer, James; Banerjee, Saheb

1995-01-01

A family of simple, periodic basis functions with 'built-in' discontinuities are introduced, and their properties are analyzed and discussed. Some of their potential usefulness is illustrated in conjunction with the Fourier series representations of functions with discontinuities. In particular, it is demonstrated how they can be used to construct a sequence of approximations which converges exponentially in the maximum norm to a piece-wise smooth function. The theory is illustrated with several examples and the results are discussed in the context of other sequences of functions which can be used to approximate discontinuous functions.
The novel primers for mammal species identification-based mitochondrial cytochrome b sequence: implication for reserved wild animals in Thailand and endangered mammal species in Southeast Asia.

PubMed

Muangkram, Yuttamol; Wajjwalku, Worawidh; Amano, Akira; Sukmak, Manakorn

2018-01-01

We presented the powerful techniques for species identification using the short amplicon of mitochondrial cytochrome b gene sequence. Two faecal samples and one single hair sample of the Asian tapir were tested using the new cytochrome b primers. The results showed a high sequence similarity with the mainland Asian tapir group. The comparative sequence analysis of the reserved wild mammals in Thailand and the other endangered mammal species from Southeast Asia comprehensibly verified the potential of our novel primers. The forward and reverse primers were 94.2 and 93.2%, respectively, by the average value of the sequence identity among 77 species sequences, and the overall mean distance was 35.9%. This development technique could provide rapid, simple, and reliable tools for species confirmation. Especially, it could recognize the problematic biological specimens contained less DNA material from illegal products and assist with wildlife crime investigation of threatened species and related forensic casework.
Research Techniques Made Simple: Single-Cell RNA Sequencing and its Applications in Dermatology.

PubMed

Wu, Xiaojun; Yang, Bin; Udo-Inyang, Imo; Ji, Suyun; Ozog, David; Zhou, Li; Mi, Qing-Sheng

2018-05-01

RNA sequencing is one of the most highly reliable and reproducible methods of assessing the cell transcriptome. As high-throughput RNA sequencing libraries at the single cell level have recently developed, single cell RNA sequencing has become more feasible and popular in biology research. Single cell RNA sequencing allows investigators to evaluate cell transcriptional profiles at the single cell level. It has become a very useful tool to perform investigations that could not be addressed by other methodologies, such as the assessment of cell-to-cell variation, the identification of rare populations, and the determination of heterogeneity within a cell population. So far, the single cell RNA sequencing technique has been widely applied to embryonic development, immune cell development, and human disease progress and treatment. Here, we describe the history of single cell technology development and its potential application in the field of dermatology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

USDA-ARS?s Scientific Manuscript database

Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Construction and characterization of an in-vivo linear covalently closed DNA vector production system.

PubMed

Nafissi, Nafiseh; Slavcev, Roderick

2012-12-06

While safer than their viral counterparts, conventional non-viral gene delivery DNA vectors offer a limited safety profile. They often result in the delivery of unwanted prokaryotic sequences, antibiotic resistance genes, and the bacterial origins of replication to the target, which may lead to the stimulation of unwanted immunological responses due to their chimeric DNA composition. Such vectors may also impart the potential for chromosomal integration, thus potentiating oncogenesis. We sought to engineer an in vivo system for the quick and simple production of safer DNA vector alternatives that were devoid of non-transgene bacterial sequences and would lethally disrupt the host chromosome in the event of an unwanted vector integration event. We constructed a parent eukaryotic expression vector possessing a specialized manufactured multi-target site called "Super Sequence", and engineered E. coli cells (R-cell) that conditionally produce phage-derived recombinase Tel (PY54), TelN (N15), or Cre (P1). Passage of the parent plasmid vector through R-cells under optimized conditions, resulted in rapid, efficient, and one step in vivo generation of mini lcc--linear covalently closed (Tel/TelN-cell), or mini ccc--circular covalently closed (Cre-cell), DNA constructs, separated from the backbone plasmid DNA. Site-specific integration of lcc plasmids into the host chromosome resulted in chromosomal disruption and 10(5) fold lower viability than that seen with the ccc counterpart. We offer a high efficiency mini DNA vector production system that confers simple, rapid and scalable in vivo production of mini lcc DNA vectors that possess all the benefits of "minicircle" DNA vectors and virtually eliminate the potential for undesirable vector integration events.
A simple algorithm for quantifying DNA methylation levels on multiple independent CpG sites in bisulfite genomic sequencing electropherograms.

PubMed

Leakey, Tatiana I; Zielinski, Jerzy; Siegfried, Rachel N; Siegel, Eric R; Fan, Chun-Yang; Cooney, Craig A

2008-06-01

DNA methylation at cytosines is a widely studied epigenetic modification. Methylation is commonly detected using bisulfite modification of DNA followed by PCR and additional techniques such as restriction digestion or sequencing. These additional techniques are either laborious, require specialized equipment, or are not quantitative. Here we describe a simple algorithm that yields quantitative results from analysis of conventional four-dye-trace sequencing. We call this method Mquant and we compare it with the established laboratory method of combined bisulfite restriction assay (COBRA). This analysis of sequencing electropherograms provides a simple, easily applied method to quantify DNA methylation at specific CpG sites.
Comparison of two PCR-based methods and automated DNA sequencing for prop-1 genotyping in Ames dwarf mice.

PubMed

Gerstner, Arpad; DeFord, James H; Papaconstantinou, John

2003-07-25

Ames dwarfism is caused by a homozygous single nucleotide mutation in the pituitary specific prop-1 gene, resulting in combined pituitary hormone deficiency, reduced growth and extended lifespan. Thus, these mice serve as an important model system for endocrinological, aging and longevity studies. Because the phenotype of wild type and heterozygous mice is undistinguishable, it is imperative for successful breeding to accurately genotype these animals. Here we report a novel, yet simple, approach for prop-1 genotyping using PCR-based allele-specific amplification (PCR-ASA). We also compare this method to other potential genotyping techniques, i.e. PCR-based restriction fragment length polymorphism analysis (PCR-RFLP) and fluorescence automated DNA sequencing. We demonstrate that the single-step PCR-ASA has several advantages over the classical PCR-RFLP because the procedure is simple, less expensive and rapid. To further increase the specificity and sensitivity of the PCR-ASA, we introduced a single-base mismatch at the 3' penultimate position of the mutant primer. Our results also reveal that the fluorescence automated DNA sequencing has limitations for detecting a single nucleotide polymorphism in the prop-1 gene, particularly in heterozygotes.
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments

DOE PAGES

Yim, Won Cheol; Cushman, John C.

2017-07-22

Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yim, Won Cheol; Cushman, John C.

Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Design of nucleic acid strands with long low-barrier folding pathways.

PubMed

Condon, Anne; Kirkpatrick, Bonnie; Maňuch, Ján

2017-01-01

A major goal of natural computing is to design biomolecules, such as nucleic acid sequences, that can be used to perform computations. We design sequences of nucleic acids that are "guaranteed" to have long folding pathways relative to their length. This particular sequences with high probability follow low-barrier folding pathways that visit a large number of distinct structures. Long folding pathways are interesting, because they demonstrate that natural computing can potentially support long and complex computations. Formally, we provide the first scalable designs of molecules whose low-barrier folding pathways, with respect to a simple, stacked pair energy model, grow superlinearly with the molecule length, but for which all significantly shorter alternative folding pathways have an energy barrier that is [Formula: see text] times that of the low-barrier pathway for any [Formula: see text] and a sufficiently long sequence.
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.

PubMed

Widmer, G

1993-03-01

Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.

Genetic Variation and Population Differentiation in a Medical Herb Houttuynia cordata in China Revealed by Inter-Simple Sequence Repeats (ISSRs)

PubMed Central

Wei, Lin; Wu, Xian-Jin

2012-01-01

Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included. PMID:22942696
Genetic variation and population differentiation in a medical herb Houttuynia cordata in China revealed by inter-simple sequence repeats (ISSRs).

PubMed

Wei, Lin; Wu, Xian-Jin

2012-01-01

Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included.
Asymmetric scoring functions for proteins

NASA Astrophysics Data System (ADS)

Lezon, Timothy; Holter, Neal; Maritan, Amos; Banavar, Jayanth

2003-03-01

The protein folding problem entails the prediction of the native state structure of a protein given the sequence of amino acids. In a coarse-grained description of a protein, an important ingredient for attempting this task is the determination of the effective energies of interaction between amino acids. We will discuss a simple approach for determining such interaction potentials from a training set of protein sequences and their experimentally determined native state structures. The key new ingredient in our study is the incorporation of the lack of symmetry in the effective interactions between amino acids. Our results, obtained using a set of 513 proteins, and their implications will be discussed.
Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

PubMed Central

Lin, Jinke; Kudrna, Dave; Wing, Rod A.

2011-01-01

We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers. PMID:21234344
Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing.

PubMed

Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E

2016-06-20

Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

PubMed

Ramu, Chenna

2003-07-01

SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.
MELOGEN: an EST database for melon functional genomics

PubMed Central

Gonzalez-Ibeas, Daniel; Blanca, José; Roig, Cristina; González-To, Mireia; Picó, Belén; Truniger, Verónica; Gómez, Pedro; Deleu, Wim; Caño-Delgado, Ana; Arús, Pere; Nuez, Fernando; Garcia-Mas, Jordi; Puigdomènech, Pere; Aranda, Miguel A

2007-01-01

Background Melon (Cucumis melo L.) is one of the most important fleshy fruits for fresh consumption. Despite this, few genomic resources exist for this species. To facilitate the discovery of genes involved in essential traits, such as fruit development, fruit maturation and disease resistance, and to speed up the process of breeding new and better adapted melon varieties, we have produced a large collection of expressed sequence tags (ESTs) from eight normalized cDNA libraries from different tissues in different physiological conditions. Results We determined over 30,000 ESTs that were clustered into 16,637 non-redundant sequences or unigenes, comprising 6,023 tentative consensus sequences (contigs) and 10,614 unclustered sequences (singletons). Many potential molecular markers were identified in the melon dataset: 1,052 potential simple sequence repeats (SSRs) and 356 single nucleotide polymorphisms (SNPs) were found. Sixty-nine percent of the melon unigenes showed a significant similarity with proteins in databases. Functional classification of the unigenes was carried out following the Gene Ontology scheme. In total, 9,402 unigenes were mapped to one or more ontology. Remarkably, the distributions of melon and Arabidopsis unigenes followed similar tendencies, suggesting that the melon dataset is representative of the whole melon transcriptome. Bioinformatic analyses primarily focused on potential precursors of melon micro RNAs (miRNAs) in the melon dataset, but many other genes potentially controlling disease resistance and fruit quality traits were also identified. Patterns of transcript accumulation were characterised by Real-Time-qPCR for 20 of these genes. Conclusion The collection of ESTs characterised here represents a substantial increase on the genetic information available for melon. A database (MELOGEN) which contains all EST sequences, contig images and several tools for analysis and data mining has been created. This set of sequences constitutes also the basis for an oligo-based microarray for melon that is being used in experiments to further analyse the melon transcriptome. PMID:17767721
Evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery.

PubMed

Hall, Richard J; Wang, Jing; Todd, Angela K; Bissielo, Ange B; Yen, Seiha; Strydom, Hugo; Moore, Nicole E; Ren, Xiaoyun; Huang, Q Sue; Carter, Philip E; Peacey, Matthew

2014-01-01

The discovery of new or divergent viruses using metagenomics and high-throughput sequencing has become more commonplace. The preparation of a sample is known to have an effect on the representation of virus sequences within the metagenomic dataset yet comparatively little attention has been given to this. Physical enrichment techniques are often applied to samples to increase the number of viral sequences and therefore enhance the probability of detection. With the exception of virus ecology studies, there is a paucity of information available to researchers on the type of sample preparation required for a viral metagenomic study that seeks to identify an aetiological virus in an animal or human diagnostic sample. A review of published virus discovery studies revealed the most commonly used enrichment methods, that were usually quick and simple to implement, namely low-speed centrifugation, filtration, nuclease-treatment (or combinations of these) which have been routinely used but often without justification. These were applied to a simple and well-characterised artificial sample composed of bacterial and human cells, as well as DNA (adenovirus) and RNA viruses (influenza A and human enterovirus), being either non-enveloped capsid or enveloped viruses. The effect of the enrichment method was assessed by both quantitative real-time PCR and metagenomic analysis that incorporated an amplification step. Reductions in the absolute quantities of bacteria and human cells were observed for each method as determined by qPCR, but the relative abundance of viral sequences in the metagenomic dataset remained largely unchanged. A 3-step method of centrifugation, filtration and nuclease-treatment showed the greatest increase in the proportion of viral sequences. This study provides a starting point for the selection of a purification method in future virus discovery studies, and highlights the need for more data to validate the effect of enrichment methods on different sample types, amplification, bioinformatics approaches and sequencing platforms. This study also highlights the potential risks that may attend selection of a virus enrichment method without any consideration for the sample type being investigated. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States.

PubMed

Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard

2012-12-01

Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Simple sequence repeat markers that identify Claviceps species and strains

USDA-ARS?s Scientific Manuscript database

Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...
Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

PubMed Central

Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

2013-01-01

Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799
Simulating protein folding initiation sites using an alpha-carbon-only knowledge-based force field

PubMed Central

Buck, Patrick M.; Bystroff, Christopher

2015-01-01

Protein folding is a hierarchical process where structure forms locally first, then globally. Some short sequence segments initiate folding through strong structural preferences that are independent of their three-dimensional context in proteins. We have constructed a knowledge-based force field in which the energy functions are conditional on local sequence patterns, as expressed in the hidden Markov model for local structure (HMMSTR). Carbon-alpha force field (CALF) builds sequence specific statistical potentials based on database frequencies for α-carbon virtual bond opening and dihedral angles, pairwise contacts and hydrogen bond donor-acceptor pairs, and simulates folding via Brownian dynamics. We introduce hydrogen bond donor and acceptor potentials as α-carbon probability fields that are conditional on the predicted local sequence. Constant temperature simulations were carried out using 27 peptides selected as putative folding initiation sites, each 12 residues in length, representing several different local structure motifs. Each 0.6 μs trajectory was clustered based on structure. Simulation convergence or representativeness was assessed by subdividing trajectories and comparing clusters. For 21 of the 27 sequences, the largest cluster made up more than half of the total trajectory. Of these 21 sequences, 14 had cluster centers that were at most 2.6 Å root mean square deviation (RMSD) from their native structure in the corresponding full-length protein. To assess the adequacy of the energy function on nonlocal interactions, 11 full length native structures were relaxed using Brownian dynamics simulations. Equilibrated structures deviated from their native states but retained their overall topology and compactness. A simple potential that folds proteins locally and stabilizes proteins globally may enable a more realistic understanding of hierarchical folding pathways. PMID:19137613
Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

PubMed

Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

2013-07-01

Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Development of genomic resources for the narrow-leafed lupin (Lupinus angustifolius): construction of a bacterial artificial chromosome (BAC) library and BAC-end sequencing

PubMed Central

2011-01-01

Background Lupinus angustifolius L, also known as narrow-leafed lupin (NLL), is becoming an important grain legume crop that is valuable for sustainable farming and is becoming recognised as a potential human health food. Recent interest is being directed at NLL to improve grain production, disease and pest management and health benefits of the grain. However, studies have been hindered by a lack of extensive genomic resources for the species. Results A NLL BAC library was constructed consisting of 111,360 clones with an average insert size of 99.7 Kbp from cv Tanjil. The library has approximately 12 × genome coverage. Both ends of 9600 randomly selected BAC clones were sequenced to generate 13985 BAC end-sequences (BESs), covering approximately 1% of the NLL genome. These BESs permitted a preliminary characterisation of the NLL genome such as organisation and composition, with the BESs having approximately 39% G:C content, 16.6% repetitive DNA and 5.4% putative gene-encoding regions. From the BESs 9966 simple sequence repeat (SSR) motifs were identified and some of these are shown to be potential markers. Conclusions The NLL BAC library and BAC-end sequences are powerful resources for genetic and genomic research on lupin. These resources will provide a robust platform for future high-resolution mapping, map-based cloning, comparative genomics and assembly of whole-genome sequencing data for the species. PMID:22014081
Complete nucleotide sequence of the freshwater unicellular cyanobacterium Synechococcus elongatus PCC 6301 chromosome: gene content and organization.

PubMed

Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru

2007-01-01

The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

PubMed Central

Vernick, Kenneth D.

2017-01-01

Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932

Whole transcriptome analysis using next-generation sequencing of model species Setaria viridis to support C4 photosynthesis research.

PubMed

Xu, Jiajia; Li, Yuanyuan; Ma, Xiuling; Ding, Jianfeng; Wang, Kai; Wang, Sisi; Tian, Ye; Zhang, Hui; Zhu, Xin-Guang

2013-09-01

Setaria viridis is an emerging model species for genetic studies of C4 photosynthesis. Many basic molecular resources need to be developed to support for this species. In this paper, we performed a comprehensive transcriptome analysis from multiple developmental stages and tissues of S. viridis using next-generation sequencing technologies. Sequencing of the transcriptome from multiple tissues across three developmental stages (seed germination, vegetative growth, and reproduction) yielded a total of 71 million single end 100 bp long reads. Reference-based assembly using Setaria italica genome as a reference generated 42,754 transcripts. De novo assembly generated 60,751 transcripts. In addition, 9,576 and 7,056 potential simple sequence repeats (SSRs) covering S. viridis genome were identified when using the reference based assembled transcripts and the de novo assembled transcripts, respectively. This identified transcripts and SSR provided by this study can be used for both reverse and forward genetic studies based on S. viridis.
Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake

USGS Publications Warehouse

Castoe, Todd A.; Poole, Alexander W.; de Koning, A. P. Jason; Jones, Kenneth L.; Tomback, Diana F.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Lance, Stacey L.; Streicher, Jeffrey W.; Smith, Eric N.; Pollock, David D.

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample – a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.
Rapid microsatellite identification from illumina paired-end genomic sequencing in two birds and a snake

USGS Publications Warehouse

Castoe, T.A.; Poole, A.W.; de Koning, A. P. J.; Jones, K.L.; Tomback, D.F.; Oyler-McCance, S.J.; Fike, J.A.; Lance, S.L.; Streicher, J.W.; Smith, E.N.; Pollock, D.D.

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample - a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable. ?? 2012 Castoe et al.
Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake.

PubMed

Castoe, Todd A; Poole, Alexander W; de Koning, A P Jason; Jones, Kenneth L; Tomback, Diana F; Oyler-McCance, Sara J; Fike, Jennifer A; Lance, Stacey L; Streicher, Jeffrey W; Smith, Eric N; Pollock, David D

2012-01-01

Identification of microsatellites, or simple sequence repeats (SSRs), can be a time-consuming and costly investment requiring enrichment, cloning, and sequencing of candidate loci. Recently, however, high throughput sequencing (with or without prior enrichment for specific SSR loci) has been utilized to identify SSR loci. The direct "Seq-to-SSR" approach has an advantage over enrichment-based strategies in that it does not require a priori selection of particular motifs, or prior knowledge of genomic SSR content. It has been more expensive per SSR locus recovered, however, particularly for genomes with few SSR loci, such as bird genomes. The longer but relatively more expensive 454 reads have been preferred over less expensive Illumina reads. Here, we use Illumina paired-end sequence data to identify potentially amplifiable SSR loci (PALs) from a snake (the Burmese python, Python molurus bivittatus), and directly compare these results to those from 454 data. We also compare the python results to results from Illumina sequencing of two bird genomes (Gunnison Sage-grouse, Centrocercus minimus, and Clark's Nutcracker, Nucifraga columbiana), which have considerably fewer SSRs than the python. We show that direct Illumina Seq-to-SSR can identify and characterize thousands of potentially amplifiable SSR loci for as little as $10 per sample--a fraction of the cost of 454 sequencing. Given that Illumina Seq-to-SSR is effective, inexpensive, and reliable even for species such as birds that have few SSR loci, it seems that there are now few situations for which prior hybridization is justifiable.
Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

PubMed Central

2010-01-01

Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

PubMed Central

2010-01-01

Background Expressed Sequence Tag (EST) has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST) sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047), among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs) in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65%) and low in the peach (46%), and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species. PMID:20626882
Simple Epidemiological Dynamics Explain Phylogenetic Clustering of HIV from Patients with Recent Infection

PubMed Central

Volz, Erik M.; Koopman, James S.; Ward, Melissa J.; Brown, Andrew Leigh; Frost, Simon D. W.

2012-01-01

Phylogenies of highly genetically variable viruses such as HIV-1 are potentially informative of epidemiological dynamics. Several studies have demonstrated the presence of clusters of highly related HIV-1 sequences, particularly among recently HIV-infected individuals, which have been used to argue for a high transmission rate during acute infection. Using a large set of HIV-1 subtype B pol sequences collected from men who have sex with men, we demonstrate that virus from recent infections tend to be phylogenetically clustered at a greater rate than virus from patients with chronic infection (‘excess clustering’) and also tend to cluster with other recent HIV infections rather than chronic, established infections (‘excess co-clustering’), consistent with previous reports. To determine the role that a higher infectivity during acute infection may play in excess clustering and co-clustering, we developed a simple model of HIV infection that incorporates an early period of intensified transmission, and explicitly considers the dynamics of phylogenetic clusters alongside the dynamics of acute and chronic infected cases. We explored the potential for clustering statistics to be used for inference of acute stage transmission rates and found that no single statistic explains very much variance in parameters controlling acute stage transmission rates. We demonstrate that high transmission rates during the acute stage is not the main cause of excess clustering of virus from patients with early/acute infection compared to chronic infection, which may simply reflect the shorter time since transmission in acute infection. Higher transmission during acute infection can result in excess co-clustering of sequences, while the extent of clustering observed is most sensitive to the fraction of infections sampled. PMID:22761556
Genetic and Chemical Profiling of Gymnema sylvestre Accessions from Central India: Its Implication for Quality Control and Therapeutic Potential of Plant

PubMed Central

Verma, Ashutosh Kumar; Dhawan, Sunita Singh; Singh, Seema; Bharati, Kumar Avinash; Jyotsana

2016-01-01

Background: Gymnema sylvestre, a vulnerable plant species, is mentioned in Indian Pharmacopeia as an antidiabetic drug Objective: Study of genetic and chemical diversity and its implications in accessions of G. sylvestre Materials and Methods: Fourteen accessions of G. sylvestre collected from Central India and assessment of their genetic and chemical diversity were carried out using ISSR (inter simple sequence repeat) and HPLC (high performance liquid chromatography) fingerprinting methods Results: Among the screened 40 ISSR primers, 15 were found polymorphic and collectively produced nine unique accession-specific bands. The maximum and minimum numbers of amplicones were noted for ISSR-15 and ISSR-11, respectively. The ISSR -11 and ISSR-13 revealed 100% polymorphism. HPLC chromatograms showed that accessions possess the secondary metabolites of mid-polarity with considerable variability. Unknown peaks with retention time 2.63, 3.41, 23.83, 24.50, and 44.67 were found universal type. Comparative hierarchical clustering analysis based on foresaid fingerprints indicates that both techniques have equal potential to discriminate accessions according to percentage gymnemic acid in their leaf tissue. Second approach was noted more efficiently for separation of accessions according to their agro-climatic/collection site Conclusion: Highly polymorphic ISSRs could be utilized as molecular probes for further selection of high gymnemic acid yielding accessions. Observed accession specific bands may be used as a descriptor for plant accessions protection and converted into sequence tagged sites markers. Identified five universal type peaks could be helpful in identification of G. sylvestre-based various herbal preparations. SUMMARY Nine accession specific unique bandsFive marker peaks for G. sylvestre.Suitability of genetic and chemical fingerprinting Abbreviations used: HPLC: High Performance Liquid Chromatography, ISSR: Inter Simple Sequence Repeats, CTAB: Cetyl Trimethylammonium Bromide, DNTP: Deoxynucleotide Triphosphates PMID:27761067
Genetic and Chemical Profiling of Gymnema sylvestre Accessions from Central India: Its Implication for Quality Control and Therapeutic Potential of Plant.

PubMed

Verma, Ashutosh Kumar; Dhawan, Sunita Singh; Singh, Seema; Bharati, Kumar Avinash; Jyotsana

2016-07-01

Gymnema sylvestre , a vulnerable plant species, is mentioned in Indian Pharmacopeia as an antidiabetic drug. Study of genetic and chemical diversity and its implications in accessions of G. sylvestre . Fourteen accessions of G. sylvestre collected from Central India and assessment of their genetic and chemical diversity were carried out using ISSR (inter simple sequence repeat) and HPLC (high performance liquid chromatography) fingerprinting methods. Among the screened 40 ISSR primers, 15 were found polymorphic and collectively produced nine unique accession-specific bands. The maximum and minimum numbers of amplicones were noted for ISSR-15 and ISSR-11, respectively. The ISSR -11 and ISSR-13 revealed 100% polymorphism. HPLC chromatograms showed that accessions possess the secondary metabolites of mid-polarity with considerable variability. Unknown peaks with retention time 2.63, 3.41, 23.83, 24.50, and 44.67 were found universal type. Comparative hierarchical clustering analysis based on foresaid fingerprints indicates that both techniques have equal potential to discriminate accessions according to percentage gymnemic acid in their leaf tissue. Second approach was noted more efficiently for separation of accessions according to their agro-climatic/collection site. Highly polymorphic ISSRs could be utilized as molecular probes for further selection of high gymnemic acid yielding accessions. Observed accession specific bands may be used as a descriptor for plant accessions protection and converted into sequence tagged sites markers. Identified five universal type peaks could be helpful in identification of G. sylvestre -based various herbal preparations. Nine accession specific unique bandsFive marker peaks for G. sylvestre .Suitability of genetic and chemical fingerprinting Abbreviations used: HPLC: High Performance Liquid Chromatography, ISSR: Inter Simple Sequence Repeats, CTAB: Cetyl Trimethylammonium Bromide, DNTP: Deoxynucleotide Triphosphates.
Cultivar identification, pedigree verification, and diversity analysis among Peach (Prunus persica L. Batsch) Cultivars based on Simple Sequence Repeat markers

USDA-ARS?s Scientific Manuscript database

The genetic relationships and pedigree inferences among peach (Prunus persica (L.) Batsch) accessions and breeding lines used in genetic improvement were evaluated using 15 simple sequence repeat (SSR) markers. A total of 80 alleles were detected among the 37 peach accessions with an average of 5.53...
THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

EPA Science Inventory

We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...
An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

USDA-ARS?s Scientific Manuscript database

Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Analysis of mutational changes at the HLA locus in single human sperm.

PubMed

Huang, M M; Erlich, H A; Goodman, M F; Arnheim, N

1995-01-01

Using a simple and efficient single sperm PCR and direct sequencing method, we screened for HLA-DPB1 gene mutations that may give rise to new alleles at this highly polymorphic locus. More than 800 single sperm were studied from a heterozygous individual whose two alleles carried 16 nucleotide sequence differences clustered in six polymorphic regions. A potential microgene conversion event was detected. Unrepaired heteroduplex DNA similar to that which gives rise to postmeiotic segregation events in yeast was observed in three cases. Control experiments also revealed unusual sperm from DPB1 homozygous individuals. The data may help explain allelic diversity in the MHC and suggest that a possible source of human mosaicism may be incomplete DNA mismatch repair during gametogenesis.
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
M13-Tailed Simple Sequence Repeat (SSR) Markers in Studies of Genetic Diversity and Population Structure of Common Oat Germplasm.

PubMed

Onyśk, Agnieszka; Boczkowska, Maja

2017-01-01

Simple Sequence Repeat (SSR) markers are one of the most frequently used molecular markers in studies of crop diversity and population structure. This is due to their uniform distribution in the genome, the high polymorphism, reproducibility, and codominant character. Additional advantages are the possibility of automatic analysis and simple interpretation of the results. The M13 tagged PCR reaction significantly reduces the costs of analysis by the automatic genetic analyzers. Here, we also disclose a short protocol of SSR data analysis.
Reactivation, Replay, and Preplay: How It Might All Fit Together

PubMed Central

Buhry, Laure; Azizi, Amir H.; Cheng, Sen

2011-01-01

Sequential activation of neurons that occurs during “offline” states, such as sleep or awake rest, is correlated with neural sequences recorded during preceding exploration phases. This so-called reactivation, or replay, has been observed in a number of different brain regions such as the striatum, prefrontal cortex, primary visual cortex and, most prominently, the hippocampus. Reactivation largely co-occurs together with hippocampal sharp-waves/ripples, brief high-frequency bursts in the local field potential. Here, we first review the mounting evidence for the hypothesis that reactivation is the neural mechanism for memory consolidation during sleep. We then discuss recent results that suggest that offline sequential activity in the waking state might not be simple repetitions of previously experienced sequences. Some offline sequential activity occurs before animals are exposed to a novel environment for the first time, and some sequences activated offline correspond to trajectories never experienced by the animal. We propose a conceptual framework for the dynamics of offline sequential activity that can parsimoniously describe a broad spectrum of experimental results. These results point to a potentially broader role of offline sequential activity in cognitive functions such as maintenance of spatial representation, learning, or planning. PMID:21918724
Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

PubMed

Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

2011-01-01

Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.
Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies.

PubMed

DeMaere, Matthew Z; Darling, Aaron E

2018-02-01

Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing.
Targeted Amplicon Sequencing (TAS): A Scalable Next-Gen Approach to Multilocus, Multitaxa Phylogenetics

PubMed Central

Bybee, Seth M.; Bracken-Grissom, Heather; Haynes, Benjamin D.; Hermansen, Russell A.; Byers, Robert L.; Clement, Mark J.; Udall, Joshua A.; Wilcox, Edward R.; Crandall, Keith A.

2011-01-01

Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach. PMID:22002916
Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

USDA-ARS?s Scientific Manuscript database

Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

The glycan-specific sulfotransferase (R77W)GalNAc-4-ST1 putatively responsible for peeling skin syndrome has normal properties consistent with a simple sequence polymorphisim.

PubMed

Fiete, Dorothy; Mi, Yiling; Beranek, Mary; Baenziger, Nancy L; Baenziger, Jacques U

2017-05-01

Expanded access to DNA sequencing now fosters ready detection of site-specific human genome alterations whose actual significance requires in-depth functional study to rule in or out disease-causing mutations. This is a particular concern for genomic sequence differences in glycosyltransferases, whose implications are often difficult to assess. A recent whole-exome sequencing study identifies (c.229 C > T) in the GalNAc-4-ST1 glycosyltransferase (CHST8) as a disease-causing missense R77W mutation yielding the genodermatosis peeling skin syndrome (PSS) when homozygous. Cabral et al. (Genomics. 2012;99:202-208) cite this sequence change as reducing keratinocyte GalNAc-4-ST1 activity, thus decreasing glycosaminoglycan sulfation, as the mechanism for this blistering disorder. Such an identification could point toward potential clinical and/or prenatal diagnosis of a harmful medical condition. However, GalNAc-4-ST1 has minimal activity toward glycosaminoglycans, instead modifying terminal β1,4-linked GalNAc on N- and O-linked oligosaccharides on specific glycoproteins. We find expression, processing and catalytic activity of GalNAc-4-ST1 completely equivalent between wild type and (R77W) sulfotransferases. Moreover, keratinocytes have little or no GalNAc-4-ST1 mRNA, indicating that they do not express GalNAc-4-ST1. In addition, loss-of-function of GalNAc-4-ST1 primarily presents as reproductive system aberrations rather than skin effects. These findings, an allele frequency of 0.004357, and a 10-fold difference in prevalence of CHST8 (c.299 C > T, R77W) across different ethnic groups, suggest that this sequence represents a "passenger" distributed polymorphism, a simple sequence variant form of the enzyme having normal activity, rather than a "driver" disease-causing mutation that accounts for PSS. This study presents an example for guiding biomedical research initiatives, as well as medical and personal/family perspectives, regarding newly-identified genomic sequence differences. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A rapid and simple method of detection of Blepharisma japonicum using PCR and immobilisation on FTA paper

PubMed Central

Hide, Geoff; Hughes, Jacqueline M; McNuff, Robert

2003-01-01

Background The rapid expansion in the availability of genome and DNA sequence information has opened up new possibilities for the development of methods for detecting free-living protozoa in environmental samples. The protozoan Blepharisma japonicum was used to investigate a rapid and simple detection system based on polymerase chain reaction amplification (PCR) from organisms immobilised on FTA paper. Results Using primers designed from the α-tubulin genes of Blepharisma, specific and sensitive detection to the equivalent of a single Blepharisma cell could be achieved. Similar detection levels were found using water samples, containing Blepharisma, which were dried onto Whatman FTA paper. Conclusion This system has potential as a sensitive convenient detection system for Blepharisma and could be applied to other protozoan organisms. PMID:14516472
Simple chained guide trees give high-quality protein multiple sequence alignments

PubMed Central

Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.

2014-01-01

Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495
Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

PubMed Central

Huang, Yongjie; Mrázek, Jan

2014-01-01

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877
Taxonomic evaluation of putative Streptomyces scabiei strains held in the ARS Culture Collection (NRRL) using multi-locus sequence analysis.

PubMed

Labeda, David P

2016-03-01

Multi-locus sequence analysis has been demonstrated to be a useful tool for identification of Streptomyces species and was previously applied to phylogenetically differentiate the type strains of species pathogenic on potatoes (Solanum tuberosum L.). The ARS Culture Collection (NRRL) contains 43 strains identified as Streptomyces scabiei deposited at various times since the 1950s and these were subjected to multi-locus sequence analysis utilising partial sequences of the house-keeping genes atpD, gyrB, recA, rpoB and trpB. Phylogenetic analyses confirmed the identity of 17 of these strains as Streptomyces scabiei, 9 of the strains as the potato-pathogenic species Streptomyces europaeiscabiei and 6 strains as potentially new phytopathogenic species. Of the 16 other strains, 12 were identified as members of previously described non-pathogenic Streptomyces species while the remaining 4 strains may represent heretofore unrecognised non-pathogenic species. This study demonstrated the value of this technique for the relatively rapid, simple and sensitive molecular identification of Streptomyces strains held in culture collections.
Simulations Using Random-Generated DNA and RNA Sequences

ERIC Educational Resources Information Center

Bryce, C. F. A.

1977-01-01

Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…
A SSR-based genetic linkage map of cultivated peanut (Arachis hypogaea L.)

USDA-ARS?s Scientific Manuscript database

The objective of this study was to construct a molecular linkage map of cultivated tetraploid peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. Three recombinant inbre...
Development and transferability of black and red raspberry microsatellite markers from short-read sequences

USDA-ARS?s Scientific Manuscript database

The advent of next-generation sequencing technologies has been a boon to the cost-effective development of molecular markers, particularly in non-model species. Here, we demonstrate the efficiency of microsatellite or simple sequence repeat (SSR) marker development from short-read sequences using th...
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

USDA-ARS?s Scientific Manuscript database

Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
Development of chloroplast simple sequence repeats (cpSSRs) for the intraspecific study of Gracilaria tenuistipitata (Gracilariales, Rhodophyta) from different populations

PubMed Central

2014-01-01

Background Gracilaria tenuistipitata is an agarophyte with substantial economic potential because of its high growth rate and tolerance to a wide range of environment factors. This red seaweed is intensively cultured in China for the production of agar and fodder for abalone. Microsatellite markers were developed from the chloroplast genome of G. tenuistipitata var. liui to differentiate G. tenuistipitata obtained from six different localities: four from Peninsular Malaysia, one from Thailand and one from Vietnam. Eighty G. tenuistipitata specimens were analyzed using eight simple sequence repeat (SSR) primer-pairs that we developed for polymerase chain reaction (PCR) amplification. Findings Five mononucleotide primer-pairs and one trinucleotide primer-pair exhibited monomorphic alleles, whereas the other two primer-pairs separated the G. tenuistipitata specimens into two main clades. G. tenuistipitata from Thailand and Vietnam were grouped into one clade, and the populations from Batu Laut, Middle Banks and Kuah (Malaysia) were grouped into another clade. The combined dataset of these two primer-pairs separated G. tenuistipitata obtained from Kelantan, Malaysia from that obtained from other localities. Conclusions Based on the variations in repeated nucleotides of microsatellite markers, our results suggested that the populations of G. tenuistipitata were distributed into two main geographical regions: (i) populations in the west coast of Peninsular Malaysia and (ii) populations facing the South China Sea. The correct identification of G. tenuistipitata strains with traits of high economic potential will be advantageous for the mass cultivation of seaweeds. PMID:24490797
Molecular genetic variation and structure of Southeast Asian crocodile (Tomistoma schlegelii): Comparative potentials of SSRs versus ISSRs.

PubMed

Shafiei-Astani, Behnam; Ong, Alan Han Kiat; Valdiani, Alireza; Tan, Soon Guan; Yien, Christina Yong Seok; Ahmady, Fatemeh; Alitheen, Noorjahan Banu; Ng, Wei Lun; Kuar, Taranjeet

2015-10-15

Tomistoma schlegelii, also referred to as the "false gharial", is one of the most exclusive and least known of the world's fresh water crocodilians, limited to Southeast Asia. Indeed, lack of economic value for its skin has led to neglect the biodiversity of the species. The current study aimed to investigate the mentioned case using 40 simple sequence repeat (SSR) primer pairs and 45 inter-simple sequence repeat (ISSR) primers. DNA analysis of 17 T. schlegelii samples using the SSR and ISSR markers resulted in producing a total of 49 and 108 polymorphic bands, respectively. Furthermore, the SSR- and ISSR-based cluster analyses both generated two main clusters. However, the SSR based results were found to be more in line with the geographical distributions of the crocodile samples collected across the country as compared with the ISSR-based results. The observed heterozygosity (HO) and expected heterozygosity (HE) of the polymorphic SSRs ranged between 0.588-1 and 0.470-0.891, respectively. The present results suggest that the Malaysian T. schlegelii populations had originated from a core population of crocodiles. In cooperation with the SSR markers, the ISSRs showed high potential for studying the genetic variation of T. schlegelii, and these markers are suitable to be employed in conservation genetic programs of this endangered species. Both SSR- and ISSR-based STRUCTURE analyses suggested that all the individuals of T. schlegelii are genetically similar with each other. Copyright © 2015 Elsevier B.V. All rights reserved.
Genetic diversity of an Azorean endemic and endangered plant species inferred from inter-simple sequence repeat markers.

PubMed

Lopes, Maria S; Mendonça, Duarte; Bettencourt, Sílvia X; Borba, Ana R; Melo, Catarina; Baptista, Cláudio; da Câmara Machado, Artur

2014-06-26

Knowledge of the levels and distribution of genetic diversity is important for designing conservation strategies for threatened and endangered species so as to guarantee sustainable survival of populations and to preserve their evolutionary potential. Picconia azorica is a valuable Azorean endemic species recently classified as endangered. To contribute with information useful for the establishment of conservation programmes, the genetic variability and differentiation among 230 samples from 11 populations collected in three Azorean islands was accessed with eight inter-simple sequence repeat markers. A total of 64 polymorphic loci were detected. The majority of genetic variability was found within populations and no genetic structure was detected between populations and between islands. Also the coefficient of genetic differentiation and the level of gene flow indicate that geographical distances do not act as barriers for gene flow. In order to ensure the survival of populations in situ and ex situ management practices should be considered, including artificial propagation through the use of plant tissue culture techniques, not only for the restoration of habitat but also for the sustainable use of its valuable wood. Published by Oxford University Press on behalf of the Annals of Botany Company.
Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Haider, Nadia

2017-01-01

Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
G4RNA: an RNA G-quadruplex database

PubMed Central

Garant, Jean-Michel; Luce, Mikael J.; Scott, Michelle S.

2015-01-01

Abstract G-quadruplexes (G4) are tetrahelical structures formed from planar arrangement of guanines in nucleic acids. A simple, regular motif was originally proposed to describe G4-forming sequences. More recently, however, formation of G4 was discovered to depend, at least in part, on the contextual backdrop of neighboring sequences. Prediction of G4 folding is thus becoming more challenging as G4 outlier structures, not described by the originally proposed motif, are increasingly reported. Recent observations thus call for a comprehensive tool, capable of consolidating the expanding information on tested G4s, in order to conduct systematic comparative analyses of G4-promoting sequences. The G4RNA Database we propose was designed to help meet the need for easily-retrievable data on known RNA G4s. A user-friendly, flexible query system allows for data retrieval on experimentally tested sequences, from many separate genes, to assess G4-folding potential. Query output sorts data according to sequence position, G4 likelihood, experimental outcomes and associated bibliographical references. G4RNA also provides an ideal foundation to collect and store additional sequence and experimental data, considering the growing interest G4s currently generate. Database URL: scottgroup.med.usherbrooke.ca/G4RNA PMID:26200754
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.

PubMed

Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao

2017-01-01

The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
Construction and Characterization of an in-vivo Linear Covalently Closed DNA Vector Production System

PubMed Central

2012-01-01

Background While safer than their viral counterparts, conventional non-viral gene delivery DNA vectors offer a limited safety profile. They often result in the delivery of unwanted prokaryotic sequences, antibiotic resistance genes, and the bacterial origins of replication to the target, which may lead to the stimulation of unwanted immunological responses due to their chimeric DNA composition. Such vectors may also impart the potential for chromosomal integration, thus potentiating oncogenesis. We sought to engineer an in vivo system for the quick and simple production of safer DNA vector alternatives that were devoid of non-transgene bacterial sequences and would lethally disrupt the host chromosome in the event of an unwanted vector integration event. Results We constructed a parent eukaryotic expression vector possessing a specialized manufactured multi-target site called “Super Sequence”, and engineered E. coli cells (R-cell) that conditionally produce phage-derived recombinase Tel (PY54), TelN (N15), or Cre (P1). Passage of the parent plasmid vector through R-cells under optimized conditions, resulted in rapid, efficient, and one step in vivo generation of mini lcc—linear covalently closed (Tel/TelN-cell), or mini ccc—circular covalently closed (Cre-cell), DNA constructs, separated from the backbone plasmid DNA. Site-specific integration of lcc plasmids into the host chromosome resulted in chromosomal disruption and 105 fold lower viability than that seen with the ccc counterpart. Conclusion We offer a high efficiency mini DNA vector production system that confers simple, rapid and scalable in vivo production of mini lcc DNA vectors that possess all the benefits of “minicircle” DNA vectors and virtually eliminate the potential for undesirable vector integration events. PMID:23216697
Microsatellites for Lindera species

Treesearch

Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson

2006-01-01

Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...
Improved detection of genetic markers of antimicrobial resistance by hybridization probe-based melting curve analysis using primers to mask proximal mutations: examples include the influenza H275Y substitution.

PubMed

Whiley, David M; Jacob, Kevin; Nakos, Jennifer; Bletchly, Cheryl; Nimmo, Graeme R; Nissen, Michael D; Sloots, Theo P

2012-06-01

Numerous real-time PCR assays have been described for detection of the influenza A H275Y alteration. However, the performance of these methods can be undermined by sequence variation in the regions flanking the codon of interest. This is a problem encountered more broadly in microbial diagnostics. In this study, we developed a modification of hybridization probe-based melting curve analysis, whereby primers are used to mask proximal mutations in the sequence targets of hybridization probes, so as to limit the potential for sequence variation to interfere with typing. The approach was applied to the H275Y alteration of the influenza A (H1N1) 2009 strain, as well as a Neisseria gonorrhoeae mutation associated with antimicrobial resistance. Assay performances were assessed using influenza A and N. gonorrhoeae strains characterized by DNA sequencing. The modified hybridization probe-based approach proved successful in limiting the effects of proximal mutations, with the results of melting curve analyses being 100% consistent with the results of DNA sequencing for all influenza A and N. gonorrhoeae strains tested. Notably, these included influenza A and N. gonorrhoeae strains exhibiting additional mutations in hybridization probe targets. Of particular interest was that the H275Y assay correctly typed influenza A strains harbouring a T822C nucleotide substitution, previously shown to interfere with H275Y typing methods. Overall our modified hybridization probe-based approach provides a simple means of circumventing problems caused by sequence variation, and offers improved detection of the influenza A H275Y alteration and potentially other resistance mechanisms.
Hierarchy and extremes in selections from pools of randomized proteins

PubMed Central

Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier

2016-01-01

Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different “frameworks” typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution). PMID:26969726
Hierarchy and extremes in selections from pools of randomized proteins.

PubMed

Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier

2016-03-29

Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different "frameworks" typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution).

Identification and characterization of 43 microsatellite markers derived from expressed sequence tags of the sea cucumber ( Apostichopus japonicus)

NASA Astrophysics Data System (ADS)

Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

2011-06-01

The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing.

PubMed

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2016-01-01

Flax ( Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5-8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs.
Reducing DNA context dependence in bacterial promoters

PubMed Central

Carr, Swati B.; Densmore, Douglas M.

2017-01-01

Variation in the DNA sequence upstream of bacterial promoters is known to affect the expression levels of the products they regulate, sometimes dramatically. While neutral synthetic insulator sequences have been found to buffer promoters from upstream DNA context, there are no established methods for designing effective insulator sequences with predictable effects on expression levels. We address this problem with Degenerate Insulation Screening (DIS), a novel method based on a randomized 36-nucleotide insulator library and a simple, high-throughput, flow-cytometry-based screen that randomly samples from a library of 436 potential insulated promoters. The results of this screen can then be compared against a reference uninsulated device to select a set of insulated promoters providing a precise level of expression. We verify this method by insulating the constitutive, inducible, and repressible promotors of a four transcriptional-unit inverter (NOT-gate) circuit, finding both that order dependence is largely eliminated by insulation and that circuit performance is also significantly improved, with a 5.8-fold mean improvement in on/off ratio. PMID:28422998
FOUNTAIN: A JAVA open-source package to assist large sequencing projects

PubMed Central

Buerstedde, Jean-Marie; Prill, Florian

2001-01-01

Background Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. Results We describe here a new program, named FOUNTAIN, for the management of large sequencing projects . FOUNTAIN uses the JAVA computer language and data storage in a relational database. Starting with a collection of sequencing objects (clones), the program generates and stores information related to the different stages of the sequencing project using a web browser interface for user input. The generated sequences are subsequently imported and annotated based on BLAST searches against the public databases. In addition, simple algorithms to cluster sequences and determine putative polymorphic positions are implemented. Conclusions A simple, but flexible and scalable software package is presented to facilitate data generation and storage for large sequencing projects. Open source and largely platform and database independent, we wish FOUNTAIN to be improved and extended in a community effort. PMID:11591214
Early forest fire detection using principal component analysis of infrared video

NASA Astrophysics Data System (ADS)

Saghri, John A.; Radjabi, Ryan; Jacobs, John T.

2011-09-01

A land-based early forest fire detection scheme which exploits the infrared (IR) temporal signature of fire plume is described. Unlike common land-based and/or satellite-based techniques which rely on measurement and discrimination of fire plume directly from its infrared and/or visible reflectance imagery, this scheme is based on exploitation of fire plume temporal signature, i.e., temperature fluctuations over the observation period. The method is simple and relatively inexpensive to implement. The false alarm rate is expected to be lower that of the existing methods. Land-based infrared (IR) cameras are installed in a step-stare-mode configuration in potential fire-prone areas. The sequence of IR video frames from each camera is digitally processed to determine if there is a fire within camera's field of view (FOV). The process involves applying a principal component transformation (PCT) to each nonoverlapping sequence of video frames from the camera to produce a corresponding sequence of temporally-uncorrelated principal component (PC) images. Since pixels that form a fire plume exhibit statistically similar temporal variation (i.e., have a unique temporal signature), PCT conveniently renders the footprint/trace of the fire plume in low-order PC images. The PC image which best reveals the trace of the fire plume is then selected and spatially filtered via simple threshold and median filter operations to remove the background clutter, such as traces of moving tree branches due to wind.
Utility of the heteroduplex assay (HDA) as a simple and cost-effective tool for the identification of HIV type 1 dual infections in resource-limited settings.

PubMed

Powell, Rebecca L R; Urbanski, Mateusz M; Burda, Sherri; Nanfack, Aubin; Kinge, Thompson; Nyambi, Phillipe N

2008-01-01

The predominance of unique recombinant forms (URFs) of HIV-1 in Cameroon suggests that dual infection, the concomitant or sequential infection with genetically distinct HIV-1 strains, occurs frequently in this region; yet, identifying dual infection among large HIV cohorts in local, resource-limited settings is uncommon, since this generally relies on labor-intensive and costly sequencing methods. Consequently, there is a need to develop an effective, cost-efficient method appropriate to the developing world to identify these infections. In the present study, the heteroduplex assay (HDA) was used to verify dual or single infection status, as shown by traditional sequence analysis, for 15 longitudinally sampled study subjects from Cameroon. Heteroduplex formation, indicative of a dual infection, was identified for all five study subjects shown by sequence analysis to be dually infected. Conversely, heteroduplex formation was not detectable for all 10 HDA reactions of the singly infected study subjects. These results suggest that the HDA is a simple yet powerful and inexpensive tool for the detection of both intersubtype and intrasubtype dual infections, and that the HDA harbors significant potential for reliable, high-throughput screening for dual infection. As these infections and the recombinants they generate facilitate leaps in HIV-1 evolution, and may present major challenges for treatment and vaccine design, this assay will be critical for monitoring the continuing pandemic in regions of the world where HIV-1 viral diversity is broad.
Evaluation of sampling and storage procedures on preserving the community structure of stool microbiota: A simple at-home toilet-paper collection method.

PubMed

Al, Kait F; Bisanz, Jordan E; Gloor, Gregory B; Reid, Gregor; Burton, Jeremy P

2018-01-01

The increasing interest on the impact of the gut microbiota on health and disease has resulted in multiple human microbiome-related studies emerging. However, multiple sampling methods are being used, making cross-comparison of results difficult. To avoid additional clinic visits and increase patient recruitment to these studies, there is the potential to utilize at-home stool sampling. The aim of this pilot study was to compare simple self-sampling collection and storage methods. To simulate storage conditions, stool samples from three volunteers were freshly collected, placed on toilet tissue, and stored at four temperatures (-80, 7, 22 and 37°C), either dry or in the presence of a stabilization agent (RNAlater®) for 3 or 7days. Using 16S rRNA gene sequencing by Illumina, the effect of storage variations for each sample was compared to a reference community from fresh, unstored counterparts. Fastq files may be accessed in the NCBI Sequence Read Archive: Bioproject ID PRJNA418287. Microbial diversity and composition were not significantly altered by any storage method. Samples were always separable based on participant, regardless of storage method suggesting there was no need for sample preservation by a stabilization agent. In summary, if immediate sample processing is not feasible, short term storage of unpreserved stool samples on toilet paper offers a reliable way to assess the microbiota composition by 16S rRNA gene sequencing. Copyright © 2017 Elsevier B.V. All rights reserved.
Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter.

PubMed

Amirhaeri, S; Wohlrab, F; Wells, R D

1995-02-17

The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Simple-MSSM: a simple and efficient method for simultaneous multi-site saturation mutagenesis.

PubMed

Cheng, Feng; Xu, Jian-Miao; Xiang, Chao; Liu, Zhi-Qiang; Zhao, Li-Qing; Zheng, Yu-Guo

2017-04-01

To develop a practically simple and robust multi-site saturation mutagenesis (MSSM) method that enables simultaneously recombination of amino acid positions for focused mutant library generation. A general restriction enzyme-free and ligase-free MSSM method (Simple-MSSM) based on prolonged overlap extension PCR (POE-PCR) and Simple Cloning techniques. As a proof of principle of Simple-MSSM, the gene of eGFP (enhanced green fluorescent protein) was used as a template gene for simultaneous mutagenesis of five codons. Forty-eight randomly selected clones were sequenced. Sequencing revealed that all the 48 clones showed at least one mutant codon (mutation efficiency = 100%), and 46 out of the 48 clones had mutations at all the five codons. The obtained diversities at these five codons are 27, 24, 26, 26 and 22, respectively, which correspond to 84, 75, 81, 81, 69% of the theoretical diversity offered by NNK-degeneration (32 codons; NNK, K = T or G). The enzyme-free Simple-MSSM method can simultaneously and efficiently saturate five codons within one day, and therefore avoid missing interactions between residues in interacting amino acid networks.
Method for Constructing Composite Response Surfaces by Combining Neural Networks with Polynominal Interpolation or Estimation Techniques

NASA Technical Reports Server (NTRS)

Rai, Man Mohan (Inventor); Madavan, Nateri K. (Inventor)

2007-01-01

A method and system for data modeling that incorporates the advantages of both traditional response surface methodology (RSM) and neural networks is disclosed. The invention partitions the parameters into a first set of s simple parameters, where observable data are expressible as low order polynomials, and c complex parameters that reflect more complicated variation of the observed data. Variation of the data with the simple parameters is modeled using polynomials; and variation of the data with the complex parameters at each vertex is analyzed using a neural network. Variations with the simple parameters and with the complex parameters are expressed using a first sequence of shape functions and a second sequence of neural network functions. The first and second sequences are multiplicatively combined to form a composite response surface, dependent upon the parameter values, that can be used to identify an accurate mode
Graph-based optimization of epitope coverage for vaccine antigen design

DOE PAGES

Theiler, James Patrick; Korber, Bette Tina Marie

2017-01-29

Epigraph is a recently developed algorithm that enables the computationally efficient design of single or multi-antigen vaccines to maximize the potential epitope coverage for a diverse pathogen population. Potential epitopes are defined as short contiguous stretches of proteins, comparable in length to T-cell epitopes. This optimal coverage problem can be formulated in terms of a directed graph, with candidate antigens represented as paths that traverse this graph. Epigraph protein sequences can also be used as the basis for designing peptides for experimental evaluation of immune responses in natural infections to highly variable proteins. The epigraph tool suite also enables rapidmore » characterization of populations of diverse sequences from an immunological perspective. Fundamental distance measures are based on immunologically relevant shared potential epitope frequencies, rather than simple Hamming or phylogenetic distances. Here, we provide a mathematical description of the epigraph algorithm, include a comparison of different heuristics that can be used when graphs are not acyclic, and we describe an additional tool we have added to the web-based epigraph tool suite that provides frequency summaries of all distinct potential epitopes in a population. Lastly, we also show examples of the graphical output and summary tables that can be generated using the epigraph tool suite and explain their content and applications.« less
Graph-based optimization of epitope coverage for vaccine antigen design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Theiler, James Patrick; Korber, Bette Tina Marie

Epigraph is a recently developed algorithm that enables the computationally efficient design of single or multi-antigen vaccines to maximize the potential epitope coverage for a diverse pathogen population. Potential epitopes are defined as short contiguous stretches of proteins, comparable in length to T-cell epitopes. This optimal coverage problem can be formulated in terms of a directed graph, with candidate antigens represented as paths that traverse this graph. Epigraph protein sequences can also be used as the basis for designing peptides for experimental evaluation of immune responses in natural infections to highly variable proteins. The epigraph tool suite also enables rapidmore » characterization of populations of diverse sequences from an immunological perspective. Fundamental distance measures are based on immunologically relevant shared potential epitope frequencies, rather than simple Hamming or phylogenetic distances. Here, we provide a mathematical description of the epigraph algorithm, include a comparison of different heuristics that can be used when graphs are not acyclic, and we describe an additional tool we have added to the web-based epigraph tool suite that provides frequency summaries of all distinct potential epitopes in a population. Lastly, we also show examples of the graphical output and summary tables that can be generated using the epigraph tool suite and explain their content and applications.« less
Principles of protein folding--a perspective from simple exact models.

PubMed Central

Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.

1995-01-01

General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459
Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Comparison of the theoretical and real-world evolutionary potential of a genetic circuit

NASA Astrophysics Data System (ADS)

Razo-Mejia, M.; Boedicker, J. Q.; Jones, D.; DeLuna, A.; Kinney, J. B.; Phillips, R.

2014-04-01

With the development of next-generation sequencing technologies, many large scale experimental efforts aim to map genotypic variability among individuals. This natural variability in populations fuels many fundamental biological processes, ranging from evolutionary adaptation and speciation to the spread of genetic diseases and drug resistance. An interesting and important component of this variability is present within the regulatory regions of genes. As these regions evolve, accumulated mutations lead to modulation of gene expression, which may have consequences for the phenotype. A simple model system where the link between genetic variability, gene regulation and function can be studied in detail is missing. In this article we develop a model to explore how the sequence of the wild-type lac promoter dictates the fold-change in gene expression. The model combines single-base pair resolution maps of transcription factor and RNA polymerase binding energies with a comprehensive thermodynamic model of gene regulation. The model was validated by predicting and then measuring the variability of lac operon regulation in a collection of natural isolates. We then implement the model to analyze the sensitivity of the promoter sequence to the regulatory output, and predict the potential for regulation to evolve due to point mutations in the promoter region.
Development of EST Intron-Targeting SNP Markers for Panax ginseng and Their Application to Cultivar Authentication.

PubMed

Wang, Hongtao; Li, Guisheng; Kwon, Woo-Saeng; Yang, Deok-Chun

2016-06-04

Panax ginseng is one of the most valuable medicinal plants in the Orient. The low level of genetic variation has limited the application of molecular markers for cultivar authentication and marker-assisted selection in cultivated ginseng. To exploit DNA polymorphism within ginseng cultivars, ginseng expressed sequence tags (ESTs) were searched against the potential intron polymorphism (PIP) database to predict the positions of introns. Intron-flanking primers were then designed in conserved exon regions and used to amplify across the more variable introns. Sequencing results showed that single nucleotide polymorphisms (SNPs), as well as indels, were detected in four EST-derived introns, and SNP markers specific to "Gopoong" and "K-1" were first reported in this study. Based on cultivar-specific SNP sites, allele-specific polymerase chain reaction (PCR) was conducted and proved to be effective for the authentication of ginseng cultivars. Additionally, the combination of a simple NaOH-Tris DNA isolation method and real-time allele-specific PCR assay enabled the high throughput selection of cultivars from ginseng fields. The established real-time allele-specific PCR assay should be applied to molecular authentication and marker assisted selection of P. ginseng cultivars, and the EST intron-targeting strategy will provide a potential approach for marker development in species without whole genomic DNA sequence information.
Aircraft stress sequence development: A complex engineering process made simple

NASA Technical Reports Server (NTRS)

Schrader, K. H.; Butts, D. G.; Sparks, W. A.

1994-01-01

Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.
in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Xiaofan; Peris, David; Kominek, Jacek

The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less
in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

DOE PAGES

Zhou, Xiaofan; Peris, David; Kominek, Jacek; ...

2016-09-16

The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less
The transcriptome of Spodoptera exigua larvae exposed to different types of microbes.

PubMed

Pascual, Laura; Jakubowska, Agata K; Blanca, Jose M; Cañizares, Joaquin; Ferré, Juan; Gloeckner, Gernot; Vogel, Heiko; Herrero, Salvador

2012-08-01

We have obtained and characterized the transcriptome of Spodoptera exigua larvae with special emphasis on pathogen-induced genes. In order to obtain a highly representative transcriptome, we have pooled RNA from diverse insect colonies, conditions and tissues. Sequenced cDNA included samples from 3 geographically different colonies. Enrichment of RNA from pathogen-related genes was accomplished by exposing larvae to different pathogenic and non-pathogenic microbial agents such as the bacteria Bacillus thuringiensis, Micrococcus luteus, and Escherichia coli, the yeast Saccharomyces cerevisiae, and the S. exigua nucleopolyhedrovirus (SeMNPV). In addition, to avoid the loss of tissue-specific genes we included cDNA from the midgut, fat body, hemocytes and integument derived from pathogen exposed insects. RNA obtained from the different types of samples was pooled, normalized and sequenced. Analysis of the sequences obtained using the Roche 454 FLX and Sanger methods has allowed the generation of the largest public set of ESTs from S. exigua, including a large group of immune genes, and the identification of an important number of SSR (simple sequence repeats) and SNVs (single nucleotide variants: SNPs and INDELs) with potential use as genetic markers. Moreover, data mining has allowed the discovery of novel RNA viruses with potential influence in the insect population dynamics and the larval interactions with the microbial pesticides that are currently in use for the biological control of this pest. Copyright © 2012 Elsevier Ltd. All rights reserved.

De Novo Sequencing and Analysis of Lemongrass Transcriptome Provide First Insights into the Essential Oil Biosynthesis of Aromatic Grasses.

PubMed

Meena, Seema; Kumar, Sarma R; Venkata Rao, D K; Dwivedi, Varun; Shilpashree, H B; Rastogi, Shubhra; Shasany, Ajit K; Nagegowda, Dinesh A

2016-01-01

Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.
De Novo Sequencing and Analysis of Lemongrass Transcriptome Provide First Insights into the Essential Oil Biosynthesis of Aromatic Grasses

PubMed Central

Meena, Seema; Kumar, Sarma R.; Venkata Rao, D. K.; Dwivedi, Varun; Shilpashree, H. B.; Rastogi, Shubhra; Shasany, Ajit K.; Nagegowda, Dinesh A.

2016-01-01

Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition. PMID:27516768
Evolutionary Influenced Interaction Pattern as Indicator for the Investigation of Natural Variants Causing Nephrogenic Diabetes Insipidus

PubMed Central

Labudde, Dirk

2015-01-01

The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations. PMID:26180540
Evolutionary Influenced Interaction Pattern as Indicator for the Investigation of Natural Variants Causing Nephrogenic Diabetes Insipidus.

PubMed

Grunert, Steffen; Labudde, Dirk

2015-01-01

The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations.
Analysis of SSR information in EST resources of sugarcane

USDA-ARS?s Scientific Manuscript database

Expressed sequence tags ( ESTs) offer the opportunity to exploit single, low -copy, conserved sequence motifs for the development of simple sequence repeats ( SSRs). The total of 262 113 ESTs of sugarcane (Saccharum officinarum) in the database of NCBI were downloaded and analyzed, which resulted in...
Sma3s: A universal tool for easy functional annotation of proteomes and transcriptomes.

PubMed

Casimiro-Soriguer, Carlos S; Muñoz-Mérida, Antonio; Pérez-Pulido, Antonio J

2017-06-01

The current cheapening of next-generation sequencing has led to an enormous growth in the number of sequenced genomes and transcriptomes, allowing wet labs to get the sequences from their organisms of study. To make the most of these data, one of the first things that should be done is the functional annotation of the protein-coding genes. But it used to be a slow and tedious step that can involve the characterization of thousands of sequences. Sma3s is an accurate computational tool for annotating proteins in an unattended way. Now, we have developed a completely new version, which includes functionalities that will be of utility for fundamental and applied science. Currently, the results provide functional categories such as biological processes, which become useful for both characterizing particular sequence datasets and comparing results from different projects. But one of the most important implemented innovations is that it has now low computational requirements, and the complete annotation of a simple proteome or transcriptome usually takes around 24 hours in a personal computer. Sma3s has been tested with a large amount of complete proteomes and transcriptomes, and it has demonstrated its potential in health science and other specific projects. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
tropiTree: An NGS-Based EST-SSR Resource for 24 Tropical Tree Species

PubMed Central

Russell, Joanne R.; Hedley, Peter E.; Cardle, Linda; Dancey, Siobhan; Morris, Jenny; Booth, Allan; Odee, David; Mwaura, Lucy; Omondi, William; Angaine, Peter; Machua, Joseph; Muchugi, Alice; Milne, Iain; Kindt, Roeland; Jamnadass, Ramni; Dawson, Ian K.

2014-01-01

The development of genetic tools for non-model organisms has been hampered by cost, but advances in next-generation sequencing (NGS) have created new opportunities. In ecological research, this raises the prospect for developing molecular markers to simultaneously study important genetic processes such as gene flow in multiple non-model plant species within complex natural and anthropogenic landscapes. Here, we report the use of bar-coded multiplexed paired-end Illumina NGS for the de novo development of expressed sequence tag-derived simple sequence repeat (EST-SSR) markers at low cost for a range of 24 tree species. Each chosen tree species is important in complex tropical agroforestry systems where little is currently known about many genetic processes. An average of more than 5,000 EST-SSRs was identified for each of the 24 sequenced species, whereas prior to analysis 20 of the species had fewer than 100 nucleotide sequence citations. To make results available to potential users in a suitable format, we have developed an open-access, interactive online database, tropiTree (http://bioinf.hutton.ac.uk/tropiTree), which has a range of visualisation and search facilities, and which is a model for the efficient presentation and application of NGS data. PMID:25025376
Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold

PubMed Central

Li, Weizhong; Lopez, Rodrigo

2017-01-01

Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999
PUF Proteins: Cellular Functions and Potential Applications.

PubMed

Kiani, Seyed Jalal; Taheri, Tahereh; Rafati, Sima; Samimi-Rad, Katayoun

2017-01-01

RNA-binding proteins play critical roles in the regulation of gene expression. Among several families of RNA-binding proteins, PUF (Pumilio and FBF) proteins have been the subject of extensive investigations, as they can bind RNA in a sequence-specific manner and they are evolutionarily conserved among a wide range of organisms. The outstanding feature of these proteins is a highly conserved RNA-binding domain, which is known as the Pumilio-homology domain (PUM-HD) that mostly consists of eight tandem repeats. Each repeat recognizes an RNA base with a simple three-letter code that can be programmed in order to change the sequence-specificity of the protein. Using this tailored architecture, researchers have been able to change the specificity of the PUM-HD and target desired transcripts in the cell, even in subcellular compartments. The potential applications of this versatile tool in molecular cell biology seem unbounded and the use of these factors in pharmaceutics might be an interesting field of study in near future. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Modeling of prepregs during automated draping sequences

NASA Astrophysics Data System (ADS)

Krogh, Christian; Glud, Jens A.; Jakobsen, Johnny

2017-10-01

The behavior of wowen prepreg fabric during automated draping sequences is investigated. A drape tool under development with an arrangement of grippers facilitates the placement of a woven prepreg fabric in a mold. It is essential that the draped configuration is free from wrinkles and other defects. The present study aims at setting up a virtual draping framework capable of modeling the draping process from the initial flat fabric to the final double curved shape and aims at assisting the development of an automated drape tool. The virtual draping framework consists of a kinematic mapping algorithm used to generate target points on the mold which are used as input to a draping sequence planner. The draping sequence planner prescribes the displacement history for each gripper in the drape tool and these displacements are then applied to each gripper in a transient model of the draping sequence. The model is based on a transient finite element analysis with the material's constitutive behavior currently being approximated as linear elastic orthotropic. In-plane tensile and bias-extension tests as well as bending tests are conducted and used as input for the model. The virtual draping framework shows a good potential for obtaining a better understanding of the drape process and guide the development of the drape tool. However, results obtained from using the framework on a simple test case indicate that the generation of draping sequences is non-trivial.
Simple Sequence Repeat and S-locus Genotyping to Explore Genetic Variability in Polyploid Prunus spinosa and P. insititia.

PubMed

Halász, Júlia; Makovics-Zsohár, Noémi; Szőke, Ferenc; Ercisli, Sezai; Hegedűs, Attila

2017-02-01

Polyploid Prunus spinosa (2n = 4×) and P. insititia (2n = 6×) represent enormous genetic potential in Central Europe, which can be exploited in breeding programmes. In Hungary, 17 cultivar candidates were selected from wild-growing populations including 10 P. spinosa, 4 P. insititia and three P. spinosa × P. domestica hybrids (2n = 5×). Their taxonomic classification was based on their phenotypic characteristics. Six simple sequence repeats (SSRs) and the multiallelic S-locus genotyping were used to characterize genetic variability and reliable identification of the tested accessions. A total of 98 SSR alleles were identified, which presents 19.5 average allele number per locus, and each of the 17 genotypes could be discriminated based on unique SSR fingerprints. A total of 23 S-RNase alleles were identified. The complete and partial S-genotype was determined for 8 and 9 accessions, respectively. The identification of a cross-incompatible pair of cultivar candidates and several semi-compatible combinations help maximize fruit set in commercial orchards. Our results indicate that the S-allele pools of wild-growing P. spinosa and P. insititia are overlapping in Hungary. A phylogenetic and principal component analysis confirmed the high level of diversity and genetic differentiation present within the analysed genotypes and helped clarify doubtful taxonomic identities. Our data confirm that S-locus genotyping is suitable for diversity studies in polyploid Prunus species. The analysed accessions represent huge genetic potential that can be exploited in commercial cultivation.
Genetic diversity analysis of cyanogenic potential (CNp) of root among improved genotypes of cassava using simple sequence repeat markers.

PubMed

Moyib, O K; Mkumbira, J; Odunola, O A; Dixon, A G

2012-12-01

Cyanogenic potential (CNp) of cassava constitutes a serious problem for over 500 million people who rely on the crop as their main source of calories. Genetic diversity is a key to successful crop improvement for breeding new improved variability for target traits. Forty-three improved genotypes of cassava developed by International Institute of Tropical Agriculture (ITA), Ibadan, were characterized for CNp trait using 35 Simple Sequence.Repeat (SSR) markers. Essential colorimetry picric test was used for evaluation of CNp on a color scale of 1 to 14. The CNp scores obtained ranged from 3 to 9, with a mean score of 5.48 (+/- 0.09) based on Statistical Analysis System (SAS) package. TMS M98/ 0068 (4.0 +/- 0.25) was identified as the best genotype with low CNp while TMS M98/0028 (7.75 +/- 0.25) was the worst. The 43 genotypes were assigned into 7 phenotypic groups based on rank-sum analysis in SAS. Dissimilarity analysis representatives for windows generated a phylogenetic tree with 5 clusters which represented hybridizing groups. Each of the clusters (except 4) contained low CNp genotypes that could be used for improving the high CNp genotypes in the same or near cluster. The scatter plot of the genotypes showed that there was little or no demarcation for phenotypic CNp groupings in the molecular groupings. The result of this study demonstrated that SSR markers are powerful tools for the assessment of genetic variability, and proper identification and selection of parents for genetic improvement of low CNp trait among the IITA cassava collection.
How Does Sequence Structure Affect the Judgment of Time? Exploring a Weighted Sum of Segments Model

ERIC Educational Resources Information Center

Matthews, William J.

2013-01-01

This paper examines the judgment of segmented temporal intervals, using short tone sequences as a convenient test case. In four experiments, we investigate how the relative lengths, arrangement, and pitches of the tones in a sequence affect judgments of sequence duration, and ask whether the data can be described by a simple weighted sum of…
Discrete sequence prediction and its applications

NASA Technical Reports Server (NTRS)

Laird, Philip

1992-01-01

Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We apply sequence prediction using a simple and practical sequence-prediction algorithm, called TDAG. The TDAG algorithm is first tested by comparing its performance with some common data compression algorithms. Then it is adapted to the detailed requirements of dynamic program optimization, with excellent results.
Next-generation sequencing library construction on a surface.

PubMed

Feng, Kuan; Costa, Justin; Edwards, Jeremy S

2018-05-30

Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality.
Identification of the full-length β-actin sequence and expression profiles in the tree shrew (Tupaia belangeri).

PubMed

Zheng, Yu; Yun, Chenxia; Wang, Qihui; Smith, Wanli W; Leng, Jing

2015-02-01

The tree shrew (Tupaia belangeri) diverges from the primate order (Primates) and is classified as a separate taxonomic group of mammals - Scandentia. It has been suggested that the tree shrew can be used as an animal model for studying human diseases; however, the genomic sequence of the tree shrew is largely unidentified. In the present study, we reported the full-length cDNA sequence of the housekeeping gene, β-actin, in the tree shrew. The amino acid sequence of β-actin in the tree shrew was compared to that of humans and other species; a simple phylogenetic relationship was discovered. Quantitative polymerase chain reaction (qPCR) and western blot analysis further demonstrated that the expression profiles of β-actin, as a general conservative housekeeping gene, in the tree shrew were similar to those in humans, although the expression levels varied among different types of tissue in the tree shrew. Our data provide evidence that the tree shrew has a close phylogenetic association with humans. These findings further enhance the potential that the tree shrew, as a species, may be used as an animal model for studying human disorders.
How Large Asexual Populations Adapt

NASA Astrophysics Data System (ADS)

Desai, Michael

2007-03-01

We often think of beneficial mutations as being rare, and of adaptation as a sequence of selected substitutions: a beneficial mutation occurs, spreads through a population in a selective sweep, then later another beneficial mutation occurs, and so on. This simple picture is the basis for much of our intuition about adaptive evolution, and underlies a number of practical techniques for analyzing sequence data. Yet many large and mostly asexual populations -- including a wide variety of unicellular organisms and viruses -- live in a very different world. In these populations, beneficial mutations are common, and frequently interfere or cooperate with one another as they all attempt to sweep simultaneously. This radically changes the way these populations adapt: rather than an orderly sequence of selective sweeps, evolution is a constant swarm of competing and interfering mutations. I will describe some aspects of these dynamics, including why large asexual populations cannot evolve very quickly and the character of the diversity they maintain. I will explain how this changes our expectations of sequence data, how sex can help a population adapt, and the potential role of ``mutator'' phenotypes with abnormally high mutation rates. Finally, I will discuss comparisons of these predictions with evolution experiments in laboratory yeast populations.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
RNA Sequencing Analysis of the Gametophyte Transcriptome from the Liverwort, Marchantia polymorpha

PubMed Central

Sharma, Niharika; Jung, Chol-Hee; Bhalla, Prem L.; Singh, Mohan B.

2014-01-01

The liverwort Marchantia polymorpha is a member of the most basal lineage of land plants (embryophytes) and likely retains many ancestral morphological, physiological and molecular characteristics. Despite its phylogenetic importance and the availability of previous EST studies, M. polymorpha’s lack of economic importance limits accessible genomic resources for this species. We employed Illumina RNA-Seq technology to sequence the gametophyte transcriptome of M. polymorpha. cDNA libraries from 6 different male and female developmental tissues were sequenced to delineate a global view of the M. polymorpha transcriptome. Approximately 80 million short reads were obtained and assembled into a non-redundant set of 46,533 transcripts (> = 200 bp) from 46,070 loci. The average length and the N50 length of the transcripts were 757 bp and 471 bp, respectively. Sequence comparison of assembled transcripts with non-redundant proteins from embryophytes resulted in the annotation of 43% of the transcripts. The transcripts were also compared with M. polymorpha expressed sequence tags (ESTs), and approximately 69.5% of the transcripts appeared to be novel. Twenty-one percent of the transcripts were assigned GO terms to improve annotation. In addition, 6,112 simple sequence repeats (SSRs) were identified as potential molecular markers, which may be useful in studies of genetic diversity. A comparative genomics approach revealed that a substantial proportion of the genes (35.5%) expressed in M. polymorpha were conserved across phylogenetically related species, such as Selaginella and Physcomitrella, and identified 580 genes that are potentially unique to liverworts. Our study presents an extensive amount of novel sequence information for M. polymorpha. This information will serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the isolation and characterization of functional genes that are involved in sex differentiation and sexual reproduction in this liverwort. PMID:24841988
Colloidal polymers with controlled sequence and branching constructed from magnetic field assembled nanoparticles.

PubMed

Bannwarth, Markus B; Utech, Stefanie; Ebert, Sandro; Weitz, David A; Crespy, Daniel; Landfester, Katharina

2015-03-24

The assembly of nanoparticles into polymer-like architectures is challenging and usually requires highly defined colloidal building blocks. Here, we show that the broad size-distribution of a simple dispersion of magnetic nanocolloids can be exploited to obtain various polymer-like architectures. The particles are assembled under an external magnetic field and permanently linked by thermal sintering. The remarkable variety of polymer-analogue architectures that arises from this simple process ranges from statistical and block copolymer-like sequencing to branched chains and networks. This library of architectures can be realized by controlling the sequencing of the particles and the junction points via a size-dependent self-assembly of the single building blocks.

Sulfanilic acid-modified chitosan mini-spheres and their application for lysozyme purification from egg white.

PubMed

Hirsch, Daniela B; Baieli, María F; Urtasun, Nicolás; Lázaro-Martínez, Juan M; Glisoni, Romina J; Miranda, María V; Cascone, Osvaldo; Wolman, Federico J

2018-03-01

A cation exchange matrix with zwitterionic and multimodal properties was synthesized by a simple reaction sequence coupling sulfanilic acid to a chitosan based support. The novel chromatographic matrix was physico-chemically characterized by ss-NMR and ζ potential, and its chromatographic performance was evaluated for lysozyme purification from diluted egg white. The maximum adsorption capacity, calculated according to Langmuir adsorption isotherm, was 50.07 ± 1.47 mg g -1 while the dissociation constant was 0.074 ± 0.012 mg mL -1 . The process for lysozyme purification from egg white was optimized, with 81.9% yield and a purity degree of 86.5%, according to RP-HPLC analysis. This work shows novel possible applications of chitosan based materials. The simple synthesis reactions combined with the simple mode of use of the chitosan matrix represents a novel method to purify proteins from raw starting materials. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 34:387-396, 2018. © 2017 American Institute of Chemical Engineers.
Deciphering mRNA Sequence Determinants of Protein Production Rate

NASA Astrophysics Data System (ADS)

Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen

2018-03-01

One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.
A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package

PubMed Central

Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.

2013-01-01

Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703
A frequency-based linguistic approach to protein decoding and design: Simple concepts, diverse applications, and the SCS Package.

PubMed

Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M

2013-01-01

Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

PubMed

Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

PubMed Central

Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396
Statistical learning of movement.

PubMed

Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M

2016-12-01

The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.
Microcomputer-Assisted Mathematics: From Simple Interest to e.

ERIC Educational Resources Information Center

Kimberling, Clark

1985-01-01

The progression from simple interest to compound interest leads naturally and quickly to the number e, involving mathematical discovery learning through writing programs. Several programs are given, with suggestions for a teaching sequence. (MNS)
Characterization of a species-specific repetitive DNA from a highly endangered wild animal, Rhinoceros unicornis, and assessment of genetic polymorphism by microsatellite associated sequence amplification (MASA).

PubMed

Ali, S; Azfer, M A; Bashamboo, A; Mathur, P K; Malik, P K; Mathur, V B; Raha, A K; Ansari, S

1999-03-04

We have cloned and sequenced a 906bp EcoRI repeat DNA fraction from Rhinoceros unicornis genome. The contig pSS(R)2 is AT rich with 340 A (37.53%), 187 C (20.64%), 173 G (19.09%) and 206 T (22.74%). The sequence contains MALT box, NF-E1, Poly-A signal, lariat consensus sequences, TATA box, translational initiation sequences and several stop codons. Translation of the contig showed seven different types of protein motifs, among which, EGF-like domain cysteine pattern signatures and Bowman-Birk serine protease inhibitor family signatures were prominent. The presence of eukaryotic transcriptional elements, protein signatures and analysis of subset sequences in the 5' region from 1 to 165nt indicating coding potential (test code value=0.97) suggest possible regulatory and/or functional role(s) of these sequences in the rhino genome. Translation of the complementary strand from 906 to 706nt and 190 to 2nt showed proteins of more than 7kDa rich in non-polar residues. This suggests that pSS(R)2 is either a part of, or adjacent to, a functional gene. The contig contains mostly non-consecutive simple repeat units from 2 to 17nt with varying frequencies, of which four base motifs were found to be predominant. Zoo-blot hybridization revealed that pSS(R)2 sequences are unique to R. unicornis genome because they do not cross-hybridize, even with the genomic DNA of South African black rhino Diceros bicornis. Southern blot analysis of R. unicornis genomic DNA with pSS(R)2 and other synthetic oligo probes revealed a high level of genetic homogeneity, which was also substantiated by microsatellite associated sequence amplification (MASA). Owing to its uniqueness, the pSS(R)2 probe has a potential application in the area of conservation biology for unequivocal identification of horn or other body tissues of R. unicornis. The evolutionary aspect of this repeat fraction in the context of comparative genome analysis is discussed.
Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

PubMed

Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

1993-02-01

An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.
On-line separation and characterization of hyaluronan oligosaccharides derived from radical depolymerization

PubMed Central

Zhao, Xue; Yang, Bo; Li, Lingyun; Zhang, Fuming; Linhardt, Robert J.

2013-01-01

Hydroxyl radicals are widely implicated in the oxidation of carbohydrates in biological and industrial processes and are often responsible for their structural modification resulting in functional damage. In this study, the radical depolymerization of the polysaccharide hyaluronan was studied in a reaction with hydroxyl radicals generated by Fenton Chemistry. A simple method for isolation and identification of the resulting non-sulfated oligosaccharide products of oxidative depolymerization was established. Hyaluronan oligosaccharides were analyzed using ion-pairing reversed phase high performance liquid chromotography coupled with tandem electrospray mass spectrometry. The sequence of saturated hyaluronan oligosaccharides having even- and odd-numbers of saccharide units, afforded through oxidative depolymerization, were identified. This study represents a simple, effective ‘fingerprinting’ protocol for detecting the damage done to hyaluronan by oxidative radicals. This study should help reveal the potential biological outcome of reactive-oxygen radical-mediated depolymerization of hyaluronan. PMID:23768593
The Mathematics of Dispatchability Revisited

NASA Technical Reports Server (NTRS)

Morris, Paul

2016-01-01

Dispatchability is an important property for the efficient execution of temporal plans where the temporal constraints are represented as a Simple Temporal Network (STN). It has been shown that every STN may be reformulated as a dispatchable STN, and dispatchability ensures that the temporal constraints need only be satisfied locally during execution. Recently it has also been shown that Simple Temporal Networks with Uncertainty, augmented with wait edges, are Dynamically Controllable provided every projection is dispatchable. Thus, the dispatchability property has both theoretical and practical interest. One thing that hampers further work in this area is the underdeveloped theory. The existing definitions are expressed in terms of algorithms, and are less suitable for mathematical proofs. In this paper, we develop a new formal theory of dispatchability in terms of execution sequences. We exploit this to prove a characterization of dispatchability involving the structural properties of the STN graph. This facilitates the potential application of the theory to uncertainty reasoning.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

PubMed Central

Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

2011-01-01

Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts

PubMed Central

Naito, Yuki; Bono, Hidemasa

2012-01-01

GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users. PMID:22641850
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.

PubMed

Naito, Yuki; Bono, Hidemasa

2012-07-01

GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
Simple sequence repeat marker loci discovery using SSR primer.

PubMed

Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David

2004-06-12

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

PubMed Central

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.

PubMed

Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

2017-01-01

Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).
Fibre optic system for biochemical and microbiological sensing

NASA Astrophysics Data System (ADS)

Penwill, L. A.; Slater, J. H.; Hayes, N. W.; Tremlett, C. J.

2007-07-01

This poster will discuss state-of-the-art fibre optic sensors based on evanescent wave technology emphasising chemophotonic sensors for biochemical reactions and microbe detection. Devices based on antibody specificity and unique DNA sequences will be described. The development of simple sensor devices with disposable single use sensor probes will be illustrated with a view to providing cost effective field based or point of care analysis of major themes such as hospital acquired infections or bioterrorism events. This presentation will discuss the nature and detection thresholds required, the optical detection techniques investigated, results of sensor trials and the potential for wider commercial application.

Aggregate age-at-marriage patterns from individual mate-search heuristics.

PubMed

Todd, Peter M; Billari, Francesco C; Simão, Jorge

2005-08-01

The distribution of age at first marriage shows well-known strong regularities across many countries and recent historical periods. We accounted for these patterns by developing agent-based models that simulate the aggregate behavior of individuals who are searching for marriage partners. Past models assumed fully rational agents with complete knowledge of the marriage market; our simulated agents used psychologically plausible simple heuristic mate search rules that adjust aspiration levels on the basis of a sequence of encounters with potential partners. Substantial individual variation must be included in the models to account for the demographically observed age-at-marriage patterns.
Single-molecule Protein Unfolding in Solid State Nanopores

PubMed Central

Talaga, David S.; Li, Jiali

2009-01-01

We use single silicon nitride nanopores to study folded, partially folded and unfolded single proteins by measuring their excluded volumes. The DNA-calibrated translocation signals of β-lactoglobulin and histidine-containing phosphocarrier protein match quantitatively with that predicted by a simple sum of the partial volumes of the amino acids in the polypeptide segment inside the pore when translocation stalls due to the primary charge sequence. Our analysis suggests that the majority of the protein molecules were linear or looped during translocation and that the electrical forces present under physiologically relevant potentials can unfold proteins. Our results show that the nanopore translocation signals are sensitive enough to distinguish the folding state of a protein and distinguish between proteins based on the excluded volume of a local segment of the polypeptide chain that transiently stalls in the nanopore due to the primary sequence of charges. PMID:19530678
Modulation of tissue repair by regeneration enhancer elements.

PubMed

Kang, Junsu; Hu, Jianxin; Karra, Ravi; Dickson, Amy L; Tornini, Valerie A; Nachtrab, Gregory; Gemberling, Matthew; Goldman, Joseph A; Black, Brian L; Poss, Kenneth D

2016-04-14

How tissue regeneration programs are triggered by injury has received limited research attention. Here we investigate the existence of enhancer regulatory elements that are activated in regenerating tissue. Transcriptomic analyses reveal that leptin b (lepb) is highly induced in regenerating hearts and fins of zebrafish. Epigenetic profiling identified a short DNA sequence element upstream and distal to lepb that acquires open chromatin marks during regeneration and enables injury-dependent expression from minimal promoters. This element could activate expression in injured neonatal mouse tissues and was divisible into tissue-specific modules sufficient for expression in regenerating zebrafish fins or hearts. Simple enhancer-effector transgenes employing lepb-linked sequences upstream of pro- or anti-regenerative factors controlled the efficacy of regeneration in zebrafish. Our findings provide evidence for 'tissue regeneration enhancer elements' (TREEs) that trigger gene expression in injury sites and can be engineered to modulate the regenerative potential of vertebrate organs.
A behavior analytic analogue of learning to use synonyms, syntax, and parts of speech.

PubMed

Chase, Philip N; Ellenwood, David W; Madden, Gregory

2008-01-01

Matching-to-sample and sequence training procedures were used to develop responding to stimulus classes that were considered analogous to 3 aspects of verbal behavior: identifying synonyms and parts of speech, and using syntax. Matching-to-sample procedures were used to train 12 paired associates from among 24 stimuli. These pairs were analogous to synonyms. Then, sequence characteristics were trained to 6 of the stimuli. The result was the formation of 3 classes of 4 stimuli, with the classes controlling a sequence response analogous to a simple ordering syntax: first, second, and third. Matching-to-sample procedures were then used to add 4 stimuli to each class. These stimuli, without explicit sequence training, also began to control the same sequence responding as the other members of their class. Thus, three 8-member functionally equivalent sequence classes were formed. These classes were considered to be analogous to parts of speech. Further testing revealed three 8-member equivalence classes and 512 different sequences of first, second, and third. The study indicated that behavior analytic procedures may be used to produce some generative aspects of verbal behavior related to simple syntax and semantics.
Molecular phylogenetic position of hexactinellid sponges in relation to the Protista and Demospongiae.

PubMed

West, L; Powers, D

1993-01-01

Although it is generally accepted that the first multicellular organisms arose from unicellular ancestors, the phylogenetic relationships linking these groups remain unclear. Anatomical, physiological, and molecular studies of current multicellular organisms with relatively simple body organization suggest key characteristics of the earliest multicellular lineages. Glass sponges, the Hexactinellida, possess cellular characteristics that resemble some unicellular protistan organisms. These unique sponges were abundant in shallow seas of the early Cambrian, but they are currently restricted to polar habitats or very deep regions of the world oceans. Due in part to their relative inaccessibility, their potential significance to the early phylogeny of the eukaryotic kingdoms has been largely overlooked. We used sequences of the 18s ribosomal RNA gene of Farrea occa, a representative of the deep-water hexactinellid sponges, and Coelocarteria singaporense, a representative of the more common demosponges, and compared them with selected ribosomal RNA gene sequences available within the Protista. Using four computational methods for phylogenetic analysis of ribosomal DNA sequences, we found that the hexactinellid sponge-demosponge cluster is most closely related to Volvox and Acanthamoeba.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing

PubMed Central

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2017-01-01

Flax (Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5–8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs. PMID:28133461
Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology(1.).

PubMed

Robarts, Daniel W H; Wolfe, Andrea D

2014-07-01

In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance.
Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology1

PubMed Central

Robarts, Daniel W. H.; Wolfe, Andrea D.

2014-01-01

In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance. PMID:25202637
Development of EST Intron-Targeting SNP Markers for Panax ginseng and Their Application to Cultivar Authentication

PubMed Central

Wang, Hongtao; Li, Guisheng; Kwon, Woo-Saeng; Yang, Deok-Chun

2016-01-01

Panax ginseng is one of the most valuable medicinal plants in the Orient. The low level of genetic variation has limited the application of molecular markers for cultivar authentication and marker-assisted selection in cultivated ginseng. To exploit DNA polymorphism within ginseng cultivars, ginseng expressed sequence tags (ESTs) were searched against the potential intron polymorphism (PIP) database to predict the positions of introns. Intron-flanking primers were then designed in conserved exon regions and used to amplify across the more variable introns. Sequencing results showed that single nucleotide polymorphisms (SNPs), as well as indels, were detected in four EST-derived introns, and SNP markers specific to “Gopoong” and “K-1” were first reported in this study. Based on cultivar-specific SNP sites, allele-specific polymerase chain reaction (PCR) was conducted and proved to be effective for the authentication of ginseng cultivars. Additionally, the combination of a simple NaOH-Tris DNA isolation method and real-time allele-specific PCR assay enabled the high throughput selection of cultivars from ginseng fields. The established real-time allele-specific PCR assay should be applied to molecular authentication and marker assisted selection of P. ginseng cultivars, and the EST intron-targeting strategy will provide a potential approach for marker development in species without whole genomic DNA sequence information. PMID:27271615
SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop.

PubMed

Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo

2014-01-01

Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig's scalability over many computing nodes and illustrate its use with example scripts. Available under the open source MIT license at http://sourceforge.net/projects/seqpig/
In Silico Comparative Transcriptome Analysis of Two Color Morphs of the Common Coral Trout (Plectropomus Leopardus)

PubMed Central

Wang, Le; Yu, Cuiping; Guo, Liang; Lin, Haoran; Meng, Zining

2015-01-01

The common coral trout is one species of major importance in commercial fisheries and aquaculture. Recently, two different color morphs of Plectropomus leopardus were discovered and the biological importance of the color difference is unknown. Since coral trout species are poorly characterized at the molecular level, we undertook the transcriptomic characterization of the two color morphs, one black and one red coral trout, using Illumina next generation sequencing technologies. The study produced 55162966 and 54588952 paired-end reads, for black and red trout, respectively. De novo transcriptome assembly generated 95367 and 99424 unique sequences in black and red trout, respectively, with 88813 sequences shared between them. Approximately 50% of both trancriptomes were functionally annotated by BLAST searches against protein databases. The two trancriptomes were enriched into 25 functional categories and showed similar profiles of Gene Ontology category compositions. 34110 unigenes were grouped into 259 KEGG pathways. Moreover, we identified 14649 simple sequence repeats (SSRs) and designed primers for potential application. We also discovered 130524 putative single nucleotide polymorphisms (SNPs) in the two transcriptomes, supplying potential genomic resources for the coral trout species. In addition, we identified 936 fast-evolving genes and 165 candidate genes under positive selection between the two color morphs. Finally, 38 candidate genes underlying the mechanism of color and pigmentation were also isolated. This study presents the first transcriptome resources for the common coral trout and provides basic information for the development of genomic tools for the identification, conservation, and understanding of the speciation and local adaptation of coral reef fish species. PMID:26713756
Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology.

PubMed

Tanase, Koji; Nishitani, Chikako; Hirakawa, Hideki; Isobe, Sachiko; Tabata, Satoshi; Ohmiya, Akemi; Onozaki, Takashi

2012-07-02

Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. We constructed a normalized cDNA library and a 3'-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.
Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology

PubMed Central

2012-01-01

Background Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant. PMID:22747974
Sequence Composition and Gene Content of the Short Arm of Rye (Secale cereale) Chromosome 1

PubMed Central

Fluch, Silvia; Kopecky, Dieter; Burg, Kornel; Šimková, Hana; Taudien, Stefan; Petzold, Andreas; Kubaláková, Marie; Platzer, Matthias; Berenyi, Maria; Krainer, Siegfried; Doležel, Jaroslav; Lelley, Tamas

2012-01-01

Background The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. Methodology/Principal Findings Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. Conclusions The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye. PMID:22328922
Integrative taxonomy of Metrichia Ross (Trichoptera: Hydroptilidae: Ochrotrichiinae) microcaddisflies from Brazil: descriptions of twenty new species

PubMed Central

Takiya, Daniela M.; Nessimian, Jorge L.

2016-01-01

Metrichia is assigned to the Ochrotrichiinae, a group of almost exclusively Neotropical microcaddisflies. Metrichia comprises over 100 described species and, despite its diversity, only one species has been described from Brazil so far. In this paper, we provide descriptions for 20 new species from 8 Brazilian states: M. acuminata sp. nov., M. azul sp. nov., M. bonita sp. nov., M. bracui sp. nov., M. caraca sp. nov., M. circuliforme sp. nov., M. curta sp. nov., M. farofa sp. nov., M. forceps sp. nov., M. formosinha sp. nov., M. goiana sp. nov., M. itabaiana sp. nov., M. longissima sp. nov., M. peluda sp. nov., M. rafaeli sp. nov., M. simples sp. nov., M. talhada sp. nov., M. tere sp. nov., M. ubajara sp. nov., and M. vulgaris sp. nov. DNA barcode sequences (577 bp of the mitochondrial gene COI) were generated for 13 of the new species and two previously known species of Metrichia resulting in 64 sequences. In addition, COI sequences were obtained for other genera of Ochrotrichiinae (Angrisanoia, Nothotrichia, Ochrotrichia, Ragatrichia, and Rhyacopsyche). DNA sequences and morphological data were integrated to evaluate species delimitations. K2P pairwise distances were calculated to generate a neighbor-joining tree. COI sequences also were submitted to ABGD and GMYC methods to assess ‘potential species’ delimitation. Analyses showed a conspicuous barcoding gap among Metrichia sequences (highest intraspecific divergence: 4.8%; lowest interspecific divergence: 12.6%). Molecular analyses also allowed the association of larvae and adults of Metrichia bonita sp. nov. from Mato Grosso do Sul, representing the first record of microcaddisfly larvae occurring in calcareous tufa (or travertine). ABGD results agreed with the morphological delimitation of Metrichia species, while GMYC estimated a slightly higher number of species, suggesting the division of two morphological species, each one into two potential species. Because this could be due to unbalanced sampling and the lack of morphological diagnostic characters, we have maintained these two species as undivided. PMID:27169001
Rapid Identification and Differentiation of Trichophyton Species, Based on Sequence Polymorphisms of the Ribosomal Internal Transcribed Spacer Regions, by Rolling-Circle Amplification▿

PubMed Central

Kong, Fanrong; Tong, Zhongsheng; Chen, Xiaoyou; Sorrell, Tania; Wang, Bin; Wu, Qixuan; Ellis, David; Chen, Sharon

2008-01-01

DNA sequencing analyses have demonstrated relatively limited polymorphisms within the fungal internal transcribed spacer (ITS) regions among Trichophyton spp. We sequenced the ITS region (ITS1, 5.8S, and ITS2) for 42 dermatophytes belonging to seven species (Trichophyton rubrum, T. mentagrophytes, T. soudanense, T. tonsurans, Epidermophyton floccosum, Microsporum canis, and M. gypseum) and developed a novel padlock probe and rolling-circle amplification (RCA)-based method for identification of single nucleotide polymorphisms (SNPs) that could be exploited to differentiate between Trichophyton spp. Sequencing results demonstrated intraspecies genetic variation for T. tonsurans, T. mentagrophytes, and T. soudanense but not T. rubrum. Signature sets of SNPs between T. rubrum and T. soudanense (4-bp difference) and T. violaceum and T. soudanense (3-bp difference) were identified. The RCA assay correctly identified five Trichophyton species. Although the use of two “group-specific” probes targeting both the ITS1 and the ITS2 regions were required to identify T. soudanense, the other species were identified by single ITS1- or ITS2-targeted species-specific probes. There was good agreement between ITS sequencing and the RCA assay. Despite limited genetic variation between Trichophyton spp., the sensitive, specific RCA-based SNP detection assay showed potential as a simple, reproducible method for the rapid (2-h) identification of Trichophyton spp. PMID:18234865
Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

PubMed

Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

2018-03-01

Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.
Bacterial identification and subtyping using DNA microarray and DNA sequencing.

PubMed

Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

2012-01-01

The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
GABI-Kat SimpleSearch: new features of the Arabidopsis thaliana T-DNA mutant database.

PubMed

Kleinboelting, Nils; Huep, Gunnar; Kloetgen, Andreas; Viehoever, Prisca; Weisshaar, Bernd

2012-01-01

T-DNA insertion mutants are very valuable for reverse genetics in Arabidopsis thaliana. Several projects have generated large sequence-indexed collections of T-DNA insertion lines, of which GABI-Kat is the second largest resource worldwide. User access to the collection and its Flanking Sequence Tags (FSTs) is provided by the front end SimpleSearch (http://www.GABI-Kat.de). Several significant improvements have been implemented recently. The database now relies on the TAIRv10 genome sequence and annotation dataset. All FSTs have been newly mapped using an optimized procedure that leads to improved accuracy of insertion site predictions. A fraction of the collection with weak FST yield was re-analysed by generating new FSTs. Along with newly found predictions for older sequences about 20,000 new FSTs were included in the database. Information about groups of FSTs pointing to the same insertion site that is found in several lines but is real only in a single line are included, and many problematic FST-to-line links have been corrected using new wet-lab data. SimpleSearch currently contains data from ~71,000 lines with predicted insertions covering 62.5% of the 27,206 nuclear protein coding genes, and offers insertion allele-specific data from 9545 confirmed lines that are available from the Nottingham Arabidopsis Stock Centre.
Prediction during statistical learning, and implications for the implicit/explicit divide

PubMed Central

Dale, Rick; Duran, Nicholas D.; Morehead, J. Ryan

2012-01-01

Accounts of statistical learning, both implicit and explicit, often invoke predictive processes as central to learning, yet practically all experiments employ non-predictive measures during training. We argue that the common theoretical assumption of anticipation and prediction needs clearer, more direct evidence for it during learning. We offer a novel experimental context to explore prediction, and report results from a simple sequential learning task designed to promote predictive behaviors in participants as they responded to a short sequence of simple stimulus events. Predictive tendencies in participants were measured using their computer mouse, the trajectories of which served as a means of tapping into predictive behavior while participants were exposed to very short and simple sequences of events. A total of 143 participants were randomly assigned to stimulus sequences along a continuum of regularity. Analysis of computer-mouse trajectories revealed that (a) participants almost always anticipate events in some manner, (b) participants exhibit two stable patterns of behavior, either reacting to vs. predicting future events, (c) the extent to which participants predict relates to performance on a recall test, and (d) explicit reports of perceiving patterns in the brief sequence correlates with extent of prediction. We end with a discussion of implicit and explicit statistical learning and of the role prediction may play in both kinds of learning. PMID:22723817

Multiplex Amplification Refractory Mutation System PCR (ARMS-PCR) provides sequencing independent typing of canine parvovirus.

PubMed

Chander, Vishal; Chakravarti, Soumendu; Gupta, Vikas; Nandi, Sukdeb; Singh, Mithilesh; Badasara, Surendra Kumar; Sharma, Chhavi; Mittal, Mitesh; Dandapat, S; Gupta, V K

2016-12-01

Canine parvovirus-2 antigenic variants (CPV-2a, CPV-2b and CPV-2c) ubiquitously distributed worldwide in canine population causes severe fatal gastroenteritis. Antigenic typing of CPV-2 remains a prime focus of research groups worldwide in understanding the disease epidemiology and virus evolution. The present study was thus envisioned to provide a simple sequencing independent, rapid, robust, specific, user-friendly technique for detecting and typing of presently circulating CPV-2 antigenic variants. ARMS-PCR strategy was employed using specific primers for CPV-2a, CPV-2b and CPV-2c to differentiate these antigenic types. ARMS-PCR was initially optimized with reference positive controls in two steps; where first reaction was used to differentiate CPV-2a from CPV-2b/CPV-2c. The second reaction was carried out with CPV-2c specific primers to confirm the presence of CPV-2c. Initial validation of the ARMS-PCR was carried out with 24 sequenced samples and the results were matched with the sequencing results. ARMS-PCR technique was further used to screen and type 90 suspected clinical samples. Randomly selected 15 suspected clinical samples that were typed with this technique were sequenced. The results of ARMS-PCR and the sequencing matched exactly with each other. The developed technique has a potential to become a sequencing independent method for simultaneous detection and typing of CPV-2 antigenic variants in veterinary disease diagnostic laboratories globally. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Model-based quality assessment and base-calling for second-generation sequencing data.

PubMed

Bravo, Héctor Corrada; Irizarry, Rafael A

2010-09-01

Second-generation sequencing (sec-gen) technology can sequence millions of short fragments of DNA in parallel, making it capable of assembling complex genomes for a small fraction of the price and time of previous technologies. In fact, a recently formed international consortium, the 1000 Genomes Project, plans to fully sequence the genomes of approximately 1200 people. The prospect of comparative analysis at the sequence level of a large number of samples across multiple populations may be achieved within the next five years. These data present unprecedented challenges in statistical analysis. For instance, analysis operates on millions of short nucleotide sequences, or reads-strings of A,C,G, or T's, between 30 and 100 characters long-which are the result of complex processing of noisy continuous fluorescence intensity measurements known as base-calling. The complexity of the base-calling discretization process results in reads of widely varying quality within and across sequence samples. This variation in processing quality results in infrequent but systematic errors that we have found to mislead downstream analysis of the discretized sequence read data. For instance, a central goal of the 1000 Genomes Project is to quantify across-sample variation at the single nucleotide level. At this resolution, small error rates in sequencing prove significant, especially for rare variants. Sec-gen sequencing is a relatively new technology for which potential biases and sources of obscuring variation are not yet fully understood. Therefore, modeling and quantifying the uncertainty inherent in the generation of sequence reads is of utmost importance. In this article, we present a simple model to capture uncertainty arising in the base-calling procedure of the Illumina/Solexa GA platform. Model parameters have a straightforward interpretation in terms of the chemistry of base-calling allowing for informative and easily interpretable metrics that capture the variability in sequencing quality. Our model provides these informative estimates readily usable in quality assessment tools while significantly improving base-calling performance. © 2009, The International Biometric Society.
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

NASA Astrophysics Data System (ADS)

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-06-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs.
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

PubMed Central

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-01-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs. PMID:26047353
De novo transcriptome sequence assembly and identification of AP2/ERF transcription factor related to abiotic stress in parsley (Petroselinum crispum).

PubMed

Li, Meng-Yao; Tan, Hua-Wei; Wang, Feng; Jiang, Qian; Xu, Zhi-Sheng; Tian, Chang; Xiong, Ai-Sheng

2014-01-01

Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research.
De Novo Transcriptome Sequence Assembly and Identification of AP2/ERF Transcription Factor Related to Abiotic Stress in Parsley (Petroselinum crispum)

PubMed Central

Wang, Feng; Jiang, Qian; Xu, Zhi-Sheng; Tian, Chang; Xiong, Ai-Sheng

2014-01-01

Parsley is an important biennial Apiaceae species that is widely cultivated as herb, spice, and vegetable. Previous studies on parsley principally focused on its physiological and biochemical properties, including phenolic compound and volatile oil contents. However, little is known about the molecular and genetic properties of parsley. In this study, 23,686,707 high-quality reads were obtained and assembled into 81,852 transcripts and 50,161 unigenes for the first time. Functional annotation showed that 30,516 unigenes had sequence similarity to known genes. In addition, 3,244 putative simple sequence repeats were detected in curly parsley. Finally, 1,569 of the identified unigenes belonged to 58 transcription factor families. Various abiotic stresses have a strong detrimental effect on the yield and quality of parsley. AP2/ERF transcription factors have important functions in plant development, hormonal regulation, and abiotic response. A total of 88 putative AP2/ERF factors were identified from the transcriptome sequence of parsley. Seven AP2/ERF transcription factors were selected in this study to analyze the expression profiles of parsley under different abiotic stresses. Our data provide a potentially valuable resource that can be used for intensive parsley research. PMID:25268141
Classification of viral zoonosis through receptor pattern analysis.

PubMed

Bae, Se-Eun; Son, Hyeon Seok

2011-04-13

Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environmental factors. Viruses can be categorized according to genotype (ssDNA, dsDNA, ssRNA and dsRNA viruses). Among them, the RNA viruses exhibit particularly high mutation rates and are especially problematic for this reason. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species. In this study, we sought to predict zoonotic propensity through the analysis of receptor characteristics. We hypothesized that the major barrier to interspecies virus transmission is that receptor sequences vary among species--in other words, that the specific amino acid sequence of the receptor determines the ability of the viral envelope protein to attach to the cell. We analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species and used a statistical discriminant analysis to predict the likelihood of transmission among species. This study is an attempt to predict zoonosis through simple computational analysis of receptor sequence differences. Our method may be useful in predicting the zoonotic potential of newly discovered viral strains.
A simple method for MR elastography: a gradient-echo type multi-echo sequence.

PubMed

Numano, Tomokazu; Mizuhara, Kazuyuki; Hata, Junichi; Washio, Toshikatsu; Homma, Kazuhiro

2015-01-01

To demonstrate the feasibility of a novel MR elastography (MRE) technique based on a conventional gradient-echo type multi-echo MR sequence which does not need additional bipolar magnetic field gradients (motion encoding gradient: MEG), yet is sensitive to vibration. In a gradient-echo type multi-echo MR sequence, several images are produced from each echo of the train with different echo times (TEs). If these echoes are synchronized with the vibration, each readout's gradient lobes achieve a MEG-like effect, and the later generated echo causes a greater MEG-like effect. The sequence was tested for the tissue-mimicking agarose gel phantoms and the psoas major muscles of healthy volunteers. It was confirmed that the readout gradient lobes caused an MEG-like effect and the later TE images had higher sensitivity to vibrations. The magnitude image of later generated echo suffered the T2 decay and the susceptibility artifacts, but the wave image and elastogram of later generated echo were unaffected by these effects. In in vivo experiments, this method was able to measure the mean shear modulus of the psoas major muscle. From the results of phantom experiments and volunteer studies, it was shown that this method has clinical application potential. Copyright © 2014 Elsevier Inc. All rights reserved.
Genotyping and Molecular Identification of Date Palm Cultivars Using Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Ayesh, Basim M

2017-01-01

Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.
Analysis on the DNA Fingerprinting of Aspergillus Oryzae Mutant Induced by High Hydrostatic Pressure

NASA Astrophysics Data System (ADS)

Wang, Hua; Zhang, Jian; Yang, Fan; Wang, Kai; Shen, Si-Le; Liu, Bing-Bing; Zou, Bo; Zou, Guang-Tian

2011-01-01

The mutant strains of aspergillus oryzae (HP300a) are screened under 300 MPa for 20 min. Compared with the control strains, the screened mutant strains have unique properties such as genetic stability, rapid growth, lots of spores, and high protease activity. Random amplified polymorphic DNA (RAPD) and inter simple sequence repeats (ISSR) are used to analyze the DNA fingerprinting of HP300a and the control strains. There are 67.9% and 51.3% polymorphic bands obtained by these two markers, respectively, indicating significant genetic variations between HP300a and the control strains. In addition, comparison of HP300a and the control strains, the genetic distances of random sequence and simple sequence repeat of DNA are 0.51 and 0.34, respectively.
Structurally Complex Organization of Repetitive DNAs in the Genome of Cobia (Rachycentron canadum).

PubMed

Costa, Gideão W W F; Cioffi, Marcelo de B; Bertollo, Luiz A C; Molina, Wagner F

2015-06-01

Repetitive DNAs comprise the largest fraction of the eukaryotic genome. They include microsatellites or simple sequence repeats (SSRs), which play an important role in the chromosome differentiation among fishes. Rachycentron canadum is the only representative of the family Rachycentridae. This species has been focused on several multidisciplinary studies in view of its important potential for marine fish farming. In the present study, distinct classes of repetitive DNAs, with emphasis on SSRs, were mapped in the chromosomes of this species to improve the knowledge of its genome organization. Microsatellites exhibited a diversified distribution, both dispersed in euchromatin and clustered in the heterochromatin. The multilocus location of SSRs strengthened the heterochromatin heterogeneity in this species, as suggested by some previous studies. The colocalization of SSRs with retrotransposons and transposons pointed to a close evolutionary relationship between these repetitive sequences. A number of heterochromatic regions highlighted a greater complex organization than previously supposed, harboring a diversity of repetitive elements. In this sense, there was also evidence of colocalization of active genetic regions and different classes of repetitive DNAs in a common heterochromatic region, which offers a potential opportunity for further researches regarding the interaction of these distinct fractions in fish genomes.
Free-energy calculations reveal the subtle differences in the interactions of DNA bases with α-hemolysin.

PubMed

Manara, Richard M A; Guy, Andrew T; Wallace, E Jayne; Khalid, Syma

2015-02-10

Next generation DNA sequencing methods that utilize protein nanopores have the potential to revolutionize this area of biotechnology. While the technique is underpinned by simple physics, the wild-type protein pores do not have all of the desired properties for efficient and accurate DNA sequencing. Much of the research efforts have focused on protein nanopores, such as α-hemolysin from Staphylococcus aureus. However, the speed of DNA translocation has historically been an issue, hampered in part by incomplete knowledge of the energetics of translocation. Here we have utilized atomistic molecular dynamics simulations of nucleotide fragments in order to calculate the potential of mean force (PMF) through α-hemolysin. Our results reveal specific regions within the pore that play a key role in the interaction with DNA. In particular, charged residues such as D127 and K131 provide stabilizing interactions with the anionic DNA and therefore are likely to reduce the speed of translocation. These regions provide rational targets for pore optimization. Furthermore, we show that the energetic contributions to the protein-DNA interactions are a complex combination of electrostatics and short-range interactions, often mediated by water molecules.
In silico mining and characterization of simple sequence repeats from gilthead sea bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification, polymorphism evaluation and multiplexing and cross-species assays.

PubMed

Vogiatzi, Emmanouella; Lagnel, Jacques; Pakaki, Victoria; Louro, Bruno; Canario, Adelino V M; Reinhardt, Richard; Kotoulas, Georgios; Magoulas, Antonios; Tsigenopoulos, Costas S

2011-06-01

We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs. Copyright © 2011 Elsevier B.V. All rights reserved.
Learning predictive statistics from temporal sequences: Dynamics and strategies

PubMed Central

Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe

2017-01-01

Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111
Learning predictive statistics from temporal sequences: Dynamics and strategies.

PubMed

Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

2017-10-01

Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
A coarse-grained biophysical model of sequence evolution and the population size dependence of the speciation rate

PubMed Central

Khatri, Bhavin S.; Goldstein, Richard A.

2015-01-01

Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level. PMID:25936759
Exploiting rice-sorghum synteny for targeted development of EST-SSRs to enrich the sorghum genetic linkage map.

PubMed

Ramu, P; Kassahun, B; Senthilvel, S; Ashok Kumar, C; Jayashree, B; Folkertsma, R T; Reddy, L Ananda; Kuruvinashetti, M S; Haussmann, B I G; Hash, C T

2009-11-01

The sequencing and detailed comparative functional analysis of genomes of a number of select botanical models open new doors into comparative genomics among the angiosperms, with potential benefits for improvement of many orphan crops that feed large populations. In this study, a set of simple sequence repeat (SSR) markers was developed by mining the expressed sequence tag (EST) database of sorghum. Among the SSR-containing sequences, only those sharing considerable homology with rice genomic sequences across the lengths of the 12 rice chromosomes were selected. Thus, 600 SSR-containing sorghum EST sequences (50 homologous sequences on each of the 12 rice chromosomes) were selected, with the intention of providing coverage for corresponding homologous regions of the sorghum genome. Primer pairs were designed and polymorphism detection ability was assessed using parental pairs of two existing sorghum mapping populations. About 28% of these new markers detected polymorphism in this 4-entry panel. A subset of 55 polymorphic EST-derived SSR markers were mapped onto the existing skeleton map of a recombinant inbred population derived from cross N13 x E 36-1, which is segregating for Striga resistance and the stay-green component of terminal drought tolerance. These new EST-derived SSR markers mapped across all 10 sorghum linkage groups, mostly to regions expected based on prior knowledge of rice-sorghum synteny. The ESTs from which these markers were derived were then mapped in silico onto the aligned sorghum genome sequence, and 88% of the best hits corresponded to linkage-based positions. This study demonstrates the utility of comparative genomic information in targeted development of markers to fill gaps in linkage maps of related crop species for which sufficient genomic tools are not available.
New concepts of fluorescent probes for specific detection of DNA sequences: bis-modified oligonucleotides in excimer and exciplex detection.

PubMed

Gbaj, A; Bichenkova, Ev; Walsh, L; Savage, He; Sardarian, Ar; Etchells, Ll; Gulati, A; Hawisa, S; Douglas, Kt

2009-12-01

The detection of single base mismatches in DNA is important for diagnostics, treatment of genetic diseases, and identification of single nucleotide polymorphisms. Highly sensitive, specific assays are needed to investigate genetic samples from patients. The use of a simple fluorescent nucleoside analogue in detection of DNA sequence and point mutations by hybridisation in solution is described in this study. The 5'-bispyrene and 3'-naphthalene oligonucleotide probes form an exciplex on hybridisation to target in water and the 5'-bispyrene oligonucleotide alone is an adequate probe to determine concentration of target present. It was also indicated that this system has a potential to identify mismatches and insertions. The aim of this work was to investigate experimental structures and conditions that permit strong exciplex emission for nucleic acid detectors, and show how such exciplexes can register the presence of mismatches as required in SNP analysis. This study revealed that the hybridisation of 5'-bispyrenyl fluorophore to a DNA target results in formation of a fluorescent probe with high signal intensity change and specificity for detecting a complementary target in a homogeneous system. Detection of SNP mutations using this split-probe system is a highly specific, simple, and accessible method to meet the rigorous requirements of pharmacogenomic studies. Thus, it is possible for the system to act as SNP detectors and it shows promise for future applications in genetic testing.
[Age structure and genetic diversity of Homatula pycnolepis in the Nujiang River basin].

PubMed

Yue, Xing-Jian; Liu, Shao-Ping; Liu, Ming-Dian; Duan, Xin-Bin; Wang, Deng-Qiang; Chen, Da-Qing

2013-08-01

This study examined the age structure of the Loach, Homatula pycnolepis through the otolith growth rings in 204 individual specimens collected from the Xiaomengtong River of the Nujiang River (Salween River) basin in April, 2008. There were only two different age classes, 1 and 2 years of age-no 3 year olds were detected. The age structure of H. pycnolepis was simple. The complete mitochondrial DNA cytochrome b gene sequences (1140) of 80 individuals from 4 populations collected in the Nujiang River drainage were sequenced and a total of 44 variable sites were found among 4 different haplotypes. The global haplotype diversity (Hd) and nucleotide diversity (Pi) were calculated at 0.7595, 0.0151 respectively, and 0, 0 in each population, indicating a consistent lack of genetic diversity in each small population. There was obvious geographic structure in both the Nujiang River basin (NJB) group, and the Nanding River (NDR) group. The genetic distance between NJB and NDR was calculated at 0.0356, suggesting that genetic divergence resulted from long-term isolation of individual population. Such a simple age structure and a lack of genetic diversity in H. pycnolepis may potentially be due to small populations and locale fishing pressures. Accordingly, the results of this study prompt us to recommend that the NJB, NDR and Lancang River populations should be protected as three different evolutionary significant units or separated management units.
Development of genome-wide SNP assays for rice

USDA-ARS?s Scientific Manuscript database

With the introduction of new sequencing technologies, single nucleotide polymorphisms (SNPs) are rapidly replacing simple sequence repeats (SSRs) as the DNA marker of choice for applications in plant breeding and genetics because they are more abundant, stable, amenable to automation, efficient, and...

A Simple View of Writing in Chinese

ERIC Educational Resources Information Center

Yeung, Pui-sze; Ho, Connie Suk-han; Chan, David Wai-ock; Chung, Kevin Kien-hoa

2017-01-01

This study examined the Chinese written composition development of elementary-grade students in relation to the simple view of writing. Measures of nonverbal reasoning ability, component skills of transcription (stroke sequence knowledge, word spelling, and handwriting fluency), oral language (definitional skill, oral narrative skills, and…
Measuring DNA hybridization using fluorescent DNA-stabilized silver clusters to investigate mismatch effects on therapeutic oligonucleotides.

PubMed

de Bruin, Donny; Bossert, Nelli; Aartsma-Rus, Annemieke; Bouwmeester, Dirk

2018-04-06

Short nucleic acid oligomers have found a wide range of applications in experimental physics, biology and medicine, and show potential for the treatment of acquired and genetic diseases. These applications rely heavily on the predictability of hybridization through Watson-Crick base pairing to allow positioning on a nanometer scale, as well as binding to the target transcripts, but also off-target binding to transcripts with partial homology. These effects are of particular importance in the development of therapeutic oligonucleotides, where off-target effects caused by the binding of mismatched sequences need to be avoided. We employ a novel method of probing DNA hybridization using optically active DNA-stabilized silver clusters (Ag-DNA) to measure binding efficiencies through a change in fluorescence intensity. In this way we can determine their location-specific sensitivity to individual mismatches in the sequence. The results reveal a strong dependence of the hybridization on the location of the mismatch, whereby mismatches close to the edges and center show a relatively minor impact. In parallel, we propose a simple model for calculating the annealing ratios of mismatched DNA sequences, which supports our experimental results. The primary result shown in this work is a demonstration of a novel technique to measure DNA hybridization using fluorescent Ag-DNA. With this technique, we investigated the effect of mismatches on the hybridization efficiency, and found a significant dependence on the location of individual mismatches. These effects are strongly influenced by the length of the used oligonucleotides. The novel probe method based on fluorescent Ag-DNA functions as a reliable tool in measuring this behavior. As a secondary result, we formulated a simple model that is consistent with the experimental data.
Aging reduces experience-induced sensorimotor plasticity. A magnetoencephalographic study.

PubMed

Mary, Alison; Bourguignon, Mathieu; Wens, Vincent; Op de Beeck, Marc; Leproult, Rachel; De Tiège, Xavier; Peigneux, Philippe

2015-01-01

Modulation of the mu-alpha and mu-beta spontaneous rhythms reflects plastic neural changes within the primary sensorimotor cortex (SM1). Using magnetoencephalography (MEG), we investigated how aging modifies experience-induced plasticity after learning a motor sequence, looking at post- vs. pre-learning changes in the modulation of mu rhythms during the execution of simple hand movements. Fifteen young (18-30 years) and fourteen older (65-75 years) right-handed healthy participants performed auditory-cued key presses using all four left fingers simultaneously (Simple Movement task - SMT) during two separate sessions. Following both SMT sessions, they repeatedly practiced a 5-elements sequential finger-tapping task (FTT). Mu power calculated during SMT was averaged across 18 gradiometers covering the right sensorimotor region and compared before vs. after sequence learning in the alpha (9/10/11Hz) and the beta (18/20/22Hz) bands separately. Source power maps in the mu-alpha and mu-beta bands were localized using Dynamic Statistical Parametric Mapping (dSPM). The FTT sequence was performed faster at retest than at the end of the learning session, indicating an offline boost in performance. Analyses conducted on SMT sessions revealed enhanced rebound after learning in the right SM1, 3000-3500ms after the initiation of movement, in young as compared to older participants. Source reconstruction indicated that mu-beta is located in the precentral gyrus (motor processes) and mu-alpha is located in the postcentral gyrus (somatosensory processes) in both groups. The enhanced post-movement rebound in young subjects potentially reflects post-training plastic changes in SM1. Age-related decreases in post-training modulatory effects suggest reduced experience-dependent plasticity in the aging brain. Copyright © 2014 Elsevier Inc. All rights reserved.
Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data.

PubMed

Parker, Nicolas J; Parker, Andrew G

2008-04-18

The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches. A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads. Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension. The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.
Recursive sequences in first-year calculus

NASA Astrophysics Data System (ADS)

Krainer, Thomas

2016-02-01

This article provides ready-to-use supplementary material on recursive sequences for a second-semester calculus class. It equips first-year calculus students with a basic methodical procedure based on which they can conduct a rigorous convergence or divergence analysis of many simple recursive sequences on their own without the need to invoke inductive arguments as is typically required in calculus textbooks. The sequences that are accessible to this kind of analysis are predominantly (eventually) monotonic, but also certain recursive sequences that alternate around their limit point as they converge can be considered.
New solutions with accelerated expansion in string theory

DOE PAGES

Dodelson, Matthew; Dong, Xi; Silverstein, Eva; ...

2014-12-05

We present concrete solutions with accelerated expansion in string theory, requiring a small, tractable list of stress energy sources. We explain how this construction (and others in progress) evades previous no go theorems for simple accelerating solutions. Our solutions respect an approximate scaling symmetry and realize discrete sequences of values for the equation of state, including one with an accumulation point at w = –1 and another accumulating near w = –1/3 from below. In another class of models, a density of defects generates scaling solutions with accelerated expansion. Here, we briefly discuss potential applications to dark energy phenomenology, andmore » to holography for cosmology.« less
SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

USGS Publications Warehouse

Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

PubMed

Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
Genotyping variability of computationally categorized peach microsatellite markers

USDA-ARS?s Scientific Manuscript database

Numerous expressed sequence tag (EST) simple sequence repeat (SSR) primers can be easily mined out. The obstacle to develop them into usable markers is how to optimally select downsized subsets of the primers for genotyping, which accordingly reduces amplification failure and monomorphism often occu...
A Fractal Excursion.

ERIC Educational Resources Information Center

Camp, Dane R.

1991-01-01

After introducing the two-dimensional Koch curve, which is generated by simple recursions on an equilateral triangle, the process is extended to three dimensions with simple recursions on a regular tetrahedron. Included, for both fractal sequences, are iterative formulae, illustrations of the first several iterations, and a sample PASCAL program.…
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
A hybrid swarm population of Pinus densiflora x P. sylvestris hybrids inferred from sequence analysis of chloroplast DNA and morphological characters

USDA-ARS?s Scientific Manuscript database

To confirm a hybrid swarm population of Pinus densiflora × P. sylvestris in Jilin, China and to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSR), needles and seeds from P. densiflora, P. syl...
Comparison of Computed Condon Loci with Franck-Condon Factors in Deslandres Tables of Molecular Band Systems

NASA Astrophysics Data System (ADS)

Hefferlin, R.; Clark, B.; Tatum, J.

2012-06-01

The literature often shows a Condon parabola not quite tracking the Franck-Condon factors for the strongest bands in the Deslandres table for a diatomic molecular band system; often the parabola appears to have been hand-drawn. We have calculated Condon loci, assuming originally that the lower and upper electronic potentials are simple harmonic potentials, and assuming now that they are Morse potentials. In the harmonic case the Condon loci are parabolas. These calculations are for small vibrational quantum numbers, where the Morse loci also begin as parabolas. We will present these loci for representative molecular band systems and discuss the extent to which the loci track the strongest Franck-Condon factors. In the event that neither does, calculations for arbitrary potentials are available. The importance of this study is that we have previously calculated the latera recta and the symmetry-axis angles of the harmonic oscillator parabolas in Deslandres tables appropriate to molecules in several isoelectronic sequences. We have found that the angle increases along the sequence until the species one proton-shift away from “rare-gas” molecules, such as LiNe, is reached. This phenomenon is a suggestion that diatomic molecules are periodic with respect to each of their two atoms. G. Herzberg, Molecular Spectra and Molecular Structure, 1950, pg. 197 D. J. Flynn, R. J. Spindler; S. Fifer; M. Kelly, J. Quant. Spectr. Radiat. Transfer 4, 271-282, (1964) R. W. Nicholls, J. Quant. Spectr. Radiat. Transfer 28, 481-492, (1982).
Assessment, validation and deployment strategy of a two-barcode protocol for facile genotyping of duckweed species.

PubMed

Borisjuk, N; Chu, P; Gutierrez, R; Zhang, H; Acosta, K; Friesen, N; Sree, K S; Garcia, C; Appenroth, K J; Lam, E

2015-01-01

Lemnaceae, commonly called duckweeds, comprise a diverse group of floating aquatic plants that have previously been classified into 37 species based on morphological and physiological criteria. In addition to their unique evolutionary position among angiosperms and their applications in biomonitoring, the potential of duckweeds as a novel sustainable crop for fuel and feed has recently increased interest in the study of their biodiversity and systematics. However, due to their small size and abbreviated structure, accurate typing of duckweeds based on morphology can be challenging. In the past decade, attempts to employ molecular barcoding techniques for species assignment have produced promising results; however, they have yet to be codified into a simple and quantitative protocol. A study that compiles and compares the barcode sequences within all known species of this family would help to establish the fidelity and limits of this DNA-based approach. In this work, we compared the level of conservation between over 100 strains of duckweed for two intergenic barcode sequences derived from the plastid genome. By using over 300 sequences publicly available in the NCBI database, we determined the utility of each of these two barcodes for duckweed species identification. Through sequencing of these barcodes from additional accessions, 30 of the 37 known species of duckweed could be identified with varying levels of confidence using this approach. From our analyses using this reference dataset, we also confirmed two instances where mis-assignment of species has likely occurred. Potential strategies for further improving the scope of this technology are discussed. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Estimating the potential refolding yield of recombinant proteins expressed as inclusion bodies.

PubMed

Ho, Jason G S; Middelberg, Anton P J

2004-09-05

Recombinant protein production in bacteria is efficient except that insoluble inclusion bodies form when some gene sequences are expressed. Such proteins must undergo renaturation, which is an inefficient process due to protein aggregation on dilution from concentrated denaturant. In this study, the protein-protein interactions of eight distinct inclusion-body proteins are quantified, in different solution conditions, by measurement of protein second virial coefficients (SVCs). Protein solubility is shown to decrease as the SVC is reduced (i.e., as protein interactions become more attractive). Plots of SVC versus denaturant concentration demonstrate two clear groupings of proteins: a more aggregative group and a group having higher SVC and better solubility. A correlation of the measured SVC with protein molecular weight and hydropathicity, that is able to predict which group each of the eight proteins falls into, is presented. The inclusion of additives known to inhibit aggregation during renaturation improves solubility and increases the SVC of both protein groups. Furthermore, an estimate of maximum refolding yield (or solubility) using high-performance liquid chromatography was obtained for each protein tested, under different environmental conditions, enabling a relationship between "yield" and SVC to be demonstrated. Combined, the results enable an approximate estimation of the maximum refolding yield that is attainable for each of the eight proteins examined, under a selected chemical environment. Although the correlations must be tested with a far larger set of protein sequences, this work represents a significant move beyond empirical approaches for optimizing renaturation conditions. The approach moves toward the ideal of predicting maximum refolding yield using simple bioinformatic metrics that can be estimated from the gene sequence. Such a capability could potentially "screen," in silico, those sequences suitable for expression in bacteria from those that must be expressed in more complex hosts.
SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop

PubMed Central

Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo

2014-01-01

Summary: Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig’s scalability over many computing nodes and illustrate its use with example scripts. Availability and Implementation: Available under the open source MIT license at http://sourceforge.net/projects/seqpig/ Contact: andre.schumacher@yahoo.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24149054
A Practical Workshop for Generating Simple DNA Fingerprints of Plants

ERIC Educational Resources Information Center

Rouziere, A.-S.; Redman, J. E.

2011-01-01

Gel electrophoresis DNA fingerprints offer a graphical and visually appealing illumination of the similarities and differences between DNA sequences of different species and individuals. A polymerase chain reaction (PCR) and restriction digest protocol was designed to give high-school students the opportunity to generate simple fingerprints of…
Multiplexed microsatellite recovery using massively parallel sequencing

Treesearch

T.N. Jennings; B.J. Knaus; T.D. Mullins; S.M. Haig; R.C. Cronn

2011-01-01

Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of...
BeerDeCoded: the open beer metagenome project.

PubMed

Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

2017-01-01

Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer.

Dynamic modeling of normal faults of the 2016 Central Italy earthquake sequence

NASA Astrophysics Data System (ADS)

Aochi, Hideo

2017-04-01

The earthquake sequence of the Central Italy in 2016 are characterized mainly by the Mw6.0 24th August, Mw5.9 26th October and Mw6.4 30th October as well as two Mw5.4 earthquakes (24th August, 26th October) (catalogue INGV). They all show normal faulting mechanisms corresponding to the Apennines's tectonics. They are aligned briefly along NNW-SSE axis, and they may not be on a single continuous fault plane. Therefore, dynamic rupture modeling of sequences should be carried out supposing co-planar normal multiple segments. We apply a Boundary Domain Method (BDM, Goto and Bielak, GJI, 2008) coupling a boundary integral equation method and a domain-based method, namely a finite difference method in this study. The Mw6.0 24th August earthquake is modeled. We use the basic information of hypocenter position, focal mechanism and potential ruptured dimension from the INGV catalogue and Tinti et al., GRL, 2016), and begin with a simple condition (homogeneous boundary condition). From our preliminary simulations, it is shown that a uniformly extended rupture model does not fit the near-field ground motions and localized heterogeneity would be required.
Mutation at a distance caused by homopolymeric guanine repeats in Saccharomyces cerevisiae

PubMed Central

McDonald, Michael J.; Yu, Yen-Hsin; Guo, Jheng-Fen; Chong, Shin Yen; Kao, Cheng-Fu; Leu, Jun-Yi

2016-01-01

Mutation provides the raw material from which natural selection shapes adaptations. The rate at which new mutations arise is therefore a key factor that determines the tempo and mode of evolution. However, an accurate assessment of the mutation rate of a given organism is difficult because mutation rate varies on a fine scale within a genome. A central challenge of evolutionary genetics is to determine the underlying causes of this variation. In earlier work, we had shown that repeat sequences not only are prone to a high rate of expansion and contraction but also can cause an increase in mutation rate (on the order of kilobases) of the sequence surrounding the repeat. We perform experiments that show that simple guanine repeats 13 bp (base pairs) in length or longer (G13+) increase the substitution rate 4- to 18-fold in the downstream DNA sequence, and this correlates with DNA replication timing (R = 0.89). We show that G13+ mutagenicity results from the interplay of both error-prone translesion synthesis and homologous recombination repair pathways. The mutagenic repeats that we study have the potential to be exploited for the artificial elevation of mutation rate in systems biology and synthetic biology applications. PMID:27386516
BeerDeCoded: the open beer metagenome project

PubMed Central

Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

2017-01-01

Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer. PMID:29123645
NLSdb-major update for database of nuclear localization signals and nuclear export signals.

PubMed

Bernhofer, Michael; Goldberg, Tatyana; Wolf, Silvana; Ahmed, Mohamed; Zaugg, Julian; Boden, Mikael; Rost, Burkhard

2018-01-04

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A note on the efficiencies of sampling strategies in two-stage Bayesian regional fine mapping of a quantitative trait.

PubMed

Chen, Zhijian; Craiu, Radu V; Bull, Shelley B

2014-11-01

In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies. © 2014 WILEY PERIODICALS, INC.
An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.

PubMed

Prakash, Celine; Haeseler, Arndt Von

2017-03-01

RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment.
An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile

PubMed Central

Haeseler, Arndt Von

2017-01-01

Abstract RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment. PMID:27661099
Rapid microsatellite marker development for African mahogany (Khaya senegalensis, Meliaceae) using next-generation sequencing and assessment of its intra-specific genetic diversity.

PubMed

Karan, M; Evans, D S; Reilly, D; Schulte, K; Wright, C; Innes, D; Holton, T A; Nikles, D G; Dickinson, G R

2012-03-01

Khaya senegalensis (African mahogany or dry-zone mahogany) is a high-value hardwood timber species with great potential for forest plantations in northern Australia. The species is distributed across the sub-Saharan belt from Senegal to Sudan and Uganda. Because of heavy exploitation and constraints on natural regeneration and sustainable planting, it is now classified as a vulnerable species. Here, we describe the development of microsatellite markers for K. senegalensis using next-generation sequencing to assess its intra-specific diversity across its natural range, which is a key for successful breeding programs and effective conservation management of the species. Next-generation sequencing yielded 93,943 sequences with an average read length of 234 bp. The assembled sequences contained 1030 simple sequence repeats, with primers designed for 522 microsatellite loci. Twenty-one microsatellite loci were tested with 11 showing reliable amplification and polymorphism in K. senegalensis. The 11 novel microsatellites, together with one previously published, were used to assess 73 accessions belonging to the Australian K. senegalensis domestication program, sampled from across the natural range of the species. STRUCTURE analysis shows two major clusters, one comprising mainly accessions from west Africa (Senegal to Benin) and the second based in the far eastern limits of the range in Sudan and Uganda. Higher levels of genetic diversity were found in material from western Africa. This suggests that new seed collections from this region may yield more diverse genotypes than those originating from Sudan and Uganda in eastern Africa. © 2011 Blackwell Publishing Ltd.
Genetic linkage map and QTL identification for adventitious rooting traits in red gum eucalypts.

PubMed

Sumathi, Murugan; Bachpai, Vijaya Kumar Waman; Mayavel, A; Dasgupta, Modhumita Ghosh; Nagarajan, Binai; Rajasugunasekar, D; Sivakumar, Veerasamy; Yasodha, Ramasamy

2018-05-01

The eucalypt species, Eucalyptus tereticornis and Eucalyptus camaldulensis , show tolerance to drought and salinity conditions, respectively, and are widely cultivated in arid and semiarid regions of tropical countries. In this study, genetic linkage map was developed for interspecific cross E. tereticornis × E. camaldulensis using pseudo-testcross strategy with simple sequence repeats (SSRs), intersimple sequence repeats (ISSRs), and sequence-related amplified polymorphism (SRAP) markers. The consensus genetic map comprised totally 283 markers with 84 SSRs, 94 ISSRs, and 105 SRAP markers on 11 linkage groups spanning 1163.4 cM genetic distance. Blasting the SSR sequences against E. grandis sequences allowed an alignment of 64% and the average ratio of genetic-to-physical distance was 1.7 Mbp/cM, which strengths the evidence that high amount of synteny and colinearity exists among eucalypts genome. Blast searches also revealed that 37% of SSRs had homologies with genes, which could potentially be used in the variety of downstream applications including candidate gene polymorphism. Quantitative trait loci (QTL) analysis for adventitious rooting traits revealed six QTL for rooting percent and root length on five chromosomes with interval and composite interval mapping. All the QTL explained 12.0-14.7% of the phenotypic variance, showing the involvement of major effect QTL on adventitious rooting traits. Increasing the density of markers would facilitate the detection of more number of small-effect QTL and also underpinning the genes involved in rooting process.
Integrative workflows for metagenomic analysis

PubMed Central

Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.

2014-01-01

The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562
De novo transcriptome analysis of an imminent biofuel crop, Camelina sativa L. using Illumina GAIIX sequencing platform and identification of SSR markers.

PubMed

Mudalkar, Shalini; Golla, Ramesh; Ghatty, Sreenivas; Reddy, Attipalli Ramachandra

2014-01-01

Camelina sativa L. is an emerging biofuel crop with potential applications in industry, medicine, cosmetics and human nutrition. The crop is unexploited owing to very limited availability of transcriptome and genomic data. In order to analyse the various metabolic pathways, we performed de novo assembly of the transcriptome on Illumina GAIIX platform with paired end sequencing for obtaining short reads. The sequencing output generated a FastQ file size of 2.97 GB with 10.83 million reads having a maximum read length of 101 nucleotides. The number of contigs generated was 53,854 with maximum and minimum lengths of 10,086 and 200 nucleotides respectively. These trancripts were annotated using BLAST search against the Aracyc, Swiss-Prot, TrEMBL, gene ontology and clusters of orthologous groups (KOG) databases. The genes involved in lipid metabolism were studied and the transcription factors were identified. Sequence similarity studies of Camelina with the other related organisms indicated the close relatedness of Camelina with Arabidopsis. In addition, bioinformatics analysis revealed the presence of a total of 19,379 simple sequence repeats. This is the first report on Camelina sativa L., where the transcriptome of the entire plant, including seedlings, seed, root, leaves and stem was done. Our data established an excellent resource for gene discovery and provide useful information for functional and comparative genomic studies in this promising biofuel crop.
Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

PubMed

Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

2011-09-01

The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods. Copyright © 2011 Elsevier B.V. All rights reserved.
Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

PubMed

Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

2016-05-01

Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. Copyright © 2016 Elsevier B.V. All rights reserved.
De novo characterization of fall dormant and nondormant alfalfa (Medicago sativa L.) leaf transcriptome and identification of candidate genes related to fall dormancy.

PubMed

Zhang, Senhao; Shi, Yinghua; Cheng, Ningning; Du, Hongqi; Fan, Wenna; Wang, Chengzhang

2015-01-01

Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa.
De Novo Characterization of Fall Dormant and Nondormant Alfalfa (Medicago sativa L.) Leaf Transcriptome and Identification of Candidate Genes Related to Fall Dormancy

PubMed Central

Cheng, Ningning; Du, Hongqi; Fan, Wenna; Wang, Chengzhang

2015-01-01

Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa. PMID:25799491
De novo sequencing analysis of the Rosa roxburghii fruit transcriptome reveals putative ascorbate biosynthetic genes and EST-SSR markers.

PubMed

Yan, Xiuqin; Zhang, Xue; Lu, Min; He, Yong; An, Huaming

2015-04-25

Rosa roxburghii Tratt. is a well-known ornamental rose species native to China. In addition, the fruits of this species are valued for their nutritional and medicinal characteristics, especially their high ascorbic acid (AsA) levels. Nevertheless, AsA biosynthesis in R. roxburghii fruit has not been explored in detail because of a lack of genomic resources for this species. High-throughput transcriptomic sequencing generating large volumes of transcript sequence data can aid in gene discovery and molecular marker development. In this study, we generated more than 53 million clean reads using Illumina paired-end sequencing technology. De novo assembly yielded 106,590 unigenes, with an average length of 343 bp. On the basis of sequence similarity to known proteins, 9301 and 2393 unigenes were classified into Gene Ontology and Clusters of Orthologous Group categories, respectively. There were 7480 unigenes assigned to 124 pathways in the Kyoto Encyclopedia of Gene and Genome pathway database. BLASTx searches identified 498 unique putative transcripts encoding various transcription factors, some known to regulate fruit development. qRT-PCR validated the expressions of most of the genes encoding the main enzymes involved in ascorbate biosynthesis. In addition, 9131 potential simple sequence repeat (SSR) loci were identified among the unigenes. One hundred and two primer pairs were synthesized and 71 pairs produced an amplification product during initial screening. Among the amplified products, 30 were polymorphic in the 16 R. roxburghii germplasms tested. Our study was the first to produce a large volume of transcriptome data from R. roxburghii. The resulting sequence collection is a valuable resource for gene discovery and marker-assisted selective breeding in this rose species. Copyright © 2015 Elsevier B.V. All rights reserved.
Integer sequence discovery from small graphs

PubMed Central

Hoppe, Travis; Petrone, Anna

2015-01-01

We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526
A novel diagnostic method for malaria using loop-mediated isothermal amplification (LAMP) and MinION™ nanopore sequencer.

PubMed

Imai, Kazuo; Tarumoto, Norihito; Misawa, Kazuhisa; Runtuwene, Lucky Ronald; Sakai, Jun; Hayashida, Kyoko; Eshita, Yuki; Maeda, Ryuichiro; Tuda, Josef; Murakami, Takashi; Maesaki, Shigefumi; Suzuki, Yutaka; Yamagishi, Junya; Maeda, Takuya

2017-09-13

A simple and accurate molecular diagnostic method for malaria is urgently needed due to the limitations of conventional microscopic examination. In this study, we demonstrate a new diagnostic procedure for human malaria using loop mediated isothermal amplification (LAMP) and the MinION™ nanopore sequencer. We generated specific LAMP primers targeting the 18S-rRNA gene of all five human Plasmodium species including two P. ovale subspecies (P. falciparum, P. vivax, P. ovale wallikeri, P. ovale curtisi, P. knowlesi and P. malariae) and examined human blood samples collected from 63 malaria patients in Indonesia. Additionally, we performed amplicon sequencing of our LAMP products using MinION™ nanopore sequencer to identify each Plasmodium species. Our LAMP method allowed amplification of all targeted 18S-rRNA genes of the reference plasmids with detection limits of 10-100 copies per reaction. Among the 63 clinical samples, 54 and 55 samples were positive by nested PCR and our LAMP method, respectively. Identification of the Plasmodium species by LAMP amplicon sequencing analysis using the MinION™ was consistent with the reference plasmid sequences and the results of nested PCR. Our diagnostic method combined with LAMP and MinION™ could become a simple and accurate tool for the identification of human Plasmodium species, even in resource-limited situations.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.).

PubMed

Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A

2015-10-26

Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
GASP: Gapped Ancestral Sequence Prediction for proteins

PubMed Central

Edwards, Richard J; Shields, Denis C

2004-01-01

Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

Differential transferability of EST-SSR primers developed from diploid species Pseudoroegneria spicata, Thinopyrum bessarabicum, and Th. elongatum

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat technology based on expressed sequence tag (EST-SSR) is a useful genomic tool for genome mapping, characterizing plant species relationships, elucidating genome evolution, and tracing genes on alien chromosome segments. EST-SSR primers developed from three perennial diploid T...
Discrimination of epimeric glycans and glycopeptides using IM-MS and its potential for carbohydrate sequencing

NASA Astrophysics Data System (ADS)

Both, P.; Green, A. P.; Gray, C. J.; Šardzík, R.; Voglmeir, J.; Fontana, C.; Austeri, M.; Rejzek, M.; Richardson, D.; Field, R. A.; Widmalm, G.; Flitsch, S. L.; Eyers, C. E.

2014-01-01

Mass spectrometry is the primary analytical technique used to characterize the complex oligosaccharides that decorate cell surfaces. Monosaccharide building blocks are often simple epimers, which when combined produce diastereomeric glycoconjugates indistinguishable by mass spectrometry. Structure elucidation frequently relies on assumptions that biosynthetic pathways are highly conserved. Here, we show that biosynthetic enzymes can display unexpected promiscuity, with human glycosyltransferase pp-α-GanT2 able to utilize both uridine diphosphate N-acetylglucosamine and uridine diphosphate N-acetylgalactosamine, leading to the synthesis of epimeric glycopeptides in vitro. Ion-mobility mass spectrometry (IM-MS) was used to separate these structures and, significantly, enabled characterization of the attached glycan based on the drift times of the monosaccharide product ions generated following collision-induced dissociation. Finally, ion-mobility mass spectrometry following fragmentation was used to determine the nature of both the reducing and non-reducing glycans of a series of epimeric disaccharides and the branched pentasaccharide Man3 glycan, demonstrating that this technique may prove useful for the sequencing of complex oligosaccharides.
Context influences on TALE–DNA binding revealed by quantitative profiling

PubMed Central

Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

2015-01-01

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805
Context influences on TALE-DNA binding revealed by quantitative profiling.

PubMed

Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

2015-06-11

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.
Sexing the Sciuridae: a simple and accurate set of molecular methods to determine sex in tree squirrels, ground squirrels and marmots.

PubMed

Gorrell, Jamieson C; Boutin, Stan; Raveh, Shirley; Neuhaus, Peter; Côté, Steeve D; Coltman, David W

2012-09-01

We determined the sequence of the male-specific minor histocompatibility complex antigen (Smcy) from the Y chromosome of seven squirrel species (Sciuridae, Rodentia). Based on conserved regions inside the Smcy intron sequence, we designed PCR primers for sex determination in these species that can be co-amplified with nuclear loci as controls. PCR co-amplification yields two products for males and one for females that are easily visualized as bands by agarose gel electrophoresis. Our method provides simple and reliable sex determination across a wide range of squirrel species. © 2012 Blackwell Publishing Ltd.
Plant genotyping using fluorescently tagged inter-simple sequence repeats (ISSRs): basic principles and methodology.

PubMed

Prince, Linda M

2015-01-01

Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Characterization of the Kenaf (Hibiscus cannabinus) Global Transcriptome Using Illumina Paired-End Sequencing and Development of EST-SSR Markers

PubMed Central

Li, Hui; Li, Defang; Chen, Anguo; Tang, Huijuan; Li, Jianjun; Huang, Siqi

2016-01-01

Kenaf (Hibiscus cannabinus L.) is an economically important natural fiber crop grown worldwide. However, only 20 expressed tag sequences (ESTs) for kenaf are available in public databases. The aim of this study was to develop large-scale simple sequence repeat (SSR) markers to lay a solid foundation for the construction of genetic linkage maps and marker-assisted breeding in kenaf. We used Illumina paired-end sequencing technology to generate new EST-simple sequences and MISA software to mine SSR markers. We identified 71,318 unigenes with an average length of 1143 nt and annotated these unigenes using four different protein databases. Overall, 9324 complementary pairs were designated as EST-SSR markers, and their quality was validated using 100 randomly selected SSR markers. In total, 72 primer pairs reproducibly amplified target amplicons, and 61 of these primer pairs detected significant polymorphism among 28 kenaf accessions. Thus, in this study, we have developed large-scale SSR markers for kenaf, and this new resource will facilitate construction of genetic linkage maps, investigation of fiber growth and development in kenaf, and also be of value to novel gene discovery and functional genomic studies. PMID:26960153
De novo transcriptomic analysis of cowpea (Vigna unguiculata L. Walp.) for genic SSR marker development.

PubMed

Chen, Honglin; Wang, Lixia; Liu, Xiaoyan; Hu, Liangliang; Wang, Suhua; Cheng, Xuzhen

2017-07-11

Cowpea [Vigna unguiculata (L.) Walp.] is one of the most important legumes in tropical and semi-arid regions. However, there is relatively little genomic information available for genetic research on and breeding of cowpea. The objectives of this study were to analyse the cowpea transcriptome and develop genic molecular markers for future genetic studies of this genus. Approximately 54 million high-quality cDNA sequence reads were obtained from cowpea based on Illumina paired-end sequencing technology and were de novo assembled to generate 47,899 unigenes with an N50 length of 1534 bp. Sequence similarity analysis revealed 36,289 unigenes (75.8%) with significant similarity to known proteins in the non-redundant (Nr) protein database, 23,471 unigenes (49.0%) with BLAST hits in the Swiss-Prot database, and 20,654 unigenes (43.1%) with high similarity in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Further analysis identified 5560 simple sequence repeats (SSRs) as potential genic molecular markers. Validating a random set of 500 SSR markers yielded 54 polymorphic markers among 32 cowpea accessions. This transcriptomic analysis of cowpea provided a valuable set of genomic data for characterizing genes with important agronomic traits in Vigna unguiculata and a new set of genic SSR markers for further genetic studies and breeding in cowpea and related Vigna species.
De Novo Transcriptome of the Hemimetabolous German Cockroach (Blattella germanica)

PubMed Central

Zhou, Xiaojie; Qian, Kun; Tong, Ying; Zhu, Junwei Jerry; Qiu, Xinghui; Zeng, Xiaopeng

2014-01-01

Background The German cockroach, Blattella germanica, is an important insect pest that transmits various pathogens mechanically and causes severe allergic diseases. This insect has long served as a model system for studies of insect biology, physiology and ecology. However, the lack of genome or transcriptome information heavily hinder our further understanding about the German cockroach in every aspect at a molecular level and on a genome-wide scale. To explore the transcriptome and identify unique sequences of interest, we subjected the B. germanica transcriptome to massively parallel pyrosequencing and generated the first reference transcriptome for B. germanica. Methodology/Principal Findings A total of 1,365,609 raw reads with an average length of 529 bp were generated via pyrosequencing the mixed cDNA library from different life stages of German cockroach including maturing oothecae, nymphs, adult females and males. The raw reads were de novo assembled to 48,800 contigs and 3,961 singletons with high-quality unique sequences. These sequences were annotated and classified functionally in terms of BLAST, GO and KEGG, and the genes putatively coding detoxification enzyme systems, insecticide targets, key components in systematic RNA interference, immunity and chemoreception pathways were identified. A total of 3,601 SSRs (Simple Sequence Repeats) loci were also predicted. Conclusions/Significance The whole transcriptome pyrosequencing data from this study provides a usable genetic resource for future identification of potential functional genes involved in various biological processes. PMID:25265537
A Writing Intervention to Teach Simple Sentences and Descriptive Paragraphs to Adolescents with Writing Difficulties

ERIC Educational Resources Information Center

Datchuk, Shawn M.; Kubina, Richard M., Jr.

2017-01-01

The present study used a multiple-baseline, single-case experimental design to investigate the effects of a multicomponent intervention on construction of simple sentences and word sequences. The intervention entailed sequential delivery of sentence instruction and frequency building to a performance criterion and paragraph instruction.…
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

PubMed Central

Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S

2009-01-01

In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455
A Simple Sequence Repeat- and Single-Nucleotide Polymorphism-Based Genetic Linkage Map of the Brown Planthopper, Nilaparvata lugens

PubMed Central

Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi

2013-01-01

In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257
A Potential Role for Drosophila Mucins in Development and Physiology

PubMed Central

Syed, Zulfeqhar A.; Härd, Torleif; Uv, Anne; van Dijk-Härd, Iris F.

2008-01-01

Vital vertebrate organs are protected from the external environment by a barrier that to a large extent consists of mucins. These proteins are characterized by poorly conserved repeated sequences that are rich in prolines and potentially glycosylated threonines and serines (PTS). We have now used the characteristics of the PTS repeat domain to identify Drosophila mucins in a simple bioinformatics approach. Searching the predicted protein database for proteins with at least 4 repeats and a high ST content, more than 30 mucin-like proteins were identified, ranging from 300–23000 amino acids in length. We find that Drosophila mucins are present at all stages of the fly life cycle, and that their transcripts localize to selective organs analogous to sites of vertebrate mucin expression. The results could allow for addressing basic questions about human mucin-related diseases in this model system. Additionally, many of the mucins are expressed in selective tissues during embryogenesis, thus revealing new potential functions for mucins as apical matrix components during organ morphogenesis. PMID:18725942
De novo Assembly of the Indo-Pacific Humpback Dolphin Leucocyte Transcriptome to Identify Putative Genes Involved in the Aquatic Adaptation and Immune Response

PubMed Central

Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng

2013-01-01

Background The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. Principal Findings We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10−5), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. Conclusion This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers. PMID:24015242
De novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome to identify putative genes involved in the aquatic adaptation and immune response.

PubMed

Gui, Duan; Jia, Kuntong; Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng

2013-01-01

The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10(-5)), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers.
A simple method for semi-random DNA amplicon fragmentation using the methylation-dependent restriction enzyme MspJI.

PubMed

Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W

2015-04-11

Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Adenine specific DNA chemical sequencing reaction.

PubMed Central

Iverson, B L; Dervan, P B

1987-01-01

Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067
The draft genome of the pest tephritid fruit fly Bactrocera tryoni: resources for the genomic analysis of hybridising species.

PubMed

Gilchrist, Anthony Stuart; Shearman, Deborah C A; Frommer, Marianne; Raphael, Kathryn A; Deshpande, Nandan P; Wilkins, Marc R; Sherwin, William B; Sved, John A

2014-12-20

The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. We produced a draft de novo genome assembly of Australia's major tephritid pest species, Bactrocera tryoni. The male genome (650-700 Mbp) includes approximately 150 Mb of interspersed repetitive DNA sequences and 60 Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA families. These genetic resources will facilitate the further investigations of genetic mechanisms responsible for the behavioural and morphological differences between these three species and other tephritids. We have also shown how whole genome sequence data can be used to generate simple diagnostic tests between very closely-related species where only one of the species is scaffolded.
Yeast and the AIDS Virus: The Odd Couple

PubMed Central

Andréola, Marie-Line; Litvak, Simon

2012-01-01

Despite being simple eukaryotic organisms, the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe have been widely used as a model to study human pathologies and the replication of human, animal, and plant viruses, as well as the function of individual viral proteins. The complete genome of S. cerevisiae was the first of eukaryotic origin to be sequenced and contains about 6,000 genes. More than 75% of the genes have an assigned function, while more than 40% share conserved sequences with known or predicted human genes. This strong homology has allowed the function of human orthologs to be unveiled starting from the data obtained in yeast. RNA plant viruses were the first to be studied in yeast. In this paper, we focus on the use of the yeast model to study the function of the proteins of human immunodeficiency virus type 1 (HIV-1) and the search for its cellular partners. This human retrovirus is the cause of AIDS. The WHO estimates that there are 33.4 million people worldwide living with HIV/AIDS, with 2.7 million new HIV infections per year and 2.0 million annual deaths due to AIDS. Current therapy is able to control the disease but there is no permanent cure or a vaccine. By using yeast, it is possible to dissect the function of some HIV-1 proteins and discover new cellular factors common to this simple cell and humans that may become potential therapeutic targets, leading to a long-lasting treatment for AIDS. PMID:22778552

High angular resolution diffusion imaging with stimulated echoes: compensation and correction in experiment design and analysis.

PubMed

Lundell, Henrik; Alexander, Daniel C; Dyrby, Tim B

2014-08-01

Stimulated echo acquisition mode (STEAM) diffusion MRI can be advantageous over pulsed-gradient spin-echo (PGSE) for diffusion times that are long compared with T2 . It therefore has potential for biomedical diffusion imaging applications at 7T and above where T2 is short. However, gradient pulses other than the diffusion gradients in the STEAM sequence contribute much greater diffusion weighting than in PGSE and lead to a disrupted experimental design. Here, we introduce a simple compensation to the STEAM acquisition that avoids the orientational bias and disrupted experiment design that these gradient pulses can otherwise produce. The compensation is simple to implement by adjusting the gradient vectors in the diffusion pulses of the STEAM sequence, so that the net effective gradient vector including contributions from diffusion and other gradient pulses is as the experiment intends. High angular resolution diffusion imaging (HARDI) data were acquired with and without the proposed compensation. The data were processed to derive standard diffusion tensor imaging (DTI) maps, which highlight the need for the compensation. Ignoring the other gradient pulses, a bias in DTI parameters from STEAM acquisition is found, due both to confounds in the analysis and the experiment design. Retrospectively correcting the analysis with a calculation of the full B matrix can partly correct for these confounds, but an acquisition that is compensated as proposed is needed to remove the effect entirely. © 2014 The Authors. NMR in Biomedicine published by John Wiley & Sons, Ltd.
New Concepts of Fluorescent Probes for Specific Detection of DNA Sequences: Bis-Modified Oligonucleotides in Excimer and Exciplex Detection

PubMed Central

Gbaj, A; Bichenkova, EV; Walsh, L; Savage, HE; Sardarian, AR; Etchells, LL; Gulati, A; Hawisa, S; Douglas, KT

2009-01-01

The detection of single base mismatches in DNA is important for diagnostics, treatment of genetic diseases, and identification of single nucleotide polymorphisms. Highly sensitive, specific assays are needed to investigate genetic samples from patients. The use of a simple fluorescent nucleoside analogue in detection of DNA sequence and point mutations by hybridisation in solution is described in this study. The 5′-bispyrene and 3′-naphthalene oligonucleotide probes form an exciplex on hybridisation to target in water and the 5′-bispyrene oligonucleotide alone is an adequate probe to determine concentration of target present. It was also indicated that this system has a potential to identify mismatches and insertions. The aim of this work was to investigate experimental structures and conditions that permit strong exciplex emission for nucleic acid detectors, and show how such exciplexes can register the presence of mismatches as required in SNP analysis. This study revealed that the hybridisation of 5′-bispyrenyl fluorophore to a DNA target results in formation of a fluorescent probe with high signal intensity change and specificity for detecting a complementary target in a homogeneous system. Detection of SNP mutations using this split-probe system is a highly specific, simple, and accessible method to meet the rigorous requirements of pharmacogenomic studies. Thus, it is possible for the system to act as SNP detectors and it shows promise for future applications in genetic testing. PMID:21483539
Ultrasensitive signal-on DNA biosensor based on nicking endonuclease assisted electrochemistry signal amplification.

PubMed

Liu, Zhongyuan; Zhang, Wei; Zhu, Shuyun; Zhang, Ling; Hu, Lianzhe; Parveen, Saima; Xu, Guobao

2011-11-15

Combining the advantages of signal-on strategy and nicking endonuclease assisted electrochemistry signal amplification (NEAESA), a new sensitive and signal-on electrochemical DNA biosensor for the sequence specific DNA detection based on NEAESA has been developed for the first time. A Hairpin-shape probe (HP), containing the target DNA recognition sequence, is thiol-modified at 5' end and immobilized on gold electrode via Au-S bonding. Subsequently, the HP modified electrode is hybridized with target DNA to form a duplex. Then the nicking endonuclease is added and nicks the HP strand in the duplex. After nicking, 3'-ferrocene (Fc)-labeled part complementary probe (Fc-PCP) is introduced on the electrode surface by hybridizing with the thiol-modified HP fragment, which results in the generation of electrochemical signal. Hence, the DNA biosensor is constructed successfully. The present DNA biosensor shows a wide linear range of 5.0×10(-13)-5.0×10(-8)M for detecting target DNA, with a low detection limit of 0.167pM. The proposed strategy does not require any amplifying labels (enzymes, DNAzymes, nanoparticles, etc.) for biorecognition events, which avoids false-positive results to occur frequently. Moreover, the strategy has the benefits of simple preparation, convenient operation, good selectivity, and high sensitivity. With the advantages mentioned above, this simple and sensitive strategy has the potential to be integrated in portable, low cost and simplified devices for diagnostic applications. Copyright © 2011 Elsevier B.V. All rights reserved.
A population study of the minicircles in Trypanosoma cruzi: predicting guide RNAs in the absence of empirical RNA editing.

PubMed

Thomas, Sean; Martinez, L L Isadora Trejo; Westenberger, Scott J; Sturm, Nancy R

2007-05-24

The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised. Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found. The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.
Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

USDA-ARS?s Scientific Manuscript database

Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...
Loblolly pine SSR markers for shortleaf pine genetics

Treesearch

C. Dana Nelson; Sedley Josserand; Craig S. Echt; Jeff Koppelman

2007-01-01

Simple sequence repeats (SSR) are highly informative DNA-based markers widely used in population genetic and linkage mapping studies. We have been developing PCR primer pairs for amplifying SSR markers for loblolly pine (Pinus taeda L.) using loblolly pine DNA and EST sequence data as starting materials. Fifty primer pairs known to reliably amplify...
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)

PubMed Central

Martin, Andrew C. R.

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

PubMed

Martin, Andrew C R

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.
Brassica ASTRA: an integrated database for Brassica genomic research.

PubMed

Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

2005-01-01

Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L.

PubMed Central

Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire

2012-01-01

Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr. PMID:22210604
The Flushtration Count Illusion: Attribute substitution tricks our interpretation of a simple visual event sequence.

PubMed

Thomas, Cyril; Didierjean, André; Kuhn, Gustav

2018-04-17

When faced with a difficult question, people sometimes work out an answer to a related, easier question without realizing that a substitution has taken place (e.g., Kahneman, 2011, Thinking, fast and slow. New York, Farrar, Strauss, Giroux). In two experiments, we investigated whether this attribute substitution effect can also affect the interpretation of a simple visual event sequence. We used a magic trick called the 'Flushtration Count Illusion', which involves a technique used by magicians to give the illusion of having seen multiple cards with identical backs, when in fact only the back of one card (the bottom card) is repeatedly shown. In Experiment 1, we demonstrated that most participants are susceptible to the illusion, even if they have the visual and analytical reasoning capacity to correctly process the sequence. In Experiment 2, we demonstrated that participants construct a biased and simplified representation of the Flushtration Count by substituting some attributes of the event sequence. We discussed of the psychological processes underlying this attribute substitution effect. © 2018 The British Psychological Society.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

PubMed Central

Anwar, Tamanna; Khan, Asad U

2006-01-01

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
A model for genesis of transcription systems.

PubMed

Burton, Zachary F; Opron, Kristopher; Wei, Guowei; Geiger, James H

2016-01-01

Repeating sequences generated from RNA gene fusions/ligations dominate ancient life, indicating central importance of building structural complexity in evolving biological systems. A simple and coherent story of life on earth is told from tracking repeating motifs that generate α/β proteins, 2-double-Ψ-β-barrel (DPBB) type RNA polymerases (RNAPs), general transcription factors (GTFs), and promoters. A general rule that emerges is that biological complexity that arises through generation of repeats is often bounded by solubility and closure (i.e., to form a pseudo-dimer or a barrel). Because the first DNA genomes were replicated by DNA template-dependent RNA synthesis followed by RNA template-dependent DNA synthesis via reverse transcriptase, the first DNA replication origins were initially 2-DPBB type RNAP promoters. A simplifying model for evolution of promoters/replication origins via repetition of core promoter elements is proposed. The model can explain why Pribnow boxes in bacterial transcription (i.e., (-12)TATAATG(-6)) so closely resemble TATA boxes (i.e., (-31)TATAAAAG(-24)) in archaeal/eukaryotic transcription. The evolution of anchor DNA sequences in bacterial (i.e., (-35)TTGACA(-30)) and archaeal (BRE(up); BRE for TFB recognition element) promoters is potentially explained. The evolution of BRE(down) elements of archaeal promoters is potentially explained.
Biological removal of NOx from flue gas.

PubMed

Kumaraswamy, R; Muyzer, G; Kuenen, J G; Loosdrecht, M C M

2004-01-01

BioDeNOx is a novel integrated physico-chemical and biological process for the removal of nitrogen oxides (NOx) from flue gas. Due to the high temperature of flue gas the process is performed at a temperature between 50-55 degrees C. Flue gas containing CO2, O2, SO2 and NOx, is purged through Fe(II)EDTA2- containing liquid. The Fe(II)EDTA2- complex effectively binds the NOx; the bound NOx is converted into N2 in a complex reaction sequence. In this paper an overview of the potential microbial reactions in the BioDeNOx process is discussed. It is evident that though the process looks simple, due to the large number of parallel potential reactions and serial microbial conversions, it is much more complex. There is a need for a detailed investigation in order to properly understand and optimise the process.
New milk protein-derived peptides with potential antimicrobial activity: an approach based on bioinformatic studies.

PubMed

Dziuba, Bartłomiej; Dziuba, Marta

2014-08-20

New peptides with potential antimicrobial activity, encrypted in milk protein sequences, were searched for with the use of bioinformatic tools. The major milk proteins were hydrolyzed in silico by 28 enzymes. The obtained peptides were characterized by the following parameters: molecular weight, isoelectric point, composition and number of amino acid residues, net charge at pH 7.0, aliphatic index, instability index, Boman index, and GRAVY index, and compared with those calculated for known 416 antimicrobial peptides including 59 antimicrobial peptides (AMPs) from milk proteins listed in the BIOPEP database. A simple analysis of physico-chemical properties and the values of biological activity indicators were insufficient to select potentially antimicrobial peptides released in silico from milk proteins by proteolytic enzymes. The final selection was made based on the results of multidimensional statistical analysis such as support vector machines (SVM), random forest (RF), artificial neural networks (ANN) and discriminant analysis (DA) available in the Collection of Anti-Microbial Peptides (CAMP database). Eleven new peptides with potential antimicrobial activity were selected from all peptides released during in silico proteolysis of milk proteins.
New Milk Protein-Derived Peptides with Potential Antimicrobial Activity: An Approach Based on Bioinformatic Studies

PubMed Central

Dziuba, Bartłomiej; Dziuba, Marta

2014-01-01

New peptides with potential antimicrobial activity, encrypted in milk protein sequences, were searched for with the use of bioinformatic tools. The major milk proteins were hydrolyzed in silico by 28 enzymes. The obtained peptides were characterized by the following parameters: molecular weight, isoelectric point, composition and number of amino acid residues, net charge at pH 7.0, aliphatic index, instability index, Boman index, and GRAVY index, and compared with those calculated for known 416 antimicrobial peptides including 59 antimicrobial peptides (AMPs) from milk proteins listed in the BIOPEP database. A simple analysis of physico-chemical properties and the values of biological activity indicators were insufficient to select potentially antimicrobial peptides released in silico from milk proteins by proteolytic enzymes. The final selection was made based on the results of multidimensional statistical analysis such as support vector machines (SVM), random forest (RF), artificial neural networks (ANN) and discriminant analysis (DA) available in the Collection of Anti-Microbial Peptides (CAMP database). Eleven new peptides with potential antimicrobial activity were selected from all peptides released during in silico proteolysis of milk proteins. PMID:25141106
Probing the Boundaries of Orthology: The Unanticipated Rapid Evolution of Drosophila centrosomin

PubMed Central

Eisman, Robert C.; Kaufman, Thomas C.

2013-01-01

The rapid evolution of essential developmental genes and their protein products is both intriguing and problematic. The rapid evolution of gene products with simple protein folds and a lack of well-characterized functional domains typically result in a low discovery rate of orthologous genes. Additionally, in the absence of orthologs it is difficult to study the processes and mechanisms underlying rapid evolution. In this study, we have investigated the rapid evolution of centrosomin (cnn), an essential gene encoding centrosomal protein isoforms required during syncytial development in Drosophila melanogaster. Until recently the rapid divergence of cnn made identification of orthologs difficult and questionable because Cnn violates many of the assumptions underlying models for protein evolution. To overcome these limitations, we have identified a group of insect orthologs and present conserved features likely to be required for the functions attributed to cnn in D. melanogaster. We also show that the rapid divergence of Cnn isoforms is apparently due to frequent coding sequence indels and an accelerated rate of intronic additions and eliminations. These changes appear to be buffered by multi-exon and multi-reading frame maximum potential ORFs, simple protein folds, and the splicing machinery. These buffering features also occur in other genes in Drosophila and may help prevent potentially deleterious mutations due to indels in genes with large coding exons and exon-dense regions separated by small introns. This work promises to be useful for future investigations of cnn and potentially other rapidly evolving genes and proteins. PMID:23749319
Rapid, simple and direct detection of Meloidogyne hapla from infected root galls using loop-mediated isothermal amplification combined with FTA technology

PubMed Central

Peng, Huan; Long, Haibo; Huang, Wenkun; Liu, Jing; Cui, Jiangkuan; Kong, Lingan; Hu, Xianqi; Gu, Jianfeng; Peng, Deliang

2017-01-01

The northern root-knot nematode (Meloidogyne hapla) is a damaging nematode that has caused serious economic losses worldwide. In the present study, a sensitive, simple and rapid method was developed for detection of M. hapla in infested plant roots by combining a Flinders Technology Associates (FTA) card with loop-mediated isothermal amplification (LAMP). The specific primers of LAMP were designed based on the distinction of internal transcribed spacer (ITS) sequences between M. hapla and other Meloidogyne spp. The LAMP assay can detect nematode genomic DNA at concentrations low to 1/200 000, which is 100 times more sensitive than conventional PCR. The LAMP was able to highly specifically distinguish M. hapla from other closely related nematode species. Furthermore, the advantages of the FTA-LAMP assay to detect M. hapla were demonstrated by assaying infected root galls that were artificially inoculated. In addition, M. hapla was successfully detected from six of forty-two field samples using FTA-LAMP technology. This study was the first to provide a simple diagnostic assay for M. hapla using the LAMP assay combined with FTA technology. In conclusion, the new FTA-LAMP assay has the potential for diagnosing infestation in the field and managing the pathogen M. hapla. PMID:28368036

Rapid, simple and direct detection of Meloidogyne hapla from infected root galls using loop-mediated isothermal amplification combined with FTA technology.

PubMed

Peng, Huan; Long, Haibo; Huang, Wenkun; Liu, Jing; Cui, Jiangkuan; Kong, Lingan; Hu, Xianqi; Gu, Jianfeng; Peng, Deliang

2017-04-03

The northern root-knot nematode (Meloidogyne hapla) is a damaging nematode that has caused serious economic losses worldwide. In the present study, a sensitive, simple and rapid method was developed for detection of M. hapla in infested plant roots by combining a Flinders Technology Associates (FTA) card with loop-mediated isothermal amplification (LAMP). The specific primers of LAMP were designed based on the distinction of internal transcribed spacer (ITS) sequences between M. hapla and other Meloidogyne spp. The LAMP assay can detect nematode genomic DNA at concentrations low to 1/200 000, which is 100 times more sensitive than conventional PCR. The LAMP was able to highly specifically distinguish M. hapla from other closely related nematode species. Furthermore, the advantages of the FTA-LAMP assay to detect M. hapla were demonstrated by assaying infected root galls that were artificially inoculated. In addition, M. hapla was successfully detected from six of forty-two field samples using FTA-LAMP technology. This study was the first to provide a simple diagnostic assay for M. hapla using the LAMP assay combined with FTA technology. In conclusion, the new FTA-LAMP assay has the potential for diagnosing infestation in the field and managing the pathogen M. hapla.
Asymptotic convertibility of entanglement: An information-spectrum approach to entanglement concentration and dilution

NASA Astrophysics Data System (ADS)

Jiao, Yong; Wakakuwa, Eyuri; Ogawa, Tomohiro

2018-02-01

We consider asymptotic convertibility of an arbitrary sequence of bipartite pure states into another by local operations and classical communication (LOCC). We adopt an information-spectrum approach to address cases where each element of the sequences is not necessarily a tensor power of a bipartite pure state. We derive necessary and sufficient conditions for the LOCC convertibility of one sequence to another in terms of spectral entropy rates of entanglement of the sequences. Based on these results, we also provide simple proofs for previously known results on the optimal rates of entanglement concentration and dilution of general sequences of bipartite pure states.
BioWord: A sequence manipulation suite for Microsoft Word

PubMed Central

2012-01-01

Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms. PMID:22676326
BioWord: a sequence manipulation suite for Microsoft Word.

PubMed

Anzaldi, Laura J; Muñoz-Fernández, Daniel; Erill, Ivan

2012-06-07

The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms.
Modeling How, When, and What Is Learned in a Simple Fault-Finding Task

ERIC Educational Resources Information Center

Ritter, Frank E.; Bibby, Peter A.

2008-01-01

We have developed a process model that learns in multiple ways while finding faults in a simple control panel device. The model predicts human participants' learning through its own learning. The model's performance was systematically compared to human learning data, including the time course and specific sequence of learned behaviors. These…
One Stroke at a Time

ERIC Educational Resources Information Center

Hollibaugh, Molly

2012-01-01

At first glance, a Zentangle creation can seem intricate and complicated. But, when you learn how it is done, you realize how simple it is. Zentangles are patterns, or "tangles," that have been reduced to a simple sequence of elemental strokes. When you learn to focus on each stroke you find yourself capable of things that you may have once…
An annotated genetic map of loblolly pine based on microsatellite and cDNA markers

USDA-ARS?s Scientific Manuscript database

Previous loblolly pine (Pinus taeda L.) genetic linkage maps have been based on a variety of DNA polymorphisms, such as AFLPs, RAPDs, RFLPs, and ESTPs, but only a few SSRs (simple sequence repeats), also known as simple tandem repeats or microsatellites, have been mapped in P. taeda. The objective o...
``Sequence space soup'' of proteins and copolymers

NASA Astrophysics Data System (ADS)

Chan, Hue Sun; Dill, Ken A.

1991-09-01

To study the protein folding problem, we use exhaustive computer enumeration to explore ``sequence space soup,'' an imaginary solution containing the ``native'' conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymer sequence. The model is of short self-avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two-dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acid monomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.
Masking as an effective quality control method for next-generation sequencing data analysis.

PubMed

Yun, Sajung; Yun, Sijung

2014-12-13

Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).
Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

NASA Astrophysics Data System (ADS)

Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

2018-05-01

We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.
De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

PubMed Central

Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

2013-01-01

Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

PubMed

Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

2013-01-01

Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
Transcriptome Sequencing and Characterization of Japanese Scallop Patinopecten yessoensis from Different Shell Color Lines

PubMed Central

Chang, Yaqing; Zhao, Wenming; Du, Zhenlin; Hao, Zhenlin

2015-01-01

Shell color is an important trait that is used in breeding the Japanese scallop Patinopecten yessoensis, the most economically important scallop species in China. We constructed four transcriptome libraries from different shell color lines of P. yessoensis: the left and right shell mantles of ordinary strains of P. yessoensis and the left shell mantles of the ‘Ivory’ and ‘Maple’ strains. These four libraries were paired-end sequenced using the Illumina HiSeq 2000 platform and contained 54,802,692 sequences, 40,798,962 sequences, 74,019,262 sequences, and 44,466,166 sequences, respectively. A total of 214,087,082 expressed sequence tags were assembled into 73,522 unigenes with an average size of 1,163 bp. When the data were compared against the public Nr and Swiss-Prot databases using BlastX, nearly 30.55% (22,458) of the unigenes were significantly matched to known unique proteins. Gene Ontology annotation and pathway mapping analysis using the Kyoto Encyclopedia of Genes and Genomes categorized unigenes according to their diverse biological functions and processes and identified candidate genes that were potentially involved in growth, pigmentation, metal transcription, and immunity. Expression profile analysis was performed on all four libraries and many differentially expressed genes were identified. In addition, 5,772 simple sequence repeats were obtained from the P. yessoensis transcriptomes, and 464,197, 395,646, and 310,649 single nucleotide polymorphisms were revealed in the ordinary strains, the ‘Ivory’ strain, and the ‘Maple’ strain, respectively. These results provide valuable information for future genomic studies on P. yessoensis and improve our understanding of the molecular mechanisms involved in the growth, immunity, shell coloring, and shell biomineralization of this species. These resources also may be used in a variety of applications, such as trait mapping, marker-assisted breeding, studies of population genetics and genomics, and work on functional genomics. PMID:25680107
A universal protocol to generate consensus level genome sequences for foot-and-mouth disease virus and other positive-sense polyadenylated RNA viruses using the Illumina MiSeq.

PubMed

Logan, Grace; Freimanis, Graham L; King, David J; Valdazo-González, Begoña; Bachanek-Bankowska, Katarzyna; Sanderson, Nicholas D; Knowles, Nick J; King, Donald P; Cottam, Eleanor M

2014-09-30

Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template. The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5' genomic termini and area immediately flanking the poly(C) region. We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.
Sequencing and Characterization of the Invasive Sycamore Lace Bug Corythucha ciliata (Hemiptera: Tingidae) Transcriptome

PubMed Central

Qu, Cheng; Fu, Ningning; Xu, Yihua

2016-01-01

The sycamore lace bug, Corythucha ciliata (Hemiptera: Tingidae), is an invasive forestry pest rapidly expanding in many countries. This pest poses a considerable threat to the urban forestry ecosystem, especially to Platanus spp. However, its molecular biology and biochemistry are poorly understood. This study reports the first C. ciliata transcriptome, encompassing three different life stages (Nymphs, adults female (AF) and adults male (AM)). In total, 26.53 GB of clean data and 60,879 unigenes were obtained from three RNA-seq libraries. These unigenes were annotated and classified by Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (A manually annotated and reviewed protein sequence database), and KO (KEGG Ortholog database). After all pairwise comparisons between these three different samples, a large number of differentially expressed genes were revealed. The dramatic differences in global gene expression profiles were found between distinct life stages (nymphs and AF, nymphs and AM) and sex difference (AF and AM), with some of the significantly differentially expressed genes (DEGs) being related to metamorphosis, digestion, immune and sex difference. The different express of unigenes were validated through quantitative Real-Time PCR (qRT-PCR) for 16 randomly selected unigenes. In addition, 17,462 potential simple sequence repeat molecular markers were identified in these transcriptome resources. These comprehensive C. ciliata transcriptomic information can be utilized to promote the development of environmentally friendly methodologies to disrupt the processes of metamorphosis, digestion, immune and sex differences. PMID:27494615
Complete Chloroplast Genome of the Multifunctional Crop Globe Artichoke and Comparison with Other Asteraceae

PubMed Central

Curci, Pasquale L.; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G.; Sonnante, Gabriella

2015-01-01

With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for “specific barcode” purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants. PMID:25774672
Complete chloroplast genome of the multifunctional crop globe artichoke and comparison with other Asteraceae.

PubMed

Curci, Pasquale L; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G; Sonnante, Gabriella

2015-01-01

With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for "specific barcode" purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants.
SEQ-REVIEW: A tool for reviewing and checking spacecraft sequences

NASA Astrophysics Data System (ADS)

Maldague, Pierre F.; El-Boushi, Mekki; Starbird, Thomas J.; Zawacki, Steven J.

1994-11-01

A key component of JPL's strategy to make space missions faster, better and cheaper is the Advanced Multi-Mission Operations System (AMMOS), a ground software intensive system currently in use and in further development. AMMOS intends to eliminate the cost of re-engineering a ground system for each new JPL mission. This paper discusses SEQ-REVIEW, a component of AMMOS that was designed to facilitate and automate the task of reviewing and checking spacecraft sequences. SEQ-REVIEW is a smart browser for inspecting files created by other sequence generation tools in the AMMOS system. It can parse sequence-related files according to a computer-readable version of a 'Software Interface Specification' (SIS), which is a standard document for defining file formats. It lets users display one or several linked files and check simple constraints using a Basic-like 'Little Language'. SEQ-REVIEW represents the first application of the Quality Function Development (QFD) method to sequence software development at JPL. The paper will show how the requirements for SEQ-REVIEW were defined and converted into a design based on object-oriented principles. The process starts with interviews of potential users, a small but diverse group that spans multiple disciplines and 'cultures'. It continues with the development of QFD matrices that related product functions and characteristics to user-demanded qualities. These matrices are then turned into a formal Software Requirements Document (SRD). The process concludes with the design phase, in which the CRC (Class, Responsibility, Collaboration) approach was used to convert requirements into a blueprint for the final product.
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

PubMed

Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

2017-11-28

Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.

SEQ-REVIEW: A tool for reviewing and checking spacecraft sequences

NASA Technical Reports Server (NTRS)

Maldague, Pierre F.; El-Boushi, Mekki; Starbird, Thomas J.; Zawacki, Steven J.

1994-01-01

A key component of JPL's strategy to make space missions faster, better and cheaper is the Advanced Multi-Mission Operations System (AMMOS), a ground software intensive system currently in use and in further development. AMMOS intends to eliminate the cost of re-engineering a ground system for each new JPL mission. This paper discusses SEQ-REVIEW, a component of AMMOS that was designed to facilitate and automate the task of reviewing and checking spacecraft sequences. SEQ-REVIEW is a smart browser for inspecting files created by other sequence generation tools in the AMMOS system. It can parse sequence-related files according to a computer-readable version of a 'Software Interface Specification' (SIS), which is a standard document for defining file formats. It lets users display one or several linked files and check simple constraints using a Basic-like 'Little Language'. SEQ-REVIEW represents the first application of the Quality Function Development (QFD) method to sequence software development at JPL. The paper will show how the requirements for SEQ-REVIEW were defined and converted into a design based on object-oriented principles. The process starts with interviews of potential users, a small but diverse group that spans multiple disciplines and 'cultures'. It continues with the development of QFD matrices that related product functions and characteristics to user-demanded qualities. These matrices are then turned into a formal Software Requirements Document (SRD). The process concludes with the design phase, in which the CRC (Class, Responsibility, Collaboration) approach was used to convert requirements into a blueprint for the final product.
Dog leukocyte antigen class II-associated genetic risk testing for immune disorders of dogs: simplified approaches using Pug dog necrotizing meningoencephalitis as a model.

PubMed

Pedersen, Niels; Liu, Hongwei; Millon, Lee; Greer, Kimberly

2011-01-01

A significantly increased risk for a number of autoimmune and infectious diseases in purebred and mixed-breed dogs has been associated with certain alleles or allele combinations of the dog leukocyte antigen (DLA) class II complex containing the DRB1, DQA1, and DQB1 genes. The exact level of risk depends on the specific disease, the alleles in question, and whether alleles exist in a homozygous or heterozygous state. The gold standard for identifying high-risk alleles and their zygosity has involved direct sequencing of the exon 2 regions of each of the 3 genes. However, sequencing and identification of specific alleles at each of the 3 loci are relatively expensive and sequencing techniques are not ideal for additional parentage or identity determination. However, it is often possible to get the same information from sequencing only 1 gene given the small number of possible alleles at each locus in purebred dogs, extensive homozygosity, and tendency for disease-causing alleles at each of the 3 loci to be strongly linked to each other into haplotypes. Therefore, genetic testing in purebred dogs with immune diseases can be often simplified by sequencing alleles at 1 rather than 3 loci. Further simplification of genetic tests for canine immune diseases can be achieved by the use of alternative genetic markers in the DLA class II region that are also strongly linked with the disease genotype. These markers consist of either simple tandem repeats or single nucleotide polymorphisms that are also in strong linkage with specific DLA class II genotypes and/or haplotypes. The current study uses necrotizing meningoencephalitis of Pug dogs as a paradigm to assess simple alternative genetic tests for disease risk. It was possible to attain identical necrotizing meningoencephalitis risk assessments to 3-locus DLA class II sequencing by sequencing only the DQB1 gene, using 3 DLA class II-linked simple tandem repeat markers, or with a small single nucleotide polymorphism array designed to identify breed-specific DQB1 alleles.
Finite-size effects in transcript sequencing count distribution: its power-law correction necessarily precedes downstream normalization and comparative analysis.

PubMed

Wong, Wing-Cheong; Ng, Hong-Kiat; Tantoso, Erwin; Soong, Richie; Eisenhaber, Frank

2018-02-12

Though earlier works on modelling transcript abundance from vertebrates to lower eukaroytes have specifically singled out the Zip's law, the observed distributions often deviate from a single power-law slope. In hindsight, while power-laws of critical phenomena are derived asymptotically under the conditions of infinite observations, real world observations are finite where the finite-size effects will set in to force a power-law distribution into an exponential decay and consequently, manifests as a curvature (i.e., varying exponent values) in a log-log plot. If transcript abundance is truly power-law distributed, the varying exponent signifies changing mathematical moments (e.g., mean, variance) and creates heteroskedasticity which compromises statistical rigor in analysis. The impact of this deviation from the asymptotic power-law on sequencing count data has never truly been examined and quantified. The anecdotal description of transcript abundance being almost Zipf's law-like distributed can be conceptualized as the imperfect mathematical rendition of the Pareto power-law distribution when subjected to the finite-size effects in the real world; This is regardless of the advancement in sequencing technology since sampling is finite in practice. Our conceptualization agrees well with our empirical analysis of two modern day NGS (Next-generation sequencing) datasets: an in-house generated dilution miRNA study of two gastric cancer cell lines (NUGC3 and AGS) and a publicly available spike-in miRNA data; Firstly, the finite-size effects causes the deviations of sequencing count data from Zipf's law and issues of reproducibility in sequencing experiments. Secondly, it manifests as heteroskedasticity among experimental replicates to bring about statistical woes. Surprisingly, a straightforward power-law correction that restores the distribution distortion to a single exponent value can dramatically reduce data heteroskedasticity to invoke an instant increase in signal-to-noise ratio by 50% and the statistical/detection sensitivity by as high as 30% regardless of the downstream mapping and normalization methods. Most importantly, the power-law correction improves concordance in significant calls among different normalization methods of a data series averagely by 22%. When presented with a higher sequence depth (4 times difference), the improvement in concordance is asymmetrical (32% for the higher sequencing depth instance versus 13% for the lower instance) and demonstrates that the simple power-law correction can increase significant detection with higher sequencing depths. Finally, the correction dramatically enhances the statistical conclusions and eludes the metastasis potential of the NUGC3 cell line against AGS of our dilution analysis. The finite-size effects due to undersampling generally plagues transcript count data with reproducibility issues but can be minimized through a simple power-law correction of the count distribution. This distribution correction has direct implication on the biological interpretation of the study and the rigor of the scientific findings. This article was reviewed by Oliviero Carugo, Thomas Dandekar and Sandor Pongor.
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Identification of apple cultivars on the basis of simple sequence repeat markers.

PubMed

Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y

2014-09-12

DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.
The paradox of MHC-DRB exon/intron evolution: alpha-helix and beta-sheet encoding regions diverge while hypervariable intronic simple repeats coevolve with beta-sheet codons.

PubMed

Schwaiger, F W; Weyers, E; Epplen, C; Brün, J; Ruff, G; Crawford, A; Epplen, J T

1993-09-01

Twenty-one different caprine and 13 ovine MHC-DRB exon 2 sequences were determined including part of the adjacent introns containing simple repetitive (gt)n(ga)m elements. The positions for highly polymorphic DRB amino acids vary slightly among ungulates and other mammals. From man and mouse to ungulates the basic (gt)n(ga)m structure is fixed in evolution for 7 x 10(7) years whereas ample variations exist in the tandem (gt)n and (ga)m dinucleotides and especially their "degenerated" derivatives. Phylogenetic trees for the alpha-helices and beta-pleated sheets of the ungulate DRB sequences suggest different evolutionary histories. In hoofed animals as well as in humans DRB beta-sheet encoding sequences and adjacent intronic repeats can be assembled into virtually identical groups suggesting coevolution of noncoding as well as coding DNA. In contrast alpha-helices and C-terminal parts of the first DRB domain evolve distinctly. In the absence of a defined mechanism causing specific, site-directed mutations, double-recombination or gene-conversion-like events would readily explain this fact. The role of the intronic simple (gt)n(ga)m repeat is discussed with respect to these genetic exchange mechanisms during evolution.
40 CFR 86.230-94 - Test sequence: general requirements.

Code of Federal Regulations, 2010 CFR

2010-07-01

... testing. (2) The ambient temperature reported shall be a simple average of the test cell temperatures... cell temperature shall be 20 °F±3 °F (−7 °C±1.7 °C) when measured in accordance with paragraph (e)(2... approximately level during all phases of the test sequence to prevent abnormal fuel distribution. (e) Engine...
Cross-species transferability and mapping of genomic and cDNA SSRs in pines

Treesearch

D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion

2004-01-01

Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...
Genetic variation patterns of American chestnut populations at EST-SSRs

Treesearch

Oliver Gailing; C. Dana Nelson

2017-01-01

The objective of this study is to analyze patterns of genetic variation at genic expressed sequence tag - simple sequence repeats (EST-SSRs) and at chloroplast DNA markers in populations of American chestnut (Castanea dentata Borkh.) to assist in conservation and breeding efforts. Allelic diversity at EST-SSRs decreased significantly from southwest to northeast along...
Does Sleep Facilitate the Consolidation of Allocentric or Egocentric Representations of Implicitly Learned Visual-Motor Sequence Learning?

ERIC Educational Resources Information Center

Viczko, Jeremy; Sergeeva, Valya; Ray, Laura B.; Owen, Adrian M.; Fogel, Stuart M.

2018-01-01

Sleep facilitates the consolidation (i.e., enhancement) of simple, explicit (i.e., conscious) motor sequence learning (MSL). MSL can be dissociated into egocentric (i.e., motor) or allocentric (i.e., spatial) frames of reference. The consolidation of the allocentric memory representation is sleep-dependent, whereas the egocentric consolidation…
Learning of Monotonic and Nonmonotonic Sequences in Domesticated Horses ("Equus Callabus") and Chickens ("Gallus Domesticus")

ERIC Educational Resources Information Center

Kundey, Shannon M. A.; Strandell, Brittany; Mathis, Heather; Rowan, James D.

2010-01-01

(Hulse and Dorsky, 1977) and (Hulse and Dorsky, 1979) found that rats, like humans, learn sequences following a simple rule-based structure more quickly than those lacking a rule-based structure. Through two experiments, we explored whether two additional species--domesticated horses ("Equus callabus") and chickens ("Gallus domesticus")--would…
LISTA, a comprehensive compilation of nucleotide sequences encoding proteins from the yeast Saccharomyces.

PubMed Central

Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P

1993-01-01

The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521
No evidence that sex and transposable elements drive genome size variation in evening primroses.

PubMed

Ågren, J Arvid; Greiner, Stephan; Johnson, Marc T J; Wright, Stephen I

2015-04-01

Genome size varies dramatically across species, but despite an abundance of attention there is little agreement on the relative contributions of selective and neutral processes in governing this variation. The rate of sex can potentially play an important role in genome size evolution because of its effect on the efficacy of selection and transmission of transposable elements (TEs). Here, we used a phylogenetic comparative approach and whole genome sequencing to investigate the contribution of sex and TE content to genome size variation in the evening primrose (Oenothera) genus. We determined genome size using flow cytometry for 30 species that vary in genetic system and find that variation in sexual/asexual reproduction cannot explain the almost twofold variation in genome size. Moreover, using whole genome sequences of three species of varying genome sizes and reproductive system, we found that genome size was not associated with TE abundance; instead the larger genomes had a higher abundance of simple sequence repeats. Although it has long been clear that sexual reproduction may affect various aspects of genome evolution in general and TE evolution in particular, it does not appear to have played a major role in genome size evolution in the evening primroses. © 2015 The Author(s).
Molecular analysis of RAPD DNA based markers: their potential use for the detection of genetic variability in jojoba (Simmondsia chinensis L Schneider).

PubMed

Amarger, V; Mercier, L

1995-01-01

We have applied the recently developed technique of random amplified polymorphic DNA (RAPD) for the discrimination between two jojoba clones at the genomic level. Among a set of 30 primers tested, a simple reproducible pattern with three distinct fragments for clone D and two distinct fragments for clone E was obtained with primer OPB08. Since RAPD products are the results of arbitrarily priming events and because a given primer can amplify a number of non-homologous sequences, we wondered whether or not RAPD bands, even those of similar size, were derived from different loci in the two clones. To answer this question, two complementary approaches were used: i) cloning and sequencing of the amplification products from clone E; and ii) complementary Southern analysis of RAPD gels using cloned or amplified fragments (directly recovered from agarose gels) as RFLP probes. The data reported here show that the RAPD reaction generates multiple amplified fragments. Some fragments, although resolved as a single band on agarose gels, contain different DNA species of the same size. Furthermore, it appears that the cloned RAPD products of known sequence that do not target repetitive DNA can be used as hybridization probes in RFLP to detect a polymorphism among individuals.
The Zagros hinterland fold-and-thrust belt in-sequence thrusting, Iran

NASA Astrophysics Data System (ADS)

Sarkarinejad, Khalil; Ghanbarian, Mohammad Ali

2014-05-01

The collision of the Iranian microcontinent with the Afro-Arabian continent resulted in the deformation of the Zagros orogenic belt. The foreland of this belt in the Persian Gulf and Arabian platform has been investigated for its petroleum and gas resource potentials, but the Zagros hinterland is poorly investigated and our knowledge about its deformation is much less than other parts of this orogen. Therefore, this work presents a new geological map, stratigraphic column and two detailed geological cross sections. This study indicates the presence of a hinterland fold-and-thrust belt on northeastern side of the Zagros orogenic core that consists of in-sequence thrusting and basement involvement in this important part of the Zagros hinterland. The in-sequence thrusting resulted in first- and second-order duplex systems, Mode I fault-bend folding, fault-propagation folding and asymmetric detachment folding which indicate close relationships between folding and thrusting. Study of fault-bend folds shows that layer-parallel simple shear has the same role in the southeastern and northwestern parts of the study area (αe = 23.4 ± 9.1°). A major lateral ramp in the basement beneath the Talaee plain with about one kilometer of vertical offset formed parallel to the SW movement direction and perpendicular to the major folding and thrusting.
Computer-Aided Design of RNA Origami Structures.

PubMed

Sparvath, Steffen L; Geary, Cody W; Andersen, Ebbe S

2017-01-01

RNA nanostructures can be used as scaffolds to organize, combine, and control molecular functionalities, with great potential for applications in nanomedicine and synthetic biology. The single-stranded RNA origami method allows RNA nanostructures to be folded as they are transcribed by the RNA polymerase. RNA origami structures provide a stable framework that can be decorated with functional RNA elements such as riboswitches, ribozymes, interaction sites, and aptamers for binding small molecules or protein targets. The rich library of RNA structural and functional elements combined with the possibility to attach proteins through aptamer-based binding creates virtually limitless possibilities for constructing advanced RNA-based nanodevices.In this chapter we provide a detailed protocol for the single-stranded RNA origami design method using a simple 2-helix tall structure as an example. The first step involves 3D modeling of a double-crossover between two RNA double helices, followed by decoration with tertiary motifs. The second step deals with the construction of a 2D blueprint describing the secondary structure and sequence constraints that serves as the input for computer programs. In the third step, computer programs are used to design RNA sequences that are compatible with the structure, and the resulting outputs are evaluated and converted into DNA sequences to order.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.

PubMed

Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B

2011-06-03

Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.

Quantum mechanical calculations related to ionization and charge transfer in DNA

NASA Astrophysics Data System (ADS)

Cauët, E.; Valiev, M.; Weare, J. H.; Liévin, J.

2012-07-01

Ionization and charge migration in DNA play crucial roles in mechanisms of DNA damage caused by ionizing radiation, oxidizing agents and photo-irradiation. Therefore, an evaluation of the ionization properties of the DNA bases is central to the full interpretation and understanding of the elementary reactive processes that occur at the molecular level during the initial exposure and afterwards. Ab initio quantum mechanical (QM) methods have been successful in providing highly accurate evaluations of key parameters, such as ionization energies (IE) of DNA bases. Hence, in this study, we performed high-level QM calculations to characterize the molecular energy levels and potential energy surfaces, which shed light on ionization and charge migration between DNA bases. In particular, we examined the IEs of guanine, the most easily oxidized base, isolated and embedded in base clusters, and investigated the mechanism of charge migration over two and three stacked guanines. The IE of guanine in the human telomere sequence has also been evaluated. We report a simple molecular orbital analysis to explain how modifications in the base sequence are expected to change the efficiency of the sequence as a hole trap. Finally, the application of a hybrid approach combining quantum mechanics with molecular mechanics brings an interesting discussion as to how the native aqueous DNA environment affects the IE threshold of nucleobases.
A Simple Method to Determine the "R" or "S" Configuration of Molecules with an Axis of Chirality

ERIC Educational Resources Information Center

Wang, Cunde; Wu, Weiming

2011-01-01

A simple method for the "R" or "S" designation of molecules with an axis of chirality is described. The method involves projection of the substituents along the chiral axis, utilizes the Cahn-Ingold-Prelog sequence rules in assigning priority to the substituents, is easy to use, and has broad applicability. (Contains 5 figures.)
Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.

PubMed

Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick

2013-01-01

Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.
Fine-tuning gene networks using simple sequence repeats

PubMed Central

Egbert, Robert G.; Klavins, Eric

2012-01-01

The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

PubMed

Wickersheim, Michelle L; Blumenstiel, Justin P

2013-11-01

A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.
Next-generation sequencing library preparation method for identification of RNA viruses on the Ion Torrent Sequencing Platform.

PubMed

Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng

2018-05-09

Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.
Molecular beacon sequence design algorithm.

PubMed

Monroe, W Todd; Haselton, Frederick R

2003-01-01

A method based on Web-based tools is presented to design optimally functioning molecular beacons. Molecular beacons, fluorogenic hybridization probes, are a powerful tool for the rapid and specific detection of a particular nucleic acid sequence. However, their synthesis costs can be considerable. Since molecular beacon performance is based on its sequence, it is imperative to rationally design an optimal sequence before synthesis. The algorithm presented here uses simple Microsoft Excel formulas and macros to rank candidate sequences. This analysis is carried out using mfold structural predictions along with other free Web-based tools. For smaller laboratories where molecular beacons are not the focus of research, the public domain algorithm described here may be usefully employed to aid in molecular beacon design.
A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library

PubMed Central

Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477
A simple and efficient method for assembling TALE protein based on plasmid library.

PubMed

Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.
New chloroplast microsatellite markers suitable for assessing genetic diversity of Lolium perenne and other related grass species

PubMed Central

Diekmann, Kerstin; Hodkinson, Trevor R.; Barth, Susanne

2012-01-01

Background and Aims Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne ‘Cashel’. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Methods Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. Key Results All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A8 mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. Conclusions The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species. PMID:22419761
New chloroplast microsatellite markers suitable for assessing genetic diversity of Lolium perenne and other related grass species.

PubMed

Diekmann, Kerstin; Hodkinson, Trevor R; Barth, Susanne

2012-11-01

Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne 'Cashel'. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A(8) mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species.
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

PubMed

Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B

2016-02-04

Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Video-tracker trajectory analysis: who meets whom, when and where

NASA Astrophysics Data System (ADS)

Jäger, U.; Willersinn, D.

2010-04-01

Unveiling unusual or hostile events by observing manifold moving persons in a crowd is a challenging task for human operators, especially when sitting in front of monitor walls for hours. Typically, hostile events are rare. Thus, due to tiredness and negligence the operator may miss important events. In such situations, an automatic alarming system is able to support the human operator. The system incorporates a processing chain consisting of (1) people tracking, (2) event detection, (3) data retrieval, and (4) display of relevant video sequence overlaid by highlighted regions of interest. In this paper we focus on the event detection stage of the processing chain mentioned above. In our case, the selected event of interest is the encounter of people. Although being based on a rather simple trajectory analysis, this kind of event embodies great practical importance because it paves the way to answer the question "who meets whom, when and where". This, in turn, forms the basis to detect potential situations where e.g. money, weapons, drugs etc. are handed over from one person to another in crowded environments like railway stations, airports or busy streets and places etc.. The input to the trajectory analysis comes from a multi-object video-based tracking system developed at IOSB which is able to track multiple individuals within a crowd in real-time [1]. From this we calculate the inter-distances between all persons on a frame-to-frame basis. We use a sequence of simple rules based on the individuals' kinematics to detect the event mentioned above to output the frame number, the persons' IDs from the tracker and the pixel coordinates of the meeting position. Using this information, a data retrieval system may extract the corresponding part of the recorded video image sequence and finally allows for replaying the selected video clip with a highlighted region of interest to attract the operator's attention for further visual inspection.
Molecular Linkage Mapping and Marker-Trait Associations with NlRPT, a Downy Mildew Resistance Gene in Nicotiana langsdorffii

PubMed Central

Zhang, Shouan; Gao, Muqiang; Zaitlin, David

2012-01-01

Nicotiana langsdorffii is one of two species of Nicotiana known to express an incompatible interaction with the oomycete Peronospora tabacina, the causal agent of tobacco blue mold disease. We previously showed that incompatibility is due to the hypersensitive response (HR), and plants expressing the HR are resistant to P. tabacina at all stages of growth. Resistance is due to a single dominant gene in N. langsdorffii accession S-4-4 that we have named NlRPT. In further characterizing this unique host-pathogen interaction, NlRPT has been placed on a preliminary genetic map of the N. langsdorffii genome. Allelic scores for five classes of DNA markers were determined for 90 progeny of a “modified backcross” involving two N. langsdorffii inbred lines and the related species N. forgetiana. All markers had an expected segregation ratio of 1:1, and were scored in a common format. The map was constructed with JoinMap 3.0, and loci showing excessive transmission distortion were removed. The linkage map consists of 266 molecular marker loci defined by 217 amplified fragment length polymorphisms (AFLPs), 26 simple-sequence repeats (SSRs), 10 conserved orthologous sequence markers, nine inter-simple sequence repeat markers, and four target region amplification polymorphism markers arranged in 12 linkage groups with a combined length of 1062 cM. NlRPT is located on linkage group three, flanked by four AFLP markers and one SSR. Regions of skewed segregation were detected on LGs 1, 5, and 9. Markers developed for N. langsdorffii are potentially useful genetic tools for other species in Nicotiana section Alatae, as well as in N. benthamiana. We also investigated whether AFLPs could be used to infer genetic relationships within N. langsdorffii and related species from section Alatae. A phenetic analysis of the AFLP data showed that there are two main lineages within N. langsdorffii, and that both contain populations expressing dominant resistance to P. tabacina. PMID:22936937
A deeper view into the significance of simple sequence repeats in pre-miRNAs provides clues for its possible roles in determining the function of microRNAs.

PubMed

Joy, Nisha; Maimoonath Beevi, Y P; Soniya, E V

2018-05-09

The central tenet of 'genome content' has been that the 'non-coding' parts are highly enriched with 'microsatellites' or 'Simple Sequence Repeats' (SSRs). We presume that the presence and change in number of repeat unit (n) of SSRs in different genomic locations may or may not become beneficial, depending on the position of SSRs in a gene. Very few studies have looked into the existence of SSRs in the hair-pin precursors of miRNAs (pre-miRNAs). The interplay between SSRs and miRNAs is not yet clearly understood. Considering the potential significance of SSRs in pre-miRNAs, we analysed the miRNA hair-pin precursors of 171 organisms, which revealed a noticeable (29.8%) existence of SSRs in their pre-miRNAs. The maintenance of SSRs in pre-miRNAs even in the complex, highly evolved phyla like Chordata and Magnoliophyta shed light upon its diverse functions. Putative effects of SSRs in either regulating the biogenesis or function of miRNAs were more underlined based on computational and experimental analysis. A preliminary computational analysis to explore the relevance of such SSRs maintained in pre-miRNA sequences led to the detection of splicing regulatory elements (SREs) either in or near to the SSRs. The absence of SSRs correspondingly decreased the detection of SREs. The present study is the first implication for the possible involvement of SSRs in shaping the SREs to undergo Alternative Splicing events to produce miRNA isoforms in accordance with different stress environments. This part of work well demonstrates the importance of studying such consistently maintained SSRs residing in pre-miRNAs and can enhance more and more research towards deciphering the exact function of SSRs in the near future.
Statistical physics of nucleosome positioning and chromatin structure

NASA Astrophysics Data System (ADS)

Morozov, Alexandre

2012-02-01

Genomic DNA is packaged into chromatin in eukaryotic cells. The fundamental building block of chromatin is the nucleosome, a 147 bp-long DNA molecule wrapped around the surface of a histone octamer. Arrays of nucleosomes are positioned along DNA according to their sequence preferences and folded into higher-order chromatin fibers whose structure is poorly understood. We have developed a framework for predicting sequence-specific histone-DNA interactions and the effective two-body potential responsible for ordering nucleosomes into regular higher-order structures. Our approach is based on the analogy between nucleosomal arrays and a one-dimensional fluid of finite-size particles with nearest-neighbor interactions. We derive simple rules which allow us to predict nucleosome occupancy solely from the dinucleotide content of the underlying DNA sequences.Dinucleotide content determines the degree of stiffness of the DNA polymer and thus defines its ability to bend into the nucleosomal superhelix. As expected, the nucleosome positioning rules are universal for chromatin assembled in vitro on genomic DNA from baker's yeast and from the nematode worm C.elegans, where nucleosome placement follows intrinsic sequence preferences and steric exclusion. However, the positioning rules inferred from in vivo C.elegans chromatin are affected by global nucleosome depletion from chromosome arms relative to central domains, likely caused by the attachment of the chromosome arms to the nuclear membrane. Furthermore, intrinsic nucleosome positioning rules are overwritten in transcribed regions, indicating that chromatin organization is actively managed by the transcriptional and splicing machinery.
Robust Data Detection for the Photon-Counting Free-Space Optical System With Implicit CSI Acquisition and Background Radiation Compensation

NASA Astrophysics Data System (ADS)

Song, Tianyu; Kam, Pooi-Yuen

2016-02-01

Since atmospheric turbulence and pointing errors cause signal intensity fluctuations and the background radiation surrounding the free-space optical (FSO) receiver contributes an undesired noisy component, the receiver requires accurate channel state information (CSI) and background information to adjust the detection threshold. In most previous studies, for CSI acquisition, pilot symbols were employed, which leads to a reduction of spectral and energy efficiency; and an impractical assumption that the background radiation component is perfectly known was made. In this paper, we develop an efficient and robust sequence receiver, which acquires the CSI and the background information implicitly and requires no knowledge about the channel model information. It is robust since it can automatically estimate the CSI and background component and detect the data sequence accordingly. Its decision metric has a simple form and involves no integrals, and thus can be easily evaluated. A Viterbi-type trellis-search algorithm is adopted to improve the search efficiency, and a selective-store strategy is adopted to overcome a potential error floor problem as well as to increase the memory efficiency. To further simplify the receiver, a decision-feedback symbol-by-symbol receiver is proposed as an approximation of the sequence receiver. By simulations and theoretical analysis, we show that the performance of both the sequence receiver and the symbol-by-symbol receiver, approach that of detection with perfect knowledge of the CSI and background radiation, as the length of the window for forming the decision metric increases.
Comparative transcriptome sequencing and de novo analysis of Vaccinium corymbosum during fruit and color development.

PubMed

Li, Lingli; Zhang, Hehua; Liu, Zhongshuai; Cui, Xiaoyue; Zhang, Tong; Li, Yanfang; Zhang, Lingyun

2016-10-12

Blueberry is an economically important fruit crop in Ericaceae family. The substantial quantities of flavonoids in blueberry have been implicated in a broad range of health benefits. However, the information regarding fruit development and flavonoid metabolites based on the transcriptome level is still limited. In the present study, the transcriptome and gene expression profiling over berry development, especially during color development were initiated. A total of approximately 13.67 Gbp of data were obtained and assembled into 186,962 transcripts and 80,836 unigenes from three stages of blueberry fruit and color development. A large number of simple sequence repeats (SSRs) and candidate genes, which are potentially involved in plant development, metabolic and hormone pathways, were identified. A total of 6429 sequences containing 8796 SSRs were characterized from 15,457 unigenes and 1763 unigenes contained more than one SSR. The expression profiles of key genes involved in anthocyanin biosynthesis were also studied. In addition, a comparison between our dataset and other published results was carried out. Our high quality reads produced in this study are an important advancement and provide a new resource for the interpretation of high-throughput data for blueberry species whether regarding sequencing data depth or species extension. The use of this transcriptome data will serve as a valuable public information database for the studies of blueberry genome and would greatly boost the research of fruit and color development, flavonoid metabolisms and regulation and breeding of more healthful blueberries.
Development and characterization of novel EST-SSR markers and their application for genetic diversity analysis of Jerusalem artichoke (Helianthus tuberosus L.).

PubMed

Mornkham, T; Wangsomnuk, P P; Mo, X C; Francisco, F O; Gao, L Z; Kurzweil, H

2016-10-24

Jerusalem artichoke (Helianthus tuberosus L.) is a perennial tuberous plant and a traditional inulin-rich crop in Thailand. It has become the most important source of inulin and has great potential for use in chemical and food industries. In this study, expressed sequence tag (EST)-based simple sequence repeat (SSR) markers were developed from 40,362 Jerusalem artichoke ESTs retrieved from the NCBI database. Among 23,691 non-redundant identified ESTs, 1949 SSR motifs harboring 2 to 6 nucleotides with varied repeat motifs were discovered from 1676 assembled sequences. Seventy-nine primer pairs were generated from EST sequences harboring SSR motifs. Our results show that 43 primers are polymorphic for the six studied populations, while the remaining 36 were either monomorphic or failed to amplify. These 43 SSR loci exhibited a high level of genetic diversity among populations, with allele numbers varying from 2 to 7, with an average of 3.95 alleles per loci. Heterozygosity ranged from 0.096 to 0.774, with an average of 0.536; polymorphic index content ranged from 0.096 to 0.854, with an average of 0.568. Principal component analysis and neighbor-joining analysis revealed that the six populations could be divided into six clusters. Our results indicate that these newly characterized EST-SSR markers may be useful in the exploration of genetic diversity and range expansion of the Jerusalem artichoke, and in cross-species application for the genus Helianthus.
The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

PubMed Central

Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

2016-01-01

Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

Development of EST-SSR markers for Taxillus nigrans (Loranthaceae) in southwestern China using next-generation sequencing1

PubMed Central

Miao, Ning; Zhang, Lei; Li, Maoping; Fan, Liqiang; Mao, Kangshan

2017-01-01

Premise of the study: We developed transcriptome microsatellite markers (simple sequence repeats) for Taxillus nigrans (Loranthaceae) to survey the genetic diversity and population structure of this species. Methods and Results: We used Illumina HiSeq data to reconstruct the transcriptome of T. nigrans by de novo assembly and used the transcriptome to develop a set of simple sequence repeat markers. Overall, 40 primer pairs were designed and tested; 19 of them amplified successfully and demonstrated polymorphisms. Two loci that detected null alleles were eliminated, and the remaining 17, which were subjected to further analyses, yielded two to 21 alleles per locus. Conclusions: The markers will serve as a basis for studies to assess the extent and pattern of distribution of genetic variation in T. nigrans, and they may also be useful in conservation genetic, ecological, and evolutionary studies of the genus Taxillus, a group of plant species of importance in Chinese traditional medicine. PMID:28924510
Inverted-U Function Relating Cortical Plasticity and Task Difficulty

PubMed Central

Engineer, Navzer D.; Engineer, Crystal T.; Reed, Amanda C.; Pandya, Pritesh K.; Jakkamsetti, Vikram; Moucha, Raluca; Kilgard, Michael P.

2012-01-01

Many psychological and physiological studies with simple stimuli have suggested that perceptual learning specifically enhances the response of primary sensory cortex to task-relevant stimuli. The aim of this study was to determine whether auditory discrimination training on complex tasks enhances primary auditory cortex responses to a target sequence relative to non-target and novel sequences. We collected responses from more than 2,000 sites in 31 rats trained on one of six discrimination tasks that differed primarily in the similarity of the target and distractor sequences. Unlike training with simple stimuli, long-term training with complex stimuli did not generate target specific enhancement in any of the groups. Instead, cortical receptive field size decreased, latency decreased, and paired pulse depression decreased in rats trained on the tasks of intermediate difficulty while tasks that were too easy or too difficult either did not alter or degraded cortical responses. These results suggest an inverted-U function relating neural plasticity and task difficulty. PMID:22249158
Investigation of microsatellite instability in Turkish breast cancer patients.

PubMed

Demokan, Semra; Muslumanoglu, Mahmut; Yazici, H; Igci, Abdullah; Dalay, Nejat

2002-01-01

Multiple somatic and inherited genetic changes that lead to loss of growth control may contribute to the development of breast cancer. Microsatellites are tandem repeats of simple sequences that occur abundantly and at random throughout most eucaryotic genomes. Microsatellite instability (MI), characterized by the presence of random contractions or expansions in the length of simple sequence repeats or microsatellites, is observed in a variety of tumors. The aim of this study was to compare tumor DNA fingerprints with constitutional DNA fingerprints to investigate changes specific to breast cancer and evaluate its correlation with clinical characteristics. Tumor and normal tissue samples of 38 patients with breast cancer were investigated by comparing PCR-amplified microsatellite sequences D2S443 and D21S1436. Microsatellite instability at D21S1436 and D2S443 was found in 5 (13%) and 7 (18%) patients, respectively. Two patients displayed instability at both marker loci. No association was found between MI and age, family history, lymph node involvement and other clinical parameters.
Simple diazonium chemistry to develop specific gene sensing platforms.

PubMed

Revenga-Parra, M; García-Mendiola, T; González-Costas, J; González-Romero, E; Marín, A García; Pau, J L; Pariente, F; Lorenzo, E

2014-02-27

A simple strategy for covalent immobilizing DNA sequences, based on the formation of stable diazonized conducting platforms, is described. The electrochemical reduction of 4-nitrobenzenediazonium salt onto screen-printed carbon electrodes (SPCE) in aqueous media gives rise to terminal grafted amino groups. The presence of primary aromatic amines allows the formation of diazonium cations capable to react with the amines present at the DNA capture probe. As a comparison a second strategy based on the binding of aminated DNA capture probes to the developed diazonized conducting platforms through a crosslinking agent was also employed. The resulting DNA sensing platforms were characterized by cyclic voltammetry, electrochemical impedance spectroscopy and spectroscopic ellipsometry. The hybridization event with the complementary sequence was detected using hexaamineruthenium (III) chloride as electrochemical indicator. Finally, they were applied to the analysis of a 145-bp sequence from the human gene MRP3, reaching a detection limit of 210 pg μL(-1). Copyright © 2014 Elsevier B.V. All rights reserved.
WebSat--a web software for microsatellite marker development.

PubMed

Martins, Wellington Santos; Lucas, Divino César Soares; Neves, Kelligton Fabricio de Souza; Bertioli, David John

2009-01-01

Simple sequence repeats (SSR), also known as microsatellites, have been extensively used as molecular markers due to their abundance and high degree of polymorphism. We have developed a simple to use web software, called WebSat, for microsatellite molecular marker prediction and development. WebSat is accessible through the Internet, requiring no program installation. Although a web solution, it makes use of Ajax techniques, providing a rich, responsive user interface. WebSat allows the submission of sequences, visualization of microsatellites and the design of primers suitable for their amplification. The program allows full control of parameters and the easy export of the resulting data, thus facilitating the development of microsatellite markers. The web tool may be accessed at http://purl.oclc.org/NET/websat/
Plant genome and transcriptome annotations: from misconceptions to simple solutions

PubMed Central

Bolger, Marie E; Arsova, Borjana; Usadel, Björn

2018-01-01

Abstract Next-generation sequencing has triggered an explosion of available genomic and transcriptomic resources in the plant sciences. Although genome and transcriptome sequencing has become orders of magnitudes cheaper and more efficient, often the functional annotation process is lagging behind. This might be hampered by the lack of a comprehensive enumeration of simple-to-use tools available to the plant researcher. In this comprehensive review, we present (i) typical ontologies to be used in the plant sciences, (ii) useful databases and resources used for functional annotation, (iii) what to expect from an annotated plant genome, (iv) an automated annotation pipeline and (v) a recipe and reference chart outlining typical steps used to annotate plant genomes/transcriptomes using publicly available resources. PMID:28062412
GATA simple sequence repeats function as enhancer blocker boundaries.

PubMed

Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K

2013-01-01

Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
Assessment of clinical analytical sensitivity and specificity of next-generation sequencing for detection of simple and complex mutations.

PubMed

Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri

2013-02-19

Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

PubMed

Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

2017-01-03

Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

PubMed

Ralph, Duncan K; Matsen, Frederick A

2016-01-01

VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.
Changes in spinal reflex excitability associated with motor sequence learning.

PubMed

Lungu, Ovidiu; Frigon, Alain; Piché, Mathieu; Rainville, Pierre; Rossignol, Serge; Doyon, Julien

2010-05-01

There is ample evidence that motor sequence learning is mediated by changes in brain activity. Yet the question of whether this form of learning elicits changes detectable at the spinal cord level has not been addressed. To date, studies in humans have revealed that spinal reflex activity may be altered during the acquisition of various motor skills, but a link between motor sequence learning and changes in spinal excitability has not been demonstrated. To address this issue, we studied the modulation of H-reflex amplitude evoked in the flexor carpi radialis muscle of 14 healthy individuals between blocks of movements that involved the implicit acquisition of a sequence versus other movements that did not require learning. Each participant performed the task in three conditions: "sequence"-externally triggered, repeating and sequential movements, "random"-similar movements, but performed in an arbitrary order, and "simple"- involving alternating movements in a left-right or up-down direction only. When controlling for background muscular activity, H-reflex amplitude was significantly more reduced in the sequence (43.8 +/- 1.47%. mean +/- SE) compared with the random (38.2 +/- 1.60%) and simple (31.5 +/- 1.82%) conditions, while the M-response was not different across conditions. Furthermore, H-reflex changes were observed from the beginning of the learning process up to when subjects reached asymptotic performance on the motor task. Changes also persisted for >60 s after motor activity ceased. Such findings suggest that the excitability in some spinal reflex circuits is altered during the implicit learning process of a new motor sequence.
Draft Genome Sequence of the Cellulolytic Bacterium Clostridium papyrosolvens C7 (ATCC 700395).

PubMed

Zepeda, Veronica; Dassa, Bareket; Borovok, Ilya; Lamed, Raphael; Bayer, Edward A; Cate, Jamie H D

2013-09-12

We report the draft genome sequence of the cellulose-degrading bacterium Clostridium papyrosolvens C7, originally isolated from mud collected below a freshwater pond in Massachusetts. This Gram-positive bacterium grows in a mesophilic anaerobic environment with filter paper as the only carbon source, and it has a simple cellulosome system with multiple carbohydrate-degrading enzymes.
Draft Genome Sequence of the Cellulolytic Bacterium Clostridium papyrosolvens C7 (ATCC 700395)

PubMed Central

Zepeda, Veronica; Dassa, Bareket; Borovok, Ilya; Lamed, Raphael; Bayer, Edward A.

2013-01-01

We report the draft genome sequence of the cellulose-degrading bacterium Clostridium papyrosolvens C7, originally isolated from mud collected below a freshwater pond in Massachusetts. This Gram-positive bacterium grows in a mesophilic anaerobic environment with filter paper as the only carbon source, and it has a simple cellulosome system with multiple carbohydrate-degrading enzymes. PMID:24029755
Increasing Classroom Compliance: Using a High-Probability Command Sequence with Noncompliant Students

ERIC Educational Resources Information Center

Axelrod, Michael I.; Zank, Amber J.

2012-01-01

Noncompliance is one of the most problematic behaviors within the school setting. One strategy to increase compliance of noncompliant students is a high-probability command sequence (HPCS; i.e., a set of simple commands in which an individual is likely to comply immediately prior to the delivery of a command that has a lower probability of…
in silico Whole Genome Sequencer & Analyzer (iWGS): a computational pipeline to guide the design and analysis of de novo genome sequencing studies

USDA-ARS?s Scientific Manuscript database

The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding it...
Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.

PubMed

Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y

2000-01-01

Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.
Improved Spin-Echo-Edited NMR Diffusion Measurements

NASA Astrophysics Data System (ADS)

Otto, William H.; Larive, Cynthia K.

2001-12-01

The need for simple and robust schemes for the analysis of ligand-protein binding has resulted in the development of diffusion-based NMR techniques that can be used to assay binding in protein solutions containing a mixture of several ligands. As a means of gaining spectral selectivity in NMR diffusion measurements, a simple experiment, the gradient modified spin-echo (GOSE), has been developed to reject the resonances of coupled spins and detect only the singlets in the 1H NMR spectrum. This is accomplished by first using a spin echo to null the resonances of the coupled spins. Following the spin echo, the singlet magnetization is flipped out of the transverse plane and a dephasing gradient is applied to reduce the spectral artifacts resulting from incomplete cancellation of the J-coupled resonances. The resulting modular sequence is combined here with the BPPSTE pulse sequence; however, it could be easily incorporated into any pulse sequence where additional spectral selectivity is desired. Results obtained with the GOSE-BPPSTE pulse sequence are compared with those obtained with the BPPSTE and CPMG-BPPSTE experiments for a mixture containing the ligands resorcinol and tryptophan in a solution of human serum albumin.
Towards the development of multifunctional molecular indicators combining soil biogeochemical and microbiological variables to predict the ecological integrity of silvicultural practices.

PubMed

Peck, Vincent; Quiza, Liliana; Buffet, Jean-Philippe; Khdhiri, Mondher; Durand, Audrey-Anne; Paquette, Alain; Thiffault, Nelson; Messier, Christian; Beaulieu, Nadyre; Guertin, Claude; Constant, Philippe

2016-05-01

The impact of mechanical site preparation (MSP) on soil biogeochemical structure in young larch plantations was investigated. Soil samples were collected in replicated plots comprising simple trenching, double trenching, mounding and inverting site preparation. Unlogged natural mixed forest areas were used as a reference. Analysis of soil nutrients, abundance of bacteria and gas exchanges unveiled no significant difference among the plots. However, inverting site preparation resulted in higher variations of gas exchanges when compared with trenching, mounding and unlogged natural forest. A combination of the biological and physicochemical variables was used to define a multifunctional classification of the soil samples into four distinct groups categorized as a function of their deviation from baseline ecological conditions. According to this classification model, simple trenching was the approach that represented the lowest ecological risk potential at the microsite level. No relationship was observed between MSP method and soil bacterial community structure as assessed by high-throughput sequencing of bacterial 16S rRNA gene; however, indicator genotypes were identified for each multifunctional soil class. This is the first identification of multifunctional molecular indicators for baseline and disturbed ecological conditions in soil, demonstrating the potential of applied microbial ecology to guide silvicultural practices and ecological risk assessment. © 2016 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
DNA nanotechnology: new adventures for an old warhorse.

PubMed

Zakeri, Bijan; Lu, Timothy K

2015-10-01

As the blueprint of life, the natural exploits of DNA are admirable. However, DNA should not only be viewed within a biological context. It is an elegantly simple yet functionally complex chemical polymer with properties that make it an ideal platform for engineering new nanotechnologies. Rapidly advancing synthesis and sequencing technologies are enabling novel unnatural applications for DNA beyond the realm of genetics. Here we explore the chemical biology of DNA nanotechnology for emerging applications in communication and digital data storage. Early studies of DNA as an alternative to magnetic and optical storage mediums have not only been promising, but have demonstrated the potential of DNA to revolutionize the way we interact with digital data in the future. Copyright © 2015 Elsevier Ltd. All rights reserved.
The Role of Nonlinear Gradients in Parallel Imaging: A k-Space Based Analysis.

PubMed

Galiana, Gigi; Stockmann, Jason P; Tam, Leo; Peters, Dana; Tagare, Hemant; Constable, R Todd

2012-09-01

Sequences that encode the spatial information of an object using nonlinear gradient fields are a new frontier in MRI, with potential to provide lower peripheral nerve stimulation, windowed fields of view, tailored spatially-varying resolution, curved slices that mirror physiological geometry, and, most importantly, very fast parallel imaging with multichannel coils. The acceleration for multichannel images is generally explained by the fact that curvilinear gradient isocontours better complement the azimuthal spatial encoding provided by typical receiver arrays. However, the details of this complementarity have been more difficult to specify. We present a simple and intuitive framework for describing the mechanics of image formation with nonlinear gradients, and we use this framework to review some the main classes of nonlinear encoding schemes.

DNA nanotechnology: new adventures for an old warhorse

PubMed Central

Zakeri, Bijan; Lu, Timothy K.

2016-01-01

As the blueprint of life, the natural exploits of DNA are admirable. However, DNA should not only be viewed within a biological context. It is an elegantly simple yet functionally complex chemical polymer with properties that make it an ideal platform for engineering new nanotechnologies. Rapidly advancing synthesis and sequencing technologies are enabling novel unnatural applications for DNA beyond the realm of genetics. Here we explore the chemical biology of DNA nanotechnology for emerging applications in communication and digital data storage. Early studies of DNA as an alternative to magnetic and optical storage mediums have not only been promising, but have demonstrated the potential of DNA to revolutionize the way we interact with digital data in the future. PMID:26056949
A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

PubMed Central

Lewers, Kim S; Saski, Chris A; Cuthbertson, Brandon J; Henry, David C; Staton, Meg E; Main, Dorrie S; Dhanaraj, Anik L; Rowland, Lisa J; Tomkins, Jeff P

2008-01-01

Background The recent development of novel repeat-fruiting types of blackberry (Rubus L.) cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST) library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR), and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry. PMID:18570660
On the normalization of the minimum free energy of RNAs by sequence length.

PubMed

Trotta, Edoardo

2014-01-01

The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.
On the Normalization of the Minimum Free Energy of RNAs by Sequence Length

PubMed Central

Trotta, Edoardo

2014-01-01

The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875
Forensic Loci Allele Database (FLAD): Automatically generated, permanent identifiers for sequenced forensic alleles.

PubMed

Van Neste, Christophe; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

2016-01-01

It is difficult to predict if and when massively parallel sequencing of forensic STR loci will replace capillary electrophoresis as the new standard technology in forensic genetics. The main benefits of sequencing are increased multiplexing scales and SNP detection. There is not yet a consensus on how sequenced profiles should be reported. We present the Forensic Loci Allele Database (FLAD) service, made freely available on http://forensic.ugent.be/FLAD/. It offers permanent identifiers for sequenced forensic alleles (STR or SNP) and their microvariants for use in forensic allele nomenclature. Analogous to Genbank, its aim is to provide permanent identifiers for forensically relevant allele sequences. Researchers that are developing forensic sequencing kits or are performing population studies, can register on http://forensic.ugent.be/FLAD/ and add loci and allele sequences with a short and simple application interface (API). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants.

PubMed

Taheri, Sima; Lee Abdullah, Thohirah; Yusop, Mohd Rafii; Hanafi, Mohamed Musa; Sahebi, Mahbod; Azizi, Parisa; Shamshiri, Redmond Ramin

2018-02-13

Microsatellites, or simple sequence repeats (SSRs), are one of the most informative and multi-purpose genetic markers exploited in plant functional genomics. However, the discovery of SSRs and development using traditional methods are laborious, time-consuming, and costly. Recently, the availability of high-throughput sequencing technologies has enabled researchers to identify a substantial number of microsatellites at less cost and effort than traditional approaches. Illumina is a noteworthy transcriptome sequencing technology that is currently used in SSR marker development. Although 454 pyrosequencing datasets can be used for SSR development, this type of sequencing is no longer supported. This review aims to present an overview of the next generation sequencing, with a focus on the efficient use of de novo transcriptome sequencing (RNA-Seq) and related tools for mining and development of microsatellites in plants.
SimulaTE: simulating complex landscapes of transposable elements of populations.

PubMed

Kofler, Robert

2018-04-15

Estimating the abundance of transposable elements (TEs) in populations (or tissues) promises to answer many open research questions. However, progress is hampered by the lack of concordance between different approaches for TE identification and thus potentially unreliable results. To address this problem, we developed SimulaTE a tool that generates TE landscapes for populations using a newly developed domain specific language (DSL). The simple syntax of our DSL allows for easily building even complex TE landscapes that have, for example, nested, truncated and highly diverged TE insertions. Reads may be simulated for the populations using different sequencing technologies (PacBio, Illumina paired-ends) and strategies (sequencing individuals and pooled populations). The comparison between the expected (i.e. simulated) and the observed results will guide researchers in finding the most suitable approach for a particular research question. SimulaTE is implemented in Python and available at https://sourceforge.net/projects/simulates/. Manual https://sourceforge.net/p/simulates/wiki/Home/#manual; Test data and tutorials https://sourceforge.net/p/simulates/wiki/Home/#walkthrough; Validation https://sourceforge.net/p/simulates/wiki/Home/#validation. robert.kofler@vetmeduni.ac.at.
Dynamic control of nutrient-removal from industrial wastewater in a sequencing batch reactor, using common and low-cost online sensors.

PubMed

Dries, Jan

2016-01-01

On-line control of the biological treatment process is an innovative tool to cope with variable concentrations of chemical oxygen demand and nutrients in industrial wastewater. In the present study we implemented a simple dynamic control strategy for nutrient-removal in a sequencing batch reactor (SBR) treating variable tank truck cleaning wastewater. The control system was based on derived signals from two low-cost and robust sensors that are very common in activated sludge plants, i.e. oxidation reduction potential (ORP) and dissolved oxygen. The amount of wastewater fed during anoxic filling phases, and the number of filling phases in the SBR cycle, were determined by the appearance of the 'nitrate knee' in the profile of the ORP. The phase length of the subsequent aerobic phases was controlled by the oxygen uptake rate measured online in the reactor. As a result, the sludge loading rate (F/M ratio), the volume exchange rate and the SBR cycle length adapted dynamically to the activity of the activated sludge and the actual characteristics of the wastewater, without affecting the final effluent quality.
Tracking flow of leukocytes in blood for drug analysis

NASA Astrophysics Data System (ADS)

Basharat, Arslan; Turner, Wesley; Stephens, Gillian; Badillo, Benjamin; Lumpkin, Rick; Andre, Patrick; Perera, Amitha

2011-03-01

Modern microscopy techniques allow imaging of circulating blood components under vascular flow conditions. The resulting video sequences provide unique insights into the behavior of blood cells within the vasculature and can be used as a method to monitor and quantitate the recruitment of inflammatory cells at sites of vascular injury/ inflammation and potentially serve as a pharmacodynamic biomarker, helping screen new therapies and individualize dose and combinations of drugs. However, manual analysis of these video sequences is intractable, requiring hours per 400 second video clip. In this paper, we present an automated technique to analyze the behavior and recruitment of human leukocytes in whole blood under physiological conditions of shear through a simple multi-channel fluorescence microscope in real-time. This technique detects and tracks the recruitment of leukocytes to a bioactive surface coated on a flow chamber. Rolling cells (cells which partially bind to the bioactive matrix) are detected counted, and have their velocity measured and graphed. The challenges here include: high cell density, appearance similarity, and low (1Hz) frame rate. Our approach performs frame differencing based motion segmentation, track initialization and online tracking of individual leukocytes.
Complete genome sequence of Terriglobus saanensis type strain SP1PR4T, an Acidobacteria from tundra soil

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rawat, Suman R.; Mannisto, Minna; Starovoytov, Valentin

2012-01-01

Terriglobus saanensis SP1PR4T is a novel species of the genus Terriglobus. T. saanensis is of ecological interest because it is a representative of the phylum Acidobacteria, which are dominant members of bacterial soil microbiota in Arctic ecosystems. T. saanensis is a cold-adapted acidophile and a versatile heterotroph utilizing a suite of simple sugars and complex polysaccharides. The genome contained an abundance of genes assigned to metabolism and transport of carbohydrates including gene modules encoding for carbohydrate-active enzyme (CAZyme) family involved in breakdown, utilization and biosynthesis of diverse structural and storage polysaccharides. T. saanensis SP1PR4T represents the first member of genusmore » Terriglobus with a completed genome sequence, consisting of a single replicon of 5,095,226 base pairs (bp), 54 RNA genes and 4,279 protein-coding genes. We infer that the physiology and metabolic potential of T. saanensis is adapted to allow for resilience to the nutrient-deficient conditions and fluctuating temperatures of Arctic tundra soils.« less
Rediscovering Medicinal Plants' Potential with OMICS: Microsatellite Survey in Expressed Sequence Tags of Eleven Traditional Plants with Potent Antidiabetic Properties

PubMed Central

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar

2014-01-01

Abstract Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes. PMID:24802971
Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

PubMed

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

2014-05-01

Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes.
SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

PubMed

Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin

2017-01-01

Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.
An innovative diagnostic technology for the codon mutation C580Y in kelch13 of Plasmodium falciparum with MinION nanopore sequencer.

PubMed

Imai, Kazuo; Tarumoto, Norihito; Runtuwene, Lucky Ronald; Sakai, Jun; Hayashida, Kyoko; Eshita, Yuki; Maeda, Ryuichiro; Tuda, Josef; Ohno, Hideaki; Murakami, Takashi; Maesaki, Shigefumi; Suzuki, Yutaka; Yamagishi, Junya; Maeda, Takuya

2018-05-29

The recent spread of artemisinin (ART)-resistant Plasmodium falciparum represents an emerging global threat to public health. In Southeast Asia, the C580Y mutation of kelch13 (k13) is the dominant mutation of ART-resistant P. falciparum. Therefore, a simple method for the detection of C580Y mutation is urgently needed to enable widespread routine surveillance in the field. The aim of this study is to develop a new diagnostic procedure for the C580Y mutation using loop-mediated isothermal amplification (LAMP) combined with the MinION nanopore sequencer. A LAMP assay for the k13 gene of P. falciparum to detect the C580Y mutation was successfully developed. The detection limit of this procedure was 10 copies of the reference plasmid harboring the k13 gene within 60 min. Thereafter, amplicon sequencing of the LAMP products using the MinION nanopore sequencer was performed to clarify the nucleotide sequences of the gene. The C580Y mutation was identified based on the sequence data collected from MinION reads 30 min after the start of sequencing. Further, clinical evaluation of the LAMP assay in 34 human blood samples collected from patients with P. falciparum malaria in Indonesia revealed a positive detection rate of 100%. All LAMP amplicons of up to 12 specimens were simultaneously sequenced using MinION. The results of sequencing were consistent with those of the conventional PCR and Sanger sequencing protocol. All procedures from DNA extraction to variant calling were completed within 3 h. The C580Y mutation was not found among these 34 P. falciparum isolates in Indonesia. An innovative method combining LAMP and MinION will enable simple, rapid, and high-sensitivity detection of the C580Y mutation of P. falciparum, even in resource-limited situations in developing countries.
Insights into mutagenesis using Escherichia coli chromosomal lacZ strains that enable detection of a wide spectrum of mutational events.

PubMed

Seier, Tracey; Padgett, Dana R; Zilberberg, Gal; Sutera, Vincent A; Toha, Noor; Lovett, Susan T

2011-06-01

Strand misalignments at DNA repeats during replication are implicated in mutational hotspots. To study these events, we have generated strains carrying mutations in the Escherichia coli chromosomal lacZ gene that revert via deletion of a short duplicated sequence or by template switching within imperfect inverted repeat (quasipalindrome, QP) sequences. Using these strains, we demonstrate that mutation of the distal repeat of a quasipalindrome, with respect to replication fork movement, is about 10-fold higher than the proximal repeat, consistent with more common template switching on the leading strand. The leading strand bias was lost in the absence of exonucleases I and VII, suggesting that it results from more efficient suppression of template switching by 3' exonucleases targeted to the lagging strand. The loss of 3' exonucleases has no effect on strand misalignment at direct repeats to produce deletion. To compare these events to other mutations, we have reengineered reporters (designed by Cupples and Miller 1989) that detect specific base substitutions or frameshifts in lacZ with the reverting lacZ locus on the chromosome rather than an F' element. This set allows rapid screening of potential mutagens, environmental conditions, or genetic loci for effects on a broad set of mutational events. We found that hydroxyurea (HU), which depletes dNTP pools, slightly elevated templated mutations at inverted repeats but had no effect on deletions, simple frameshifts, or base substitutions. Mutations in nucleotide diphosphate kinase, ndk, significantly elevated simple mutations but had little effect on the templated class. Zebularine, a cytosine analog, elevated all classes.
Novel numerical and graphical representation of DNA sequences and proteins.

PubMed

Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D

2006-12-01

We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.
Transposon fingerprinting using low coverage whole genome shotgun sequencing in cacao (Theobroma cacao L.) and related species.

PubMed

Sveinsson, Saemundur; Gill, Navdeep; Kane, Nolan C; Cronk, Quentin

2013-07-24

Transposable elements (TEs) and other repetitive elements are a large and dynamically evolving part of eukaryotic genomes, especially in plants where they can account for a significant proportion of genome size. Their dynamic nature gives them the potential for use in identifying and characterizing crop germplasm. However, their repetitive nature makes them challenging to study using conventional methods of molecular biology. Next generation sequencing and new computational tools have greatly facilitated the investigation of TE variation within species and among closely related species. (i) We generated low-coverage Illumina whole genome shotgun sequencing reads for multiple individuals of cacao (Theobroma cacao) and related species. These reads were analysed using both an alignment/mapping approach and a de novo (graph based clustering) approach. (ii) A standard set of ultra-conserved orthologous sequences (UCOS) standardized TE data between samples and provided phylogenetic information on the relatedness of samples. (iii) The mapping approach proved highly effective within the reference species but underestimated TE abundance in interspecific comparisons relative to the de novo methods. (iv) Individual T. cacao accessions have unique patterns of TE abundance indicating that the TE composition of the genome is evolving actively within this species. (v) LTR/Gypsy elements are the most abundant, comprising c.10% of the genome. (vi) Within T. cacao the retroelement families show an order of magnitude greater sequence variability than the DNA transposon families. (vii) Theobroma grandiflorum has a similar TE composition to T. cacao, but the related genus Herrania is rather different, with LTRs making up a lower proportion of the genome, perhaps because of a massive presence (c. 20%) of distinctive low complexity satellite-like repeats in this genome. (i) Short read alignment/mapping to reference TE contigs provides a simple and effective method of investigating intraspecific differences in TE composition. It is not appropriate for comparing repetitive elements across the species boundaries, for which de novo methods are more appropriate. (ii) Individual T. cacao accessions have unique spectra of TE composition indicating active evolution of TE abundance within this species. TE patterns could potentially be used as a "fingerprint" to identify and characterize cacao accessions.
High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

PubMed

Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

2016-01-01

Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.
Arbitrarily accurate twin composite π -pulse sequences

NASA Astrophysics Data System (ADS)

Torosov, Boyan T.; Vitanov, Nikolay V.

2018-04-01

We present three classes of symmetric broadband composite pulse sequences. The composite phases are given by analytic formulas (rational fractions of π ) valid for any number of constituent pulses. The transition probability is expressed by simple analytic formulas and the order of pulse area error compensation grows linearly with the number of pulses. Therefore, any desired compensation order can be produced by an appropriate composite sequence; in this sense, they are arbitrarily accurate. These composite pulses perform equally well as or better than previously published ones. Moreover, the current sequences are more flexible as they allow total pulse areas of arbitrary integer multiples of π .
Short communication: Development of a direct in vivo screening model to identify potential probiotic bacteria using Caenorhabditis elegans.

PubMed

Park, M R; Yun, H S; Son, S J; Oh, S; Kim, Y

2014-11-01

Caenorhabditis elegans is an accepted model host to study host-bacteria interactions in the gut, in addition to being a simple model with which to study conserved aspects of biological signaling pathways in intestinal environments, because these nematode worms have similar intestinal cells to those of humans. Here, we used C. elegans to develop a new in vivo screening system for potential probiotic lactic acid bacteria (LAB). Initially, critical colonization ability of LAB strains isolated from Korean infant feces was screened in the worm intestinal tract over a period of 5 d. Furthermore, we investigated host health-promoting activities, including longevity-extending effects and immune-enhancing activities against foodborne pathogen infection. We identified 4 LAB strains that were highly persistent in the nematode gut and that significantly prolonged the longevity of C. elegans and improved the survival of C. elegans in response to infection by Staphylococcus aureus. The 4 LAB strains we identified showed resistance to acid and bile conditions, assimilated cholesterol, and were able to attach to a mucus layer. The 4 LAB isolates were identified as Lactobacillus plantarum using 16S rRNA sequencing analysis. Taken together, we developed a direct in vivo screening system using C. elegans to study host health-promoting LAB. Our system is simple, rapid, cost-effective, and reliable, and we anticipate that this system will result in the discovery of many more potential probiotic bacteria for dairy foods. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism

PubMed Central

Gur-Arie, Riva; Cohen, Cyril J.; Eitan, Yuval; Shelef, Leora; Hallerman, Eric M.; Kashi, Yechezkel

2000-01-01

Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF209020–209030 and AF209508–209518.] PMID:10645951
A founder large deletion mutation in Xeroderma pigmentosum-Variant form in Tunisia: implication for molecular diagnosis and therapy.

PubMed

Ben Rekaya, Mariem; Laroussi, Nadia; Messaoud, Olfa; Jones, Mariem; Jerbi, Manel; Naouali, Chokri; Bouyacoub, Yosra; Chargui, Mariem; Kefi, Rym; Fazaa, Becima; Boubaker, Mohamed Samir; Boussen, Hamouda; Mokni, Mourad; Abdelhak, Sonia; Zghal, Mohamed; Khaled, Aida; Yacoub-Youssef, Houda

2014-01-01

Xeroderma pigmentosum Variant (XP-V) form is characterized by a late onset of skin symptoms. Our aim is the clinical and genetic investigations of XP-V Tunisian patients in order to develop a simple tool for early diagnosis. We investigated 16 suspected XP patients belonging to ten consanguineous families. Analysis of the POLH gene was performed by linkage analysis, long range PCR, and sequencing. Genetic analysis showed linkage to the POLH gene with a founder haplotype in all affected patients. Long range PCR of exon 9 to exon 11 showed a 3926 bp deletion compared to control individuals. Sequence analysis demonstrates that this deletion has occurred between two Alu-Sq2 repetitive sequences in the same orientation, respectively, in introns 9 and 10. We suggest that this mutation POLH NG_009252.1: g.36847_40771del3925 is caused by an equal crossover event that occurred between two homologous chromosomes at meiosis. These results allowed us to develop a simple test based on a simple PCR in order to screen suspected XP-V patients. In Tunisia, the prevalence of XP-V group seems to be underestimated and clinical diagnosis is usually later. Cascade screening of this founder mutation by PCR in regions with high frequency of XP provides a rapid and cost-effective tool for early diagnosis of XP-V in Tunisia and North Africa.
A Founder Large Deletion Mutation in Xeroderma Pigmentosum-Variant Form in Tunisia: Implication for Molecular Diagnosis and Therapy

PubMed Central

Ben Rekaya, Mariem; Laroussi, Nadia; Messaoud, Olfa; Jones, Mariem; Jerbi, Manel; Bouyacoub, Yosra; Chargui, Mariem; Kefi, Rym; Fazaa, Becima; Boubaker, Mohamed Samir; Boussen, Hamouda; Mokni, Mourad; Abdelhak, Sonia; Zghal, Mohamed; Khaled, Aida; Yacoub-Youssef, Houda

2014-01-01

Xeroderma pigmentosum Variant (XP-V) form is characterized by a late onset of skin symptoms. Our aim is the clinical and genetic investigations of XP-V Tunisian patients in order to develop a simple tool for early diagnosis. We investigated 16 suspected XP patients belonging to ten consanguineous families. Analysis of the POLH gene was performed by linkage analysis, long range PCR, and sequencing. Genetic analysis showed linkage to the POLH gene with a founder haplotype in all affected patients. Long range PCR of exon 9 to exon 11 showed a 3926 bp deletion compared to control individuals. Sequence analysis demonstrates that this deletion has occurred between two Alu-Sq2 repetitive sequences in the same orientation, respectively, in introns 9 and 10. We suggest that this mutation POLH NG_009252.1: g.36847_40771del3925 is caused by an equal crossover event that occurred between two homologous chromosomes at meiosis. These results allowed us to develop a simple test based on a simple PCR in order to screen suspected XP-V patients. In Tunisia, the prevalence of XP-V group seems to be underestimated and clinical diagnosis is usually later. Cascade screening of this founder mutation by PCR in regions with high frequency of XP provides a rapid and cost-effective tool for early diagnosis of XP-V in Tunisia and North Africa. PMID:24877075
Long-Term Predictive and Feedback Encoding of Motor Signals in the Simple Spike Discharge of Purkinje Cells

PubMed Central

Popa, Laurentiu S.; Streng, Martha L.

2017-01-01

Abstract Most hypotheses of cerebellar function emphasize a role in real-time control of movements. However, the cerebellum’s use of current information to adjust future movements and its involvement in sequencing, working memory, and attention argues for predicting and maintaining information over extended time windows. The present study examines the time course of Purkinje cell discharge modulation in the monkey (Macaca mulatta) during manual, pseudo-random tracking. Analysis of the simple spike firing from 183 Purkinje cells during tracking reveals modulation up to 2 s before and after kinematics and position error. Modulation significance was assessed against trial shuffled firing, which decoupled simple spike activity from behavior and abolished long-range encoding while preserving data statistics. Position, velocity, and position errors have the most frequent and strongest long-range feedforward and feedback modulations, with less common, weaker long-term correlations for speed and radial error. Position, velocity, and position errors can be decoded from the population simple spike firing with considerable accuracy for even the longest predictive (-2000 to -1500 ms) and feedback (1500 to 2000 ms) epochs. Separate analysis of the simple spike firing in the initial hold period preceding tracking shows similar long-range feedforward encoding of the upcoming movement and in the final hold period feedback encoding of the just completed movement, respectively. Complex spike analysis reveals little long-term modulation with behavior. We conclude that Purkinje cell simple spike discharge includes short- and long-range representations of both upcoming and preceding behavior that could underlie cerebellar involvement in error correction, working memory, and sequencing. PMID:28413823
A Simple and Efficient Methodology To Improve Geometric Accuracy in Gamma Knife Radiation Surgery: Implementation in Multiple Brain Metastases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karaiskos, Pantelis, E-mail: pkaraisk@med.uoa.gr; Gamma Knife Department, Hygeia Hospital, Athens; Moutsatsos, Argyris

Purpose: To propose, verify, and implement a simple and efficient methodology for the improvement of total geometric accuracy in multiple brain metastases gamma knife (GK) radiation surgery. Methods and Materials: The proposed methodology exploits the directional dependence of magnetic resonance imaging (MRI)-related spatial distortions stemming from background field inhomogeneities, also known as sequence-dependent distortions, with respect to the read-gradient polarity during MRI acquisition. First, an extra MRI pulse sequence is acquired with the same imaging parameters as those used for routine patient imaging, aside from a reversal in the read-gradient polarity. Then, “average” image data are compounded from data acquiredmore » from the 2 MRI sequences and are used for treatment planning purposes. The method was applied and verified in a polymer gel phantom irradiated with multiple shots in an extended region of the GK stereotactic space. Its clinical impact in dose delivery accuracy was assessed in 15 patients with a total of 96 relatively small (<2 cm) metastases treated with GK radiation surgery. Results: Phantom study results showed that use of average MR images eliminates the effect of sequence-dependent distortions, leading to a total spatial uncertainty of less than 0.3 mm, attributed mainly to gradient nonlinearities. In brain metastases patients, non-eliminated sequence-dependent distortions lead to target localization uncertainties of up to 1.3 mm (mean: 0.51 ± 0.37 mm) with respect to the corresponding target locations in the “average” MRI series. Due to these uncertainties, a considerable underdosage (5%-32% of the prescription dose) was found in 33% of the studied targets. Conclusions: The proposed methodology is simple and straightforward in its implementation. Regarding multiple brain metastases applications, the suggested approach may substantially improve total GK dose delivery accuracy in smaller, outlying targets.« less
Iteration with Spreadsheets.

ERIC Educational Resources Information Center

Smith, Michael

1990-01-01

Presents several examples of the iteration method using computer spreadsheets. Examples included are simple iterative sequences and the solution of equations using the Newton-Raphson formula, linear interpolation, and interval bisection. (YP)
RUCS: rapid identification of PCR primers for unique core sequences.

PubMed

Thomsen, Martin Christen Frølund; Hasman, Henrik; Westh, Henrik; Kaya, Hülya; Lund, Ole

2017-12-15

Designing PCR primers to target a specific selection of whole genome sequenced strains can be a long, arduous and sometimes impractical task. Such tasks would benefit greatly from an automated tool to both identify unique targets, and to validate the vast number of potential primer pairs for the targets in silico. Here we present RUCS, a program that will find PCR primer pairs and probes for the unique core sequences of a positive genome dataset complement to a negative genome dataset. The resulting primer pairs and probes are in addition to simple selection also validated through a complex in silico PCR simulation. We compared our method, which identifies the unique core sequences, against an existing tool called ssGeneFinder, and found that our method was 6.5-20 times more sensitive. We used RUCS to design primer pairs that would target a set of genomes known to contain the mcr-1 colistin resistance gene. Three of the predicted pairs were chosen for experimental validation using PCR and gel electrophoresis. All three pairs successfully produced an amplicon with the target length for the samples containing mcr-1 and no amplification products were produced for the negative samples. The novel methods presented in this manuscript can reduce the time needed to identify target sequences, and provide a quick virtual PCR validation to eliminate time wasted on ambiguously binding primers. Source code is freely available on https://bitbucket.org/genomicepidemiology/rucs. Web service is freely available on https://cge.cbs.dtu.dk/services/RUCS. mcft@cbs.dtu.dk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
De Novo Assembly of the Japanese Flounder (Paralichthys olivaceus) Spleen Transcriptome to Identify Putative Genes Involved in Immunity

PubMed Central

Huang, Lin; Li, Guiyang; Mo, Zhaolan; Xiao, Peng; Li, Jie; Huang, Jie

2015-01-01

Background Japanese flounder (Paralichthys olivaceus) is an economically important marine fish in Asia and has suffered from disease outbreaks caused by various pathogens, which requires more information for immune relevant genes on genome background. However, genomic and transcriptomic data for Japanese flounder remain scarce, which limits studies on the immune system of this species. In this study, we characterized the Japanese flounder spleen transcriptome using an Illumina paired-end sequencing platform to identify putative genes involved in immunity. Methodology/Principal Findings A cDNA library from the spleen of P. olivaceus was constructed and randomly sequenced using an Illumina technique. The removal of low quality reads generated 12,196,968 trimmed reads, which assembled into 96,627 unigenes. A total of 21,391 unigenes (22.14%) were annotated in the NCBI Nr database, and only 1.1% of the BLASTx top-hits matched P. olivaceus protein sequences. Approximately 12,503 (58.45%) unigenes were categorized into three Gene Ontology groups, 19,547 (91.38%) were classified into 26 Cluster of Orthologous Groups, and 10,649 (49.78%) were assigned to six Kyoto Encyclopedia of Genes and Genomes pathways. Furthermore, 40,928 putative simple sequence repeats and 47, 362 putative single nucleotide polymorphisms were identified. Importantly, we identified 1,563 putative immune-associated unigenes that mapped to 15 immune signaling pathways. Conclusions/Significance The P. olivaceus transciptome data provides a rich source to discover and identify new genes, and the immune-relevant sequences identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the plentiful potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the flounder. PMID:25723398
Isosteric And Non-Isosteric Base Pairs In RNA Motifs: Molecular Dynamics And Bioinformatics Study Of The Sarcin-Ricin Internal Loop

PubMed Central

Havrila, Marek; Réblová, Kamila; Zirbel, Craig L.; Leontis, Neocles B.; Šponer, Jiří

2013-01-01

The Sarcin-Ricin RNA motif (SR motif) is one of the most prominent recurrent RNA building blocks that occurs in many different RNA contexts and folds autonomously, i.e., in a context-independent manner. In this study, we combined bioinformatics analysis with explicit-solvent molecular dynamics (MD) simulations to better understand the relation between the RNA sequence and the evolutionary patterns of SR motif. SHAPE probing experiment was also performed to confirm fidelity of MD simulations. We identified 57 instances of the SR motif in a non-redundant subset of the RNA X-ray structure database and analyzed their basepairing, base-phosphate, and backbone-backbone interactions. We extracted sequences aligned to these instances from large ribosomal RNA alignments to determine frequency of occurrence for different sequence variants. We then used a simple scoring scheme based on isostericity to suggest 10 sequence variants with highly variable expected degree of compatibility with the SR motif 3D structure. We carried out MD simulations of SR motifs with these base substitutions. Non isosteric base substitutions led to unstable structures, but so did isosteric substitutions which were unable to make key base-phosphate interactions. MD technique explains why some potentially isosteric SR motifs are not realized during evolution. We also found that inability to form stable cWW geometry is an important factor in case of the first base pair of the flexible region of the SR motif. Comparison of structural, bioinformatics, SHAPE probing and MD simulation data reveals that explicit solvent MD simulations neatly reflect viability of different sequence variants of the SR motif. Thus, MD simulations can efficiently complement bioinformatics tools in studies of conservation patterns of RNA motifs and provide atomistic insight into the role of their different signature interactions. PMID:24144333
Characterization and Transferable Utility of Microsatellite Markers in the Wild and Cultivated Arachis Species.

PubMed

Huang, Li; Wu, Bei; Zhao, Jiaojiao; Li, Haitao; Chen, Weigang; Zheng, Yanli; Ren, Xiaoping; Chen, Yuning; Zhou, Xiaojing; Lei, Yong; Liao, Boshou; Jiang, Huifang

2016-01-01

Microsatellite or simple sequence repeat (SSR) is one of the most widely distributed molecular markers that have been widely utilized to assess genetic diversity and genetic mapping for important traits in plants. However, the understanding of microsatellite characteristics in Arachis species and the currently available amount of high-quality SSR markers remain limited. In this study, we identified 16,435 genome survey sequences SSRs (GSS-SSRs) and 40,199 expressed sequence tag SSRs (EST-SSRs) in Arachis hypogaea and its wild relative species using the publicly available sequence data. The GSS-SSRs had a density of 159.9-239.8 SSRs/Mb for wild Arachis and 1,015.8 SSR/Mb for cultivated Arachis, whereas the EST-SSRs had the density of 173.5-384.4 SSR/Mb and 250.9 SSRs/Mb for wild and cultivated Arachis, respectively. The trinucleotide SSRs were predominant across Arachis species, except that the dinucleotide accounted for most in A. hypogaea GSSs. From Arachis GSS-SSR and EST-SSR sequences, we developed 2,589 novel SSR markers that showed a high polymorphism in six diverse A. hypogaea accessions. A genetic linkage map that contained 540 novel SSR loci and 105 anchor SSR loci was constructed by case of a recombinant inbred lines F6 population. A subset of 82 randomly selected SSR markers were used to screen 39 wild and 22 cultivated Arachis accessions, which revealed a high transferability of the novel SSRs across Arachis species. Our results provided informative clues to investigate microsatellite patterns across A. hypogaea and its wild relative species and potentially facilitate the germplasm evaluation and gene mapping in Arachis species.
Transcriptome Analysis of the Chrysanthemum Foliar Nematode, Aphelenchoides ritzemabosi (Aphelenchida: Aphelenchoididae)

PubMed Central

Li, Jun-Yi; Xie, Hui; Xu, Chun-Ling; Li, Yu

2016-01-01

The chrysanthemum foliar nematode (CFN), Aphelenchoides ritzemabosi, is a plant parasitic nematode that attacks many plants. In this study, a transcriptomes of mixed-stage population of CFN was sequenced on the Illumina HiSeq 2000 platform. 68.10 million Illumina high quality paired end reads were obtained which generated 26,817 transcripts with a mean length of 1,032 bp and an N50 of 1,672 bp, of which 16,467 transcripts were annotated against six databases. In total, 20,311 coding region sequences (CDS), 495 simple sequence repeats (SSRs) and 8,353 single-nucleotide polymorphisms (SNPs) were predicted, respectively. The CFN with the most shared sequences was B. xylophilus with 16,846 (62.82%) common transcripts and 10,543 (39.31%) CFN transcripts matched sequences of all of four plant parasitic nematodes compared. A total of 111 CFN transcripts were predicted as homologues of 7 types of carbohydrate-active enzymes (CAZymes) with plant/fungal cell wall-degrading activities, fewer transcripts were predicted as homologues of plant cell wall-degrading enzymes than fungal cell wall-degrading enzymes. The phylogenetic analysis of GH5, GH16, GH43 and GH45 proteins between CFN and other organisms showed CFN and other nematodes have a closer phylogenetic relationship. In the CFN transcriptome, sixteen types of genes orthologues with seven classes of protein families involved in the RNAi pathway in C. elegans were predicted. This research provides comprehensive gene expression information at the transcriptional level, which will facilitate the elucidation of the molecular mechanisms of CFN and the distribution of gene functions at the macro level, potentially revealing improved methods for controlling CFN. PMID:27875578
Non-Genomic Origins of Proteins and Metabolism

NASA Technical Reports Server (NTRS)

Pohorille, Andrew

2003-01-01

It is proposed that evolution of inanimate matter to cells endowed with a nucleic acid- based coding of genetic information was preceded by an evolutionary phase, in which peptides not coded by nucleic acids were able to self-organize into networks capable of evolution towards increasing metabolic complexity. Recent findings that truly different, simple peptides (Keefe and Szostak, 2001) can perform the same function (such as ATP binding) provide experimental support for this mechanism of early protobiological evolution. The central concept underlying this mechanism is that the reproduction of cellular functions alone was sufficient for self-maintenance of protocells, and that self- replication of macromolecules was not required at this stage of evolution. The precise transfer of information between successive generations of the earliest protocells was unnecessary and, possibly, undesirable. The key requirement in the initial stage of protocellular evolution was an ability to rapidly explore a large number of protein sequences in order to discover a set of molecules capable of supporting self- maintenance and growth of protocells. Undoubtedly, the essential protocellular functions were carried out by molecules not nearly as efficient or as specific as contemporary proteins. Many, potentially unrelated sequences could have performed each of these functions at an evolutionarily acceptable level. As evolution progressed, however proteins must have performed their functions with increasing efficiency and specificity. This, in turn, put additional constraints on protein sequences and the fraction of proteins capable of performing their functions at the required level decreased. At some point, the likelihood of generating a sufficiently efficient set of proteins through a non-coded synthesis was so small that further evolution was not possible without storing information about the sequences of these proteins. Beyond this point, further evolution required coupling between proteins and informational polymers that is characteristic to all known forms of life. The emergence of such coupling must be postulated in any scenario of the origin of life, no matter whether it starts with RNA or proteins. To examine the evolutionary potential of non-genomic systems, a simple, computationally tractable model, which is still capable of capturing the essential features of the real system, has been studied computationally. Both constructive and destructive processes have been introduced into the model in a stochastic manner. Instead of assuming random reaction sets, only a suite of protobiologically plausible reactions has been considered. Peptides have been explicitly considered as protoenzymes and their catalytic efficiencies have been assigned on the basis of biochemical principles and experimental estimates. Simulations have been carried out using a novel approach (The Next Reaction Method) that is appropriate even for very low concentrations of reactants. Studies have focused on global autocatalytic processes and their diversity.
Demographic stability metrics for conservation prioritization of isolated populations.

PubMed

Finn, Debra S; Bogan, Michael T; Lytle, David A

2009-10-01

Systems of geographically isolated habitat patches house species that occur naturally as small, disjunct populations. Many of these species are of conservation concern, particularly under the interacting influences of isolation and rapid global change. One potential conservation strategy is to prioritize the populations most likely to persist through change and act as sources for future recolonization of less stable localities. We propose an approach to classify long-term population stability (and, presumably, future persistence potential) with composite demographic metrics derived from standard population-genetic data. Stability metrics can be related to simple habitat measures for a straightforward method of classifying localities to inform conservation management. We tested these ideas in a system of isolated desert headwater streams with mitochondrial sequence data from 16 populations of a flightless aquatic insect. Populations exhibited a wide range of stability scores, which were significantly predicted by dry-season aquatic habitat size. This preliminary test suggests strong potential for our proposed method of classifying isolated populations according to persistence potential. The approach is complementary to existing methods for prioritizing local habitats according to diversity patterns and should be tested further in other systems and with additional loci to inform composite demographic stability scores.
Application of multi-criteria decision analysis in prediction of groundwater resources potential: A case of Oke-Ana, Ilesa Area Southwestern, Nigeria

NASA Astrophysics Data System (ADS)

Akinlalu, A. A.; Adegbuyiro, A.; Adiat, K. A. N.; Akeredolu, B. E.; Lateef, W. Y.

2017-06-01

Groundwater Potential of Oke-Ana area southwestern Nigeria have been evaluated using the integration of electrical resistivity method, remote sensing and geographic information systems. The effect of five hydrogeological indices, namely lineament density, drainage density, lithology, overburden thickness and aquifer layer resistivity on groundwater occurrence was established. Multi-criteria decision analysis technique was employed to assign weight to each of the index using the concept of analytical hierarchy process. The assigned weight was normalized and consistency ratio was established. In order to evaluate the groundwater potential of Oke-Ana, sixty-seven (67) vertical electrical sounding points were occupied. Ten curve types were delineated in the study area. The curve types vary from simple three layer A and H-type curves to the more complex four, five and six layer AA, HA, KH, QH, AKH, HKH, KHA and KHKH curves. Four subsurface geo-electric sequences of top soil, weathered layer, partially weathered/fractured basement and the fresh basement were delineated in the area. The analytical process assisted in classifying Oke-Ana into, low, medium and high groundwater potential zones. Validation of the model from well information and two aborted boreholes suggest 70% agreement.
Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes

PubMed Central

Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney

2012-01-01

RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676
WebSat ‐ A web software for microsatellite marker development

PubMed Central

Martins, Wellington Santos; Soares Lucas, Divino César; de Souza Neves, Kelligton Fabricio; Bertioli, David John

2009-01-01

Simple sequence repeats (SSR), also known as microsatellites, have been extensively used as molecular markers due to their abundance and high degree of polymorphism. We have developed a simple to use web software, called WebSat, for microsatellite molecular marker prediction and development. WebSat is accessible through the Internet, requiring no program installation. Although a web solution, it makes use of Ajax techniques, providing a rich, responsive user interface. WebSat allows the submission of sequences, visualization of microsatellites and the design of primers suitable for their amplification. The program allows full control of parameters and the easy export of the resulting data, thus facilitating the development of microsatellite markers. Availability The web tool may be accessed at http://purl.oclc.org/NET/websat/ PMID:19255650
Molecular Analysis of Date Palm Genetic Diversity Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeats (ISSRs).

PubMed

El Sharabasy, Sherif F; Soliman, Khaled A

2017-01-01

The date palm is an ancient domesticated plant with great diversity and has been cultivated in the Middle East and North Africa for at last 5000 years. Date palm cultivars are classified based on the fruit moisture content, as dry, semidry, and soft dates. There are a number of biochemical and molecular techniques available for characterization of the date palm variation. This chapter focuses on the DNA-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats (ISSR) techniques, in addition to biochemical markers based on isozyme analysis. These techniques coupled with appropriate statistical tools proved useful for determining phylogenetic relationships among date palm cultivars and provide information resources for date palm gene banks.
Folding and Stabilization of Native-Sequence-Reversed Proteins

PubMed Central

Zhang, Yuanzhao; Weber, Jeffrey K; Zhou, Ruhong

2016-01-01

Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols. PMID:27113844
Folding and Stabilization of Native-Sequence-Reversed Proteins

NASA Astrophysics Data System (ADS)

Zhang, Yuanzhao; Weber, Jeffrey K.; Zhou, Ruhong

2016-04-01

Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols.
Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi) and related species

PubMed Central

Perumal, Ramasamy; Nimmakayala, Padmavathi; Erattaimuthu, Saradha R; No, Eun-Gyu; Reddy, Umesh K; Prom, Louis K; Odvody, Gary N; Luster, Douglas G; Magill, Clint W

2008-01-01

Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites) to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55%) with dinucleotide repeats and 6 (11%) with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40%) and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis), sugar cane (P. sacchari), pearl millet (Sclerospora graminicola) and rose (Peronospora sparsa) indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production) were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34 Peronosclerospora, Peronospora and Sclerospora spp isolates studied. Cluster analysis by UPGMA as well as principal coordinate analysis (PCA) grouped the 34 isolates into three distinct groups (all 19 isolates of Peronosclerospora sorghi in cluster I, five isolates of P. maydis and three isolates of P. sacchari in cluster II and five isolates of Sclerospora graminicola in cluster III). Conclusion To our knowledge, this is the first attempt to extensively develop SSR markers from Peronosclerospora genomic DNA. The newly developed SSR markers can be readily used to distinguish isolates within several species of the oomycetes that cause downy mildew diseases. Also, microsatellite fragments likely include retrotransposon regions of DNA and these sequences can serve as useful genetic markers for strain identification, due to their degree of variability and their widespread occurrence among sorghum, maize, sugarcane, pearl millet and rose downy mildew isolates. PMID:19040756

An annotated genetic map of loblolly pine based on microsatellite and cDNA markers

Treesearch

Craig S. Echt; Surya Saha; Konstantin V. Krutovsky; Kokulapalan Wimalanathan; John E. Erpelding; Chun Liang; C Dana Nelson

2011-01-01

Previous loblolly pine (Pinus taeda L.) genetic linkage maps have been based on a variety of DNA polymorphisms, such as AFLPs, RAPDs, RFLPs, and ESTPs, but only a few SSRs (simple sequence repeats), also known as simple tandem repeats or microsatellites, have been mapped in P. taeda. The objective of this study was to integrate a large set of SSR markers from a variety...
Strain rates, stress markers and earthquake clustering (Invited)

NASA Astrophysics Data System (ADS)

Fry, B.; Gerstenberger, M.; Abercrombie, R. E.; Reyners, M.; Eberhart-Phillips, D. M.

2013-12-01

The 2010-present Canterbury earthquakes comprise a well-recorded sequence in a relatively low strain-rate shallow crustal region. We present new scientific results to test the hypothesis that: Earthquake sequences in low-strain rate areas experience high stress drop events, low-post seismic relaxation, and accentuated seismic clustering. This hypothesis is based on a physical description of the aftershock process in which the spatial distribution of stress accumulation and stress transfer are controlled by fault strength and orientation. Following large crustal earthquakes, time dependent forecasts are often developed by fitting parameters defined by Omori's aftershock decay law. In high-strain rate areas, simple forecast models utilizing a single p-value fit observed aftershock sequences well. In low-strain rate areas such as Canterbury, assumptions of simple Omori decay may not be sufficient to capture the clustering (sub-sequence) nature exhibited by the punctuated rise in activity following significant child events. In Canterbury, the moment release is more clustered than in more typical Omori sequences. The individual earthquakes in these clusters also exhibit somewhat higher stress drops than in the average crustal sequence in high-strain rate regions, suggesting the earthquakes occur on strong Andersonian-oriented faults, possibly juvenile or well-healed . We use the spectral ratio procedure outlined in (Viegas et al., 2010) to determine corner frequencies and Madariaga stress-drop values for over 800 events in the sequence. Furthermore, we will discuss the relevance of tomographic results of Reyners and Eberhart-Phillips (2013) documenting post-seismic stress-driven fluid processes following the three largest events in the sequence as well as anisotropic patterns in surface wave tomography (Fry et al., 2013). These tomographic studies are both compatible with the hypothesis, providing strong evidence for the presence of widespread and hydrated regional upper crustal cracking parallel to sub-parallel to the dominant transverse failure plane in the sequence. Joint interpretation of the three separate datasets provide a positive first attempt at testing our fundamental hypothesis.
Markers and mapping revisited: finding your gene.

PubMed

Jones, Neil; Ougham, Helen; Thomas, Howard; Pasakinskiene, Izolda

2009-01-01

This paper is an update of our earlier review (Jones et al., 1997, Markers and mapping: we are all geneticists now. New Phytologist 137: 165-177), which dealt with the genetics of mapping, in terms of recombination as the basis of the procedure, and covered some of the first generation of markers, including restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPDs), simple sequence repeats (SSRs) and quantitative trait loci (QTLs). In the intervening decade there have been numerous developments in marker science with many new systems becoming available, which are herein described: cleavage amplification polymorphism (CAP), sequence-specific amplification polymorphism (S-SAP), inter-simple sequence repeat (ISSR), sequence tagged site (STS), sequence characterized amplification region (SCAR), selective amplification of microsatellite polymorphic loci (SAMPL), single nucleotide polymorphism (SNP), expressed sequence tag (EST), sequence-related amplified polymorphism (SRAP), target region amplification polymorphism (TRAP), microarrays, diversity arrays technology (DArT), single-strand conformation polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE) and methylation-sensitive PCR. In addition there has been an explosion of knowledge and databases in the area of genomics and bioinformatics. The number of flowering plant ESTs is c. 19 million and counting, with all the opportunity that this provides for gene-hunting, while the survey of bioinformatics and computer resources points to a rapid growth point for future activities in unravelling and applying the burst of new information on plant genomes. A case study is presented on tracking down a specific gene (stay-green (SGR), a post-transcriptional senescence regulator) using the full suite of mapping tools and comparative mapping resources. We end with a brief speculation on how genome analysis may progress into the future of this highly dynamic arena of plant science.
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

PubMed

Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

2012-08-01

Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.
On Stellar Flash Echoes from Circular Rings

NASA Astrophysics Data System (ADS)

Nemiroff, Robert; Mukherjee, Oindabi

2018-01-01

A flash -- or any episode of variability -- that occurs in the vicinity of a circular ring might be seen several times later, simultaneously, as echoes on the ring. Effective images of the flash are created and annihilated in pairs, with as many as four flash images visible concurrently. Videos detailing sequences of image pair creation, tandem motion, and subsequent image annihilation are shown, given simple opacity and scattering assumptions. It is proven that, surprisingly, images from a second pair creation event always annihilate with images from the first. Caustic surfaces between flash locations yielding two and four images are computed. Although such ring echos surely occur, their practical detection might be difficult as it could require dedicated observing programs involving sensitive photometry of extended objects. Potential flash sources include planetary and interstellar gas and dust rings near and around variable stars, flare stars, novae, supernovae, and GRBs. Potentially recoverable information includes size, distance, temporal history, and angular isotropy of both the ring and flash.
Recent Advances in CRISPR-Cas9 Genome Editing Technology for Biological and Biomedical Investigations.

PubMed

Singh, Vijai; Gohil, Nisarg; Ramírez García, Robert; Braddick, Darren; Fofié, Christian Kuete

2018-01-01

The Type II CRISPR-Cas9 system is a simple, efficient, and versatile tool for targeted genome editing in a wide range of organisms and cell types. It continues to gain more scientific interest and has established itself as an extremely powerful technology within our synthetic biology toolkit. It works upon a targeted site and generates a double strand breaks that become repaired by either the NHEJ or the HDR pathway, modifying or permanently replacing the genomic target sequences of interest. These can include viral targets, single-mutation genetic diseases, and multiple-site corrections for wide scale disease states, offering the potential to manage and cure some of mankind's most persistent biomedical menaces. Here, we present the developing progress and future potential of CRISPR-Cas9 in biological and biomedical investigations, toward numerous therapeutic, biomedical, and biotechnological applications, as well as some of the challenges within. J. Cell. Biochem. 119: 81-94, 2018. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Neuronal chronometry of target detection: fusion of hemodynamic and event-related potential data.

PubMed

Calhoun, V D; Adali, T; Pearlson, G D; Kiehl, K A

2006-04-01

Event-related potential (ERP) studies of the brain's response to infrequent, target (oddball) stimuli elicit a sequence of physiological events, the most prominent and well studied being a complex, the P300 (or P3) peaking approximately 300 ms post-stimulus for simple stimuli and slightly later for more complex stimuli. Localization of the neural generators of the human oddball response remains challenging due to the lack of a single imaging technique with good spatial and temporal resolution. Here, we use independent component analyses to fuse ERP and fMRI modalities in order to examine the dynamics of the auditory oddball response with high spatiotemporal resolution across the entire brain. Initial activations in auditory and motor planning regions are followed by auditory association cortex and motor execution regions. The P3 response is associated with brainstem, temporal lobe, and medial frontal activity and finally a late temporal lobe "evaluative" response. We show that fusing imaging modalities with different advantages can provide new information about the brain.
Robust optical flow using adaptive Lorentzian filter for image reconstruction under noisy condition

NASA Astrophysics Data System (ADS)

Kesrarat, Darun; Patanavijit, Vorapoj

2017-02-01

In optical flow for motion allocation, the efficient result in Motion Vector (MV) is an important issue. Several noisy conditions may cause the unreliable result in optical flow algorithms. We discover that many classical optical flows algorithms perform better result under noisy condition when combined with modern optimized model. This paper introduces effective robust models of optical flow by using Robust high reliability spatial based optical flow algorithms using the adaptive Lorentzian norm influence function in computation on simple spatial temporal optical flows algorithm. Experiment on our proposed models confirm better noise tolerance in optical flow's MV under noisy condition when they are applied over simple spatial temporal optical flow algorithms as a filtering model in simple frame-to-frame correlation technique. We illustrate the performance of our models by performing an experiment on several typical sequences with differences in movement speed of foreground and background where the experiment sequences are contaminated by the additive white Gaussian noise (AWGN) at different noise decibels (dB). This paper shows very high effectiveness of noise tolerance models that they are indicated by peak signal to noise ratio (PSNR).
Preschool-aged children have difficulty constructing and interpreting simple utterances composed of graphic symbols.

PubMed

Sutton, Ann; Trudeau, Natacha; Morford, Jill; Rios, Monica; Poirier, Marie-Andrée

2010-01-01

Children who require augmentative and alternative communication (AAC) systems while they are in the process of acquiring language face unique challenges because they use graphic symbols for communication. In contrast to the situation of typically developing children, they use different modalities for comprehension (auditory) and expression (visual). This study explored the ability of three- and four-year-old children without disabilities to perform tasks involving sequences of graphic symbols. Thirty participants were asked to transpose spoken simple sentences into graphic symbols by selecting individual symbols corresponding to the spoken words, and to interpret graphic symbol utterances by selecting one of four photographs corresponding to a sequence of three graphic symbols. The results showed that these were not simple tasks for the participants, and few of them performed in the expected manner - only one in transposition, and only one-third of participants in interpretation. Individual response strategies in some cases lead to contrasting response patterns. Children at this age level have not yet developed the skills required to deal with graphic symbols even though they have mastered the corresponding spoken language structures.
Bifunctional nanoparticles for surface-enhanced Raman spectroscopy-based leukemia biomarker detection

NASA Astrophysics Data System (ADS)

Mehn, Dora; Morasso, Carlo; Vanna, Renzo; Schiumarini, Domitilla; Bedoni, Marzia; Ciceri, Fabio; Gramatica, Furio

2014-03-01

The Wilms tumor gene (WT1) is a biomarker overexpressed in more than 90% of acute myeloid leukemia patients. Fast and sensitive detection of the WT1 in blood samples would allow monitoring of the minimal residual disease during clinical remission and would permit early detection of a potential relapse in acute myeloid leukemia. In this work, Surface Enhanced Raman Spectroscopy (SERS) based detection of the WT1 sequence using bifunctional, magnetic core - gold shell nanoparticles is presented. The classical co-precipitation method was applied to generate magnetic nanoparticles which were coated with a gold shell after modification with aminopropyltriethoxy silane and subsequent deposition of gold nanoparticle seeds. Simple hydroquinone based reduction procedure was applied for the shell growing in water based reaction mixture at room temperature. Thiolated ssDNA probes of the WT1 sequence were immobilized as capture oligonucleotides on the gold surface. Malachite green was applied both for testing the amplification performance of the core-shell colloidal SERS substrate and also as label dye of the target DNA sequence. The SERS enhancer efficacy of the core-shell nanomaterial was compared with the efficacy of classical spherical gold particles produced using the conventional citrate reduction method. The core-shell particles were found not only to provide an opportunity for facile separation in a heterogeneous reaction system but also to be superior regarding robustness as SERS enhancers.
Subcellular localization of transiently expressed fluorescent fusion proteins.

PubMed

Collings, David A

2013-01-01

The recent and massive expansion in plant genomics data has generated a large number of gene sequences for which two seemingly simple questions need to be answered: where do the proteins encoded by these genes localize in cells, and what do they do? One widespread approach to answering the localization question has been to use particle bombardment to transiently express unknown proteins tagged with green fluorescent protein (GFP) or its numerous derivatives. Confocal fluorescence microscopy is then used to monitor the localization of the fluorescent protein as it hitches a ride through the cell. The subcellular localization of the fusion protein, if not immediately apparent, can then be determined by comparison to localizations generated by fluorescent protein fusions to known signalling sequences and proteins, or by direct comparison with fluorescent dyes. This review aims to be a tour guide for researchers wanting to travel this hitch-hiker's path, and for reviewers and readers who wish to understand their travel reports. It will describe some of the technology available for visualizing protein localizations, and some of the experimental approaches for optimizing and confirming localizations generated by particle bombardment in onion epidermal cells, the most commonly used experimental system. As the non-conservation of signal sequences in heterologous expression systems such as onion, and consequent mis-targeting of fusion proteins, is always a potential problem, the epidermal cells of the Argenteum mutant of pea are proposed as a model system.
Evolutionary Origins of a Bioactive Peptide Buried within Preproalbumin[C][W

PubMed Central

Elliott, Alysha G.; Delay, Christina; Liu, Huanle; Phua, Zaiyang; Rosengren, K. Johan; Benfield, Aurélie H.; Panero, Jose L.; Colgrave, Michelle L.; Jayasena, Achala S.; Dunse, Kerry M.; Anderson, Marilyn A.; Schilling, Edward E.; Ortiz-Barrientos, Daniel; Craik, David J.; Mylne, Joshua S.

2014-01-01

The de novo evolution of proteins is now considered a frequented route for biological innovation, but the genetic and biochemical processes that lead to each newly created protein are often poorly documented. The common sunflower (Helianthus annuus) contains the unusual gene PawS1 (Preproalbumin with SFTI-1) that encodes a precursor for seed storage albumin; however, in a region usually discarded during albumin maturation, its sequence is matured into SFTI-1, a protease-inhibiting cyclic peptide with a motif homologous to unrelated inhibitors from legumes, cereals, and frogs. To understand how PawS1 acquired this additional peptide with novel biochemical functionality, we cloned PawS1 genes and showed that this dual destiny is over 18 million years old. This new family of mostly backbone-cyclic peptides is structurally diverse, but the protease-inhibitory motif was restricted to peptides from sunflower and close relatives from its subtribe. We describe a widely distributed, potential evolutionary intermediate PawS-Like1 (PawL1), which is matured into storage albumin, but makes no stable peptide despite possessing residues essential for processing and cyclization from within PawS1. Using sequences we cloned, we retrodict the likely stepwise creation of PawS1’s additional destiny within a simple albumin precursor. We propose that relaxed selection enabled SFTI-1 to evolve its inhibitor function by converging upon a successful sequence and structure. PMID:24681618
Listen up! Processing of intensity change differs for vocal and nonvocal sounds.

PubMed

Schirmer, Annett; Simpson, Elizabeth; Escoffier, Nicolas

2007-10-24

Changes in the intensity of both vocal and nonvocal sounds can be emotionally relevant. However, as only vocal sounds directly reflect communicative intent, intensity change of vocal but not nonvocal sounds is socially relevant. Here we investigated whether a change in sound intensity is processed differently depending on its social relevance. To this end, participants listened passively to a sequence of vocal or nonvocal sounds that contained rare deviants which differed from standards in sound intensity. Concurrently recorded event-related potentials (ERPs) revealed a mismatch negativity (MMN) and P300 effect for intensity change. Direction of intensity change was of little importance for vocal stimulus sequences, which recruited enhanced sensory and attentional resources for both loud and soft deviants. In contrast, intensity change in nonvocal sequences recruited more sensory and attentional resources for loud as compared to soft deviants. This was reflected in markedly larger MMN/P300 amplitudes and shorter P300 latencies for the loud as compared to soft nonvocal deviants. Furthermore, while the processing pattern observed for nonvocal sounds was largely comparable between men and women, sex differences for vocal sounds suggest that women were more sensitive to their social relevance. These findings extend previous evidence of sex differences in vocal processing and add to reports of voice specific processing mechanisms by demonstrating that simple acoustic change recruits more processing resources if it is socially relevant.
ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences

PubMed Central

Santamaria, Monica; Fosso, Bruno; Licciulli, Flavio; Balech, Bachir; Larini, Ilaria; Grillo, Giorgio; De Caro, Giorgio; Liuni, Sabino

2018-01-01

Abstract A holistic understanding of environmental communities is the new challenge of metagenomics. Accordingly, the amplicon-based or metabarcoding approach, largely applied to investigate bacterial microbiomes, is moving to the eukaryotic world too. Indeed, the analysis of metabarcoding data may provide a comprehensive assessment of both bacterial and eukaryotic composition in a variety of environments, including human body. In this respect, whereas hypervariable regions of the 16S rRNA are the de facto standard barcode for bacteria, the Internal Transcribed Spacer 1 (ITS1) of ribosomal RNA gene cluster has shown a high potential in discriminating eukaryotes at deep taxonomic levels. As metabarcoding data analysis rely on the availability of a well-curated barcode reference resource, a comprehensive collection of ITS1 sequences supplied with robust taxonomies, is highly needed. To address this issue, we created ITSoneDB (available at http://itsonedb.cloud.ba.infn.it/) which in its current version hosts 985 240 ITS1 sequences spanning over 134 000 eukaryotic species. Each ITS1 is mapped on the NCBI reference taxonomy with its start and end positions precisely annotated. ITSoneDB has been developed in agreement to the FAIR guidelines by enabling the users to query and download its content through a simple web-interface and access relevant metadata by cross-linking to European Nucleotide Archive. PMID:29036529
Fast parallel tandem mass spectral library searching using GPU hardware acceleration

PubMed Central

Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K.; Martin, Daniel B.

2011-01-01

Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching) is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment. PMID:21545112
Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology.

PubMed

Cock, Peter J A; Grüning, Björn A; Paszkiewicz, Konrad; Pritchard, Leighton

2013-01-01

The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of "effector" proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen's predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu).
Draft Genome Sequence of Thermoanaerobacter sp. Strain A7A, Reconstructed from a Metagenome Obtained from a High-Temperature Hydrocarbon Reservoir in the Bass Strait, Australia

PubMed Central

Li, Dongmei; Greenfield, Paul; Rosewarne, Carly P.

2013-01-01

The draft genome sequence of Thermoanaerobacter sp. strain A7A was reconstructed from a metagenome of a microbial consortium obtained from the Tuna oil field in the Gippsland Basin, Australia. The organism is a strict anaerobe that is predicted to ferment a range of simple sugars and undertake sulfur reduction. PMID:24029756
Comparison between rpoB and 16S rRNA Gene Sequencing for Molecular Identification of 168 Clinical Isolates of Corynebacterium

PubMed Central

Khamis, Atieh; Raoult, Didier; La Scola, Bernard

2005-01-01

Higher proportions (91%) of 168 corynebacterial isolates were positively identified by partial rpoB gene determination than by that based on 16S rRNA gene sequences. This method is thus a simple, molecular-analysis-based method for identification of corynebacteria, but it should be used in conjunction with other tests for definitive identification. PMID:15815024
Long-term Recurrent Convolutional Networks for Visual Recognition and Description

DTIC Science & Technology

2014-11-17

deep???, are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large...models which are also recurrent, or “temporally deep”, are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent...limitation of simple RNN models which strictly integrate state information over time is known as the “vanishing gradient” effect : the ability to

Detection of genomic rearrangements in cucumber using genomecmp software

NASA Astrophysics Data System (ADS)

Kulawik, Maciej; Pawełkowicz, Magdalena Ewa; Wojcieszek, Michał; PlÄ der, Wojciech; Nowak, Robert M.

2017-08-01

Comparative genomic by increasing information about the genomes sequences available in the databases is a rapidly evolving science. A simple comparison of the general features of genomes such as genome size, number of genes, and chromosome number presents an entry point into comparative genomic analysis. Here we present the utility of the new tool genomecmp for finding rearrangements across the compared sequences and applications in plant comparative genomics.
Molecular Identification of Sex in Phoenix dactylifera Using Inter Simple Sequence Repeat Markers.

PubMed

Al-Ameri, Abdulhafed A; Al-Qurainy, Fahad; Gaafar, Abdel-Rhman Z; Khan, Salim; Nadeem, M

2016-01-01

Early sex identification of Date Palm (Phoenix dactylifera L.) at seedling stage is an economically desirable objective, which will significantly increase the profits of seed based cultivation. The utilization of molecular markers at this stage for early and rapid identification of sex is important due to the lack of morphological markers. In this study, a total of two hundred Inter Simple Sequence Repeat (ISSR) primers were screened among male and female Date palm plants to identify putative sex-specific marker, out of which only two primers (IS_A02 and IS_A71) were found to be associated with sex. The primer IS_A02 produced a unique band of size 390 bp and was found clearly in all female plants, while it was absent in all male plants. Contrary to this, the primer IS_A71 produced a unique band of size 380 bp and was clearly found in all male plants, whereas it was absent in all the female plants. Subsequently, these specific fragments were excised, purified, and sequenced for the development of sequence specific markers further in future for the implementation on dioecious Date Palm for sex determination. These markers are efficient, highly reliable, and reproducible for sex identification at the early stage of seedling.
Plasmonic SERS nanochips and nanoprobes for medical diagnostics and bio-energy applications

NASA Astrophysics Data System (ADS)

Ngo, Hoan T.; Wang, Hsin-Neng; Crawford, Bridget M.; Fales, Andrew M.; Vo-Dinh, Tuan

2017-02-01

The development of rapid, easy-to-use, cost-effective, high accuracy, and high sensitive DNA detection methods for molecular diagnostics has been receiving increasing interest. Over the last five years, our laboratory has developed several chip-based DNA detection techniques including the molecular sentinel-on-chip (MSC), the multiplex MSC, and the inverse molecular sentinel-on-chip (iMS-on-Chip). In these techniques, plasmonic surface-enhanced Raman scattering (SERS) Nanowave chips were functionalized with DNA probes for single-step DNA detection. Sensing mechanisms were based on hybridization of target sequences and DNA probes, resulting in a distance change between SERS reporters and the Nanowave chip's gold surface. This distance change resulted in change in SERS intensity, thus indicating the presence and capture of the target sequences. Our techniques were single-step DNA detection techniques. Target sequences were detected by simple delivery of sample solutions onto DNA probe-functionalized Nanowave chips and SERS signals were measured after 1h - 2h incubation. Target sequence labeling or washing to remove unreacted components was not required, making the techniques simple, easy-to-use, and cost effective. The usefulness of the techniques for medical diagnostics was illustrated by the detection of genetic biomarkers for respiratory viral infection and of dengue virus 4 DNA.
Hermes Transposon Distribution and Structure in Musca domestica

PubMed Central

Subramanian, Ramanand A.; Cathcart, Laura A.; Krafsur, Elliot S.; Atkinson, Peter W.

2009-01-01

Hermes are hAT transposons from Musca domestica that are very closely related to the hobo transposons from Drosophila melanogaster and are useful as gene vectors in a wide variety of organisms including insects, planaria, and yeast. hobo elements show distinct length variations in a rapidly evolving region of the transposase-coding region as a result of expansions and contractions of a simple repeat sequence encoding 3 amino acids threonine, proline, and glutamic acid (TPE). These variations in length may influence the function of the protein and the movement of hobo transposons in natural populations. Here, we determine the distribution of Hermes in populations of M. domestica as well as whether Hermes transposase has undergone similar sequence expansions and contractions during its evolution in this species. Hermes transposons were found in all M. domestica individuals sampled from 14 populations collected from 4 continents. All individuals with Hermes transposons had evidence for the presence of intact transposase open reading frames, and little sequence variation was observed among Hermes elements. A systematic analysis of the TPE-homologous region of the Hermes transposase-coding region revealed no evidence for length variation. The simple sequence repeat found in hobo elements is a feature of this transposon that evolved since the divergence of hobo and Hermes. PMID:19366812
Learning Orthographic Structure With Sequential Generative Neural Networks.

PubMed

Testolin, Alberto; Stoianov, Ivilin; Sperduti, Alessandro; Zorzi, Marco

2016-04-01

Learning the structure of event sequences is a ubiquitous problem in cognition and particularly in language. One possible solution is to learn a probabilistic generative model of sequences that allows making predictions about upcoming events. Though appealing from a neurobiological standpoint, this approach is typically not pursued in connectionist modeling. Here, we investigated a sequential version of the restricted Boltzmann machine (RBM), a stochastic recurrent neural network that extracts high-order structure from sensory data through unsupervised generative learning and can encode contextual information in the form of internal, distributed representations. We assessed whether this type of network can extract the orthographic structure of English monosyllables by learning a generative model of the letter sequences forming a word training corpus. We show that the network learned an accurate probabilistic model of English graphotactics, which can be used to make predictions about the letter following a given context as well as to autonomously generate high-quality pseudowords. The model was compared to an extended version of simple recurrent networks, augmented with a stochastic process that allows autonomous generation of sequences, and to non-connectionist probabilistic models (n-grams and hidden Markov models). We conclude that sequential RBMs and stochastic simple recurrent networks are promising candidates for modeling cognition in the temporal domain. Copyright © 2015 Cognitive Science Society, Inc.
A Simple Method for the Extraction, PCR-amplification, Cloning, and Sequencing of Pasteuria 16S rDNA from Small Numbers of Endospores.

PubMed

Atibalentja, N; Noel, G R; Ciancio, A

2004-03-01

For many years the taxonomy of the genus Pasteuria has been marred with confusion because the bacterium could not be cultured in vitro and, therefore, descriptions were based solely on morphological, developmental, and pathological characteristics. The current study sought to devise a simple method for PCR-amplification, cloning, and sequencing of Pasteuria 16S rDNA from small numbers of endospores, with no need for prior DNA purification. Results show that DNA extracts from plain glass bead-beating of crude suspensions containing 10,000 endospores at 0.2 x 10 endospores ml(-1) were sufficient for PCR-amplification of Pasteuria 16S rDNA, when used in conjunction with specific primers. These results imply that for P. penetrans and P. nishizawae only one parasitized female of Meloidogyne spp. and Heterodera glycines, respectively, should be sufficient, and as few as eight cadavers of Belonolaimus longicaudatus with an average number of 1,250 endospores of "Candidatus Pasteuria usgae" are needed for PCR-amplification of Pasteuria 16S rDNA. The method described in this paper should facilitate the sequencing of the 16S rDNA of the many Pasteuria isolates that have been reported on nematodes and, consequently, expedite the classification of those isolates through comparative sequence analysis.
GIGA: a simple, efficient algorithm for gene tree inference in the genomic age

PubMed Central

2010-01-01

Background Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. Results We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. Conclusions GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events. PMID:20534164
GIGA: a simple, efficient algorithm for gene tree inference in the genomic age.

PubMed

Thomas, Paul D

2010-06-09

Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events.
Application of denaturing gradient gel electrophoresis (DGGE) to the analysis of microbial communities of subgingival plaque.

PubMed

Fujimoto, C; Maeda, H; Kokeguchi, S; Takashiba, S; Nishimura, F; Arai, H; Fukui, K; Murayama, Y

2003-08-01

Denaturing gradient gel electrophoresis (DGGE) was applied to the microbiologic examination of subgingival plaque. The PCR primers were designed from conserved nucleotide sequences on 16S ribosomal RNA gene (16SrDNA) with GC rich clamp at the 5'-end. Polymerase chain reaction (PCR) was performed using the primers and genomic DNAs of typical periodontal bacteria. The generated 16SrDNA fragments were separated by denaturing gel. Although the sizes of the amplified DNA fragments were almost the same among the species, 16SrDNAs of the periodontal bacteria were distinguished according to their specific sequences. The microflora of clinical plaque samples were profiled by the PCR-DGGE method, and the dominant 16SrDNA bands were cloned and sequenced. Simultaneously, Actinobacillus actinomycetemcomitans, Porphyromonas gingivalis and Prevotella intermedia were detected by an ordinary PCR method. In the deep periodontal pockets, the bacterial community structures were complicated and P. gingivalis was the most dominant species, whereas the DGGE profiles were simple and Streptococcus or Neisseria species were dominant in the shallow pockets. The species-specific PCR method revealed the presence of A. actinomycetemcomitans, P. gingivalis and P. intermedia in the clinical samples. However, corresponding bands were not always observed in the DGGE profiles, indicating a lower sensitivity of the DGGE method. Although the DGGE method may have a lower sensitivity than the ordinary PCR methods, it could visualize the bacterial qualitative compositions and reveal the major species of the plaque. The DGGE analysis and following sequencing may have the potential to be a promising bacterial examination procedure in periodontal diseases.
Handling the data management needs of high-throughput sequencing data: SpeedGene, a compression algorithm for the efficient storage of genetic data

PubMed Central

2012-01-01

Background As Next-Generation Sequencing data becomes available, existing hardware environments do not provide sufficient storage space and computational power to store and process the data due to their enormous size. This is and will be a frequent problem that is encountered everyday by researchers who are working on genetic data. There are some options available for compressing and storing such data, such as general-purpose compression software, PBAT/PLINK binary format, etc. However, these currently available methods either do not offer sufficient compression rates, or require a great amount of CPU time for decompression and loading every time the data is accessed. Results Here, we propose a novel and simple algorithm for storing such sequencing data. We show that, the compression factor of the algorithm ranges from 16 to several hundreds, which potentially allows SNP data of hundreds of Gigabytes to be stored in hundreds of Megabytes. We provide a C++ implementation of the algorithm, which supports direct loading and parallel loading of the compressed format without requiring extra time for decompression. By applying the algorithm to simulated and real datasets, we show that the algorithm gives greater compression rate than the commonly used compression methods, and the data-loading process takes less time. Also, The C++ library provides direct-data-retrieving functions, which allows the compressed information to be easily accessed by other C++ programs. Conclusions The SpeedGene algorithm enables the storage and the analysis of next generation sequencing data in current hardware environment, making system upgrades unnecessary. PMID:22591016
QTL mapping for flowering-time and photoperiod insensitivity of cotton Gossypium darwinii Watt.

PubMed

Kushanov, Fakhriddin N; Buriev, Zabardast T; Shermatov, Shukhrat E; Turaev, Ozod S; Norov, Tokhir M; Pepper, Alan E; Saha, Sukumar; Ulloa, Mauricio; Yu, John Z; Jenkins, Johnie N; Abdukarimov, Abdusattor; Abdurakhmonov, Ibrokhim Y

2017-01-01

Most wild and semi-wild species of the genus Gossypium are exhibit photoperiod-sensitive flowering. The wild germplasm cotton is a valuable source of genes for genetic improvement of modern cotton cultivars. A bi-parental cotton population segregating for photoperiodic flowering was developed by crossing a photoperiod insensitive irradiation mutant line with its pre-mutagenesis photoperiodic wild-type G. darwinii Watt genotype. Individuals from the F2 and F3 generations were grown with their parental lines and F1 hybrid progeny in the long day and short night summer condition (natural day-length) of Uzbekistan to evaluate photoperiod sensitivity, i.e., flowering-time during the seasons 2008-2009. Through genotyping the individuals of this bi-parental population segregating for flowering-time, linkage maps were constructed using 212 simple-sequence repeat (SSR) and three cleaved amplified polymorphic sequence (CAPS) markers. Six QTLs directly associated with flowering-time and photoperiodic flowering were discovered in the F2 population, whereas eight QTLs were identified in the F3 population. Two QTLs controlling photoperiodic flowering and duration of flowering were common in both populations. In silico annotations of the flanking DNA sequences of mapped SSRs from sequenced cotton (G. hirsutum L.) genome database has identified several potential 'candidate' genes that are known to be associated with regulation of flowering characteristics of plants. The outcome of this research will expand our understanding of the genetic and molecular mechanisms of photoperiodic flowering. Identified markers should be useful for marker-assisted selection in cotton breeding to improve early flowering characteristics.
Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties.

PubMed

Hittalmani, Shailaja; Mahesh, H B; Shirke, Meghana Deepak; Biradar, Hanamareddy; Uday, Govindareddy; Aruna, Y R; Lohithaswa, H C; Mohanrao, A

2017-06-15

Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mechanism, which helps to utilize water and nitrogen efficiently under hot and arid conditions without severely affecting yield. Therefore, development and utilization of genomic resources for genetic improvement of this crop is immensely useful. Experimental results from whole genome sequencing and assembling process of ML-365 finger millet cultivar yielded 1196 Mb covering approximately 82% of total estimated genome size. Genome analysis showed the presence of 85,243 genes and one half of the genome is repetitive in nature. The finger millet genome was found to have higher colinearity with foxtail millet and rice as compared to other Poaceae species. Mining of simple sequence repeats (SSRs) yielded abundance of SSRs within the finger millet genome. Functional annotation and mining of transcription factors revealed finger millet genome harbors large number of drought tolerance related genes. Transcriptome analysis of low moisture stress and non-stress samples revealed the identification of several drought-induced candidate genes, which could be used in drought tolerance breeding. This genome sequencing effort will strengthen plant breeders for allele discovery, genetic mapping, and identification of candidate genes for agronomically important traits. Availability of genomic resources of finger millet will enhance the novel breeding possibilities to address potential challenges of finger millet improvement.
RNA-seq analysis and de novo transcriptome assembly of Jerusalem artichoke (Helianthus tuberosus Linne).

PubMed

Jung, Won Yong; Lee, Sang Sook; Kim, Chul Wook; Kim, Hyun-Soon; Min, Sung Ran; Moon, Jae Sun; Kwon, Suk-Yoon; Jeon, Jae-Heung; Cho, Hye Sun

2014-01-01

Jerusalem artichoke (Helianthus tuberosus L.) has long been cultivated as a vegetable and as a source of fructans (inulin) for pharmaceutical applications in diabetes and obesity prevention. However, transcriptomic and genomic data for Jerusalem artichoke remain scarce. In this study, Illumina RNA sequencing (RNA-Seq) was performed on samples from Jerusalem artichoke leaves, roots, stems and two different tuber tissues (early and late tuber development). Data were used for de novo assembly and characterization of the transcriptome. In total 206,215,632 paired-end reads were generated. These were assembled into 66,322 loci with 272,548 transcripts. Loci were annotated by querying against the NCBI non-redundant, Phytozome and UniProt databases, and 40,215 loci were homologous to existing database sequences. Gene Ontology terms were assigned to 19,848 loci, 15,434 loci were matched to 25 Clusters of Eukaryotic Orthologous Groups classifications, and 11,844 loci were classified into 142 Kyoto Encyclopedia of Genes and Genomes pathways. The assembled loci also contained 10,778 potential simple sequence repeats. The newly assembled transcriptome was used to identify loci with tissue-specific differential expression patterns. In total, 670 loci exhibited tissue-specific expression, and a subset of these were confirmed using RT-PCR and qRT-PCR. Gene expression related to inulin biosynthesis in tuber tissue was also investigated. Exsiting genetic and genomic data for H. tuberosus are scarce. The sequence resources developed in this study will enable the analysis of thousands of transcripts and will thus accelerate marker-assisted breeding studies and studies of inulin biosynthesis in Jerusalem artichoke.
A Simple Exact Error Rate Analysis for DS-CDMA with Arbitrary Pulse Shape in Flat Nakagami Fading

NASA Astrophysics Data System (ADS)

Rahman, Mohammad Azizur; Sasaki, Shigenobu; Kikuchi, Hisakazu; Harada, Hiroshi; Kato, Shuzo

A simple exact error rate analysis is presented for random binary direct sequence code division multiple access (DS-CDMA) considering a general pulse shape and flat Nakagami fading channel. First of all, a simple model is developed for the multiple access interference (MAI). Based on this, a simple exact expression of the characteristic function (CF) of MAI is developed in a straight forward manner. Finally, an exact expression of error rate is obtained following the CF method of error rate analysis. The exact error rate so obtained can be much easily evaluated as compared to the only reliable approximate error rate expression currently available, which is based on the Improved Gaussian Approximation (IGA).
Easy design of colorimetric logic gates based on nonnatural base pairing and controlled assembly of gold nanoparticles.

PubMed

Zhang, Li; Wang, Zhong-Xia; Liang, Ru-Ping; Qiu, Jian-Ding

2013-07-16

Utilizing the principles of metal-ion-mediated base pairs (C-Ag-C and T-Hg-T), the pH-sensitive conformational transition of C-rich DNA strand, and the ligand-exchange process triggered by DL-dithiothreitol (DTT), a system of colorimetric logic gates (YES, AND, INHIBIT, and XOR) can be rationally constructed based on the aggregation of the DNA-modified Au NPs. The proposed logic operation system is simple, which consists of only T-/C-rich DNA-modified Au NPs, and it is unnecessary to exquisitely design and alter the DNA sequence for different multiple molecular logic operations. The nonnatural base pairing combined with unique optical properties of Au NPs promises great potential in multiplexed ion sensing, molecular-scale computers, and other computational logic devices.
Sequence-specific label-free nucleic acid biosensor for the detection of the hepatitis C virus genotype 1a using a disposable pencil graphite electrode.

PubMed

Donmez, Soner; Arslan, Fatma; Arslan, Halit

2016-05-01

In this paper, we demonstrate a simple, sensitive, inexpensive, disposable and label-free electrochemical nucleic acid biosensor for the detection of the hepatitis C virus genotype 1a (HCV1a). The nucleic acid biosensor was designed with the amino-linked inosine-substituted 20-mer probes, which were immobilized onto a disposable pencil graphite electrode (PGE) by covalent linking. The proposed nucleic acid biosensor was linear in the range of 0.05 and 0.75 μM, exhibiting a limit of detection of 54.9 nM. The single-stranded synthetic PCR product analogs of HCV1a were also detected with satisfactory results under optimal conditions, showing the potential application of this biosensor.
Utilization of FEP energetics

NASA Technical Reports Server (NTRS)

Frederking, T. H. K.; Abbassi, P.; Afifi, F.; Khandhar, P. K.; Ono, D. Y.; Chen, W. E. W.

1987-01-01

The research and development work on Fountain Effect Pump Systems (FEP systems) has been of interest in the competition between mechanical pumps for He II and FEP units. The latter do not have moving parts. In the course of the work, the energetics have been addressed using one part of a simple four-changes-of-state cycle. One option is the FEP ideal change of state at constant chemical potential (mu). The other option is the two-state sequence mu-P with a d mu=0 state change followed by an isobar. Questions of pump behavior, of flow rate response to temperature difference at the hot end, and related questions of thermodynamic cycle completion and heat transfer have been addressed. Porous media data obtained elucidate differences between vapor-liquid phase separation (VLPS) and Zero Net Mass Transfer (ZNMF).
LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

PubMed Central

Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

1994-01-01

We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies

PubMed Central

2010-01-01

Background All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. Results The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. Conclusions This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general. PMID:20144194
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies.

PubMed

David, Maria Pamela C; Concepcion, Gisela P; Padlan, Eduardo A

2010-02-08

All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.

High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

PubMed

Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

2015-01-01

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Directionality analysis on functional magnetic resonance imaging during motor task using Granger causality.

PubMed

Anwar, A R; Muthalib, M; Perrey, S; Galka, A; Granert, O; Wolff, S; Deuschl, G; Raethjen, J; Heute, U; Muthuraman, M

2012-01-01

Directionality analysis of signals originating from different parts of brain during motor tasks has gained a lot of interest. Since brain activity can be recorded over time, methods of time series analysis can be applied to medical time series as well. Granger Causality is a method to find a causal relationship between time series. Such causality can be referred to as a directional connection and is not necessarily bidirectional. The aim of this study is to differentiate between different motor tasks on the basis of activation maps and also to understand the nature of connections present between different parts of the brain. In this paper, three different motor tasks (finger tapping, simple finger sequencing, and complex finger sequencing) are analyzed. Time series for each task were extracted from functional magnetic resonance imaging (fMRI) data, which have a very good spatial resolution and can look into the sub-cortical regions of the brain. Activation maps based on fMRI images show that, in case of complex finger sequencing, most parts of the brain are active, unlike finger tapping during which only limited regions show activity. Directionality analysis on time series extracted from contralateral motor cortex (CMC), supplementary motor area (SMA), and cerebellum (CER) show bidirectional connections between these parts of the brain. In case of simple finger sequencing and complex finger sequencing, the strongest connections originate from SMA and CMC, while connections originating from CER in either direction are the weakest ones in magnitude during all paradigms.
Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

PubMed

Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

1984-01-11

We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to an automated (Apple II) procedure for searching and evaluating possible promoters in DNA sequence files.
Simple data-smoothing and noise-suppression technique

NASA Technical Reports Server (NTRS)

Duty, R. L.

1970-01-01

Algorithm, based on the Borel method of summing divergent sequences, is used for smoothing noisy data where knowledge of frequency content is not required. Technique's effectiveness is demonstrated by a series of graphs.
Building Mathematical Models of Simple Harmonic and Damped Motion.

ERIC Educational Resources Information Center

Edwards, Thomas

1995-01-01

By developing a sequence of mathematical models of harmonic motion, shows that mathematical models are not right or wrong, but instead are better or poorer representations of the problem situation. (MKR)
Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

PubMed

Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

2016-01-01

Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease".
A set of tetra-nucleotide core motif SSR markers for efficient identification of potato (Solanum tuberosum) cultivars.

PubMed

Kishine, Masahiro; Tsutsumi, Katsuji; Kitta, Kazumi

2017-12-01

Simple sequence repeat (SSR) is a popular tool for individual fingerprinting. The long-core motif (e.g. tetra-, penta-, and hexa-nucleotide) simple sequence repeats (SSRs) are preferred because they make it easier to separate and distinguish neighbor alleles. In the present study, a new set of 8 tetra-nucleotide SSRs in potato ( Solanum tuberosum ) is reported. By using these 8 markers, 72 out of 76 cultivars obtained from Japan and the United States were clearly discriminated, while two pairs, both of which arose from natural variation, showed identical profiles. The combined probability of identity between two random cultivars for the set of 8 SSR markers was estimated to be 1.10 × 10 -8 , confirming the usefulness of the proposed SSR markers for fingerprinting analyses of potato.
PERMutation Using Transposase Engineering (PERMUTE): A Simple Approach for Constructing Circularly Permuted Protein Libraries.

PubMed

Jones, Alicia M; Atkinson, Joshua T; Silberg, Jonathan J

2017-01-01

Rearrangements that alter the order of a protein's sequence are used in the lab to study protein folding, improve activity, and build molecular switches. One of the simplest ways to rearrange a protein sequence is through random circular permutation, where native protein termini are linked together and new termini are created elsewhere through random backbone fission. Transposase mutagenesis has emerged as a simple way to generate libraries encoding different circularly permuted variants of proteins. With this approach, a synthetic transposon (called a permuteposon) is randomly inserted throughout a circularized gene to generate vectors that express different permuted variants of a protein. In this chapter, we outline the protocol for constructing combinatorial libraries of circularly permuted proteins using transposase mutagenesis, and we describe the different permuteposons that have been developed to facilitate library construction.
Object-oriented parsing of biological databases with Python.

PubMed

Ramu, C; Gemünd, C; Gibson, T J

2000-07-01

While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.
Short Communication: Genetic linkage map of Cucurbita maxima with molecular and morphological markers.

PubMed

Ge, Y; Li, X; Yang, X X; Cui, C S; Qu, S P

2015-05-22

Cucurbita maxima is one of the most widely cultivated vegetables in China and exhibits distinct morphological characteristics. In this study, genetic linkage analysis with 57 simple-sequence repeats, 21 amplified fragment length polymorphisms, 3 random-amplified polymorphic DNA, and one morphological marker revealed 20 genetic linkage groups of C. maxima covering a genetic distance of 991.5 cM with an average of 12.1 cM between adjacent markers. Genetic linkage analysis identified the simple-sequence repeat marker 'PU078072' 5.9 cM away from the locus 'Rc', which controls rind color. The genetic map in the present study will be useful for better mapping, tagging, and cloning of quantitative trait loci/gene(s) affecting economically important traits and for breeding new varieties of C. maxima through marker-assisted selection.
Genetic diversity of Pinus nigra Arn. populations in Southern Spain and Northern Morocco revealed by inter-simple sequence repeat profiles.

PubMed

Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E; Tiscar, Pedro A; Viñegla, Benjamin; Linares, Juan C; Gómez-Gómez, Lourdes; Ahrazem, Oussama

2012-01-01

Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei's genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei's genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups-Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco-while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra.
Genetic Diversity of Pinus nigra Arn. Populations in Southern Spain and Northern Morocco Revealed By Inter-Simple Sequence Repeat Profiles †

PubMed Central

Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E.; Tiscar, Pedro A.; Viñegla, Benjamin; Linares, Juan C.; Gómez-Gómez, Lourdes; Ahrazem, Oussama

2012-01-01

Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei’s genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei’s genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups—Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco—while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra. PMID:22754321
Genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) evaluated using ISSR markers.

PubMed

Vidal, Á M; Vieira, L J; Ferreira, C F; Souza, F V D; Souza, A S; Ledo, C A S

2015-07-14

Molecular markers are efficient for assessing the genetic fidelity of various species of plants after in vitro culture. In this study, we evaluated the genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) using inter-simple sequence repeat markers. Twenty-two cassava accessions from the Embrapa Cassava & Fruits Germplasm Bank were used. For each accession, DNA was extracted from a plant maintained in the field and from 3 plants grown in vitro. For DNA amplification, 27 inter-simple sequence repeat primers were used, of which 24 generated 175 bands; 100 of those bands were polymorphic and were used to study genetic variability among accessions of cassava plants maintained in the field. Based on the genetic distance matrix calculated using the arithmetic complement of the Jaccard's index, genotypes were clustered using the unweighted pair group method using arithmetic averages. The number of bands per primer was 2-13, with an average of 7.3. For most micropropagated accessions, the fidelity study showed no genetic variation between plants of the same accessions maintained in the field and those maintained in vitro, confirming the high genetic fidelity of the micropropagated plants. However, genetic variability was observed among different accessions grown in the field, and clustering based on the dissimilarity matrix revealed 7 groups. Inter-simple sequence repeat markers were efficient for detecting the genetic homogeneity of cassava plants derived from meristem culture, demonstrating the reliability of this propagation system.
Simple Sequence Repeats Provide a Substrate for Phenotypic Variation in the Neurospora crassa Circadian Clock

PubMed Central

Michael, Todd P.; Park, Sohyun; Kim, Tae-Sung; Booth, Jim; Byer, Amanda; Sun, Qi; Chory, Joanne; Lee, Kwangwon

2007-01-01

Background WHITE COLLAR-1 (WC-1) mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ) domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR) being essential for clock function. Methodology/Principal Findings Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. Conclusions/Significance Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N. crassa circadian clock will facilitate an understanding of how fungi exploit their environments. PMID:17726525
Fusion primer and nested integrated PCR (FPNI-PCR): a new high-efficiency strategy for rapid chromosome walking or flanking sequence cloning

PubMed Central

2011-01-01

Background The advent of genomics-based technologies has revolutionized many fields of biological enquiry. However, chromosome walking or flanking sequence cloning is still a necessary and important procedure to determining gene structure. Such methods are used to identify T-DNA insertion sites and so are especially relevant for organisms where large T-DNA insertion libraries have been created, such as rice and Arabidopsis. The currently available methods for flanking sequence cloning, including the popular TAIL-PCR technique, are relatively laborious and slow. Results Here, we report a simple and effective fusion primer and nested integrated PCR method (FPNI-PCR) for the identification and cloning of unknown genomic regions flanked known sequences. In brief, a set of universal primers was designed that consisted of various 15-16 base arbitrary degenerate oligonucleotides. These arbitrary degenerate primers were fused to the 3' end of an adaptor oligonucleotide which provided a known sequence without degenerate nucleotides, thereby forming the fusion primers (FPs). These fusion primers are employed in the first step of an integrated nested PCR strategy which defines the overall FPNI-PCR protocol. In order to demonstrate the efficacy of this novel strategy, we have successfully used it to isolate multiple genomic sequences namely, 21 orthologs of genes in various species of Rosaceace, 4 MYB genes of Rosa rugosa, 3 promoters of transcription factors of Petunia hybrida, and 4 flanking sequences of T-DNA insertion sites in transgenic tobacco lines and 6 specific genes from sequenced genome of rice and Arabidopsis. Conclusions The successful amplification of target products through FPNI-PCR verified that this novel strategy is an effective, low cost and simple procedure. Furthermore, FPNI-PCR represents a more sensitive, rapid and accurate technique than the established TAIL-PCR and hiTAIL-PCR procedures. PMID:22093809
Comparison of double-locus sequence typing (DLST) and multilocus sequence typing (MLST) for the investigation of Pseudomonas aeruginosa populations.

PubMed

Cholley, Pascal; Stojanov, Milos; Hocquet, Didier; Thouverez, Michelle; Bertrand, Xavier; Blanc, Dominique S

2015-08-01

Reliable molecular typing methods are necessary to investigate the epidemiology of bacterial pathogens. Reference methods such as multilocus sequence typing (MLST) and pulsed-field gel electrophoresis (PFGE) are costly and time consuming. Here, we compared our newly developed double-locus sequence typing (DLST) method for Pseudomonas aeruginosa to MLST and PFGE on a collection of 281 isolates. DLST was as discriminatory as MLST and was able to recognize "high-risk" epidemic clones. Both methods were highly congruent. Not surprisingly, a higher discriminatory power was observed with PFGE. In conclusion, being a simple method (single-strand sequencing of only 2 loci), DLST is valuable as a first-line typing tool for epidemiological investigations of P. aeruginosa. Coupled to a more discriminant method like PFGE or whole genome sequencing, it might represent an efficient typing strategy to investigate or prevent outbreaks. Copyright © 2015 Elsevier Inc. All rights reserved.
SOBA: sequence ontology bioinformatics analysis.

PubMed

Moore, Barry; Fan, Guozhen; Eilbeck, Karen

2010-07-01

The advent of cheaper, faster sequencing technologies has pushed the task of sequence annotation from the exclusive domain of large-scale multi-national sequencing projects to that of research laboratories and small consortia. The bioinformatics burden placed on these laboratories, some with very little programming experience can be daunting. Fortunately, there exist software libraries and pipelines designed with these groups in mind, to ease the transition from an assembled genome to an annotated and accessible genome resource. We have developed the Sequence Ontology Bioinformatics Analysis (SOBA) tool to provide a simple statistical and graphical summary of an annotated genome. We envisage its use during annotation jamborees, genome comparison and for use by developers for rapid feedback during annotation software development and testing. SOBA also provides annotation consistency feedback to ensure correct use of terminology within annotations, and guides users to add new terms to the Sequence Ontology when required. SOBA is available at http://www.sequenceontology.org/cgi-bin/soba.cgi.
Effective preparation of magnetic superhydrophobic Fe3O4/PU sponge for oil-water separation

NASA Astrophysics Data System (ADS)

Li, Zeng-Tian; Lin, Bo; Jiang, Li-Wang; Lin, En-Chao; Chen, Jian; Zhang, Shi-Jie; Tang, Yi-Wen; He, Fu-An; Li, De-Hao

2018-01-01

Fe3O4 nanoparticles were modified by tetraethoxysilane and different amounts of trimethoxy (1H,1H,2H,2H-heptadecafluorodecyl) silane in sequence to obtain the magnetic nanoparticles with low surface energy, which could be used to construct the superhydrophobic surfaces for PU sponge, cotton fabric, and filter paper by a simple drop-coating method. Particularly, all the resultant Fe3O4/PU sponges containing different fluoroalkylsilane-modified Fe3O4 nanoparticles possessed both high water repellency with contact angle in the range of 150.2-154.7° and good oil affinity, which could not only effectively remove oil from water followed by convenient magnetic recovery but also easily realize the oil-water separation as a filter only driven by gravity. The Fe3O4/PU sponges showed high absorption capability of peanut oil, pump oil, and silicone oil with the maximum absorptive capacities of 40.3, 39.3, and 46.3 g/g, respectively. Such novel sponges might be a potential candidate for oil-water separation as well as oil absorption and transportation accompanied by the advantages of simple process, remote control by magnetic field, and low energy consumption.
Dynamics of prebiotic RNA reproduction illuminated by chemical game theory

PubMed Central

Yeates, Jessica A. M.; Hilbe, Christian; Zwick, Martin; Nowak, Martin A.; Lehman, Niles

2016-01-01

Many origins-of-life scenarios depict a situation in which there are common and potentially scarce resources needed by molecules that compete for survival and reproduction. The dynamics of RNA assembly in a complex mixture of sequences is a frequency-dependent process and mimics such scenarios. By synthesizing Azoarcus ribozyme genotypes that differ in their single-nucleotide interactions with other genotypes, we can create molecules that interact among each other to reproduce. Pairwise interplays between RNAs involve both cooperation and selfishness, quantifiable in a 2 × 2 payoff matrix. We show that a simple model of differential equations based on chemical kinetics accurately predicts the outcomes of these molecular competitions using simple rate inputs into these matrices. In some cases, we find that mixtures of different RNAs reproduce much better than each RNA type alone, reflecting a molecular form of reciprocal cooperation. We also demonstrate that three RNA genotypes can stably coexist in a rock–paper–scissors analog. Our experiments suggest a new type of evolutionary game dynamics, called prelife game dynamics or chemical game dynamics. These operate without template-directed replication, illustrating how small networks of RNAs could have developed and evolved in an RNA world. PMID:27091972
Morphology, stratigraphy, and surface roughness properties of Venusian lava flow fields

NASA Astrophysics Data System (ADS)

Byrnes, Jeffrey M.; Crown, David A.

2002-10-01

Morphologic characteristics, flow stratigraphy, and radar backscatter properties of five lava flow fields on Venus (Turgmam Fluctus, Zipaltonal Fluctus, Tuli Mons/Uilata Fluctus, Var Mons, and Mylitta Fluctus) were examined to understand flow field emplacement mechanisms and relationships to other surface processes. These analyses indicate that the flow fields studied developed through emplacement of numerous, thin flow units, presumably over extended periods of time. Although the Venusian fields display flow morphologies similar to those observed within terrestrial flow fields, the Venusian flow units are significantly larger and have a larger range of radar backscatter coefficients. Both simple and compound flow emplacement appear to have occurred within the flow fields. A potential correlation between flow rheology and radar brightness is suggested by differences in planform morphology, apparent flow thickness, and apparent sensitivity to topography between bright and dark flows. Distributary flow morphologies may result from tube-fed flows, and postemplacement modification by processes such as flow inflation and crustal foundering is consistent with discrete zones of increased radar brightness within individual flow lobes. Mapping of these flow fields does not indicate any simple evolutionary trend in eruptive/resurfacing style within the flow fields, or any consistent temporal sequence relative to other tectonic and volcanic features.

Dynamics of prebiotic RNA reproduction illuminated by chemical game theory.

PubMed

Yeates, Jessica A M; Hilbe, Christian; Zwick, Martin; Nowak, Martin A; Lehman, Niles

2016-05-03

Many origins-of-life scenarios depict a situation in which there are common and potentially scarce resources needed by molecules that compete for survival and reproduction. The dynamics of RNA assembly in a complex mixture of sequences is a frequency-dependent process and mimics such scenarios. By synthesizing Azoarcus ribozyme genotypes that differ in their single-nucleotide interactions with other genotypes, we can create molecules that interact among each other to reproduce. Pairwise interplays between RNAs involve both cooperation and selfishness, quantifiable in a 2 × 2 payoff matrix. We show that a simple model of differential equations based on chemical kinetics accurately predicts the outcomes of these molecular competitions using simple rate inputs into these matrices. In some cases, we find that mixtures of different RNAs reproduce much better than each RNA type alone, reflecting a molecular form of reciprocal cooperation. We also demonstrate that three RNA genotypes can stably coexist in a rock-paper-scissors analog. Our experiments suggest a new type of evolutionary game dynamics, called prelife game dynamics or chemical game dynamics. These operate without template-directed replication, illustrating how small networks of RNAs could have developed and evolved in an RNA world.
An ultrasensitive label-free biosensor for assaying of sequence-specific DNA-binding protein based on amplifying fluorescent conjugated polymer.

PubMed

Liu, Xingfen; Ouyang, Lan; Cai, Xiaohui; Huang, Yanqin; Feng, Xiaomiao; Fan, Quli; Huang, Wei

2013-03-15

Sensitive, reliable, and simple detection of sequence-specific DNA-binding proteins (DBP) is of paramount importance in the area of proteomics, genomics, and biomedicine. We describe herein a novel fluorescent-amplified strategy for ultrasensitive, visual, quantitative, and "turn-on" detection of DBP. A Förster resonance energy transfer (FRET) assay utilizing a cationic conjugated polymer (CCP) and an intercalating dye was designed to detect a key transcription factor, nuclear factor-kappa B (NF-κB), the model target. A series of label-free DNA probes bearing one or two protein-binding sites (PBS) were used to identify the target protein specifically. The binding DBP protects the probe from digestion by exonuclease III, resulting in high efficient FRET due to the high affinity between the intercalating dye and duplex DNA, as well as strong electrostatic interactions between the CCP and DNA probe. By using label-free hairpin DNA or double-stranded DNA containing two PBS as probe, we could detect as low as 1 pg/μL of NF-κB in HeLa nuclear extracts, which is 10000-fold more sensitive than the previously reported methods. The approach also allows naked-eye detection by observing fluorescent color of solutions with the assistance of a hand-held UV lamp. Additionally, a less than 10% relative standard deviation was obtained, which offers a new platform for superior precision, low-cost, and simple detection of DBP. The features of our optical biosensor shows promising potential for early diagnosis of many diseases and high-throughput screening of new drugs targeted to DNA-binding proteins. Copyright © 2012 Elsevier B.V. All rights reserved.
Genetic Variation and Association Mapping of Seed-Related Traits in Cultivated Peanut (Arachis hypogaea L.) Using Single-Locus Simple Sequence Repeat Markers.

PubMed

Zhao, Jiaojiao; Huang, Li; Ren, Xiaoping; Pandey, Manish K; Wu, Bei; Chen, Yuning; Zhou, Xiaojing; Chen, Weigang; Xia, Youlin; Li, Zeqing; Luo, Huaiyong; Lei, Yong; Varshney, Rajeev K; Liao, Boshou; Jiang, Huifang

2017-01-01

Cultivated peanut ( Arachis hypogaea L.) is an allotetraploid (AABB, 2 n = 4 x = 40), valued for its edible oil and digestible protein. Seed size and weight are important agronomical traits significantly influence the yield and nutritional composition of peanut. However, the genetic basis of seed-related traits remains ambiguous. Association mapping is a powerful approach for quickly and efficiently exploring the genetic basis of important traits in plants. In this study, a total of 104 peanut accessions were used to identify molecular markers associated with seed-related traits using 554 single-locus simple sequence repeat (SSR) markers. Most of the accessions had no or weak relationship in the peanut panel. The linkage disequilibrium (LD) decayed with the genetic distance of 1cM at the genome level and the LD of B subgenome decayed faster than that of the A subgenome. Large phenotypic variation was observed for four seed-related traits in the association panel. Using mixed linear model with population structure and kinship, a total of 30 significant SSR markers were detected to be associated with four seed-related traits ( P < 1.81 × 10 -3 ) in different environments, which explained 11.22-32.30% of the phenotypic variation for each trait. The marker AHGA44686 was simultaneously and repeatedly associated with seed length and hundred-seed weight in multiple environments with large phenotypic variance (26.23 ∼ 32.30%). The favorable alleles of associated markers for each seed-related trait and the optimal combination of favorable alleles of associated markers were identified to significantly enhance trait performance, revealing a potential of utilization of these associated markers in peanut breeding program.
Analysis of Salmonella enterica Serovar Typhimurium Variable-Number Tandem-Repeat Data for Public Health Investigation Based on Measured Mutation Rates and Whole-Genome Sequence Comparisons

PubMed Central

Dimovski, Karolina; Cao, Hanwei; Wijburg, Odilia L. C.; Strugnell, Richard A.; Mantena, Radha K.; Whipp, Margaret; Hogg, Geoff

2014-01-01

Variable-number tandem repeats (VNTRs) mutate rapidly and can be useful markers for genotyping. While multilocus VNTR analysis (MLVA) is increasingly used in the detection and investigation of food-borne outbreaks caused by Salmonella enterica serovar Typhimurium (S. Typhimurium) and other bacterial pathogens, MLVA data analysis usually relies on simple clustering approaches that may lead to incorrect interpretations. Here, we estimated the rates of copy number change at each of the five loci commonly used for S. Typhimurium MLVA, during in vitro and in vivo passage. We found that loci STTR5, STTR6, and STTR10 changed during passage but STTR3 and STTR9 did not. Relative rates of change were consistent across in vitro and in vivo growth and could be accurately estimated from diversity measures of natural variation observed during large outbreaks. Using a set of 203 isolates from a series of linked outbreaks and whole-genome sequencing of 12 representative isolates, we assessed the accuracy and utility of several alternative methods for analyzing and interpreting S. Typhimurium MLVA data. We show that eBURST analysis was accurate and informative. For construction of MLVA-based trees, a novel distance metric, based on the geometric model of VNTR evolution coupled with locus-specific weights, performed better than the commonly used simple or categorical distance metrics. The data suggest that, for the purpose of identifying potential transmission clusters for further investigation, isolates whose profiles differ at one of the rapidly changing STTR5, STTR6, and STTR10 loci should be collapsed into the same cluster. PMID:24957617
Coherent direct sequence optical code multiple access encoding-decoding efficiency versus wavelength detuning.

PubMed

Pastor, D; Amaya, W; García-Olcina, R; Sales, S

2007-07-01

We present a simple theoretical model of and the experimental verification for vanishing of the autocorrelation peak due to wavelength detuning on the coding-decoding process of coherent direct sequence optical code multiple access systems based on a superstructured fiber Bragg grating. Moreover, the detuning vanishing effect has been explored to take advantage of this effect and to provide an additional degree of multiplexing and/or optical code tuning.
An Activation-Based Model of Routine Sequence Errors

DTIC Science & Technology

2015-04-01

part of the ACT-R frame- work (e.g., Anderson, 1983), we adopt a newer, richer no- tion of priming as part of our approach ( Harrison & Trafton, 2010...2014). Other models of routine sequence errors, such as the in- teractive activation network ( IAN ) model (Cooper & Shal- lice, 2006) and the simple...error patterns that results from an interface layout shift. The ideas behind our expanded priming approach, however, could apply to IAN , which uses
Assignment of the SLA alleles and reproductive potential of selective breeding Duroc pig lines.

PubMed

Soe, Ok Kar; Ohba, Yasunori; Imaeda, Noriaki; Nishii, Naohito; Takasu, Masaki; Yoshioka, Gou; Kawata, Hisako; Shigenari, Atsuko; Uenishi, Hirohide; Inoko, Hidetoshi; Ando, Asako; Kitagawa, Hitoshi

2008-01-01

Pigs with defined swine leukocyte antigen (SLA) haplotypes and their detailed information are useful for transplantation and immunological studies. We developed two herds of SLA homozygous Duroc pigs with novel SLA haplotypes and characterized their reproductive potential. For selective inbreeding, a pair of Duroc pigs was chosen as initial breeders, and substantial breeding within progenies was carried out for eight generations. In the selective breeding Duroc pigs, SLA haplotypes were assigned by nucleotide sequence determination of reverse transcription polymerase chain reaction (RT-PCR) products of three SLA classical class I genes and two class II genes. Based on this sequence information, we developed a rapid and simple SLA class II DNA typing method by polymerase chain reaction-sequence specific primer (PCR-SSP) technique. As a complementary method for the characterization of the SLA haplotypes, genetic polymorphisms of 36 microsatellite (MS) markers within the SLA region were also analyzed in the selective breeding pigs with SLA homozygous/heterozygous haplotypes. Among the selective breeding pigs from the third to fifth generations, only two SLA haplotypes were identified by the RT-PCR based SLA typing method; Hp-27.30 (SLA-1*08an03, SLA-1*06an04, SLA-2*0102, SLA-3*0101 DRB1*1101 and DQB1*0503) and Hp-60.13 (SLA-1*an02, SLA-2*1002, SLA-3*0502, DRB1*0403 and DQB1*0303). In these two SLA haplotypes, two class I haplotypes, Hp-27.0 and Hp-60.0, are novel. Furthermore, two class II haplotypes, Hp-0.30 and Hp-0.13, which were previously reported in Korean native pigs and pigs of Hanford breed, respectively, were also assigned by a simple assay using a PCR-SSP technique in the entire selective breeding stock. Moreover, two haplotype specific MS patterns were observed across the entire SLA region in the selective breeding (homozygous/heterozygous) pigs. No morphological abnormalities were observed in selective breeding pigs. The theoretical inbreeding coefficient at the eighth generation was 78.5%. In all generations of selective breeding pigs, litter sizes were comparable and weaning weights from the fifth to eighth generation produced progenies significantly lighter (P < 0.01) than those in the non-selective breeding pigs. We established and characterized SLA homozygous Duroc herds with two kinds of haplotypes that can be used as a new resource for transplantation and other biomedical studies.
A simple derivation for amplitude and time period of charged particles in an electrostatic bathtub potential

NASA Astrophysics Data System (ADS)

Prathap Reddy, K.

2016-11-01

An ‘electrostatic bathtub potential’ is defined and analytical expressions for the time period and amplitude of charged particles in this potential are obtained and compared with simulations. These kinds of potentials are encountered in linear electrostatic ion traps, where the potential along the axis appears like a bathtub. Ion traps are used in basic physics research and mass spectrometry to store ions; these stored ions make oscillatory motion within the confined volume of the trap. Usually these traps are designed and studied using ion optical software, but in this work the bathtub potential is reproduced by making two simple modifications to the harmonic oscillator potential. The addition of a linear ‘k 1|x|’ potential makes the simple harmonic potential curve steeper with a sharper turn at the origin, while the introduction of a finite-length zero potential region at the centre reproduces the flat region of the bathtub curve. This whole exercise of modelling a practical experimental situation in terms of a well-known simple physics problem may generate interest among readers.
CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

PubMed Central

Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

2015-01-01

Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360
TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees.

PubMed

Sauvage, Thomas; Plouviez, Sophie; Schmidt, William E; Fredericq, Suzanne

2018-03-05

The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content.
Enhanced sequencing coverage with digital droplet multiple displacement amplification

PubMed Central

Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

2016-01-01

Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978
Techniques for automatic large scale change analysis of temporal multispectral imagery

NASA Astrophysics Data System (ADS)

Mercovich, Ryan A.

Change detection in remotely sensed imagery is a multi-faceted problem with a wide variety of desired solutions. Automatic change detection and analysis to assist in the coverage of large areas at high resolution is a popular area of research in the remote sensing community. Beyond basic change detection, the analysis of change is essential to provide results that positively impact an image analyst's job when examining potentially changed areas. Present change detection algorithms are geared toward low resolution imagery, and require analyst input to provide anything more than a simple pixel level map of the magnitude of change that has occurred. One major problem with this approach is that change occurs in such large volume at small spatial scales that a simple change map is no longer useful. This research strives to create an algorithm based on a set of metrics that performs a large area search for change in high resolution multispectral image sequences and utilizes a variety of methods to identify different types of change. Rather than simply mapping the magnitude of any change in the scene, the goal of this research is to create a useful display of the different types of change in the image. The techniques presented in this dissertation are used to interpret large area images and provide useful information to an analyst about small regions that have undergone specific types of change while retaining image context to make further manual interpretation easier. This analyst cueing to reduce information overload in a large area search environment will have an impact in the areas of disaster recovery, search and rescue situations, and land use surveys among others. By utilizing a feature based approach founded on applying existing statistical methods and new and existing topological methods to high resolution temporal multispectral imagery, a novel change detection methodology is produced that can automatically provide useful information about the change occurring in large area and high resolution image sequences. The change detection and analysis algorithm developed could be adapted to many potential image change scenarios to perform automatic large scale analysis of change.
Flow cytometry for enrichment and titration in massively parallel DNA sequencing

PubMed Central

Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim

2009-01-01

Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

PubMed

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Detecting and Analyzing Genetic Recombination Using RDP4.

PubMed

Martin, Darren P; Murrell, Ben; Khoosal, Arjun; Muhire, Brejnev

2017-01-01

Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. The evolutionary value of recombination has been widely debated and so too has its influence on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. When nucleic acids recombine, the evolution of the daughter or recombinant molecule cannot be accurately described by a single phylogeny. This simple fact can seriously undermine the accuracy of any phylogenetics-based analytical approach which assumes that the evolutionary history of a set of recombining sequences can be adequately described by a single phylogenetic tree. There are presently a large number of available methods and associated computer programs for analyzing and characterizing recombination in various classes of nucleotide sequence datasets. Here we examine the use of some of these methods to derive and test recombination hypotheses using multiple sequence alignments.
Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

PubMed

Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

2010-11-01

Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.
Characterisation of the bacterial community structures in the intestine of Lampetra morii.

PubMed

Li, Yingying; Xie, Wenfang; Li, Qingwei

2016-07-01

The metagenomic analysis and 16S rDNA sequencing method were used to investigate the bacterial community in the intestines of Lampetra morii. The bacterial community structure in L. morii intestine was relatively simple. Eight different operational taxonomic units were observed. Chitinophagaceae_unclassified (26.5 %) and Aeromonas spp. (69.6 %) were detected as dominant members at the genus level. The non-dominant genera were as follows: Acinetobacter spp. (1.4 %), Candidatus Bacilloplasma (2.5 %), Enterobacteria spp. (1.5 %), Shewanella spp. (0.04 %), Vibrio spp. (0.09 %), and Yersinia spp. (1.8 %). The Shannon-Wiener (H) and Simpson (1-D) indexes were 0.782339 and 0.5546, respectively. The rarefaction curve representing the bacterial community richness and Shannon-Wiener curve representing the bacterial community diversity reached asymptote, which indicated that the sequence depth were sufficient to represent the majority of species richness and bacterial community diversity. The number of Aeromonas in lamprey intestine was two times higher after stimulation by lipopolysaccharide than PBS. This study provides data for understanding the bacterial community harboured in lamprey intestines and exploring potential key intestinal symbiotic bacteria essential for the L. morii immune response.
Scan for Motifs: a webserver for the analysis of post-transcriptional regulatory elements in the 3' untranslated regions (3' UTRs) of mRNAs.

PubMed

Biswas, Ambarish; Brown, Chris M

2014-06-08

Gene expression in vertebrate cells may be controlled post-transcriptionally through regulatory elements in mRNAs. These are usually located in the untranslated regions (UTRs) of mRNA sequences, particularly the 3'UTRs. Scan for Motifs (SFM) simplifies the process of identifying a wide range of regulatory elements on alignments of vertebrate 3'UTRs. SFM includes identification of both RNA Binding Protein (RBP) sites and targets of miRNAs. In addition to searching pre-computed alignments, the tool provides users the flexibility to search their own sequences or alignments. The regulatory elements may be filtered by expected value cutoffs and are cross-referenced back to their respective sources and literature. The output is an interactive graphical representation, highlighting potential regulatory elements and overlaps between them. The output also provides simple statistics and links to related resources for complementary analyses. The overall process is intuitive and fast. As SFM is a free web-application, the user does not need to install any software or databases. Visualisation of the binding sites of different classes of effectors that bind to 3'UTRs will facilitate the study of regulatory elements in 3' UTRs.
De novo selection of oncogenes.

PubMed

Chacón, Kelly M; Petti, Lisa M; Scheideman, Elizabeth H; Pirazzoli, Valentina; Politi, Katerina; DiMaio, Daniel

2014-01-07

All cellular proteins are derived from preexisting ones by natural selection. Because of the random nature of this process, many potentially useful protein structures never arose or were discarded during evolution. Here, we used a single round of genetic selection in mouse cells to isolate chemically simple, biologically active transmembrane proteins that do not contain any amino acid sequences from preexisting proteins. We screened a retroviral library expressing hundreds of thousands of proteins consisting of hydrophobic amino acids in random order to isolate four 29-aa proteins that induced focus formation in mouse and human fibroblasts and tumors in mice. These proteins share no amino acid sequences with known cellular or viral proteins, and the simplest of them contains only seven different amino acids. They transformed cells by forming a stable complex with the platelet-derived growth factor β receptor transmembrane domain and causing ligand-independent receptor activation. We term this approach de novo selection and suggest that it can be used to generate structures and activities not observed in nature, create prototypes for novel research reagents and therapeutics, and provide insight into cell biology, transmembrane protein-protein interactions, and possibly virus evolution and the origin of life.
Research Techniques Made Simple: High-Throughput Sequencing of the T-Cell Receptor.

PubMed

Matos, Tiago R; de Rie, Menno A; Teunissen, Marcel B M

2017-06-01

High-throughput sequencing (HTS) of the T-cell receptor (TCR) is a rapidly advancing technique that allows sensitive and accurate identification and quantification of every distinct T-cell clone present within any biological sample. The relative frequency of each individual clone within the full T-cell repertoire can also be studied. HTS is essential to expand our knowledge on the diversity of the TCR repertoire in homeostasis or under pathologic conditions, as well as to understand the kinetics of antigen-specific T-cell responses that lead to protective immunity (i.e., vaccination) or immune-related disorders (i.e., autoimmunity and cancer). HTS can be tailored for personalized medicine, having the potential to monitor individual responses to therapeutic interventions and show prognostic and diagnostic biomarkers. In this article, we briefly review the methodology, advances, and limitations of HTS of the TCR and describe emerging applications of this technique in the field of investigative dermatology. We highlight studying the pathogenesis of T cells in allergic dermatitis and the application of HTS of the TCR in diagnosing, detecting recurrence early, and monitoring responses to therapy in cutaneous T-cell lymphoma. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

On the potential of using peculiarities of the protein intrinsic disorder distribution in mitochondrial cytochrome b to identify the source of animal meats

PubMed Central

Yacoub, Haitham A.; Sadek, Mahmoud A.; Uversky, Vladimir N.

2017-01-01

ABSTRACT This study was conducted to identify the source of animal meat based on the peculiarities of protein intrinsic disorder distribution in mitochondrial cytochrome b (mtCyt-b). The analysis revealed that animal and avian species can be discriminated based on the proportions of the two groups of residues, Leu+Ile, and Ser+Pro+Ala, in the amino acid sequences of their mtCyt-b. Although levels of the overall intrinsic disorder in mtCyt-b is not very high, the peculiarities of disorder distribution within the sequences of mtCyt-b from different species varies in a rather specific way. In fact, positions and intensities of disorder/flexibility “signals” in the corresponding disorder profiles are relatively unique for avian and animal species. Therefore, it is possible to devise a set of simple rules based on the peculiarities of disorder profiles of their mtCyt-b proteins to discriminate among species. This intrinsic disorder-based analysis represents a new technique that could be used to provide a promising solution for identification of the source of meats. PMID:28331777
An intracellular analysis of the visual responses of neurones in cat visual cortex.

PubMed Central

Douglas, R J; Martin, K A; Whitteridge, D

1991-01-01

1. Extracellular and intracellular recordings were made from neurones in the visual cortex of the cat in order to compare the subthreshold membrane potentials, reflecting the input to the neurone, with the output from the neurone seen as action potentials. 2. Moving bars and edges, generated under computer control, were used to stimulate the neurones. The membrane potential was digitized and averaged for a number of trials after stripping the action potentials. Comparison of extracellular and intracellular discharge patterns indicated that the intracellular impalement did not alter the neurones' properties. Input resistance of the neurone altered little during stable intracellular recordings (30 min-2 h 50 min). 3. Intracellular recordings showed two distinct patterns of membrane potential changes during optimal visual stimulation. The patterns corresponded closely to the division of S-type (simple) and C-type (complex) receptive fields. Simple cells had a complex pattern of membrane potential fluctuations, involving depolarizations alternating with hyperpolarizations. Complex cells had a simple single sustained plateau of depolarization that was often followed but not preceded by a hyperpolarization. In both simple and complex cells the depolarizations led to action potential discharges. The hyperpolarizations were associated with inhibition of action potential discharge. 4. Stimulating simple cells with non-optimal directions of motion produced little or no hyperpolarization of the membrane in most cases, despite a lack of action potential output. Directional complex cells always produced a single plateau of depolarization leading to action potential discharge in both the optimal and non-optimal directions of motion. The directionality could not be predicted on the basis of the position of the hyperpolarizing inhibitory potentials found in the optimal direction. 5. Stimulation of simple cells with non-optimal orientations occasionally produced slight hyperpolarizations and inhibition of action potential discharge. Complex cells, which had broader orientation tuning than simple cells, could show marked hyperpolarization for non-optimal orientations, but this was not generally the case. 6. The data do not support models of directionality and orientation that rely solely on strong inhibitory mechanisms to produce stimulus selectivity. PMID:1804981
Differences in Early Stages of Tactile ERP Temporal Sequence (P100) in Cortical Organization during Passive Tactile Stimulation in Children with Blindness and Controls.

PubMed

Ortiz Alonso, Tomás; Santos, Juan Matías; Ortiz Terán, Laura; Borrego Hernández, Mayelin; Poch Broto, Joaquín; de Erausquin, Gabriel Alejandro

2015-01-01

Compared to their seeing counterparts, people with blindness have a greater tactile capacity. Differences in the physiology of object recognition between people with blindness and seeing people have been well documented, but not when tactile stimuli require semantic processing. We used a passive vibrotactile device to focus on the differences in spatial brain processing evaluated with event related potentials (ERP) in children with blindness (n = 12) vs. normally seeing children (n = 12), when learning a simple spatial task (lines with different orientations) or a task involving recognition of letters, to describe the early stages of its temporal sequence (from 80 to 220 msec) and to search for evidence of multi-modal cortical organization. We analysed the P100 of the ERP. Children with blindness showed earlier latencies for cognitive (perceptual) event related potentials, shorter reaction times, and (paradoxically) worse ability to identify the spatial direction of the stimulus. On the other hand, they are equally proficient in recognizing stimuli with semantic content (letters). The last observation is consistent with the role of P100 on somatosensory-based recognition of complex forms. The cortical differences between seeing control and blind groups, during spatial tactile discrimination, are associated with activation in visual pathway (occipital) and task-related association (temporal and frontal) areas. The present results show that early processing of tactile stimulation conveying cross modal information differs in children with blindness or with normal vision.
Differences in Early Stages of Tactile ERP Temporal Sequence (P100) in Cortical Organization during Passive Tactile Stimulation in Children with Blindness and Controls

PubMed Central

Ortiz Alonso, Tomás; Santos, Juan Matías; Ortiz Terán, Laura; Borrego Hernández, Mayelin; Poch Broto, Joaquín; de Erausquin, Gabriel Alejandro

2015-01-01

Compared to their seeing counterparts, people with blindness have a greater tactile capacity. Differences in the physiology of object recognition between people with blindness and seeing people have been well documented, but not when tactile stimuli require semantic processing. We used a passive vibrotactile device to focus on the differences in spatial brain processing evaluated with event related potentials (ERP) in children with blindness (n = 12) vs. normally seeing children (n = 12), when learning a simple spatial task (lines with different orientations) or a task involving recognition of letters, to describe the early stages of its temporal sequence (from 80 to 220 msec) and to search for evidence of multi-modal cortical organization. We analysed the P100 of the ERP. Children with blindness showed earlier latencies for cognitive (perceptual) event related potentials, shorter reaction times, and (paradoxically) worse ability to identify the spatial direction of the stimulus. On the other hand, they are equally proficient in recognizing stimuli with semantic content (letters). The last observation is consistent with the role of P100 on somatosensory-based recognition of complex forms. The cortical differences between seeing control and blind groups, during spatial tactile discrimination, are associated with activation in visual pathway (occipital) and task-related association (temporal and frontal) areas. The present results show that early processing of tactile stimulation conveying cross modal information differs in children with blindness or with normal vision. PMID:26225827
Simple synthesis of PbSe nanocrystals and their self-assembly into 2D ‘flakes’ and 1D ‘ribbons’ structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Díaz-Torres, E., E-mail: ediaz@cinvestav.mx; Ortega-López, M.; Matsumoto, Y.

2016-08-15

Highlights: • PbSe is obtained in a simple way by the co-precipitation method at low-temperature. • The structural, morphological and optical properties of PbSe were studied. • Adding NH{sub 4}OH to the precursor solutions influences on the morphology. • 2D- and 1D-PbSe structures assemble by oriented attachment. • PbSe can be a potential candidate for thermoelectric applications. - Abstract: This work presents a simple and low-temperature method to prepare a variety of Lead selenide (PbSe) nanostructures, using aqueous solutions of Pb(NO{sub 3}){sub 2} and NaHSe. Nanostructures with different morphology were obtained by varying the Pb:Se molar ratio, as well asmore » the mixing sequence of NH{sub 4}OH with either Pb(NO{sub 3}){sub 2} or NaHSe. Nanoparticles with different shapes (spherical and octahedral), and self-assembled structures (flakes and ribbons) were observed by Transmission Electron Microscopy. X-ray results confirmed that the PbSe rock-salt crystalline structure was obtained for all of the prepared samples. The crystal size is in the order of 7.3 to 8.9 nm for single nanocrystals. The absorption spectra of the samples show exciton absorption bands at 1395 nm and 1660 nm. This material could be used to develop more advanced structures for thermoelectric generators.« less
Habitable zone lifetimes of exoplanets around main sequence stars.

PubMed

Rushby, Andrew J; Claire, Mark W; Osborn, Hugh; Watson, Andrew J

2013-09-01

The potential habitability of newly discovered exoplanets is initially assessed by determining whether their orbits fall within the circumstellar habitable zone of their star. However, the habitable zone (HZ) is not static in time or space, and its boundaries migrate outward at a rate proportional to the increase in luminosity of a star undergoing stellar evolution, possibly including or excluding planets over the course of the star's main sequence lifetime. We describe the time that a planet spends within the HZ as its "habitable zone lifetime." The HZ lifetime of a planet has strong astrobiological implications and is especially important when considering the evolution of complex life, which is likely to require a longer residence time within the HZ. Here, we present results from a simple model built to investigate the evolution of the "classic" HZ over time, while also providing estimates for the evolution of stellar luminosity over time in order to develop a "hybrid" HZ model. These models return estimates for the HZ lifetimes of Earth and 7 confirmed HZ exoplanets and 27 unconfirmed Kepler candidates. The HZ lifetime for Earth ranges between 6.29 and 7.79×10⁹ years (Gyr). The 7 exoplanets fall in a range between ∼1 and 54.72 Gyr, while the 27 Kepler candidate planets' HZ lifetimes range between 0.43 and 18.8 Gyr. Our results show that exoplanet HD 85512b is no longer within the HZ, assuming it has an Earth analog atmosphere. The HZ lifetime should be considered in future models of planetary habitability as setting an upper limit on the lifetime of any potential exoplanetary biosphere, and also for identifying planets of high astrobiological potential for continued observational or modeling campaigns.
SEQUENCING of TSUNAMI WAVES: Why the first wave is not always the largest?

NASA Astrophysics Data System (ADS)

Synolakis, C.; Okal, E.

2016-12-01

We discuss what contributes to the `sequencing' of tsunami waves in the far field, that is, to the distribution of the maximum sea surface amplitude inside the dominant wave packet constituting the primary arrival at a distant harbour. Based on simple models of sources for which analytical solutions are available, we show that, as range is increased, the wave pattern evolves from a regime of maximum amplitude in the first oscillation to one of delayed maximum, where the largest amplitude takes place during a subsequent oscillation. In the case of the simple, instantaneous uplift of a circular disk at the surface of an ocean of constant depth, the critical distance for transition between those patterns scales as r 30 /h2 where r0 is the radius of the disk and h the depth of the ocean. This behaviour is explained from simple arguments based on a model where sequencing results from frequency dispersion in the primary wave packet, as the width of its spectrum around its dominant period T0 becomes dispersed in time in an amount comparable to T0 , the latter being controlled by a combination of source size and ocean depth. The general concepts in this model are confirmed in the case of more realistic sources for tsunami excitation by a finite-time deformation of the ocean floor, as well as in real-life simulations of tsunamis excited by large subduction events, for which we find that the influence of fault width on the distribution of sequencing is more important than that of fault length. Finally, simulation of the major events of Chile (2010) and Japan (2011) at large arrays of virtual gauges in the Pacific Basin correctly predicts the majority of the sequencing patterns observed on DART buoys during these events. By providing insight into the evolution with time of wave amplitudes inside primary wave packets for far field tsunamis generated by large earthquakes, our results stress the importance, for civil defense authorities, of issuing warning and evacuation orders of sufficient duration to avoid the hazard
Sequencing of tsunami waves: why the first wave is not always the largest

NASA Astrophysics Data System (ADS)

Okal, Emile A.; Synolakis, Costas E.

2016-02-01

This paper examines the factors contributing to the `sequencing' of tsunami waves in the far field, that is, to the distribution of the maximum sea surface amplitude inside the dominant wave packet constituting the primary arrival at a distant harbour. Based on simple models of sources for which analytical solutions are available, we show that, as range is increased, the wave pattern evolves from a regime of maximum amplitude in the first oscillation to one of delayed maximum, where the largest amplitude takes place during a subsequent oscillation. In the case of the simple, instantaneous uplift of a circular disk at the surface of an ocean of constant depth, the critical distance for transition between those patterns scales as r_0^3 / h^2 where r0 is the radius of the disk and h the depth of the ocean. This behaviour is explained from simple arguments based on a model where sequencing results from frequency dispersion in the primary wave packet, as the width of its spectrum around its dominant period T0 becomes dispersed in time in an amount comparable to T0, the latter being controlled by a combination of source size and ocean depth. The general concepts in this model are confirmed in the case of more realistic sources for tsunami excitation by a finite-time deformation of the ocean floor, as well as in real-life simulations of tsunamis excited by large subduction events, for which we find that the influence of fault width on the distribution of sequencing is more important than that of fault length. Finally, simulation of the major events of Chile (2010) and Japan (2011) at large arrays of virtual gauges in the Pacific Basin correctly predicts the majority of the sequencing patterns observed on DART buoys during these events. By providing insight into the evolution with time of wave amplitudes inside primary wave packets for far field tsunamis generated by large earthquakes, our results stress the importance, for civil defense authorities, of issuing warning and evacuation orders of sufficient duration to avoid the hazard inherent in premature calls for all-clear.
Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

PubMed

Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

2013-04-01

Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.
Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

PubMed

Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

2015-01-01

Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.
A simple, rapid, high-fidelity and cost-effective PCR-based two-step DNA synthesis method for long gene sequences.

PubMed

Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi

2004-07-07

Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.
A Simple Method for Amplifying RNA Targets (SMART)

PubMed Central

McCalla, Stephanie E.; Ong, Carmichael; Sarma, Aartik; Opal, Steven M.; Artenstein, Andrew W.; Tripathi, Anubhav

2012-01-01

We present a novel and simple method for amplifying RNA targets (named by its acronym, SMART), and for detection, using engineered amplification probes that overcome existing limitations of current RNA-based technologies. This system amplifies and detects optimal engineered ssDNA probes that hybridize to target RNA. The amplifiable probe-target RNA complex is captured on magnetic beads using a sequence-specific capture probe and is separated from unbound probe using a novel microfluidic technique. Hybridization sequences are not constrained as they are in conventional target-amplification reactions such as nucleic acid sequence amplification (NASBA). Our engineered ssDNA probe was amplified both off-chip and in a microchip reservoir at the end of the separation microchannel using isothermal NASBA. Optimal solution conditions for ssDNA amplification were investigated. Although KCl and MgCl2 are typically found in NASBA reactions, replacing 70 mmol/L of the 82 mmol/L total chloride ions with acetate resulted in optimal reaction conditions, particularly for low but clinically relevant probe concentrations (≤100 fmol/L). With the optimal probe design and solution conditions, we also successfully removed the initial heating step of NASBA, thus achieving a true isothermal reaction. The SMART assay using a synthetic model influenza DNA target sequence served as a fundamental demonstration of the efficacy of the capture and microfluidic separation system, thus bridging our system to a clinically relevant detection problem. PMID:22691910
Construction of an Integrated High Density Simple Sequence Repeat Linkage Map in Cultivated Strawberry (Fragaria × ananassa) and its Applicability

PubMed Central

Isobe, Sachiko N.; Hirakawa, Hideki; Sato, Shusei; Maeda, Fumi; Ishikawa, Masami; Mori, Toshiki; Yamamoto, Yuko; Shirasawa, Kenta; Kimura, Mitsuhiro; Fukami, Masanobu; Hashizume, Fujio; Tsuji, Tomoko; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Tsuruoka, Hisano; Minami, Chiharu; Takahashi, Chika; Wada, Tsuyuko; Ono, Akiko; Kawashima, Kumiko; Nakazaki, Naomi; Kishida, Yoshie; Kohara, Mitsuyo; Nakayama, Shinobu; Yamada, Manabu; Fujishiro, Tsunakazu; Watanabe, Akiko; Tabata, Satoshi

2013-01-01

The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers. PMID:23248204
Quantification of the methylation status of the PWS/AS imprinted region: comparison of two approaches based on bisulfite sequencing and methylation-sensitive MLPA.

PubMed

Dikow, Nicola; Nygren, Anders Oh; Schouten, Jan P; Hartmann, Carolin; Krämer, Nikola; Janssen, Bart; Zschocke, Johannes

2007-06-01

Standard methods used for genomic methylation analysis allow the detection of complete absence of either methylated or non-methylated alleles but are usually unable to detect changes in the proportion of methylated and unmethylated alleles. We compare two methods for quantitative methylation analysis, using the chromosome 15q11-q13 imprinted region as model. Absence of the non-methylated paternal allele in this region leads to Prader-Willi syndrome (PWS) whilst absence of the methylated maternal allele results in Angelman syndrome (AS). A proportion of AS is caused by mosaic imprinting defects which may be missed with standard methods and require quantitative analysis for their detection. Sequence-based quantitative methylation analysis (SeQMA) involves quantitative comparison of peaks generated through sequencing reactions after bisulfite treatment. It is simple, cost-effective and can be easily established for a large number of genes. However, our results support previous suggestions that methods based on bisulfite treatment may be problematic for exact quantification of methylation status. Methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA) avoids bisulfite treatment. It detects changes in both CpG methylation as well as copy number of up to 40 chromosomal sequences in one simple reaction. Once established in a laboratory setting, the method is more accurate, reliable and less time consuming.
Biological and molecular characterization of cellular differentiation in Tetrahymena vorax: a potential biocontrol protozoan.

PubMed

Green, M M; LeBoeuf, R D; Churchill, P F

2000-01-01

Tetrahymena vorax (T. vorax) is an indigenous fresh water protozoan with the natural biological potential to maintain a specific aquatic microbial flora by ingesting and eliminating specific microorganism. To investigate the molecular mechanisms controlling Tetrahymena vorax (T. vorax) cellular differentiation from a small-mouth vegetative cell to a voracious large-mouth carnivore capable of ingesting prey ciliates and bacteria from aquatic environments, we use DNA subtraction and gene discovery techniques to identify and isolate T. vorax differentiation-specific genes. The physiological necessity for one newly discovered gene, SUBII-TG, was determined in vivo using an antisense oligonucleotide directed against the 5' SUBII-TG DNA sequence. The barriers to delivering antisense oligonucleotides to the cytoplasm of T. vorax were circumvented by employing a new but simple procedure of processing the oligonucleotide with the differentiation stimulus, stomatin. In these studies, the antisense oligonucleotide down-regulated SUBII-TG mRNA expression, and blocked differentiation and ingestion of prey ciliates. The ability to down-regulate SUBII-TG expression with the antisense oligonucleotide suggests that the molecular mechanisms controlling the natural biological activities of T. vorax can be manipulated to further study its cellular differentiation and potential as a biocontrol microorganism.
Next generation DNA sequencing technology delivers valuable genetic markers for the genomic orphan legume species, Bituminaria bituminosa

PubMed Central

2011-01-01

Background Bituminaria bituminosa is a perennial legume species from the Canary Islands and Mediterranean region that has potential as a drought-tolerant pasture species and as a source of pharmaceutical compounds. Three botanical varieties have previously been identified in this species: albomarginata, bituminosa and crassiuscula. B. bituminosa can be considered a genomic 'orphan' species with very few genomic resources available. New DNA sequencing technologies provide an opportunity to develop high quality molecular markers for such orphan species. Results 432,306 mRNA molecules were sampled from a leaf transcriptome of a single B. bituminosa plant using Roche 454 pyrosequencing, resulting in an average read length of 345 bp (149.1 Mbp in total). Sequences were assembled into 3,838 isotigs/contigs representing putatively unique gene transcripts. Gene ontology descriptors were identified for 3,419 sequences. Raw sequence reads containing simple sequence repeat (SSR) motifs were identified, and 240 primer pairs flanking these motifs were designed. Of 87 primer pairs developed this way, 75 (86.2%) successfully amplified primarily single fragments by PCR. Fragment analysis using 20 primer pairs in 79 accessions of B. bituminosa detected 130 alleles at 21 SSR loci. Genetic diversity analyses confirmed that variation at these SSR loci accurately reflected known taxonomic relationships in original collections of B. bituminosa and provided additional evidence that a division of the botanical variety bituminosa into two according to geographical origin (Mediterranean region and Canary Islands) may be appropriate. Evidence of cross-pollination was also found between botanical varieties within a B. bituminosa breeding programme. Conclusions B. bituminosa can no longer be considered a genomic orphan species, having now a large (albeit incomplete) repertoire of expressed gene sequences that can serve as a resource for future genetic studies. This experimental approach was effective in developing codominant and polymorphic SSR markers for application in diverse genetic studies. These markers have already given new insight into genetic variation in B. bituminosa, providing evidence that a division of the botanical variety bituminosa may be appropriate. This approach is commended to those seeking to develop useful markers for genomic orphan species. PMID:22171578
Single-Genome Sequencing of Hepatitis C Virus in Donor-Recipient Pairs Distinguishes Modes and Models of Virus Transmission and Early Diversification.

PubMed

Li, Hui; Stoddard, Mark B; Wang, Shuyi; Giorgi, Elena E; Blair, Lily M; Learn, Gerald H; Hahn, Beatrice H; Alter, Harvey J; Busch, Michael P; Fierer, Daniel S; Ribeiro, Ruy M; Perelson, Alan S; Bhattacharya, Tanmoy; Shaw, George M

2016-01-01

Despite the recent development of highly effective anti-hepatitis C virus (HCV) drugs, the global burden of this pathogen remains immense. Control or eradication of HCV will likely require the broad application of antiviral drugs and development of an effective vaccine. A precise molecular identification of transmitted/founder (T/F) HCV genomes that lead to productive clinical infection could play a critical role in vaccine research, as it has for HIV-1. However, the replication schema of these two RNA viruses differ substantially, as do viral responses to innate and adaptive host defenses. These differences raise questions as to the certainty of T/F HCV genome inferences, particularly in cases where multiple closely related sequence lineages have been observed. To clarify these issues and distinguish between competing models of early HCV diversification, we examined seven cases of acute HCV infection in humans and chimpanzees, including three examples of virus transmission between linked donors and recipients. Using single-genome sequencing (SGS) of plasma vRNA, we found that inferred T/F sequences in recipients were identical to viral sequences in their respective donors. Early in infection, HCV genomes generally evolved according to a simple model of random evolution where the coalescent corresponded to the T/F sequence. Closely related sequence lineages could be explained by high multiplicity infection from a donor whose viral sequences had undergone a pretransmission bottleneck due to treatment, immune selection, or recent infection. These findings validate SGS, together with mathematical modeling and phylogenetic analysis, as a novel strategy to infer T/F HCV genome sequences. Despite the recent development of highly effective, interferon-sparing anti-hepatitis C virus (HCV) drugs, the global burden of this pathogen remains immense. Control or eradication of HCV will likely require the broad application of antiviral drugs and the development of an effective vaccine, which could be facilitated by a precise molecular identification of transmitted/founder (T/F) viral genomes and their progeny. We used single-genome sequencing to show that inferred HCV T/F sequences in recipients were identical to viral sequences in their respective donors and that viral genomes generally evolved early in infection according to a simple model of random sequence evolution. Altogether, the findings validate T/F genome inferences and illustrate how T/F sequence identification can illuminate studies of HCV transmission, immunopathogenesis, drug resistance development, and vaccine protection, including sieving effects on breakthrough virus strains. Copyright © 2015 Li et al.
Platinum(II)-Oligonucleotide Coordination Based Aptasensor for Simple and Selective Detection of Platinum Compounds.

PubMed

Cai, Sheng; Tian, Xueke; Sun, Lianli; Hu, Haihong; Zheng, Shirui; Jiang, Huidi; Yu, Lushan; Zeng, Su

2015-10-20

Wide use of platinum-based chemotherapeutic regimens for the treatment for carcinoma calls for a simple and selective detection of platinum compound in biological samples. On the basis of the platinum(II)-base pair coordination, a novel type of aptameric platform for platinum detection has been introduced. This chemiluminescence (CL) aptasensor consists of a designed streptavidin (SA) aptamer sequence in which several base pairs were replaced by G-G mismatches. Only in the presence of platinum, coordination occurs between the platinum and G-G base pairs as opposed to the hydrogen-bonded G-C base pairs, which leads to SA aptamer sequence activation, resulting in their binding to SA coated magnetic beads. These Pt-DNA coordination events were monitored by a simple and direct luminol-peroxide CL reaction through horseradish peroxidase (HRP) catalysis with a strong chemiluminescence emission. The validated ranges of quantification were 0.12-240 μM with a limit of detection of 60 nM and selectivity over other metal ions. This assay was also successfully used in urine sample determination. It will be a promising candidate for the detection of platinum in biomedical and environmental samples.
The Origin and Early Evolution of Membrane Proteins

NASA Technical Reports Server (NTRS)

Pohorille, Andrew; Schweighofter, Karl; Wilson, Michael A.

2006-01-01

The origin and early evolution of membrane proteins, and in particular ion channels, are considered from the point of view that the transmembrane segments of membrane proteins are structurally quite simple and do not require specific sequences to fold. We argue that the transport of solute species, especially ions, required an early evolution of efficient transport mechanisms, and that the emergence of simple ion channels was protobiologically plausible. We also argue that, despite their simple structure, such channels could possess properties that, at the first sight, appear to require markedly larger complexity. These properties can be subtly modulated by local modifications to the sequence rather than global changes in molecular architecture. In order to address the evolution and development of ion channels, we focus on identifying those protein domains that are commonly associated with ion channel proteins and are conserved throughout the three main domains of life (Eukarya, Prokarya, and Archaea). We discuss the potassium-sodium-calcium superfamily of voltage-gated ion channels, mechanosensitive channels, porins, and ABC-transporters and argue that these families of membrane channels have sufficiently universal architectures that they can readily adapt to the diverse functional demands arising during evolution.
Transposon fingerprinting using low coverage whole genome shotgun sequencing in Cacao (Theobroma cacao L.) and related species

PubMed Central

2013-01-01

Background Transposable elements (TEs) and other repetitive elements are a large and dynamically evolving part of eukaryotic genomes, especially in plants where they can account for a significant proportion of genome size. Their dynamic nature gives them the potential for use in identifying and characterizing crop germplasm. However, their repetitive nature makes them challenging to study using conventional methods of molecular biology. Next generation sequencing and new computational tools have greatly facilitated the investigation of TE variation within species and among closely related species. Results (i) We generated low-coverage Illumina whole genome shotgun sequencing reads for multiple individuals of cacao (Theobroma cacao) and related species. These reads were analysed using both an alignment/mapping approach and a de novo (graph based clustering) approach. (ii) A standard set of ultra-conserved orthologous sequences (UCOS) standardized TE data between samples and provided phylogenetic information on the relatedness of samples. (iii) The mapping approach proved highly effective within the reference species but underestimated TE abundance in interspecific comparisons relative to the de novo methods. (iv) Individual T. cacao accessions have unique patterns of TE abundance indicating that the TE composition of the genome is evolving actively within this species. (v) LTR/Gypsy elements are the most abundant, comprising c.10% of the genome. (vi) Within T. cacao the retroelement families show an order of magnitude greater sequence variability than the DNA transposon families. (vii) Theobroma grandiflorum has a similar TE composition to T. cacao, but the related genus Herrania is rather different, with LTRs making up a lower proportion of the genome, perhaps because of a massive presence (c. 20%) of distinctive low complexity satellite-like repeats in this genome. Conclusions (i) Short read alignment/mapping to reference TE contigs provides a simple and effective method of investigating intraspecific differences in TE composition. It is not appropriate for comparing repetitive elements across the species boundaries, for which de novo methods are more appropriate. (ii) Individual T. cacao accessions have unique spectra of TE composition indicating active evolution of TE abundance within this species. TE patterns could potentially be used as a “fingerprint” to identify and characterize cacao accessions. PMID:23883295

Some links on this page may take you to non-federal websites. Their policies may differ from this site.