USDA-ARS?s Scientific Manuscript database
Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...
A SSR-based genetic linkage map of cultivated peanut (Arachis hypogaea L.)
USDA-ARS?s Scientific Manuscript database
The objective of this study was to construct a molecular linkage map of cultivated tetraploid peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. Three recombinant inbre...
Labudde, Dirk
2015-01-01
The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations. PMID:26180540
Grunert, Steffen; Labudde, Dirk
2015-01-01
The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations.
Isobe, Sachiko N.; Hirakawa, Hideki; Sato, Shusei; Maeda, Fumi; Ishikawa, Masami; Mori, Toshiki; Yamamoto, Yuko; Shirasawa, Kenta; Kimura, Mitsuhiro; Fukami, Masanobu; Hashizume, Fujio; Tsuji, Tomoko; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Tsuruoka, Hisano; Minami, Chiharu; Takahashi, Chika; Wada, Tsuyuko; Ono, Akiko; Kawashima, Kumiko; Nakazaki, Naomi; Kishida, Yoshie; Kohara, Mitsuyo; Nakayama, Shinobu; Yamada, Manabu; Fujishiro, Tsunakazu; Watanabe, Akiko; Tabata, Satoshi
2013-01-01
The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers. PMID:23248204
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
Model compilation: An approach to automated model derivation
NASA Technical Reports Server (NTRS)
Keller, Richard M.; Baudin, Catherine; Iwasaki, Yumi; Nayak, Pandurang; Tanaka, Kazuo
1990-01-01
An approach is introduced to automated model derivation for knowledge based systems. The approach, model compilation, involves procedurally generating the set of domain models used by a knowledge based system. With an implemented example, how this approach can be used to derive models of different precision and abstraction is illustrated, and models are tailored to different tasks, from a given set of base domain models. In particular, two implemented model compilers are described, each of which takes as input a base model that describes the structure and behavior of a simple electromechanical device, the Reaction Wheel Assembly of NASA's Hubble Space Telescope. The compilers transform this relatively general base model into simple task specific models for troubleshooting and redesign, respectively, by applying a sequence of model transformations. Each transformation in this sequence produces an increasingly more specialized model. The compilation approach lessens the burden of updating and maintaining consistency among models by enabling their automatic regeneration.
Kayesh, E; Bilkish, N; Liu, G S; Chen, W; Leng, X P; Fang, J G
2014-03-31
Among different classes of molecular markers, expressed sequence tags (ESTs) are a new resource for developing simple sequence repeat (SSR) functional markers for genotyping and genetic mapping in F1 hybrid populations of Vitis vinifera L. Recently, because of the availability of an enormous amount of data for ESTs in the public domain, the emphasis has shifted from genomic SSRs to EST-SSRs, which belong to transcribed regions of the genome and may have a role in gene expression or function. The objective of this study was to assess the polymorphisms among 94 F1 hybrids from "Early Rose" and "Red Globe" using 25 EST-derived and 25 non-EST SSR markers. A total collection of 362,375 grape ESTs that were retrieved from the National Center for Biotechnology Information (NCBI) and 2522 EST-SSR sequences were identified. From them, 205 primer pairs were randomly selected, including 176 pairs that were EST-derived and 29 non-EST SSR primer pairs, for polymerase chain reaction amplification. A total of 131 alleles were amplified using 50 pairs of primers; 78 alleles were amplified using EST-derived SSR primers and 53 were from non-EST SSR primers. At most, 6 and 5 alleles were amplified by EST-derived and non-EST SSR primers, respectively. The EST-derived SSR markers showed a maximum polymorphic information content (PIC) value of 1 and a minimum of 0.33 while non-EST SSR markers had maximum and minimum PIC values of 1 and 0.25, respectively. The average PIC value was 0.56 for EST-derived SSR markers and 0.45 for non-EST SSR markers.
NASA Astrophysics Data System (ADS)
Jiao, Yong; Wakakuwa, Eyuri; Ogawa, Tomohiro
2018-02-01
We consider asymptotic convertibility of an arbitrary sequence of bipartite pure states into another by local operations and classical communication (LOCC). We adopt an information-spectrum approach to address cases where each element of the sequences is not necessarily a tensor power of a bipartite pure state. We derive necessary and sufficient conditions for the LOCC convertibility of one sequence to another in terms of spectral entropy rates of entanglement of the sequences. Based on these results, we also provide simple proofs for previously known results on the optimal rates of entanglement concentration and dilution of general sequences of bipartite pure states.
Linkage mapping in a watermelon population segregating for fusarium wilt resistance
Leigh K. Hawkins; Fenny Dane; Thomas L. Kubisiak; Billy B. Rhodes; Robert L. Jarret
2001-01-01
Isozyme, randomly amplified polymorphic DNA (RAPD), and simple sequence repeats (SSR) markers were used to generate a linkage map in an F2 and F3 watermelon (Citrullus lanatus (Thumb.) Matsum. & Nakai) population derived from a cross between the fusarium wilt (Fusarium oxysporum f....
Khatri, Bhavin S.; Goldstein, Richard A.
2015-01-01
Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level. PMID:25936759
Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi
2013-01-01
In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257
Zill, Oliver A.; Sebisanovic, Dragan; Lopez, Rene; Blau, Sibel; Collisson, Eric A.; Divers, Stephen G.; Hoon, Dave S. B.; Kopetz, E. Scott; Lee, Jeeyun; Nikolinakos, Petros G.; Baca, Arthur M.; Kermani, Bahram G.; Eltoukhy, Helmy; Talasaz, AmirAli
2015-01-01
Next-generation sequencing of cell-free circulating solid tumor DNA addresses two challenges in contemporary cancer care. First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital SequencingTM is a novel method for high-quality sequencing of circulating tumor DNA simultaneously across a comprehensive panel of over 50 cancer-related genes with a simple blood test. Here we report the analytic and clinical validation of the gene panel. Analytic sensitivity down to 0.1% mutant allele fraction is demonstrated via serial dilution studies of known samples. Near-perfect analytic specificity (> 99.9999%) enables complete coverage of many genes without the false positives typically seen with traditional sequencing assays at mutant allele frequencies or fractions below 5%. We compared digital sequencing of plasma-derived cell-free DNA to tissue-based sequencing on 165 consecutive matched samples from five outside centers in patients with stage III-IV solid tumor cancers. Clinical sensitivity of plasma-derived NGS was 85.0%, comparable to 80.7% sensitivity for tissue. The assay success rate on 1,000 consecutive samples in clinical practice was 99.8%. Digital sequencing of plasma-derived DNA is indicated in advanced cancer patients to prevent repeated invasive biopsies when the initial biopsy is inadequate, unobtainable for genomic testing, or uninformative, or when the patient’s cancer has progressed despite treatment. Its clinical utility is derived from reduction in the costs, complications and delays associated with invasive tissue biopsies for genomic testing. PMID:26474073
Ramu, P; Kassahun, B; Senthilvel, S; Ashok Kumar, C; Jayashree, B; Folkertsma, R T; Reddy, L Ananda; Kuruvinashetti, M S; Haussmann, B I G; Hash, C T
2009-11-01
The sequencing and detailed comparative functional analysis of genomes of a number of select botanical models open new doors into comparative genomics among the angiosperms, with potential benefits for improvement of many orphan crops that feed large populations. In this study, a set of simple sequence repeat (SSR) markers was developed by mining the expressed sequence tag (EST) database of sorghum. Among the SSR-containing sequences, only those sharing considerable homology with rice genomic sequences across the lengths of the 12 rice chromosomes were selected. Thus, 600 SSR-containing sorghum EST sequences (50 homologous sequences on each of the 12 rice chromosomes) were selected, with the intention of providing coverage for corresponding homologous regions of the sorghum genome. Primer pairs were designed and polymorphism detection ability was assessed using parental pairs of two existing sorghum mapping populations. About 28% of these new markers detected polymorphism in this 4-entry panel. A subset of 55 polymorphic EST-derived SSR markers were mapped onto the existing skeleton map of a recombinant inbred population derived from cross N13 x E 36-1, which is segregating for Striga resistance and the stay-green component of terminal drought tolerance. These new EST-derived SSR markers mapped across all 10 sorghum linkage groups, mostly to regions expected based on prior knowledge of rice-sorghum synteny. The ESTs from which these markers were derived were then mapped in silico onto the aligned sorghum genome sequence, and 88% of the best hits corresponded to linkage-based positions. This study demonstrates the utility of comparative genomic information in targeted development of markers to fill gaps in linkage maps of related crop species for which sufficient genomic tools are not available.
Novel numerical and graphical representation of DNA sequences and proteins.
Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D
2006-12-01
We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.
A Simple, Scalable Synthetic Route to (+)- and (−)-Pseudoephenamine
Mellem, Kevin T.
2013-01-01
A three-step synthesis of pseudoephenamine suitable for preparing multigram amounts of both enantiomers of the auxiliary from the inexpensive starting material benzil is described. The sequence involves synthesis of the crystalline mono-methylimine derivative of benzil, reduction of that substance with lithium aluminum hydride, and resolution of pseudoephenamine with mandelic acid. PMID:24138164
Georgi, Laura; Johnson-Cicalese, Jennifer; Honig, Josh; Das, Sushma Parankush; Rajah, Veeran D; Bhattacharya, Debashish; Bassil, Nahla; Rowland, Lisa J; Polashock, James; Vorsa, Nicholi
2013-03-01
The first genetic map of cranberry (Vaccinium macrocarpon) has been constructed, comprising 14 linkage groups totaling 879.9 cM with an estimated coverage of 82.2 %. This map, based on four mapping populations segregating for field fruit-rot resistance, contains 136 distinct loci. Mapped markers include blueberry-derived simple sequence repeat (SSR) and cranberry-derived sequence-characterized amplified region markers previously used for fingerprinting cranberry cultivars. In addition, SSR markers were developed near cranberry sequences resembling genes involved in flavonoid biosynthesis or defense against necrotrophic pathogens, or conserved orthologous set (COS) sequences. The cranberry SSRs were developed from next-generation cranberry genomic sequence assemblies; thus, the positions of these SSRs on the genomic map provide information about the genomic location of the sequence scaffold from which they were derived. The use of SSR markers near COS and other functional sequences, plus 33 SSR markers from blueberry, facilitates comparisons of this map with maps of other plant species. Regions of the cranberry map were identified that showed conservation of synteny with Vitis vinifera and Arabidopsis thaliana. Positioned on this map are quantitative trait loci (QTL) for field fruit-rot resistance (FFRR), fruit weight, titratable acidity, and sound fruit yield (SFY). The SFY QTL is adjacent to one of the fruit weight QTL and may reflect pleiotropy. Two of the FFRR QTL are in regions of conserved synteny with grape and span defense gene markers, and the third FFRR QTL spans a flavonoid biosynthetic gene.
Array-Based Rational Design of Short Peptide Probe-Derived from an Anti-TNT Monoclonal Antibody.
Okochi, Mina; Muto, Masaki; Yanai, Kentaro; Tanaka, Masayoshi; Onodera, Takeshi; Wang, Jin; Ueda, Hiroshi; Toko, Kiyoshi
2017-10-09
Complementarity-determining regions (CDRs) are sites on the variable chains of antibodies responsible for binding to specific antigens. In this study, a short peptide probe for recognition of 2,4,6-trinitrotoluene (TNT), was identified by testing sequences derived from the CDRs of an anti-TNT monoclonal antibody. The major TNT-binding site in this antibody was identified in the heavy chain CDR3 by antigen docking simulation and confirmed by an immunoassay using a spot-synthesis based peptide array comprising amino acid sequences of six CDRs in the variable region. A peptide derived from heavy chain CDR3 (RGYSSFIYWF) bound to TNT with a dissociation constant of 1.3 μM measured by surface plasmon resonance. Substitution of selected amino acids with basic residues increased TNT binding while substitution with acidic amino acids decreased affinity, an isoleucine to arginine change showed the greatest improvement of 1.8-fold. The ability to create simple peptide binders of volatile organic compounds from sequence information provided by the immune system in the creation of an immune response will be beneficial for sensor developments in the future.
A New SNP Haplotype associated with blue disease resistance gene in cotton (Gossypium hirsutum L.)
USDA-ARS?s Scientific Manuscript database
Resistance to cotton blue disease (CBD) was evaluated in 364 F2.3 families of 3 populations derived from resistant variety ‘Delta Opal’. The CBD resistance in ‘Delta Opal’ was controlled by one single dominant gene designated Cbd. Two simple sequence repeat (SSR) markers were identified as linked t...
Detecting and Analyzing Genetic Recombination Using RDP4.
Martin, Darren P; Murrell, Ben; Khoosal, Arjun; Muhire, Brejnev
2017-01-01
Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. The evolutionary value of recombination has been widely debated and so too has its influence on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. When nucleic acids recombine, the evolution of the daughter or recombinant molecule cannot be accurately described by a single phylogeny. This simple fact can seriously undermine the accuracy of any phylogenetics-based analytical approach which assumes that the evolutionary history of a set of recombining sequences can be adequately described by a single phylogenetic tree. There are presently a large number of available methods and associated computer programs for analyzing and characterizing recombination in various classes of nucleotide sequence datasets. Here we examine the use of some of these methods to derive and test recombination hypotheses using multiple sequence alignments.
NASA Astrophysics Data System (ADS)
Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng
2011-06-01
The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.
Schwaiger, F W; Weyers, E; Epplen, C; Brün, J; Ruff, G; Crawford, A; Epplen, J T
1993-09-01
Twenty-one different caprine and 13 ovine MHC-DRB exon 2 sequences were determined including part of the adjacent introns containing simple repetitive (gt)n(ga)m elements. The positions for highly polymorphic DRB amino acids vary slightly among ungulates and other mammals. From man and mouse to ungulates the basic (gt)n(ga)m structure is fixed in evolution for 7 x 10(7) years whereas ample variations exist in the tandem (gt)n and (ga)m dinucleotides and especially their "degenerated" derivatives. Phylogenetic trees for the alpha-helices and beta-pleated sheets of the ungulate DRB sequences suggest different evolutionary histories. In hoofed animals as well as in humans DRB beta-sheet encoding sequences and adjacent intronic repeats can be assembled into virtually identical groups suggesting coevolution of noncoding as well as coding DNA. In contrast alpha-helices and C-terminal parts of the first DRB domain evolve distinctly. In the absence of a defined mechanism causing specific, site-directed mutations, double-recombination or gene-conversion-like events would readily explain this fact. The role of the intronic simple (gt)n(ga)m repeat is discussed with respect to these genetic exchange mechanisms during evolution.
Etard, Christelle; Joshi, Swarnima; Stegmaier, Johannes; Mikut, Ralf; Strähle, Uwe
2017-12-01
A bottleneck in CRISPR/Cas9 genome editing is variable efficiencies of in silico-designed gRNAs. We evaluated the sensitivity of the TIDE method (Tracking of Indels by DEcomposition) introduced by Brinkman et al. in 2014 for assessing the cutting efficiencies of gRNAs in zebrafish. We show that this simple method, which involves bulk polymerase chain reaction amplification and Sanger sequencing, is highly effective in tracking well-performing gRNAs in pools of genomic DNA derived from injected embryos. The method is equally effective for tracing INDELs in heterozygotes.
Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A
2015-10-26
Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Blair, Matthew W; Hurtado, Natalia; Chavarro, Carolina M; Muñoz-Torres, Monica C; Giraldo, Martha C; Pedraza, Fabio; Tomkins, Jeff; Wing, Rod
2011-03-22
Sequencing of cDNA libraries for the development of expressed sequence tags (ESTs) as well as for the discovery of simple sequence repeats (SSRs) has been a common method of developing microsatellites or SSR-based markers. In this research, our objective was to further sequence and develop common bean microsatellites from leaf and root cDNA libraries derived from the Andean gene pool accession G19833 and the Mesoamerican gene pool accession DOR364, mapping parents of a commonly used reference map. The root libraries were made from high and low phosphorus treated plants. A total of 3,123 EST sequences from leaf and root cDNA libraries were screened and used for direct simple sequence repeat discovery. From these EST sequences we found 184 microsatellites; the majority containing tri-nucleotide motifs, many of which were GC rich (ACC, AGC and AGG in particular). Di-nucleotide motif microsatellites were about half as common as the tri-nucleotide motif microsatellites but most of these were AGn microsatellites with a moderate number of ATn microsatellites in root ESTs followed by few ACn and no GCn microsatellites. Out of the 184 new SSR loci, 120 new microsatellite markers were developed in the BMc (Bean Microsatellites from cDNAs) series and these were evaluated for their capacity to distinguish bean diversity in a germplasm panel of 18 genotypes. We developed a database with images of the microsatellites and their polymorphism information content (PIC), which averaged 0.310 for polymorphic markers. The present study produced information about microsatellite frequency in root and leaf tissues of two important genotypes for common bean genomics: namely G19833, the Andean genotype selected for whole genome shotgun sequencing from race Peru, and DOR364 a race Mesoamerica subgroup 2 genotype that is a small-red seeded, released variety in Central America. Both race Peru and Mesoamerica subgroup 2 (small red beans) have been understudied in comparison to race Nueva Granada and Mesoamerica subgroup 1 (black beans) both with regards to gene expression and as sources of markers. However, we found few differences between SSR type and frequency between the G19833 leaf and DOR364 root tissue-derived ESTs. Overall, our work adds to the analysis of microsatellite frequency evaluation for common bean and provides a new set of 120 BMc markers which combined with the 248 previously developed BMc markers brings the total in this series to 368 markers. Once we include BMd markers, which are derived from GenBank sequences, the current total of gene-based markers from our laboratory surpasses 500 markers. These markers are basic for studies of the transcriptome of common bean and can form anchor points for genetic mapping studies in the future.
MM Algorithms for Geometric and Signomial Programming
Lange, Kenneth; Zhou, Hua
2013-01-01
This paper derives new algorithms for signomial programming, a generalization of geometric programming. The algorithms are based on a generic principle for optimization called the MM algorithm. In this setting, one can apply the geometric-arithmetic mean inequality and a supporting hyperplane inequality to create a surrogate function with parameters separated. Thus, unconstrained signomial programming reduces to a sequence of one-dimensional minimization problems. Simple examples demonstrate that the MM algorithm derived can converge to a boundary point or to one point of a continuum of minimum points. Conditions under which the minimum point is unique or occurs in the interior of parameter space are proved for geometric programming. Convergence to an interior point occurs at a linear rate. Finally, the MM framework easily accommodates equality and inequality constraints of signomial type. For the most important special case, constrained quadratic programming, the MM algorithm involves very simple updates. PMID:24634545
MM Algorithms for Geometric and Signomial Programming.
Lange, Kenneth; Zhou, Hua
2014-02-01
This paper derives new algorithms for signomial programming, a generalization of geometric programming. The algorithms are based on a generic principle for optimization called the MM algorithm. In this setting, one can apply the geometric-arithmetic mean inequality and a supporting hyperplane inequality to create a surrogate function with parameters separated. Thus, unconstrained signomial programming reduces to a sequence of one-dimensional minimization problems. Simple examples demonstrate that the MM algorithm derived can converge to a boundary point or to one point of a continuum of minimum points. Conditions under which the minimum point is unique or occurs in the interior of parameter space are proved for geometric programming. Convergence to an interior point occurs at a linear rate. Finally, the MM framework easily accommodates equality and inequality constraints of signomial type. For the most important special case, constrained quadratic programming, the MM algorithm involves very simple updates.
Javier, David J.; Castellanos-Gonzalez, Alejandro; Weigum, Shannon E.; White, A. Clinton; Richards-Kortum, Rebecca
2009-01-01
We report on a novel strategy for the detection of mRNA targets derived from Cryptosporidium parvum oocysts by the use of oligonucleotide-gold nanoparticles. Gold nanoparticles are functionalized with oligonucleotides which are complementary to unique sequences present on the heat shock protein 70 (HSP70) DNA/RNA target. The results indicate that the presence of HPS70 targets of increasing complexity causes the formation of oligonucleotide-gold nanoparticle networks which can be visually monitored via a simple colorimetric readout measured by a total internal reflection imaging setup. Furthermore, the induced expression of HSP70 mRNA in Cryptosporidium parvum oocysts via a simple heat shock process provides nonenzymatic amplification such that the HSP70 mRNA derived from as few as 5 × 103 purified C. parvum oocysts was successfully detected. Taken together, these results support the use of oligonucleotide-gold nanoparticles for the molecular diagnosis of cryptosporidiosis, offering new opportunities for the further development of point-of-care diagnostic assays with low-cost, robust reagents and simple colorimetric detection. PMID:19828740
Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining
2014-01-01
Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining
2014-01-01
Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Kumar, Vikash; Chatterjee, Amrita; Kumar, Nupur; Ganguly, Anasuya; Chakraborty, Indranil; Banerjee, Mainak
2014-10-09
Four new D-glucose derived m-s-m type gemini surfactants with variable spacer and tail length have been synthesized by a simple and efficient synthetic methodology utilizing the free C-3 hydroxy group of diisopropylidene glucose. The synthetic route to these gemini surfactants with a quaternary ammonium group as polar head group involves a sequence of simple reactions including alkylation, imine formation, quaternization of amine etc. The surface properties of the new geminis were evaluated by surface tension and conductivity measurements. These gemini surfactants showed low cytotoxicity by MTT assay on HeLa cell line. The DNA binding capabilities of these surfactants were determined by agarose gel electrophoresis, fluorescence titration, and DLS experiments. The preliminary studies by agarose gel electrophoresis indicated chain length dependent DNA binding abilities, further supported by ethidium bromide exclusion experiments. Two of the D-glucose derived gemini surfactants showed effective binding with pET-28a plasmid DNA (pDNA) at relatively low N/P ratio (i.e., cationic nitrogen/DNA phosphate molar ratio). Copyright © 2014 Elsevier Ltd. All rights reserved.
Episodic sequence memory is supported by a theta-gamma phase code.
Heusser, Andrew C; Poeppel, David; Ezzyat, Youssef; Davachi, Lila
2016-10-01
The meaning we derive from our experiences is not a simple static extraction of the elements but is largely based on the order in which those elements occur. Models propose that sequence encoding is supported by interactions between high- and low-frequency oscillations, such that elements within an experience are represented by neural cell assemblies firing at higher frequencies (gamma) and sequential order is encoded by the specific timing of firing with respect to a lower frequency oscillation (theta). During episodic sequence memory formation in humans, we provide evidence that items in different sequence positions exhibit greater gamma power along distinct phases of a theta oscillation. Furthermore, this segregation is related to successful temporal order memory. Our results provide compelling evidence that memory for order, a core component of an episodic memory, capitalizes on the ubiquitous physiological mechanism of theta-gamma phase-amplitude coupling.
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.
Widmer, G
1993-03-01
Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.
Massive contribution of transposable elements to mammalian regulatory sequences.
Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam
2016-09-01
Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Probabilistic Evaluation of Competing Climate Models
NASA Astrophysics Data System (ADS)
Braverman, A. J.; Chatterjee, S.; Heyman, M.; Cressie, N.
2017-12-01
A standard paradigm for assessing the quality of climate model simulations is to compare what these models produce for past and present time periods, to observations of the past and present. Many of these comparisons are based on simple summary statistics called metrics. Here, we propose an alternative: evaluation of competing climate models through probabilities derived from tests of the hypothesis that climate-model-simulated and observed time sequences share common climate-scale signals. The probabilities are based on the behavior of summary statistics of climate model output and observational data, over ensembles of pseudo-realizations. These are obtained by partitioning the original time sequences into signal and noise components, and using a parametric bootstrap to create pseudo-realizations of the noise sequences. The statistics we choose come from working in the space of decorrelated and dimension-reduced wavelet coefficients. We compare monthly sequences of CMIP5 model output of average global near-surface temperature anomalies to similar sequences obtained from the well-known HadCRUT4 data set, as an illustration.
Hatae, Ryusuke; Yoshimoto, Koji; Kuga, Daisuke; Akagi, Yojiro; Murata, Hideki; Suzuki, Satoshi O.; Mizoguchi, Masahiro; Iihara, Koji
2016-01-01
High resolution melting (HRM) is a simple and rapid method for screening mutations. It offers various advantages for clinical diagnostic applications. Conventional HRM analysis often yields equivocal results, especially for surgically obtained tissues. We attempted to improve HRM analyses for more effective applications to clinical diagnostics. HRM analyses were performed for IDH1R132 and IDH2R172 mutations in 192 clinical glioma samples in duplicate and these results were compared with sequencing results. BRAFV600E mutations were analyzed in 52 additional brain tumor samples. The melting profiles were used for differential calculus analyses. Negative second derivative plots revealed additional peaks derived from heteroduplexes in PCR products that contained mutations; this enabled unequivocal visual discrimination of the mutations. We further developed a numerical expression, the HRM-mutation index (MI), to quantify the heteroduplex-derived peak of the mutational curves. Using this expression, all IDH1 mutation statuses matched those ascertained by sequencing, with the exception of three samples. These discordant results were all derived from the misinterpretation of sequencing data. The effectiveness of our approach was further validated by analyses of IDH2R172 and BRAFV600E mutations. The present analytical method enabled an unequivocal and objective HRM analysis and is suitable for reliable mutation scanning in surgically obtained glioma tissues. This approach could facilitate molecular diagnostics in clinical environments. PMID:27529619
Vidal, Á M; Vieira, L J; Ferreira, C F; Souza, F V D; Souza, A S; Ledo, C A S
2015-07-14
Molecular markers are efficient for assessing the genetic fidelity of various species of plants after in vitro culture. In this study, we evaluated the genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) using inter-simple sequence repeat markers. Twenty-two cassava accessions from the Embrapa Cassava & Fruits Germplasm Bank were used. For each accession, DNA was extracted from a plant maintained in the field and from 3 plants grown in vitro. For DNA amplification, 27 inter-simple sequence repeat primers were used, of which 24 generated 175 bands; 100 of those bands were polymorphic and were used to study genetic variability among accessions of cassava plants maintained in the field. Based on the genetic distance matrix calculated using the arithmetic complement of the Jaccard's index, genotypes were clustered using the unweighted pair group method using arithmetic averages. The number of bands per primer was 2-13, with an average of 7.3. For most micropropagated accessions, the fidelity study showed no genetic variation between plants of the same accessions maintained in the field and those maintained in vitro, confirming the high genetic fidelity of the micropropagated plants. However, genetic variability was observed among different accessions grown in the field, and clustering based on the dissimilarity matrix revealed 7 groups. Inter-simple sequence repeat markers were efficient for detecting the genetic homogeneity of cassava plants derived from meristem culture, demonstrating the reliability of this propagation system.
Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja
2012-08-01
Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.
E-Learning for Rare Diseases: An Example Using Fabry Disease.
Cimmaruta, Chiara; Liguori, Ludovica; Monticelli, Maria; Andreotti, Giuseppina; Citro, Valentina
2017-09-24
Rare diseases represent a challenge for physicians because patients are rarely seen, and they can manifest with symptoms similar to those of common diseases. In this work, genetic confirmation of diagnosis is derived from DNA sequencing. We present a tutorial for the molecular analysis of a rare disease using Fabry disease as an example. An exonic sequence derived from a hypothetical male patient was matched against human reference data using a genome browser. The missense mutation was identified by running BlastX, and information on the affected protein was retrieved from the database UniProt. The pathogenic nature of the mutation was assessed with PolyPhen-2. Disease-specific databases were used to assess whether the missense mutation led to a severe phenotype, and whether pharmacological therapy was an option. An inexpensive bioinformatics approach is presented to get the reader acquainted with the diagnosis of Fabry disease. The reader is introduced to the field of pharmacological chaperones, a therapeutic approach that can be applied only to certain Fabry genotypes. The principle underlying the analysis of exome sequencing can be explained in simple terms using web applications and databases which facilitate diagnosis and therapeutic choices.
Fueling Future with Algal Genomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grigoriev, Igor
Algae constitute a major component of fundamental eukaryotic diversity, play profound roles in the carbon cycle, and are prominent candidates for biofuel production. The US Department of Energy Joint Genome Institute (JGI) is leading the world in algal genome sequencing (http://jgi.doe.gov/Algae) and contributes of the algal genome projects worldwide (GOLD database, 2012). The sequenced algal genomes offer catalogs of genes, networks, and pathways. The sequenced first of its kind genomes of a haptophyte E.huxleyii, chlorarachniophyte B.natans, and cryptophyte G.theta fill the gaps in the eukaryotic tree of life and carry unique genes and pathways as well as molecular fossils ofmore » secondary endosymbiosis. Natural adaptation to conditions critical for industrial production is encoded in algal genomes, for example, growth of A.anophagefferens at very high cell densities during the harmful algae blooms or a global distribution across diverse environments of E.huxleyii, able to live on sparse nutrients due to its expanded pan-genome. Communications and signaling pathways can be derived from simple symbiotic systems like lichens or complex marine algae metagenomes. Collectively these datasets derived from algal genomics contribute to building a comprehensive parts list essential for algal biofuel development.« less
Multiple Use One-Sided Hypotheses Testing in Univariate Linear Calibration
NASA Technical Reports Server (NTRS)
Krishnamoorthy, K.; Kulkarni, Pandurang M.; Mathew, Thomas
1996-01-01
Consider a normally distributed response variable, related to an explanatory variable through the simple linear regression model. Data obtained on the response variable, corresponding to known values of the explanatory variable (i.e., calibration data), are to be used for testing hypotheses concerning unknown values of the explanatory variable. We consider the problem of testing an unlimited sequence of one sided hypotheses concerning the explanatory variable, using the corresponding sequence of values of the response variable and the same set of calibration data. This is the situation of multiple use of the calibration data. The tests derived in this context are characterized by two types of uncertainties: one uncertainty associated with the sequence of values of the response variable, and a second uncertainty associated with the calibration data. We derive tests based on a condition that incorporates both of these uncertainties. The solution has practical applications in the decision limit problem. We illustrate our results using an example dealing with the estimation of blood alcohol concentration based on breath estimates of the alcohol concentration. In the example, the problem is to test if the unknown blood alcohol concentration of an individual exceeds a threshold that is safe for driving.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber
USDA-ARS?s Scientific Manuscript database
The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)
Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping
2014-01-01
Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054
Autonomous manipulation on a robot: Summary of manipulator software functions
NASA Technical Reports Server (NTRS)
Lewis, R. A.
1974-01-01
A six degree-of-freedom computer-controlled manipulator is examined, and the relationships between the arm's joint variables and 3-space are derived. Arm trajectories using sequences of third-degree polynomials to describe the time history of each joint variable are presented and two approaches to the avoidance of obstacles are given. The equations of motion for the arm are derived and then decomposed into time-dependent factors and time-independent coefficients. Several new and simplifying relationships among the coefficients are proven. Two sample trajectories are analyzed in detail for purposes of determining the most important contributions to total force in order that relatively simple approximations to the equations of motion can be used.
The correlation structure of several popular pseudorandom number generators
NASA Technical Reports Server (NTRS)
Neuman, F.; Merrick, R.; Martin, C. F.
1973-01-01
One of the desirable properties of a pseudorandom number generator is that the sequence of numbers it generates should have very low autocorrelation for all shifts except for zero shift and those that are multiples of its cycle length. Due to the simple methods of constructing random numbers, the ideal is often not quite fulfilled. A simple method of examining any random generator for previously unsuspected regularities is discussed. Once they are discovered it is often easy to derive the mathematical relationships, which describe the mathematical relationships, which describe the regular behavior. As examples, it is shown that high correlation exists in mixed and multiplicative congruential random number generators and prime moduli Lehmer generators for shifts a fraction of their cycle lengths.
Five- and six-membered ring opening of pyroglutamic diketopiperazine.
Parrish, Dennis A; Mathias, Lon J
2002-03-22
A variety of ring-opening reactions of pyroglutamic diketopiperazine at both the five-membered and six-membered rings is described. Mild, basic conditions facilitate nucleophilic attack by amines at the diketopiperazine carbonyls giving pyroglutamides in excellent yield. Reaction with nucleophiles under acidic conditions give bis-glutamate derivatives of 2,5-diketopiperazine (DKP). These reactions provide simple, two-step sequences to pyroglutamides and symmetrical diketopiperazines from commercial pyroglutamic acid with control of product dictated by reaction conditions, catalyst, and nucleophile.
One-Pot Isomerization–Cross Metathesis–Reduction (ICMR) Synthesis of Lipophilic Tetrapeptides
2015-01-01
An efficient, versatile and rapid method toward homologue series of lipophilic tetrapeptide derivatives (herein, the opioid peptides H-TIPP-OH and H-DIPP-OH) is reported. High atom economy and a minimal number of synthetic steps resulted from a one-pot tandem isomerization-cross metathesis-reduction sequence (ICMR), applicable both in solution and solid phase methodology. The broadly applicable synthesis proceeds with short reaction times and simple work-up, as illustrated in this work for alkylated opioid tetrapeptides. PMID:24906051
Tori, Motoo
2016-01-01
A chemical analysis of 30 samples of Ligularia virgaurea (Asteraceae) collected in Sichuan province and its adjacent territories in China was reviewed. These samples afforded 146 compounds, 73 of which were novel, and the chemical constituents were classified into 8 categories: (1) simple eremophilanes (without ring C) and eudesmanes including nor-derivatives, (2) furanoeremophilanes and lactones with a 1(10)-saturated bond, (3) furanoeremophilanes and lactones with a 1(10)-unsaturated bond, 1,10-epoxide, or 10-ol, (4) furanoeremophilanes and lactones with 1(10)-en-2-one, 1(10)-en-2-ol, or 1-en-3-one, (5) furanoeremophilanes and lactones with 1(10)-en-9-one, 1(10)-en-9-ol, or 1,10-epoxy-9-one, (6) cacalol and their derivatives, (7) bakkanes and their derivatives, and (8) others, as shown in Tables 1-7. In these studies, five chemotypes were identified in addition to three clades from the DNA sequences of L. virgaurea. The structural determination of some compounds was also discussed and a comment on how to express the real structure was proposed, particularly for spiro compounds.
Multilayer material characterization using thermographic signal reconstruction
NASA Astrophysics Data System (ADS)
Shepard, Steven M.; Beemer, Maria Frendberg
2016-02-01
Active-thermography has become a well-established Nondestructive Testing (NDT) method for detection of subsurface flaws. In its simplest form, flaw detection is based on visual identification of contrast between a flaw and local intact regions in an IR image sequence of the surface temperature as the sample responds to thermal stimulation. However, additional information and insight can be obtained from the sequence, even in the absence of a flaw, through analysis of the logarithmic derivatives of individual pixel time histories using the Thermographic Signal Reconstruction (TSR) method. For example, the response of a flaw-free multilayer sample to thermal stimulation can be viewed as a simple transition between the responses of infinitely thick samples of the individual constituent layers over the lifetime of the thermal diffusion process. The transition is represented compactly and uniquely by the logarithmic derivatives, based on the ratio of thermal effusivities of the layers. A spectrum of derivative responses relative to thermal effusivity ratios allows prediction of the time scale and detectability of the interface, and measurement of the thermophysical properties of one layer if the properties of the other are known. A similar transition between steady diffusion states occurs for flat bottom holes, based on the hole aspect ratio.
Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming
2015-01-01
Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.
Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming
2015-01-01
Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430
Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard
2012-12-01
Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.
A genetic linkage map of grape, utilizing Vitis rupestris and Vitis arizonica.
Doucleff, M; Jin, Y; Gao, F; Riaz, S; Krivanek, A F; Walker, M A
2004-10-01
A genetic linkage map of grape was constructed, utilizing 116 progeny derived from a cross of two Vitis rupestris x V. arizonica interspecific hybrids, using the pseudo-testcross strategy. A total of 475 DNA markers-410 amplified fragment length polymorphism, 24 inter-simple sequence repeat, 32 random amplified polymorphic DNA, and nine simple sequence repeat markers-were used to construct the parental maps. Markers segregating 1:1 were used to construct parental framework maps with confidence levels >90% with the Plant Genome Research Initiative mapping program. In the maternal (D8909-15) map, 105 framework markers and 55 accessory markers were ordered in 17 linkage groups (756 cM). The paternal (F8909-17) map had 111 framework markers and 33 accessory markers ordered in 19 linkage groups (1,082 cM). One hundred eighty-one markers segregating 3:1 were used to connect the two parental maps' parents. This moderately dense map will be useful for the initial mapping of genes and/or QTL for resistance to the dagger nematode, Xiphinema index, and Xylella fastidiosa, the bacterial causal agent of Pierce's disease.
Protein Structure Prediction by Protein Threading
NASA Astrophysics Data System (ADS)
Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong
The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.
Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing.
Pope, Bernard J; Hammet, Fleur; Nguyen-Dumont, Tu; Park, Daniel J
2018-01-01
Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.
BiDiBlast: comparative genomics pipeline for the PC.
de Almeida, João M G C F
2010-06-01
Bi-directional BLAST is a simple approach to detect, annotate, and analyze candidate orthologous or paralogous sequences in a single go. This procedure is usually confined to the realm of customized Perl scripts, usually tuned for UNIX-like environments. Porting those scripts to other operating systems involves refactoring them, and also the installation of the Perl programming environment with the required libraries. To overcome these limitations, a data pipeline was implemented in Java. This application submits two batches of sequences to local versions of the NCBI BLAST tool, manages result lists, and refines both bi-directional and simple hits. GO Slim terms are attached to hits, several statistics are derived, and molecular evolution rates are estimated through PAML. The results are written to a set of delimited text tables intended for further analysis. The provided graphic user interface allows a friendly interaction with this application, which is documented and available to download at http://moodle.fct.unl.pt/course/view.php?id=2079 or https://sourceforge.net/projects/bidiblast/ under the GNU GPL license. Copyright 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
Genotype imputation in a coalescent model with infinitely-many-sites mutation
Huang, Lucy; Buzbas, Erkan O.; Rosenberg, Noah A.
2012-01-01
Empirical studies have identified population-genetic factors as important determinants of the properties of genotype-imputation accuracy in imputation-based disease association studies. Here, we develop a simple coalescent model of three sequences that we use to explore the theoretical basis for the influence of these factors on genotype-imputation accuracy, under the assumption of infinitely-many-sites mutation. Employing a demographic model in which two populations diverged at a given time in the past, we derive the approximate expectation and variance of imputation accuracy in a study sequence sampled from one of the two populations, choosing between two reference sequences, one sampled from the same population as the study sequence and the other sampled from the other population. We show that under this model, imputation accuracy—as measured by the proportion of polymorphic sites that are imputed correctly in the study sequence—increases in expectation with the mutation rate, the proportion of the markers in a chromosomal region that are genotyped, and the time to divergence between the study and reference populations. Each of these effects derives largely from an increase in information available for determining the reference sequence that is genetically most similar to the sequence targeted for imputation. We analyze as a function of divergence time the expected gain in imputation accuracy in the target using a reference sequence from the same population as the target rather than from the other population. Together with a growing body of empirical investigations of genotype imputation in diverse human populations, our modeling framework lays a foundation for extending imputation techniques to novel populations that have not yet been extensively examined. PMID:23079542
Tar'an, B; Warkentin, T D; Tullu, A; Vandenberg, A
2007-01-01
Ascochyta blight, caused by the fungus Ascochyta rabiei (Pass.) Lab., is one of the most devastating diseases of chickpea (Cicer arietinum L.) worldwide. Research was conducted to map genetic factors for resistance to ascochyta blight using a linkage map constructed with 144 simple sequence repeat markers and 1 morphological marker (fc, flower colour). Stem cutting was used to vegetatively propagate 186 F2 plants derived from a cross between Cicer arietinum L. 'ICCV96029' and 'CDC Frontier'. A total of 556 cutting-derived plants were evaluated for their reaction to ascochyta blight under controlled conditions. Disease reaction of the F1 and F2 plants demonstrated that the resistance was dominantly inherited. A Fain's test based on the means and variances of the ascochyta blight reaction of the F3 families showed that a few genes were segregating in the population. Composite interval mapping identified 3 genomic regions that were associated with the reaction to ascochyta blight. One quantitative trait locus (QTL) on each of LG3, LG4, and LG6 accounted for 13%, 29%, and 12%, respectively, of the total estimated phenotypic variation for the reaction to ascochyta blight. Together, these loci controlled 56% of the total estimated phenotypic variation. The QTL on LG4 and LG6 were in common with the previously reported QTL for ascochyta blight resistance, whereas the QTL on LG3 was unique to the current population.
Xie, Guosen; Mo, Zhongxi
2011-01-21
In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species. Copyright © 2010 Elsevier Ltd. All rights reserved.
Cavusoglu, M; Ciloglu, T; Serinagaoglu, Y; Kamasak, M; Erogul, O; Akcam, T
2008-08-01
In this paper, 'snore regularity' is studied in terms of the variations of snoring sound episode durations, separations and average powers in simple snorers and in obstructive sleep apnoea (OSA) patients. The goal was to explore the possibility of distinguishing among simple snorers and OSA patients using only sleep sound recordings of individuals and to ultimately eliminate the need for spending a whole night in the clinic for polysomnographic recording. Sequences that contain snoring episode durations (SED), snoring episode separations (SES) and average snoring episode powers (SEP) were constructed from snoring sound recordings of 30 individuals (18 simple snorers and 12 OSA patients) who were also under polysomnographic recording in Gülhane Military Medical Academy Sleep Studies Laboratory (GMMA-SSL), Ankara, Turkey. Snore regularity is quantified in terms of mean, standard deviation and coefficient of variation values for the SED, SES and SEP sequences. In all three of these sequences, OSA patients' data displayed a higher variation than those of simple snorers. To exclude the effects of slow variations in the base-line of these sequences, new sequences that contain the coefficient of variation of the sample values in a 'short' signal frame, i.e., short time coefficient of variation (STCV) sequences, were defined. The mean, the standard deviation and the coefficient of variation values calculated from the STCV sequences displayed a stronger potential to distinguish among simple snorers and OSA patients than those obtained from the SED, SES and SEP sequences themselves. Spider charts were used to jointly visualize the three parameters, i.e., the mean, the standard deviation and the coefficient of variation values of the SED, SES and SEP sequences, and the corresponding STCV sequences as two-dimensional plots. Our observations showed that the statistical parameters obtained from the SED and SES sequences, and the corresponding STCV sequences, possessed a strong potential to distinguish among simple snorers and OSA patients, both marginally, i.e., when the parameters are examined individually, and jointly. The parameters obtained from the SEP sequences and the corresponding STCV sequences, on the other hand, did not have a strong discrimination capability. However, the joint behaviour of these parameters showed some potential to distinguish among simple snorers and OSA patients.
2011-01-01
Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092
SA-Search: a web tool for protein structure mining based on a Structural Alphabet
Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre
2004-01-01
SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search. PMID:15215446
SA-Search: a web tool for protein structure mining based on a Structural Alphabet.
Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre
2004-07-01
SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.
The Evolution of the Observed Hubble Sequence over the past 6Gyr
NASA Astrophysics Data System (ADS)
Delgado-Serrano, R.; Hammer, F.; Yang, Y. B.; Puech, M.; Flores, H.; Rodrigues, M.
2011-10-01
During the past years we have confronted serious problems of methodology concerning the morphological and kinematic classification of distant galaxies. This has forced us to create a new simple and effective morphological classification methodology, in order to guarantee a morpho-kinematic correlation, make the reproducibility easier and restrict the classification subjectivity. Giving the characteristic of our morphological classification, we have thus been able to apply the same methodology, using equivalent observations, to representative samples of local and distant galaxies. It has allowed us to derive, for the first time, the distant Hubble sequence (~6 Gyr ago), and determine a morphological evolution of galaxies over the past 6 Gyr. Our results strongly suggest that more than half of the present-day spirals had peculiar morphologies, 6 Gyr ago.
Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo
2014-01-01
The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. PMID:24972598
Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo
2014-10-01
The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
PCR-based approach to SINE isolation: simple and complex SINEs.
Borodulina, Olga R; Kramerov, Dmitri A
2005-04-11
Highly repeated copies of short interspersed elements (SINEs) occur in eukaryotic genomes. The distribution of each SINE family is usually restricted to some genera, families, or orders. SINEs have an RNA polymerase III internal promoter, which is composed of boxes A and B. Here we propose a method for isolation of novel SINE families based on genomic DNA PCR with oligonucleotide identical to box A as a primer. Cloning of the size-heterogeneous PCR-products and sequencing of their terminal regions allow determination of SINE structure. Using this approach, two novel SINE families, Rhin-1 and Das-1, from the genomes of great horseshoe bat (Rhinolophus ferrumequinum) and nine-banded armadillo (Dasypus novemcinctus), respectively, were isolated and studied. The distribution of Rhin-1 is restricted to two of six bat families tested. Copies of this SINE are characterized by frequent internal insertions and significant length (200-270 bp). Das-1 being only 90 bp in length is one of the shortest SINEs known. Most of Das-1 nucleotide sequences demonstrate significant similarity to alanine tRNA which appears to be an evolutionary progenitor of this SINE. Together with three other known SINEs (ID, Vic-1, and CYN), Das-1 constitutes a group of simple SINEs. Interestingly, three SINE families of this group are alanine tRNA-derived. Most probably, this tRNA gave rise to short and simple but successful SINEs several times during mammalian evolution.
Liu, Wangta; Shiue, Yow-Ling; Lin, Yi-Reng; Lin, Hugo You-Hsien; Liang, Shih-Shin
2015-01-01
In this study, we demonstrated an oxidative method with free radical to generate 3,5,4′-trihydroxy-trans-stilbene (trans-resveratrol) metabolites and detect sequentially by an autosampler coupling with liquid chromatography electrospray ionization tandem mass spectrometer (LC-ESI–MS/MS). In this oxidative method, the free radical initiator, ammonium persulfate (APS), was placed in a sample bottle containing resveratrol to produce oxidative derivatives, and the reaction progress was tracked by autosampler sequencing. Resveratrol, a natural product with purported cancer preventative qualities, produces metabolites including dihydroresveratrol, 3,4′-dihydroxy-trans-stilbene, lunularin, resveratrol monosulfate, and dihydroresveratrol monosulfate by free radical oxidation. Using APS free radical, the concentrations of resveratrol derivatives differ as a function of time. Besides simple, convenient and time- and labor saving, the advantages of free radical oxidative method of its in situ generation of oxidative derivatives followed by LC-ESI–MS/MS can be utilized to evaluate different metabolites in various conditions. PMID:27594817
Liu, Wangta; Shiue, Yow-Ling; Lin, Yi-Reng; Lin, Hugo You-Hsien; Liang, Shih-Shin
2015-10-01
In this study, we demonstrated an oxidative method with free radical to generate 3,5,4'-trihydroxy- trans -stilbene ( trans -resveratrol) metabolites and detect sequentially by an autosampler coupling with liquid chromatography electrospray ionization tandem mass spectrometer (LC-ESI-MS/MS). In this oxidative method, the free radical initiator, ammonium persulfate (APS), was placed in a sample bottle containing resveratrol to produce oxidative derivatives, and the reaction progress was tracked by autosampler sequencing. Resveratrol, a natural product with purported cancer preventative qualities, produces metabolites including dihydroresveratrol, 3,4'-dihydroxy- trans -stilbene, lunularin, resveratrol monosulfate, and dihydroresveratrol monosulfate by free radical oxidation. Using APS free radical, the concentrations of resveratrol derivatives differ as a function of time. Besides simple, convenient and time- and labor saving, the advantages of free radical oxidative method of its in situ generation of oxidative derivatives followed by LC-ESI-MS/MS can be utilized to evaluate different metabolites in various conditions.
A fast algorithm for computer aided collimation gamma camera (CACAO)
NASA Astrophysics Data System (ADS)
Jeanguillaume, C.; Begot, S.; Quartuccio, M.; Douiri, A.; Franck, D.; Pihet, P.; Ballongue, P.
2000-08-01
The computer aided collimation gamma camera is aimed at breaking down the resolution sensitivity trade-off of the conventional parallel hole collimator. It uses larger and longer holes, having an added linear movement at the acquisition sequence. A dedicated algorithm including shift and sum, deconvolution, parabolic filtering and rotation is described. Examples of reconstruction are given. This work shows that a simple and fast algorithm, based on a diagonal dominant approximation of the problem can be derived. Its gives a practical solution to the CACAO reconstruction problem.
Development of Scoring Functions for Antibody Sequence Assessment and Optimization
Seeliger, Daniel
2013-01-01
Antibody development is still associated with substantial risks and difficulties as single mutations can radically change molecule properties like thermodynamic stability, solubility or viscosity. Since antibody generation methodologies cannot select and optimize for molecule properties which are important for biotechnological applications, careful sequence analysis and optimization is necessary to develop antibodies that fulfil the ambitious requirements of future drugs. While efforts to grab the physical principles of undesired molecule properties from the very bottom are becoming increasingly powerful, the wealth of publically available antibody sequences provides an alternative way to develop early assessment strategies for antibodies using a statistical approach which is the objective of this paper. Here, publically available sequences were used to develop heuristic potentials for the framework regions of heavy and light chains of antibodies of human and murine origin. The potentials take into account position dependent probabilities of individual amino acids but also conditional probabilities which are inevitable for sequence assessment and optimization. It is shown that the potentials derived from human sequences clearly distinguish between human sequences and sequences from mice and, hence, can be used as a measure of humaness which compares a given sequence with the phenotypic pool of human sequences instead of comparing sequence identities to germline genes. Following this line, it is demonstrated that, using the developed potentials, humanization of an antibody can be described as a simple mathematical optimization problem and that the in-silico generated framework variants closely resemble native sequences in terms of predicted immunogenicity. PMID:24204701
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.
Gupta, P D
2016-10-01
In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Song, Xuhao; Shen, Fujun; Huang, Jie; Huang, Yan; Du, Lianming; Wang, Chengdong; Fan, Zhenxin; Hou, Rong; Yue, Bisong; Zhang, Xiuyue
2016-09-01
Recently, an increasing number of microsatellites or simple sequence repeats (SSRs) have been found and characterized from transcriptomes. Such SSRs can be employed as putative functional markers to easily tag corresponding genes, which play an important role in biomedical studies and genetic analysis. However, the transcriptome-derived SSRs for giant panda (Ailuropoda melanoleuca) are not yet available. In this work, we identified and characterized 20 tetranucleotide microsatellite loci from a transcript database generated from the blood of giant panda. Furthermore, we assigned their predicted transcriptome locations: 16 loci were assigned to untranslated regions (UTRs) and 4 loci were assigned to coding regions (CDSs). Gene identities of 14 transcripts contained corresponding microsatellites were determined, which provide useful information to study the potential contribution of SSRs to gene regulation in giant panda. The polymorphic information content (PIC) values ranged from 0.293 to 0.789 with an average of 0.603 for the 16 UTRs-derived SSRs. Interestingly, 4 CDS-derived microsatellites developed in our study were also polymorphic, and the instability of these 4 CDS-derived SSRs was further validated by re-genotyping and sequencing. The genes containing these 4 CDS-derived SSRs were embedded with various types of repeat motifs. The interaction of all the length-changing SSRs might provide a way against coding region frameshift caused by microsatellite instability. We hope these newly gene-associated biomarkers will pave the way for genetic and biomedical studies for giant panda in the future. In sum, this set of transcriptome-derived markers complements the genetic resources available for giant panda. © The American Genetic Association. 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Two EST-derived marker systems for cultivar identification in tree peony.
Zhang, J J; Shu, Q Y; Liu, Z A; Ren, H X; Wang, L S; De Keyser, E
2012-02-01
Tree peony (Paeonia suffruticosa Andrews), a woody deciduous shrub, belongs to the section Moutan DC. in the genus of Paeonia of the Paeoniaceae family. To increase the efficiency of breeding, two EST-derived marker systems were developed based on a tree peony expressed sequence tag (EST) database. Using target region amplification polymorphism (TRAP), 19 of 39 primer pairs showed good amplification for 56 accessions with amplicons ranging from 120 to 3,000 bp long, among which 99.3% were polymorphic. In contrast, 7 of 21 primer pairs demonstrated adequate amplification with clear bands for simple sequence repeats (SSRs) developed from ESTs, and a total of 33 alleles were found in 56 accessions. The similarity matrices generated by TRAP and EST-SSR markers were compared, and the Mantel test (r = 0.57778, P = 0.0020) showed a moderate correlation between the two types of molecular markers. TRAP markers were suitable for DNA fingerprinting and EST-SSR markers were more appropriate for discriminating synonyms (the same cultivars with different names due to limited information exchanged among different geographic areas). The two sets of EST-derived markers will be used further for genetic linkage map construction and quantitative trait locus detection in tree peony.
NASA Astrophysics Data System (ADS)
Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong
2010-01-01
Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.
USDA-ARS?s Scientific Manuscript database
Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Helix-packing motifs in membrane proteins.
Walters, R F S; DeGrado, W F
2006-09-12
The fold of a helical membrane protein is largely determined by interactions between membrane-imbedded helices. To elucidate recurring helix-helix interaction motifs, we dissected the crystallographic structures of membrane proteins into a library of interacting helical pairs. The pairs were clustered according to their three-dimensional similarity (rmsd =1.5 A), allowing 90% of the library to be assigned to clusters consisting of at least five members. Surprisingly, three quarters of the helical pairs belong to one of five tightly clustered motifs whose structural features can be understood in terms of simple principles of helix-helix packing. Thus, the universe of common transmembrane helix-pairing motifs is relatively simple. The largest cluster, which comprises 29% of the library members, consists of an antiparallel motif with left-handed packing angles, and it is frequently stabilized by packing of small side chains occurring every seven residues in the sequence. Right-handed parallel and antiparallel structures show a similar tendency to segregate small residues to the helix-helix interface but spaced at four-residue intervals. Position-specific sequence propensities were derived for the most populated motifs. These structural and sequential motifs should be quite useful for the design and structural prediction of membrane proteins.
USDA-ARS?s Scientific Manuscript database
Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
Sequence and Analysis of the Tomato JOINTLESS Locus1
Mao, Long; Begum, Dilara; Goff, Stephen A.; Wing, Rod A.
2001-01-01
A 119-kb bacterial artificial chromosome from the JOINTLESS locus on the tomato (Lycopersicon esculentum) chromosome 11 contained 15 putative genes. Repetitive sequences in this region include one copia-like LTR retrotransposon, 13 simple sequence repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback transposon and the short DNA repeats seemed to be associated preferably with genes. The predicted tomato genes were compared with the complete Arabidopsis genome. Eleven out of 15 tomato open reading frames were found to be colinear with segments on five Arabidopsis bacterial artificial chromosome/P1-derived artificial chromosome clones. The synteny patterns, however, did not reveal duplicated segments in Arabidopsis, where over half of the genome is duplicated. Our analysis indicated that the microsynteny between the tomato and Arabidopsis genomes was still conserved at a very small scale but was complicated by the large number of gene families in the Arabidopsis genome. PMID:11457984
Rapid screening method for male DNA by using the loop-mediated isothermal amplification assay.
Kitamura, Masashi; Kubo, Seiji; Tanaka, Jin; Adachi, Tatsushi
2017-08-12
Screening for male-derived biological material from collected samples plays an important role in criminal investigations, especially those involving sexual assaults. We have developed a loop-mediated isothermal amplification (LAMP) assay targeting multi-repeat sequences of the Y chromosome for detecting male DNA. Successful amplification occurred with 0.5 ng of male DNA under isothermal conditions of 61 to 67 °C, but no amplification occurred with up to 10 ng of female DNA. Under the optimized conditions, the LAMP reaction initiated amplification within 10 min and amplified for 20 min. The LAMP reaction was sensitive at levels as low as 1-pg male DNA, and a quantitative LAMP assay could be developed because of the strong correlation between the reaction time and the amount of template DNA in the range of 10 pg to 10 ng. Furthermore, to apply the LAMP assay to on-site screening for male-derived samples, we evaluated a protocol using a simple DNA extraction method and a colorimetric intercalating dye that allows detection of the LAMP reaction by evaluating the change in color of the solution. Using this protocol, samples of male-derived blood and saliva stains were processed in approximately 30 min from DNA extraction to detection. Because our protocol does not require much hands-on time or special equipment, this LAMP assay promises to become a rapid and simple screening method for male-derived samples in forensic investigations.
Wang, Chun Guo; Chen, Xiao Qiang; Li, Hui; Zhao, Qian Cheng; Sun, De Ling; Song, Wen Qin
2008-02-01
Analysis of ISSR (Inter-Simple Sequence Repeat) and DDRT-PCR (Differential Display Reverse Transcriptase Polymerase Chain Reaction) was performed between cytoplasmic male sterility cauliflower ogura-A and its corresponding maintainer line ogura-B. Totally, 306 detectable bands were obtained by ISSR using thirty oligonucleotide primers. Commonly, six to twelve bands were produced per primer. Among all these primers only the amplification of primer ISSR3 was polymorphic, an 1100 bp specific band was only detected in maintainer line, named ISSR3(1100). Analysis of this sequence indicated that ISSR3(1100) was high homologous with the corresponding sequences of mitochondrial genome in Brassica napus and Arabidopsis thaliana,which suggested that ISSR3(1100) may derive from mitochondrial genome in cauliflower. To carry out DDRT-PCR analysis, three anchor primers and fifteen random primers were selected to combine. Totally, 1122 bands from 1 000 bp to 50 bp were detected. However, only four bands, named ogura-A 205, ogura-A383, ogura-B307 and ogura-B352, were confirmed to be different display in both lines. This result was further identified by reverse Northern dot blotting analysis. Among these four bands, ogura-A205 and ogura-A383 only express in cytoplasmic male sterility line, while ogura-B307 and ogura-B352 were only detected in maintainer line. Analysis of these sequences indicated that it was the first time that these four sequences were reported in cauliflower. Interestingly, ogura-A205 and ogura-B307 did not exhibit any similarities to other reported sequences in other species, more investigations were required to obtain further information. ogura-A383 and ogura-B352 were also two new sequences, they showed high similarities to corresponding chloroplast sequences of Arabidopsis thaliana and Brassica rapa subsp. pekinensis. So we speculated that these two sequences may derive from chloroplast genome. All these results obtained in this study offer new and significant information to investigate the molecular mechanism of cytoplasmic male sterility and fertile maintenance in cauliflower.
USDA-ARS?s Scientific Manuscript database
Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
USDA-ARS?s Scientific Manuscript database
Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
NASA Astrophysics Data System (ADS)
Porta, Alberto; Marchi, Andrea; Bari, Vlasta; De Maria, Beatrice; Esler, Murray; Lambert, Elisabeth; Baumert, Mathias
2017-05-01
The study assesses the strength of the causal relation along baroreflex (BR) in humans during an incremental postural challenge soliciting the BR. Both cardiac BR (cBR) and sympathetic BR (sBR) were characterized via BR sequence approaches from spontaneous fluctuations of heart period (HP), systolic arterial pressure (SAP), diastolic arterial pressure (DAP) and muscle sympathetic nerve activity (MSNA). A model-based transfer entropy method was applied to quantify the strength of the coupling from SAP to HP and from DAP to MSNA. The confounding influences of respiration were accounted for. Twelve young healthy subjects (20-36 years, nine females) were sequentially tilted at 0°, 20°, 30° and 40°. We found that (i) the strength of the causal relation along the cBR increases with tilt table inclination, while that along the sBR is unrelated to it; (ii) the strength of the causal coupling is unrelated to the gain of the relation; (iii) transfer entropy indexes are significantly and positively associated with simplified causality indexes derived from BR sequence analysis. The study proves that causality indexes are complementary to traditional characterization of the BR and suggests that simple markers derived from BR sequence analysis might be fruitfully exploited to estimate causality along the BR. This article is part of the themed issue `Mathematical methods in medicine: neuroscience, cardiology and pathology'.
Hatta, Tomoko; Fujinaga, Yasunari; Kadoya, Masumi; Ueda, Hitoshi; Murayama, Hiroaki; Kurozumi, Masahiro; Ueda, Kazuhiko; Komatsu, Michiharu; Nagaya, Tadanobu; Joshita, Satoru; Kodama, Ryo; Tanaka, Eiji; Uehara, Tsuyoshi; Sano, Kenji; Tanaka, Naoki
2010-12-01
To assess the degree of hepatic fat content, simple and noninvasive methods with high objectivity and reproducibility are required. Magnetic resonance imaging (MRI) is one such candidate, although its accuracy remains unclear. We aimed to validate an MRI method for quantifying hepatic fat content by calibrating MRI reading with a phantom and comparing MRI measurements in human subjects with estimates of liver fat content in liver biopsy specimens. The MRI method was performed by a combination of MRI calibration using a phantom and double-echo chemical shift gradient-echo sequence (double-echo fast low-angle shot sequence) that has been widely used on a 1.5-T scanner. Liver fat content in patients with nonalcoholic fatty liver disease (NAFLD, n = 26) was derived from a calibration curve generated by scanning the phantom. Liver fat was also estimated by optical image analysis. The correlation between the MRI measurements and liver histology findings was examined prospectively. Magnetic resonance imaging measurements showed a strong correlation with liver fat content estimated from the results of light microscopic examination (correlation coefficient 0.91, P < 0.001) regardless of the degree of hepatic steatosis. Moreover, the severity of lobular inflammation or fibrosis did not influence the MRI measurements. This MRI method is simple and noninvasive, has excellent ability to quantify hepatic fat content even in NAFLD patients with mild steatosis or advanced fibrosis, and can be performed easily without special devices.
Tachometer Derived From Brushless Shaft-Angle Resolver
NASA Technical Reports Server (NTRS)
Howard, David E.; Smith, Dennis A.
1995-01-01
Tachometer circuit operates in conjunction with brushless shaft-angle resolver. By performing sequence of straightforward mathematical operations on resolver signals and utilizing simple trigonometric identity, generates voltage proportional to rate of rotation of shaft. One advantage is use of brushless shaft-angle resolver as main source of rate signal: no brushes to wear out, no brush noise, and brushless resolvers have proven robustness. No switching of signals to generate noise. Another advantage, shaft-angle resolver used as shaft-angle sensor, tachometer input obtained without adding another sensor. Present circuit reduces overall size, weight, and cost of tachometer.
NASA Astrophysics Data System (ADS)
Cunningham, D.
2017-12-01
This talk will review the Permian-Recent tectonic history of the Gobi Corridor region which includes the actively deforming Gobi Altai-Altai, Eastern Tien Shan, Beishan and North Tibetan foreland. Since terrane amalgamation in the Permian, Gobi Corridor crust has been repeatedly reactivated by Triassic-Jurassic contraction/transpression, Late Cretaceous extension and Late Cenozoic transpression. The tectonic history of the region suggests the following basic principle for intraplate continental regions: non-cratonized continental interior terrane collages are susceptible to repeated intraplate reactivation events, driven by either post-orogenic collapse and/or compressional stresses derived from distant plate boundary convergence. Thus, important related questions are: 1) what lithospheric pre-conditions favor intraplate crustal reactivation in the Gobi Corridor (simple answer: crustal thinning, thermal weakening, strong buttressing cratons), 2) what are the controls on the kinematics of deformation and style of mountain building in the Gobi-Altai-Altai, Beishan and North Tibetan margin (simple answer: many factors, but especially angular relationship between SHmax and `crustal grain'), 3) how does knowledge of the array of Quaternary faults and the historical earthquake record influence our understanding of modern earthquake hazards in continental intraplate regions (answer: extrapolation of derived fault slip rates and recurrence interval determinations are problematic), 4) what important lessons can we learn from the Mesozoic-Cenozoic tectonic history of Central Asia that is applicable to the tectonic evolution of all intraplate continental regions (simple answer: ancient intraplate deformation events may be subtly expressed in the rock record and only revealed by low-temperature thermochronometers, preserved orogen-derived sedimentary sequences, fault zone evidence for younger brittle reactivation, and recognition of a younger class of cross-cutting tectonic structures).
Leakey, Tatiana I; Zielinski, Jerzy; Siegfried, Rachel N; Siegel, Eric R; Fan, Chun-Yang; Cooney, Craig A
2008-06-01
DNA methylation at cytosines is a widely studied epigenetic modification. Methylation is commonly detected using bisulfite modification of DNA followed by PCR and additional techniques such as restriction digestion or sequencing. These additional techniques are either laborious, require specialized equipment, or are not quantitative. Here we describe a simple algorithm that yields quantitative results from analysis of conventional four-dye-trace sequencing. We call this method Mquant and we compare it with the established laboratory method of combined bisulfite restriction assay (COBRA). This analysis of sequencing electropherograms provides a simple, easily applied method to quantify DNA methylation at specific CpG sites.
Quantification of the cerebrospinal fluid from a new whole body MRI sequence
NASA Astrophysics Data System (ADS)
Lebret, Alain; Petit, Eric; Durning, Bruno; Hodel, Jérôme; Rahmouni, Alain; Decq, Philippe
2012-03-01
Our work aims to develop a biomechanical model of hydrocephalus both intended to perform clinical research and to assist the neurosurgeon in diagnosis decisions. Recently, we have defined a new MR imaging sequence based on SPACE (Sampling Perfection with Application optimized Contrast using different flip-angle Evolution). On these images, the cerebrospinal fluid (CSF) appears as a homogeneous hypersignal. Therefore such images are suitable for segmentation and for volume assessment of the CSF. In this paper we present a fully automatic 3D segmentation of such SPACE MRI sequences. We choose a topological approach considering that CSF can be modeled as a simply connected object (i.e. a filled sphere). First an initial object which must be strictly included in the CSF and homotopic to a filled sphere, is determined by using a moment-preserving thresholding. Then a priority function based on an Euclidean distance map is computed in order to control the thickening process that adds "simple points" to the initial thresholded object. A point is called simple if its addition or its suppression does not result in change of topology neither for the object, nor for the background. The method is validated by measuring fluid volume of brain phantoms and by comparing our volume assessments on clinical data to those derived from a segmentation controlled by expert physicians. Then we show that a distinction between pathological cases and healthy adult people can be achieved by a linear discriminant analysis on volumes of the ventricular and intracranial subarachnoid spaces.
Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M
1994-01-01
Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130
Mulder, Willem H; Crawford, Forrest W
2015-01-07
Efforts to reconstruct phylogenetic trees and understand evolutionary processes depend fundamentally on stochastic models of speciation and mutation. The simplest continuous-time model for speciation in phylogenetic trees is the Yule process, in which new species are "born" from existing lineages at a constant rate. Recent work has illuminated some of the structural properties of Yule trees, but it remains mostly unknown how these properties affect sequence and trait patterns observed at the tips of the phylogenetic tree. Understanding the interplay between speciation and mutation under simple models of evolution is essential for deriving valid phylogenetic inference methods and gives insight into the optimal design of phylogenetic studies. In this work, we derive the probability distribution of interspecies covariance under Brownian motion and Ornstein-Uhlenbeck models of phenotypic change on a Yule tree. We compute the probability distribution of the number of mutations shared between two randomly chosen taxa in a Yule tree under discrete Markov mutation models. Our results suggest summary measures of phylogenetic information content, illuminate the correlation between site patterns in sequences or traits of related organisms, and provide heuristics for experimental design and reconstruction of phylogenetic trees. Copyright © 2014 Elsevier Ltd. All rights reserved.
De novo selection of oncogenes.
Chacón, Kelly M; Petti, Lisa M; Scheideman, Elizabeth H; Pirazzoli, Valentina; Politi, Katerina; DiMaio, Daniel
2014-01-07
All cellular proteins are derived from preexisting ones by natural selection. Because of the random nature of this process, many potentially useful protein structures never arose or were discarded during evolution. Here, we used a single round of genetic selection in mouse cells to isolate chemically simple, biologically active transmembrane proteins that do not contain any amino acid sequences from preexisting proteins. We screened a retroviral library expressing hundreds of thousands of proteins consisting of hydrophobic amino acids in random order to isolate four 29-aa proteins that induced focus formation in mouse and human fibroblasts and tumors in mice. These proteins share no amino acid sequences with known cellular or viral proteins, and the simplest of them contains only seven different amino acids. They transformed cells by forming a stable complex with the platelet-derived growth factor β receptor transmembrane domain and causing ligand-independent receptor activation. We term this approach de novo selection and suggest that it can be used to generate structures and activities not observed in nature, create prototypes for novel research reagents and therapeutics, and provide insight into cell biology, transmembrane protein-protein interactions, and possibly virus evolution and the origin of life.
Zhao, Xue; Yang, Bo; Li, Lingyun; Zhang, Fuming; Linhardt, Robert J.
2013-01-01
Hydroxyl radicals are widely implicated in the oxidation of carbohydrates in biological and industrial processes and are often responsible for their structural modification resulting in functional damage. In this study, the radical depolymerization of the polysaccharide hyaluronan was studied in a reaction with hydroxyl radicals generated by Fenton Chemistry. A simple method for isolation and identification of the resulting non-sulfated oligosaccharide products of oxidative depolymerization was established. Hyaluronan oligosaccharides were analyzed using ion-pairing reversed phase high performance liquid chromotography coupled with tandem electrospray mass spectrometry. The sequence of saturated hyaluronan oligosaccharides having even- and odd-numbers of saccharide units, afforded through oxidative depolymerization, were identified. This study represents a simple, effective ‘fingerprinting’ protocol for detecting the damage done to hyaluronan by oxidative radicals. This study should help reveal the potential biological outcome of reactive-oxygen radical-mediated depolymerization of hyaluronan. PMID:23768593
tropiTree: An NGS-Based EST-SSR Resource for 24 Tropical Tree Species
Russell, Joanne R.; Hedley, Peter E.; Cardle, Linda; Dancey, Siobhan; Morris, Jenny; Booth, Allan; Odee, David; Mwaura, Lucy; Omondi, William; Angaine, Peter; Machua, Joseph; Muchugi, Alice; Milne, Iain; Kindt, Roeland; Jamnadass, Ramni; Dawson, Ian K.
2014-01-01
The development of genetic tools for non-model organisms has been hampered by cost, but advances in next-generation sequencing (NGS) have created new opportunities. In ecological research, this raises the prospect for developing molecular markers to simultaneously study important genetic processes such as gene flow in multiple non-model plant species within complex natural and anthropogenic landscapes. Here, we report the use of bar-coded multiplexed paired-end Illumina NGS for the de novo development of expressed sequence tag-derived simple sequence repeat (EST-SSR) markers at low cost for a range of 24 tree species. Each chosen tree species is important in complex tropical agroforestry systems where little is currently known about many genetic processes. An average of more than 5,000 EST-SSRs was identified for each of the 24 sequenced species, whereas prior to analysis 20 of the species had fewer than 100 nucleotide sequence citations. To make results available to potential users in a suitable format, we have developed an open-access, interactive online database, tropiTree (http://bioinf.hutton.ac.uk/tropiTree), which has a range of visualisation and search facilities, and which is a model for the efficient presentation and application of NGS data. PMID:25025376
Fluctuation sensitivity of a transcriptional signaling cascade
NASA Astrophysics Data System (ADS)
Pilkiewicz, Kevin R.; Mayo, Michael L.
2016-09-01
The internal biochemical state of a cell is regulated by a vast transcriptional network that kinetically correlates the concentrations of numerous proteins. Fluctuations in protein concentration that encode crucial information about this changing state must compete with fluctuations caused by the noisy cellular environment in order to successfully transmit information across the network. Oftentimes, one protein must regulate another through a sequence of intermediaries, and conventional wisdom, derived from the data processing inequality of information theory, leads us to expect that longer sequences should lose more information to noise. Using the metric of mutual information to characterize the fluctuation sensitivity of transcriptional signaling cascades, we find, counter to this expectation, that longer chains of regulatory interactions can instead lead to enhanced informational efficiency. We derive an analytic expression for the mutual information from a generalized chemical kinetics model that we reduce to simple, mass-action kinetics by linearizing for small fluctuations about the basal biological steady state, and we find that at long times this expression depends only on a simple ratio of protein production to destruction rates and the length of the cascade. We place bounds on the values of these parameters by requiring that the mutual information be at least one bit—otherwise, any received signal would be indistinguishable from noise—and we find not only that nature has devised a way to circumvent the data processing inequality, but that it must be circumvented to attain this one-bit threshold. We demonstrate how this result places informational and biochemical efficiency at odds with one another by correlating high transcription factor binding affinities with low informational output, and we conclude with an analysis of the validity of our assumptions and propose how they might be tested experimentally.
Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut
USDA-ARS?s Scientific Manuscript database
Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
Ramu, Chenna
2003-07-01
SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.
Bishoyi, Ashok Kumar; Sharma, Anjali; Kavane, Aarti; Geetha, K A
2016-06-01
Cymbopogon is an important genus of family Poaceae, cultivated mainly for its essential oils which possess high medicinal and economical value. Several cultivars of Cymbopogon species are available for commercial cultivation in India and identification of these cultivars was conceded by means of morphological markers and essential oil constitution. Since these parameters are highly influenced by environmental factors, in most of the cases, it is difficult to identify Cymbopogon cultivars. In the present study, Random amplified polymorphic DNA (RAPD) and Inter-simple sequence repeat (ISSR) markers were employed to discriminate nine leading varieties of Cymbopogon since prior genomic information is lacking or very little in the genus. Ninety RAPD and 70 ISSR primers were used which generated 63 and 69 % polymorphic amplicons, respectively. Similarity in the pattern of UPGMA-derived dendrogram of RAPD and ISSR analysis revealed the reliability of the markers chosen for the study. Varietal/cultivar-specific markers generated from the study could be utilised for varietal/cultivar authentication, thus monitoring the quality of the essential oil production in Cymbopogon. These markers can also be utilised for the IPR protection of the cultivars. Moreover, the study provides molecular marker tool kit in both random and simple sequence repeats for diverse molecular research in the same or related genera.
Jiang, Rui ; Yang, Hua ; Zhou, Linqi ; Kuo, C.-C. Jay ; Sun, Fengzhu ; Chen, Ting
2007-01-01
The increasing demand for the identification of genetic variation responsible for common diseases has translated into a need for sophisticated methods for effectively prioritizing mutations occurring in disease-associated genetic regions. In this article, we prioritize candidate nonsynonymous single-nucleotide polymorphisms (nsSNPs) through a bioinformatics approach that takes advantages of a set of improved numeric features derived from protein-sequence information and a new statistical learning model called “multiple selection rule voting” (MSRV). The sequence-based features can maximize the scope of applications of our approach, and the MSRV model can capture subtle characteristics of individual mutations. Systematic validation of the approach demonstrates that this approach is capable of prioritizing causal mutations for both simple monogenic diseases and complex polygenic diseases. Further studies of familial Alzheimer diseases and diabetes show that the approach can enrich mutations underlying these polygenic diseases among the top of candidate mutations. Application of this approach to unclassified mutations suggests that there are 10 suspicious mutations likely to cause diseases, and there is strong support for this in the literature. PMID:17668383
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster
Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.
1993-01-01
Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry
2017-01-01
Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
Du, Qingzhang; Gong, Chenrui; Pan, Wei; Zhang, Deqiang
2013-02-01
Gene-derived simple sequence repeats (genic SSRs), also known as functional markers, are often preferred over random genomic markers because they represent variation in gene coding and/or regulatory regions. We characterized 544 genic SSR loci derived from 138 candidate genes involved in wood formation, distributed throughout the genome of Populus tomentosa, a key ecological and cultivated wood production species. Of these SSRs, three-quarters were located in the promoter or intron regions, and dinucleotide (59.7%) and trinucleotide repeat motifs (26.5%) predominated. By screening 15 wild P. tomentosa ecotypes, we identified 188 polymorphic genic SSRs with 861 alleles, 2-7 alleles for each marker. Transferability analysis of 30 random genic SSRs, testing whether these SSRs work in 26 genotypes of five genus Populus sections (outgroup, Salix matsudana), showed that 72% of the SSRs could be amplified in Turanga and 100% could be amplified in Leuce. Based on genotyping of these 26 genotypes, a neighbour-joining analysis showed the expected six phylogenetic groupings. In silico analysis of SSR variation in 220 sequences that are homologous between P. tomentosa and Populus trichocarpa suggested that genic SSR variations between relatives were predominantly affected by repeat motif variations or flanking sequence mutations. Inheritance tests and single-marker associations demonstrated the power of genic SSRs in family-based linkage mapping and candidate gene-based association studies, as well as marker-assisted selection and comparative genomic studies of P. tomentosa and related species.
Robust efficient estimation of heart rate pulse from video.
Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde
2014-04-01
We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity.
Robust efficient estimation of heart rate pulse from video
Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde
2014-01-01
We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity. PMID:24761294
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Resolution in forensic microbial genotyping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Velsko, S P
2005-08-30
Resolution is a key parameter for differentiating among the large number of strain typing methods that could be applied to pathogens involved in bioterror events or biocrimes. In this report we develop a first-principles analysis of strain typing resolution using a simple mathematical model to provide a basis for the rational design of microbial typing systems for forensic applications. We derive two figures of merit that describe the resolving power and phylogenetic depth of a strain typing system. Rough estimates of these figures-of-merit for MLVA, MLST, IS element, AFLP, hybridization microarrays, and other bacterial typing methods are derived from mutationmore » rate data reported in the literature. We also discuss the general problem of how to construct a ''universal'' practical typing system that has the highest possible resolution short of whole-genome sequencing, and that is applicable with minimal modification to a wide range of pathogens.« less
Izuchi, Yukari; Takashima, Tsuneo; Hatano, Naoya
2016-01-01
The demand for leather goods has grown globally in recent years. Industry revenue is forecast to reach $91.2 billion by 2018. There is an ongoing labelling problem in the leather items market, in that it is currently impossible to identify the species that a given piece of leather is derived from. To address this issue, we developed a rapid and simple method for the specific identification of leather derived from cattle, horses, pigs, sheep, goats, and deer by analysing peptides produced by the trypsin-digestion of proteins contained in leather goods using liquid chromatography/mass spectrometry. We determined species-specific amino acid sequences by liquid chromatography/tandem mass spectrometry analysis using the Mascot software program and demonstrated that collagen α-1(I), collagen α-2(I), and collagen α-1(III) from the dermal layer of the skin are particularly useful in species identification. PMID:27313979
Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A
2011-01-01
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Carbonell, Alberto; Fahlgren, Noah; Mitchell, Skyler; ...
2015-05-20
Artificial microRNAs (amiRNAs) are used for selective gene silencing in plants. However, current methods to produce amiRNA constructs for silencing transcripts in monocot species are not suitable for simple, cost-effective and large-scale synthesis. Here, a series of expression vectors based on Oryza sativa MIR390 (OsMIR390) precursor was developed for high-throughput cloning and high expression of amiRNAs in monocots. Four different amiRNA sequences designed to target specifically endogenous genes and expressed from OsMIR390-based vectors were validated in transgenic Brachypodium distachyon plants. Surprisingly, amiRNAs accumulated to higher levels and were processed more accurately when expressed from chimeric OsMIR390-based precursors that include distalmore » stem-loop sequences from Arabidopsis thaliana MIR390a (AtMIR390a). In all cases, transgenic plants displayed the predicted phenotypes induced by target gene repression, and accumulated high levels of amiRNAs and low levels of the corresponding target transcripts. Genome-wide transcriptome profiling combined with 5-RLM-RACE analysis in transgenic plants confirmed that amiRNAs were highly specific. Finally, significance Statement A series of amiRNA vectors based on Oryza sativa MIR390 (OsMIR390) precursor were developed for simple, cost-effective and large-scale synthesis of amiRNA constructs to silence genes in monocots. Unexpectedly, amiRNAs produced from chimeric OsMIR390-based precursors including Arabidopsis thaliana MIR390a distal stem-loop sequences accumulated elevated levels of highly effective and specific amiRNAs in transgenic Brachypodium distachyon plants.« less
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L
2012-01-01
Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Identifying and reducing error in cluster-expansion approximations of protein energies.
Hahn, Seungsoo; Ashenberg, Orr; Grigoryan, Gevorg; Keating, Amy E
2010-12-01
Protein design involves searching a vast space for sequences that are compatible with a defined structure. This can pose significant computational challenges. Cluster expansion is a technique that can accelerate the evaluation of protein energies by generating a simple functional relationship between sequence and energy. The method consists of several steps. First, for a given protein structure, a training set of sequences with known energies is generated. Next, this training set is used to expand energy as a function of clusters consisting of single residues, residue pairs, and higher order terms, if required. The accuracy of the sequence-based expansion is monitored and improved using cross-validation testing and iterative inclusion of additional clusters. As a trade-off for evaluation speed, the cluster-expansion approximation causes prediction errors, which can be reduced by including more training sequences, including higher order terms in the expansion, and/or reducing the sequence space described by the cluster expansion. This article analyzes the sources of error and introduces a method whereby accuracy can be improved by judiciously reducing the described sequence space. The method is applied to describe the sequence-stability relationship for several protein structures: coiled-coil dimers and trimers, a PDZ domain, and T4 lysozyme as examples with computationally derived energies, and SH3 domains in amphiphysin-1 and endophilin-1 as examples where the expanded pseudo-energies are obtained from experiments. Our open-source software package Cluster Expansion Version 1.0 allows users to expand their own energy function of interest and thereby apply cluster expansion to custom problems in protein design. © 2010 Wiley Periodicals, Inc.
Simple sequence repeat markers that identify Claviceps species and strains
USDA-ARS?s Scientific Manuscript database
Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...
Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia
2014-03-10
In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Germline TRAV5D-4 T-Cell Receptor Sequence Targets a Primary Insulin Peptide of NOD Mice
Nakayama, Maki; Castoe, Todd; Sosinowski, Tomasz; He, XiangLing; Johnson, Kelly; Haskins, Kathryn; Vignali, Dario A.A.; Gapin, Laurent; Pollock, David; Eisenbarth, George S.
2012-01-01
There is accumulating evidence that autoimmunity to insulin B chain peptide, amino acids 9–23 (insulin B:9–23), is central to development of autoimmune diabetes of the NOD mouse model. We hypothesized that enhanced susceptibility to autoimmune diabetes is the result of targeting of insulin by a T-cell receptor (TCR) sequence commonly encoded in the germline. In this study, we aimed to demonstrate that a particular Vα gene TRAV5D-4 with multiple junction sequences is sufficient to induce anti-islet autoimmunity by studying retrogenic mouse lines expressing α-chains with different Vα TRAV genes. Retrogenic NOD strains expressing Vα TRAV5D-4 α-chains with many different complementarity determining region (CDR) 3 sequences, even those derived from TCRs recognizing islet-irrelevant molecules, developed anti-insulin autoimmunity. Induction of insulin autoantibodies by TRAV5D-4 α-chains was abrogated by the mutation of insulin peptide B:9–23 or that of two amino acid residues in CDR1 and 2 of the TRAV5D-4. TRAV13–1, the human ortholog of murine TRAV5D-4, was also capable of inducing in vivo anti-insulin autoimmunity when combined with different murine CDR3 sequences. Targeting primary autoantigenic peptides by simple germline-encoded TCR motifs may underlie enhanced susceptibility to the development of autoimmune diabetes. PMID:22315318
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-01-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a ‘piping-leaf-type’ cultivar, ‘Yugafu’, and a ‘spiny-tip-leaf-type’ variety, ‘Yonekura’. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the ‘spiny-leaf type’ as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding. PMID:26175625
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-06-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a 'piping-leaf-type' cultivar, 'Yugafu', and a 'spiny-tip-leaf-type' variety, 'Yonekura'. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the 'spiny-leaf type' as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding.
Choi, Hong-Il; Kim, Nam Hoon; Kim, Jun Ha; Choi, Beom Soon; Ahn, In-Ok; Lee, Joon-Soo; Yang, Tae-Jin
2011-01-01
Little is known about the genetics or genomics of Panax ginseng. In this study, we developed 70 expressed sequence tag-derived polymorphic simple sequence repeat markers by trials of 140 primer pairs. All of the 70 markers showed reproducible polymorphism among four Panax speciesand 19 of them were polymorphic in six P. ginseng cultivars. These markers segregated 1:2:1 manner of Mendelian inheritance in an F2 population of a cross between two P. ginseng cultivars, ‘Yunpoong’ and ‘Chunpoong’, indicating that these are reproducible and inheritable mappable markers. A phylogenetic analysis using the genotype data showed three distinctive groups: a P. ginseng-P. japonicus clade, P. notoginseng and P. quinquefolius, with similarity coefficients of 0.70. P. japonicus was intermingled with P. ginseng cultivars, indicating that both species have similar genetic backgrounds. P. ginseng cultivars were subdivided into three minor groups: an independent cultivar ‘Chunpoong’, a subgroup with three accessions including two cultivars, ‘Gumpoong’ and ‘Yunpoong’ and one landrace ‘Hwangsook’ and another subgroup with two accessions including one cultivar, ‘Gopoong’ and one landrace ‘Jakyung’. Each primer pair produced 1 to 4 bands, indicating that the ginseng genome has a highly replicated paleopolyploid genome structure. PMID:23717085
One-pot multienzyme (OPME) systems for chemoenzymatic synthesis of carbohydrates.
Yu, Hai; Chen, Xi
2016-03-14
Glycosyltransferase-catalyzed enzymatic and chemoenzymatic syntheses are powerful approaches for the production of oligosaccharides, polysaccharides, glycoconjugates, and their derivatives. Enzymes involved in the biosynthesis of sugar nucleotide donors can be combined with glycosyltransferases in one pot for efficient production of the target glycans from simple monosaccharides and acceptors. The identification of enzymes involved in the salvage pathway of sugar nucleotide generation has greatly facilitated the development of simplified and efficient one-pot multienzyme (OPME) systems for synthesizing major glycan epitopes in mammalian glycomes. The applications of OPME methods are steadily gaining popularity mainly due to the increasing availability of wild-type and engineered enzymes. Substrate promiscuity of these enzymes and their mutants allows OPME synthesis of carbohydrates with naturally occurring post-glycosylational modifications (PGMs) and their non-natural derivatives using modified monosaccharides as precursors. The OPME systems can be applied in sequence for synthesizing complex carbohydrates. The sequence of the sequential OPME processes, the glycosyltransferase used, and the substrate specificities of the glycosyltransferases define the structures of the products. The OPME and sequential OPME strategies can be extended to diverse glycans in other glycomes when suitable enzymes with substrate promiscuity become available. This Perspective summarizes the work of the authors and collaborators on the development of glycosyltransferase-based OPME systems for carbohydrate synthesis. Future directions are also discussed.
Dong, Hongjuan; Marchetti-Deschmann, Martina; Allmaier, Günter
2014-01-01
Traditionally characterization of microbial proteins is performed by a complex sequence of steps with the final step to be either Edman sequencing or mass spectrometry, which generally takes several weeks or months to be complete. In this work, we proposed a strategy for the characterization of tryptic peptides derived from Giberella zeae (anamorph: Fusarium graminearum) proteins in parallel to intact cell mass spectrometry (ICMS) in which no complicated and time-consuming steps were needed. Experimentally, after a simple washing treatment of the spores, the aliquots of the intact G. zeae macro conidia spores solution, were deposited two times onto one MALDI (matrix-assisted laser desorption ionization) mass spectrometry (MS) target (two spots). One spot was used for ICMS and the second spot was subject to a brief on-target digestion with bead-immobilized or non-immobilized trypsin. Subsequently, one spot was analyzed immediately by MALDI MS in the linear mode (ICMS) whereas the second spot containing the digested material was investigated by MALDI MS in the reflectron mode ("peptide mass fingerprint") followed by protonated peptide selection for MS/MS (post source decay (PSD) fragment ion) analysis. Based on the formed fragment ions of selected tryptic peptides a complete or partial amino acid sequence was generated by manual de novo sequencing. These sequence data were used for homology search for protein identification. Finally four different peptides of varying abundances have been identified successfully allowing the verification that our desorbed/ionized surface compounds were indeed derived from proteins. The presence of three different proteins could be found unambiguously. Interestingly, one of these proteins is belonging to the ribosomal superfamily which indicates that not only surface-associated proteins were digested. This strategy minimized the amount of time and labor required for obtaining deeper information on spore preparations within the nowadays widely used ICMS approach. Copyright © 2013 Elsevier Ltd. All rights reserved.
Alverson, Andrew J.; Wei, XiaoXin; Rice, Danny W.; Stern, David B.; Barry, Kerrie; Palmer, Jeffrey D.
2010-01-01
The mitochondrial genomes of seed plants are unusually large and vary in size by at least an order of magnitude. Much of this variation occurs within a single family, the Cucurbitaceae, whose genomes range from an estimated 390 to 2,900 kb in size. We sequenced the mitochondrial genomes of Citrullus lanatus (watermelon: 379,236 nt) and Cucurbita pepo (zucchini: 982,833 nt)—the two smallest characterized cucurbit mitochondrial genomes—and determined their RNA editing content. The relatively compact Citrullus mitochondrial genome actually contains more and longer genes and introns, longer segmental duplications, and more discernibly nuclear-derived DNA. The large size of the Cucurbita mitochondrial genome reflects the accumulation of unprecedented amounts of both chloroplast sequences (>113 kb) and short repeated sequences (>370 kb). A low mutation rate has been hypothesized to underlie increases in both genome size and RNA editing frequency in plant mitochondria. However, despite its much larger genome, Cucurbita has a significantly higher synonymous substitution rate (and presumably mutation rate) than Citrullus but comparable levels of RNA editing. The evolution of mutation rate, genome size, and RNA editing are apparently decoupled in Cucurbitaceae, reflecting either simple stochastic variation or governance by different factors. PMID:20118192
NOTE: A method for controlling image acquisition in electronic portal imaging devices
NASA Astrophysics Data System (ADS)
Glendinning, A. G.; Hunt, S. G.; Bonnett, D. E.
2001-02-01
Certain types of camera-based electronic portal imaging devices (EPIDs) which initiate image acquisition based on sensing a change in video level have been observed to trigger unreliably at the beginning of dynamic multileaf collimation sequences. A simple, novel means of controlling image acquisition with an Elekta linear accelerator (Elekta Oncology Systems, Crawley, UK) is proposed which is based on illumination of a photodetector (ORP-12, Silonex Inc., Plattsburgh, NY, USA) by the electron gun of the accelerator. By incorporating a simple trigger circuit it is possible to derive a beam on/off status signal which changes at least 100 ms before any dose is measured by the accelerator. The status signal does not return to the beam-off state until all dose has been delivered and is suitable for accelerator pulse repetition frequencies of 50-400 Hz. The status signal is thus a reliable means of indicating the initiation and termination of radiation exposure, and thus controlling image acquisition of such EPIDs for this application.
Yang, S; Chen, S; Geng, X X; Yan, G; Li, Z Y; Meng, J L; Cowling, W A; Zhou, W J
2016-04-01
We present the first genetic map of an allohexaploid Brassica species, based on segregating microsatellite markers in a doubled haploid mapping population generated from a hybrid between two hexaploid parents. This study reports the first genetic map of trigenomic Brassica. A doubled haploid mapping population consisting of 189 lines was obtained via microspore culture from a hybrid H16-1 derived from a cross between two allohexaploid Brassica lines (7H170-1 and Y54-2). Simple sequence repeat primer pairs specific to the A genome (107), B genome (44) and C genome (109) were used to construct a genetic linkage map of the population. Twenty-seven linkage groups were resolved from 274 polymorphic loci on the A genome (109), B genome (49) and C genome (116) covering a total genetic distance of 3178.8 cM with an average distance between markers of 11.60 cM. This is the first genetic framework map for the artificially synthesized Brassica allohexaploids. The linkage groups represent the expected complement of chromosomes in the A, B and C genomes from the original diploid and tetraploid parents. This framework linkage map will be valuable for QTL analysis and future genetic improvement of a new allohexaploid Brassica species, and in improving our understanding of the genetic control of meiosis in new polyploids.
Verifying Digital Components of Physical Systems: Experimental Evaluation of Test Quality
NASA Astrophysics Data System (ADS)
Laputenko, A. V.; López, J. E.; Yevtushenko, N. V.
2018-03-01
This paper continues the study of high quality test derivation for verifying digital components which are used in various physical systems; those are sensors, data transfer components, etc. We have used logic circuits b01-b010 of the package of ITC'99 benchmarks (Second Release) for experimental evaluation which as stated before, describe digital components of physical systems designed for various applications. Test sequences are derived for detecting the most known faults of the reference logic circuit using three different approaches to test derivation. Three widely used fault types such as stuck-at-faults, bridges, and faults which slightly modify the behavior of one gate are considered as possible faults of the reference behavior. The most interesting test sequences are short test sequences that can provide appropriate guarantees after testing, and thus, we experimentally study various approaches to the derivation of the so-called complete test suites which detect all fault types. In the first series of experiments, we compare two approaches for deriving complete test suites. In the first approach, a shortest test sequence is derived for testing each fault. In the second approach, a test sequence is pseudo-randomly generated by the use of an appropriate software for logic synthesis and verification (ABC system in our study) and thus, can be longer. However, after deleting sequences detecting the same set of faults, a test suite returned by the second approach is shorter. The latter underlines the fact that in many cases it is useless to spend `time and efforts' for deriving a shortest distinguishing sequence; it is better to use the test minimization afterwards. The performed experiments also show that the use of only randomly generated test sequences is not very efficient since such sequences do not detect all the faults of any type. After reaching the fault coverage around 70%, saturation is observed, and the fault coverage cannot be increased anymore. For deriving high quality short test suites, the approach that is the combination of randomly generated sequences together with sequences which are aimed to detect faults not detected by random tests, allows to reach the good fault coverage using shortest test sequences.
USDA-ARS?s Scientific Manuscript database
The genetic relationships and pedigree inferences among peach (Prunus persica (L.) Batsch) accessions and breeding lines used in genetic improvement were evaluated using 15 simple sequence repeat (SSR) markers. A total of 80 alleles were detected among the 37 peach accessions with an average of 5.53...
We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...
USDA-ARS?s Scientific Manuscript database
Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Photogrammetric 3d Building Reconstruction from Thermal Images
NASA Astrophysics Data System (ADS)
Maset, E.; Fusiello, A.; Crosilla, F.; Toldo, R.; Zorzetto, D.
2017-08-01
This paper addresses the problem of 3D building reconstruction from thermal infrared (TIR) images. We show that a commercial Computer Vision software can be used to automatically orient sequences of TIR images taken from an Unmanned Aerial Vehicle (UAV) and to generate 3D point clouds, without requiring any GNSS/INS data about position and attitude of the images nor camera calibration parameters. Moreover, we propose a procedure based on Iterative Closest Point (ICP) algorithm to create a model that combines high resolution and geometric accuracy of RGB images with the thermal information deriving from TIR images. The process can be carried out entirely by the aforesaid software in a simple and efficient way.
A Rigorous Framework for Optimization of Expensive Functions by Surrogates
NASA Technical Reports Server (NTRS)
Booker, Andrew J.; Dennis, J. E., Jr.; Frank, Paul D.; Serafini, David B.; Torczon, Virginia; Trosset, Michael W.
1998-01-01
The goal of the research reported here is to develop rigorous optimization algorithms to apply to some engineering design problems for which design application of traditional optimization approaches is not practical. This paper presents and analyzes a framework for generating a sequence of approximations to the objective function and managing the use of these approximations as surrogates for optimization. The result is to obtain convergence to a minimizer of an expensive objective function subject to simple constraints. The approach is widely applicable because it does not require, or even explicitly approximate, derivatives of the objective. Numerical results are presented for a 31-variable helicopter rotor blade design example and for a standard optimization test example.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Kwon, Hyuk-Sang; Yang, Eun-Hee; Yeon, Seung-Woo; Kang, Byoung-Hwa; Kim, Tae-Yong
2004-10-15
This study aimed to develop a novel multiplex polymerase chain reaction (PCR) primer set for the identification of seven probiotic Lactobacillus species such as Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus casei, Lactobacillus gasseri, Lactobacillus plantarum, Lactobacillus reuteri and Lactobacillus rhamnosus. The primer set, comprising of seven specific and two conserved primers, was derived from the integrated sequences of 16S and 23S rRNA genes and their rRNA intergenic spacer region of each species. It was able to identify the seven target species with 93.6% accuracy, which exceeds that of the general biochemical methods. The phylogenetic analyses, using 16S rDNA sequences of the probiotic isolates, also provided further support that the results from the multiplex PCR assay were trustworthy. Taken together, we suggest that the multiplex primer set is an efficient tool for simple, rapid and reliable identification of seven Lactobacillus species.
Wu, Zhigang; Wu, Jinwei; Wang, Yalin; Hou, Hongwei
2017-01-01
Premise of the study: Microsatellite or simple sequence repeat (SSR) markers were developed to investigate the influence of ecological factors on gene flow and spatial genetic structuring of the submerged plant Ranunculus bungei (Ranunculaceae), which is regarded as an important species for understanding how plants adapt to an aquatic environment. Methods and Results: Twenty-two microsatellite loci were identified from an expressed sequence tag (EST) library. The number of alleles per locus ranged from one to five, and the expected heterozygosity varied from 0.0 to 0.5 in four Chinese populations of R. bungei. Fourteen loci were polymorphic and significantly deviated from Hardy–Weinberg equilibrium. All of the loci were found to be amplifiable in two other species of Ranunculus section Batrachium, and cross-amplification in six riparian and aquatic species of Ranunculaceae was also partially successful. Conclusions: These novel EST-SSR markers will be useful for ecological and evolutionary studies of R. bungei as well as related species. PMID:28791205
NASA Astrophysics Data System (ADS)
Soltanian-Zadeh, Hamid; Windham, Joe P.
1992-04-01
Maximizing the minimum absolute contrast-to-noise ratios (CNRs) between a desired feature and multiple interfering processes, by linear combination of images in a magnetic resonance imaging (MRI) scene sequence, is attractive for MRI analysis and interpretation. A general formulation of the problem is presented, along with a novel solution utilizing the simple and numerically stable method of Gram-Schmidt orthogonalization. We derive explicit solutions for the case of two interfering features first, then for three interfering features, and, finally, using a typical example, for an arbitrary number of interfering feature. For the case of two interfering features, we also provide simplified analytical expressions for the signal-to-noise ratios (SNRs) and CNRs of the filtered images. The technique is demonstrated through its applications to simulated and acquired MRI scene sequences of a human brain with a cerebral infarction. For these applications, a 50 to 100% improvement for the smallest absolute CNR is obtained.
NASA Astrophysics Data System (ADS)
Mansour, Ahmed; Mohamed, Omar; Tahoun, Sameh S.; Elewa, Ashraf M. T.
2018-03-01
The current paper provides a high resolution sequence stratigraphic study of the Raha Formation from the productive Bakr Oil Field, central Gulf of Suez, Egypt. Sixty cutting rock samples spanning the Cenomanian from three wells (Bakr-114, B-115 and B-109) in the Bakr Basin, were palynologically investigated. The documented palynomorphs assemblage of either terrestrially-derived sporomorphs or marine inhabited dinocysts, allowed two palynological zones as well as their encompassing depositional palaeoenvironment to be recognized. These zones are Afropollis jardinus-Crybelosporites pannuceus Assemblage Zone (early-middle Cenomanian) and Classopollis brasiliensis-Tricolpites sagax Assemblage Zone (late Cenomanian). Detailed analysis of the particulate organic matter compositions suggested that the depositional palaeoenvironment of the Raha Formation was fluctuating between supratidal and distal-inner neritic conditions, due to successive oscillations of the Neo-Tethyan Ocean during the Cenomanian. The pronounced peaks of particulate organic matter versus gamma ray are markedly used in delineating the depositional sequences of the Raha Formation and their bounding surfaces. The Raha Formation probably corresponds to a second-order depositional sequence, which can be further subdivided into eight third-order depositional sequences, of which six are complete and two are incomplete ones. These depositional sequences are significantly synchronized based on a simple 2-D correlation model between the three wells. According to the hierarchical duration system, the Cenomanian herein was approximately attributed to 6 Myr, each of which has lower order depositional sequences that took approximately 0.9 Myr. Based on the sequence stratigraphic approach together with palynofacies analysis and gamma ray data, a condensed section was defined in the B-115.
Molecular dynamics studies of the 3D structure and planar ligand binding of a quadruplex dimer.
Li, Ming-Hui; Luo, Quan; Xue, Xiang-Gui; Li, Ze-Sheng
2011-03-01
G-rich sequences can fold into a four-stranded structure called a G-quadruplex, and sequences with short loops are able to aggregate to form stable quadruplex multimers. Few studies have characterized the properties of this variety of quadruplex multimers. Using molecular modeling and molecular dynamics simulations, the present study investigated a dimeric G-quadruplex structure formed from a simple sequence of d(GGGTGGGTGGGTGGGT) (G1), and its interactions with a planar ligand of a perylene derivative (Tel03). A series of analytical methods, including free energy calculations and principal components analysis (PCA), was used. The results show that a dimer structure with stacked parallel monomer structures is maintained well during the entire simulation. Tel03 can bind to the dimer efficiently through end stacking, and the binding mode of the ligand stacked with the 3'-terminal thymine base is most favorable. PCA showed that the dominant motions in the free dimer occur on the loop regions, and the presence of the ligand reduces the flexibility of the loops. Our investigation will assist in understanding the geometric structure of stacked G-quadruplex multimers and may be helpful as a platform for rational drug design.
Test equality in binary data for a 4 × 4 crossover trial under a Latin-square design.
Lui, Kung-Jong; Chang, Kuang-Chao
2016-10-15
When there are four or more treatments under comparison, the use of a crossover design with a complete set of treatment-receipt sequences in binary data is of limited use because of too many treatment-receipt sequences. Thus, we may consider use of a 4 × 4 Latin square to reduce the number of treatment-receipt sequences when comparing three experimental treatments with a control treatment. Under a distribution-free random effects logistic regression model, we develop simple procedures for testing non-equality between any of the three experimental treatments and the control treatment in a crossover trial with dichotomous responses. We further derive interval estimators in closed forms for the relative effect between treatments. To evaluate the performance of these test procedures and interval estimators, we employ Monte Carlo simulation. We use the data taken from a crossover trial using a 4 × 4 Latin-square design for studying four-treatments to illustrate the use of test procedures and interval estimators developed here. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A Gauge-generalized Solution for Non-Keplerian Motion in the Frenet-Serret Frame
NASA Astrophysics Data System (ADS)
Garber, Darren D.
2009-05-01
The customary modeling of perturbed planetary and spacecraft motion as a continuous sequence of unperturbed two-body orbits (instantaneous ellipses) is conveniently assigned a physical interpretation through the Keplerian and Delaunay elements and complemented mathematically by the Lagrange-type equations which describe the evolution of these variables. If however the actual motion is very non-Keplerian (i.e. the perturbed orbit varies greatly from a two-body orbit), then its modeling by a sequence of conics is not necessarily optimal in terms of its mathematical description and its resulting physical interpretation. Since, in principle a curve of any type can be represented as a sequence of points from a family of curves of any other type (Efroimsky 2005), alternate non-conic curves can be utilized to better describe the perturbed non-Keplerian motion of the body both mathematically and with a physically relevant interpretation. Non-Keplerian motion exists in both celestial mechanics and astrodynamics as evident by the complex interactions within star clusters and also as the result of a spacecraft accelerating via ion propulsion, solar sails and electro-dynamic tethers. For these cases, the sequence of simple orbits to describe the motion is not based on conics, but instead a family of spirals. The selection of spirals as the underlying simple motion is supported by the fact that it is unnecessary to describe the motion in terms of instantaneous orbits tangent to the actual trajectory (Efroimsky 2002, Newman & Efroimsky 2003) and at times there is an advantage to deviate from osculation, in order to greatly simplify the resulting mathematics via gauge freedom (Efroimsky & Goldreich 2003, Slabinski 2003, Gurfil 2004). From these two principles, (1) spirals as instantaneous orbits, and (2) controlled deviation from osculation, new planetary equations are derived for new non-osculating elements in the Frenet-Serret frame with the gauge function as a measure of non-osculation.
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA
USDA-ARS?s Scientific Manuscript database
A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
Nafissi, Nafiseh; Slavcev, Roderick
2012-12-06
While safer than their viral counterparts, conventional non-viral gene delivery DNA vectors offer a limited safety profile. They often result in the delivery of unwanted prokaryotic sequences, antibiotic resistance genes, and the bacterial origins of replication to the target, which may lead to the stimulation of unwanted immunological responses due to their chimeric DNA composition. Such vectors may also impart the potential for chromosomal integration, thus potentiating oncogenesis. We sought to engineer an in vivo system for the quick and simple production of safer DNA vector alternatives that were devoid of non-transgene bacterial sequences and would lethally disrupt the host chromosome in the event of an unwanted vector integration event. We constructed a parent eukaryotic expression vector possessing a specialized manufactured multi-target site called "Super Sequence", and engineered E. coli cells (R-cell) that conditionally produce phage-derived recombinase Tel (PY54), TelN (N15), or Cre (P1). Passage of the parent plasmid vector through R-cells under optimized conditions, resulted in rapid, efficient, and one step in vivo generation of mini lcc--linear covalently closed (Tel/TelN-cell), or mini ccc--circular covalently closed (Cre-cell), DNA constructs, separated from the backbone plasmid DNA. Site-specific integration of lcc plasmids into the host chromosome resulted in chromosomal disruption and 10(5) fold lower viability than that seen with the ccc counterpart. We offer a high efficiency mini DNA vector production system that confers simple, rapid and scalable in vivo production of mini lcc DNA vectors that possess all the benefits of "minicircle" DNA vectors and virtually eliminate the potential for undesirable vector integration events.
CrossQuery: a web tool for easy associative querying of transcriptome data.
Wagner, Toni U; Fischer, Andreas; Thoma, Eva C; Schartl, Manfred
2011-01-01
Enormous amounts of data are being generated by modern methods such as transcriptome or exome sequencing and microarray profiling. Primary analyses such as quality control, normalization, statistics and mapping are highly complex and need to be performed by specialists. Thereafter, results are handed back to biomedical researchers, who are then confronted with complicated data lists. For rather simple tasks like data filtering, sorting and cross-association there is a need for new tools which can be used by non-specialists. Here, we describe CrossQuery, a web tool that enables straight forward, simple syntax queries to be executed on transcriptome sequencing and microarray datasets. We provide deep-sequencing data sets of stem cell lines derived from the model fish Medaka and microarray data of human endothelial cells. In the example datasets provided, mRNA expression levels, gene, transcript and sample identification numbers, GO-terms and gene descriptions can be freely correlated, filtered and sorted. Queries can be saved for later reuse and results can be exported to standard formats that allow copy-and-paste to all widespread data visualization tools such as Microsoft Excel. CrossQuery enables researchers to quickly and freely work with transcriptome and microarray data sets requiring only minimal computer skills. Furthermore, CrossQuery allows growing association of multiple datasets as long as at least one common point of correlated information, such as transcript identification numbers or GO-terms, is shared between samples. For advanced users, the object-oriented plug-in and event-driven code design of both server-side and client-side scripts allow easy addition of new features, data sources and data types.
Zhang, Lihua; Chen, Xianzhong; Chen, Zhen; Wang, Zezheng; Jiang, Shan; Li, Li; Pötter, Markus; Shen, Wei; Fan, You
2016-11-01
The diploid yeast Candida tropicalis, which can utilize n-alkane as a carbon and energy source, is an attractive strain for both physiological studies and practical applications. However, it presents some characteristics, such as rare codon usage, difficulty in sequential gene disruption, and inefficiency in foreign gene expression, that hamper strain improvement through genetic engineering. In this work, we present a simple and effective method for sequential gene disruption in C. tropicalis based on the use of an auxotrophic mutant host defective in orotidine monophosphate decarboxylase (URA3). The disruption cassette, which consists of a functional yeast URA3 gene flanked by a 0.3 kb gene disruption auxiliary sequence (gda) direct repeat derived from downstream or upstream of the URA3 gene and of homologous arms of the target gene, was constructed and introduced into the yeast genome by integrative transformation. Stable integrants were isolated by selection for Ura + and identified by PCR and sequencing. The important feature of this construct, which makes it very attractive, is that recombination between the flanking direct gda repeats occurs at a high frequency (10 -8 ) during mitosis. After excision of the URA3 marker, only one copy of the gda sequence remains at the recombinant locus. Thus, the resulting ura3 strain can be used again to disrupt a second allelic gene in a similar manner. In addition to this effective sequential gene disruption method, a codon-optimized green fluorescent protein-encoding gene (GFP) was functionally expressed in C. tropicalis. Thus, we propose a simple and reliable method to improve C. tropicalis by genetic manipulation.
NASA Technical Reports Server (NTRS)
Kaljurand, M.; Valentin, J. R.; Shao, M.
1996-01-01
Two alternative input sequences are commonly employed in correlation chromatography (CC). They are sequences derived according to the algorithm of the feedback shift register (i.e., pseudo random binary sequences (PRBS)) and sequences derived by using the uniform random binary sequences (URBS). These two sequences are compared. By applying the "cleaning" data processing technique to the correlograms that result from these sequences, we show that when the PRBS is used the S/N of the correlogram is much higher than the one resulting from using URBS.
Onyśk, Agnieszka; Boczkowska, Maja
2017-01-01
Simple Sequence Repeat (SSR) markers are one of the most frequently used molecular markers in studies of crop diversity and population structure. This is due to their uniform distribution in the genome, the high polymorphism, reproducibility, and codominant character. Additional advantages are the possibility of automatic analysis and simple interpretation of the results. The M13 tagged PCR reaction significantly reduces the costs of analysis by the automatic genetic analyzers. Here, we also disclose a short protocol of SSR data analysis.
Demberg, Lilian M; Winkler, Jana; Wilde, Caroline; Simon, Kay-Uwe; Schön, Julia; Rothemund, Sven; Schöneberg, Torsten; Prömel, Simone; Liebscher, Ines
2017-03-17
Members of the adhesion G protein-coupled receptor (aGPCR) family carry an agonistic sequence within their large ectodomains. Peptides derived from this region, called the Stachel sequence, can activate the respective receptor. As the conserved core region of the Stachel sequence is highly similar between aGPCRs, the agonist specificity of Stachel sequence-derived peptides was tested between family members using cell culture-based second messenger assays. Stachel peptides derived from aGPCRs of subfamily VI (GPR110/ADGRF1, GPR116/ADGRF5) and subfamily VIII (GPR64/ADGRG2, GPR126/ADGRG6) are able to activate more than one member of the respective subfamily supporting their evolutionary relationship and defining them as pharmacological receptor subtypes. Extended functional analyses of the Stachel sequences and derived peptides revealed agonist promiscuity, not only within, but also between aGPCR subfamilies. For example, the Stachel -derived peptide of GPR110 (subfamily VI) can activate GPR64 and GPR126 (both subfamily VIII). Our results indicate that key residues in the Stachel sequence are very similar between aGPCRs allowing for agonist promiscuity of several Stachel -derived peptides. Therefore, aGPCRs appear to be pharmacologically more closely related than previously thought. Our findings have direct implications for many aGPCR studies, as potential functional overlap has to be considered for in vitro and in vivo studies. However, it also offers the possibility of a broader use of more potent peptides when the original Stachel sequence is less effective. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Determination of the sequences of protein-derived peptides and peptide mixtures by mass spectrometry
Morris, Howard R.; Williams, Dudley H.; Ambler, Richard P.
1971-01-01
Micro-quantities of protein-derived peptides have been converted into N-acetylated permethyl derivatives, and their sequences determined by low-resolution mass spectrometry without prior knowledge of their amino acid compositions or lengths. A new strategy is suggested for the mass spectrometric sequencing of oligopeptides or proteins, involving gel filtration of protein hydrolysates and subsequent sequence analysis of peptide mixtures. Finally, results are given that demonstrate for the first time the use of mass spectrometry for the analysis of a protein-derived peptide mixture, again without prior knowledge of the protein or components within the mixture. PMID:5158904
Vogiatzi, Emmanouella; Lagnel, Jacques; Pakaki, Victoria; Louro, Bruno; Canario, Adelino V M; Reinhardt, Richard; Kotoulas, Georgios; Magoulas, Antonios; Tsigenopoulos, Costas S
2011-06-01
We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs. Copyright © 2011 Elsevier B.V. All rights reserved.
Kangaroo – A pattern-matching program for biological sequences
2002-01-01
Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718
Evaluating the protein coding potential of exonized transposable element sequences
Piriyapongsa, Jittima; Rutledge, Mark T; Patel, Sanil; Borodovsky, Mark; Jordan, I King
2007-01-01
Background Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons. Results We compared the ability of three classes of sequence similarity search methods to detect TE-derived sequences among data sets of experimentally characterized proteins: 1-a profile-based hidden Markov model (HMM) approach, 2-BLAST methods and 3-RepeatMasker. Profile based methods are more sensitive and more selective than the other methods evaluated. However, the application of profile-based search methods to the detection of TE-derived sequences among well-curated experimentally characterized protein data sets did not turn up many more cases than had been previously detected and nowhere near as many cases as recent genome-wide searches have. We observed that the different search methods used were complementary in the sense that they yielded largely non-overlapping sets of hits and differed in their ability to recover known cases of TE-derived CDS. The probabilistic analysis of TE-derived exon sequences indicates that these sequences have low protein coding potential on average. In particular, non-autonomous TEs that do not encode protein sequences, such as Alu elements, are frequently exonized but unlikely to encode protein sequences. Conclusion The exaptation of the numerous TE sequences found in exons as bona fide protein coding sequences may prove to be far less common than has been suggested by the analysis of complete genomes. We hypothesize that many exonized TE sequences actually function as post-transcriptional regulators of gene expression, rather than coding sequences, which may act through a variety of double stranded RNA related regulatory pathways. Indeed, their relatively high copy numbers and similarity to sequences dispersed throughout the genome suggests that exonized TE sequences could serve as master regulators with a wide scope of regulatory influence. Reviewers: This article was reviewed by Itai Yanai, Kateryna D. Makova, Melissa Wilson (nominated by Kateryna D. Makova) and Cedric Feschotte (nominated by John M. Logsdon Jr.). PMID:18036258
FARME DB: a functional antibiotic resistance element database
Wallace, James C.; Port, Jesse A.; Smith, Marissa N.; Faustman, Elaine M.
2017-01-01
Antibiotic resistance (AR) is a major global public health threat but few resources exist that catalog AR genes outside of a clinical context. Current AR sequence databases are assembled almost exclusively from genomic sequences derived from clinical bacterial isolates and thus do not include many microbial sequences derived from environmental samples that confer resistance in functional metagenomic studies. These environmental metagenomic sequences often show little or no similarity to AR sequences from clinical isolates using standard classification criteria. In addition, existing AR databases provide no information about flanking sequences containing regulatory or mobile genetic elements. To help address this issue, we created an annotated database of DNA and protein sequences derived exclusively from environmental metagenomic sequences showing AR in laboratory experiments. Our Functional Antibiotic Resistant Metagenomic Element (FARME) database is a compilation of publically available DNA sequences and predicted protein sequences conferring AR as well as regulatory elements, mobile genetic elements and predicted proteins flanking antibiotic resistant genes. FARME is the first database to focus on functional metagenomic AR gene elements and provides a resource to better understand AR in the 99% of bacteria which cannot be cultured and the relationship between environmental AR sequences and antibiotic resistant genes derived from cultured isolates. Database URL: http://staff.washington.edu/jwallace/farme PMID:28077567
High-Throughput Sequencing and De Novo Assembly of the Isatis indigotica Transcriptome
Tang, Xiaoqing; Xiao, Yunhua; Lv, Tingting; Wang, Fangquan; Zhu, QianHao; Zheng, Tianqing; Yang, Jie
2014-01-01
Background Isatis indigotica, the source of the traditional Chinese medicine Radix isatidis (Ban-Lan-Gen), is an extremely important economical crop in China. To facilitate biological, biochemical and molecular research on the medicinal chemicals in I. indigotica, here we report the first I. indigotica transcriptome generated by RNA sequencing (RNA-seq). Results RNA-seq library was created using RNA extracted from a mixed sample including leaf and root. A total of 33,238 unigenes were assembled from more than 28 million of high quality short reads. The quality of the assembly was experimentally examined by cDNA sequencing of seven randomly selected unigenes. Based on blast search 28,184 unigenes had a hit in at least one of the protein and nucleotide databases used in this study, and 8 unigenes were found to be associated with biosynthesis of indole and its derivatives. According to Gene Ontology classification, 22,365 unigenes were categorized into 48 functional groups. Furthermore, Clusters of Orthologous Group and Swiss-Port annotation were assigned for 7,707 and 18,679 unigenes, respectively. Analysis of repeat motifs identified 6,400 simple sequence repeat markers in 4,509 unigenes. Conclusion Our data provide a comprehensive sequence resource for molecular study of I. indigotica. Our results will facilitate studies on the functions of genes involved in the indole alkaloid biosynthesis pathway and on metabolism of nitrogen and indole alkaloids in I. indigotica and its related species. PMID:25259890
Gupta, R S
1998-12-01
The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes.
Gupta, Radhey S.
1998-01-01
The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes. PMID:9841678
Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.
Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G
1993-01-01
The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231
A simple derivation of Lorentz self-force
NASA Astrophysics Data System (ADS)
Haque, Asrarul
2014-09-01
We derive the Lorentz self-force for a charged particle in arbitrary non-relativistic motion by averaging the retarded fields. The derivation is simple and at the same time pedagogically accessible. We obtain the radiation reaction for a charged particle moving in a circle. We pin down the underlying concept of mass renormalization.
Onozawa, Masahiro; Zhang, Zhenhua; Kim, Yoo Jung; Goldberg, Liat; Varga, Tamas; Bergsagel, P Leif; Kuehl, W Michael; Aplan, Peter D
2014-05-27
We used the I-SceI endonuclease to produce DNA double-strand breaks (DSBs) and observed that a fraction of these DSBs were repaired by insertion of sequences, which we termed "templated sequence insertions" (TSIs), derived from distant regions of the genome. These TSIs were derived from genic, retrotransposon, or telomere sequences and were not deleted from the donor site in the genome, leading to the hypothesis that they were derived from reverse-transcribed RNA. Cotransfection of RNA and an I-SceI expression vector demonstrated insertion of RNA-derived sequences at the DNA-DSB site, and TSIs were suppressed by reverse-transcriptase inhibitors. Both observations support the hypothesis that TSIs were derived from RNA templates. In addition, similar insertions were detected at sites of DNA DSBs induced by transcription activator-like effector nuclease proteins. Whole-genome sequencing of myeloma cell lines revealed additional TSIs, demonstrating that repair of DNA DSBs via insertion was not restricted to experimentally produced DNA DSBs. Analysis of publicly available databases revealed that many of these TSIs are polymorphic in the human genome. Taken together, these results indicate that insertional events should be considered as alternatives to gross chromosomal rearrangements in the interpretation of whole-genome sequence data and that this mutagenic form of DNA repair may play a role in genetic disease, exon shuffling, and mammalian evolution.
Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François
2015-01-01
Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486
Jones, David T; Kandathil, Shaun M
2018-04-26
In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation. Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. DeepCov is freely available at https://github.com/psipred/DeepCov. d.t.jones@ucl.ac.uk.
Marques, Isabel; Montgomery, Sean A; Barker, Michael S; Macfarlane, Terry D; Conran, John G; Catalán, Pilar; Rieseberg, Loren H; Rudall, Paula J; Graham, Sean W
2016-04-01
Relatively little is known about species-level genetic diversity in flowering plants outside the eudicots and monocots, and it is often unclear how to interpret genetic patterns in lineages with whole-genome duplications. We addressed these issues in a polyploid representative of Hydatellaceae, part of the water-lily order Nymphaeales. We examined a transcriptome of Trithuria submersa for evidence of recent whole-genome duplication, and applied transcriptome-derived microsatellite (expressed-sequence tag simple-sequence repeat (EST-SSR)) primers to survey genetic variation in populations across its range in mainland Australia. A transcriptome-based Ks plot revealed at least one recent polyploidization event, consistent with fixed heterozygous genotypes representing underlying sets of homeologous loci. A strong genetic division coincides with a trans-Nullarbor biogeographic boundary. Patterns of 'allelic' variation (no more than two variants per EST-SSR genotype) and recently published chromosomal evidence are consistent with the predicted polyploidization event and substantial homozygosity underlying fixed heterozygote SSR genotypes, which in turn reflect a selfing mating system. The Nullarbor Plain is a barrier to gene flow between two deep lineages of T. submersa that may represent cryptic species. The markers developed here should also be useful for further disentangling species relationships, and provide a first step towards future genomic studies in Trithuria. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Insights into natural products biosynthesis from analysis of 490 polyketide synthases from Fusarium.
Brown, Daren W; Proctor, Robert H
2016-04-01
Species of the fungus Fusarium collectively cause disease on almost all crop plants and produce numerous natural products (NPs), including some of the mycotoxins of greatest concern to agriculture. Many Fusarium NPs are derived from polyketide synthases (PKSs), large multi-domain enzymes that catalyze sequential condensation of simple carboxylic acids to form polyketides. To gain insight into the biosynthesis of polyketide-derived NPs in Fusarium, we retrieved 488 PKS gene sequences from genome sequences of 31 species of the fungus. In addition to these apparently functional PKS genes, the genomes collectively included 81 pseudogenized PKS genes. Phylogenetic analysis resolved the PKS genes into 67 clades, and based on multiple lines of evidence, we propose that homologs in each clade are responsible for synthesis of a polyketide that is distinct from those synthesized by PKSs in other clades. The presence and absence of PKS genes among the species examined indicated marked differences in distribution of PKS homologs. Comparisons of Fusarium PKS genes and genes flanking them to those from other Ascomycetes provided evidence that Fusarium has the genetic potential to synthesize multiple NPs that are the same or similar to those reported in other fungi, but that have not yet been reported in Fusarium. The results also highlight ways in which such analyses can help guide identification of novel Fusarium NPs and differences in NP biosynthetic capabilities that exist among fungi. Published by Elsevier Inc.
Highly Efficient Targeted Mutagenesis in Mice Using TALENs
Panda, Sudeepta Kumar; Wefers, Benedikt; Ortiz, Oskar; Floss, Thomas; Schmid, Bettina; Haass, Christian; Wurst, Wolfgang; Kühn, Ralf
2013-01-01
Targeted mouse mutants are instrumental for the analysis of gene function in health and disease. We recently provided proof-of-principle for the fast-track mutagenesis of the mouse genome, using transcription activator-like effector nucleases (TALENs) in one-cell embryos. Here we report a routine procedure for the efficient production of disease-related knockin and knockout mutants, using improved TALEN mRNAs that include a plasmid-coded poly(A) tail (TALEN-95A), circumventing the problematic in vitro polyadenylation step. To knock out the C9orf72 gene as a model of frontotemporal lobar degeneration, TALEN-95A mutagenesis induced sequence deletions in 41% of pups derived from microinjected embryos. Using TALENs together with mutagenic oligodeoxynucleotides, we introduced amyotrophic lateral sclerosis patient-derived missense mutations in the fused in sarcoma (Fus) gene at a rate of 6.8%. For the simple identification of TALEN-induced mutants and their progeny we validate high-resolution melt analysis (HRMA) of PCR products as a sensitive and universal genotyping tool. Furthermore, HRMA of off-target sites in mutant founder mice revealed no evidence for undesired TALEN-mediated processing of related genomic sequences. The combination of TALEN-95A mRNAs for enhanced mutagenesis and of HRMA for simplified genotyping enables the accelerated, routine production of new mouse models for the study of genetic disease mechanisms. PMID:23979585
Ramchiary, Nirala; Nguyen, Van Dan; Li, Xiaonan; Hong, Chang Pyo; Dhandapani, Vignesh; Choi, Su Ryun; Yu, Ge; Piao, Zhong Yun; Lim, Yong Pyo
2011-01-01
Genic microsatellite markers, also known as functional markers, are preferred over anonymous markers as they reveal the variation in transcribed genes among individuals. In this study, we developed a total of 707 expressed sequence tag-derived simple sequence repeat markers (EST-SSRs) and used for development of a high-density integrated map using four individual mapping populations of B. rapa. This map contains a total of 1426 markers, consisting of 306 EST-SSRs, 153 intron polymorphic markers, 395 bacterial artificial chromosome-derived SSRs (BAC-SSRs), and 572 public SSRs and other markers covering a total distance of 1245.9 cM of the B. rapa genome. Analysis of allelic diversity in 24 B. rapa germplasm using 234 mapped EST-SSR markers showed amplification of 2 alleles by majority of EST-SSRs, although amplification of alleles ranging from 2 to 8 was found. Transferability analysis of 167 EST-SSRs in 35 species belonging to cultivated and wild brassica relatives showed 42.51% (Sysimprium leteum) to 100% (B. carinata, B. juncea, and B. napus) amplification. Our newly developed EST-SSRs and high-density linkage map based on highly transferable genic markers would facilitate the molecular mapping of quantitative trait loci and the positional cloning of specific genes, in addition to marker-assisted selection and comparative genomic studies of B. rapa with other related species. PMID:21768136
Simple chained guide trees give high-quality protein multiple sequence alignments
Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.
2014-01-01
Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495
Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine
Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson
2011-01-01
Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...
Simulations Using Random-Generated DNA and RNA Sequences
ERIC Educational Resources Information Center
Bryce, C. F. A.
1977-01-01
Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…
USDA-ARS?s Scientific Manuscript database
The advent of next-generation sequencing technologies has been a boon to the cost-effective development of molecular markers, particularly in non-model species. Here, we demonstrate the efficiency of microsatellite or simple sequence repeat (SSR) marker development from short-read sequences using th...
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism
USDA-ARS?s Scientific Manuscript database
Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
Miller, K.G.; Mountain, Gregory S.; Browning, J.V.; Kominz, M.; Sugarman, P.J.; Christie-Blick, N.; Katz, M.E.; Wright, J.D.
1998-01-01
The New Jersey Sea Level Transect was designed to evaluate the relationships among global sea level (eustatic) change, unconformity-bounded sequences, and variations in subsidence, sediment supply, and climate on a passive continental margin. By sampling and dating Cenozoic strata from coastal plain and continental slope locations, we show that sequence boundaries correlate (within ??0.5 myr) regionally (onshore-offshore) and interregionally (New Jersey-Alabama-Bahamas), implicating a global cause. Sequence boundaries correlate with ??18O increases for at least the past 42 myr, consistent with an ice volume (glacioeustatic) control, although a causal relationship is not required because of uncertainties in ages and correlations. Evidence for a causal connection is provided by preliminary Miocene data from slope Site 904 that directly link ??18O increases with sequence boundaries. We conclude that variation in the size of ice sheets has been a primary control on the formation of sequence boundaries since ~42 Ma. We speculate that prior to this, the growth and decay of small ice sheets caused small-amplitude sea level changes (<20 m) in this supposedly ice-free world because Eocene sequence boundaries also appear to correlate with minor ??18O increases. Subsidence estimates (backstripping) indicate amplitudes of short-term (million-year scale) lowerings that are consistent with estimates derived from ??18O studies (25-50 m in the Oligocene-middle Miocene and 10-20 m in the Eocene) and a long-term lowering of 150-200 m over the past 65 myr, consistent with estimates derived from volume changes on mid-ocean ridges. Although our results are consistent with the general number and timing of Paleocene to middle Miocene sequences published by workers at Exxon Production Research Company, our estimates of sea level amplitudes are substantially lower than theirs. Lithofacies patterns within sequences follow repetitive, predictable patterns: (1) coastal plain sequences consist of basal transgressive sands overlain by regressive highstand silts and quartz sands; and (2) although slope lithofacies variations are subdued, reworked sediments constitute lowstand deposits, causing the strongest, most extensive seismic reflections. Despite a primary eustatic control on sequence boundaries, New Jersey sequences were also influenced by changes in tectonics, sediment supply, and climate. During the early to middle Eocene, low siliciclastic and high pelagic input associated with warm climates resulted in widespread carbonate deposition and thin sequences. Late middle Eocene and earliest Oligocene cooling events curtailed carbonate deposition in the coastal plain and slope, respectively, resulting in a switch to siliciclastic sedimentation. In onshore areas, Oligocene sequences are thin owing to low siliciclastic and pelagic input, and their distribution is patchy, reflecting migration or progradation of depocenters; in contrast, Miocene onshore sequences are thicker, reflecting increased sediment supply, and they are more complete downdip owing to simple tectonics. We conclude that the New Jersey margin provides a natural laboratory for unraveling complex interactions of eustasy, tectonics, changes in sediment supply, and climate change.
Haider, Nadia
2017-01-01
Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.
Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao
2017-01-01
The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
2011-01-01
Background Over recent years, a growing effort has been made to develop microsatellite markers for the genomic analysis of the common bean (Phaseolus vulgaris) to broaden the knowledge of the molecular genetic basis of this species. The availability of large sets of expressed sequence tags (ESTs) in public databases has given rise to an expedient approach for the identification of SSRs (Simple Sequence Repeats), specifically EST-derived SSRs. In the present work, a battery of new microsatellite markers was obtained from a search of the Phaseolus vulgaris EST database. The diversity, degree of transferability and polymorphism of these markers were tested. Results From 9,583 valid ESTs, 4,764 had microsatellite motifs, from which 377 were used to design primers, and 302 (80.11%) showed good amplification quality. To analyze transferability, a group of 167 SSRs were tested, and the results showed that they were 82% transferable across at least one species. The highest amplification rates were observed between the species from the Phaseolus (63.7%), Vigna (25.9%), Glycine (19.8%), Medicago (10.2%), Dipterix (6%) and Arachis (1.8%) genera. The average PIC (Polymorphism Information Content) varied from 0.53 for genomic SSRs to 0.47 for EST-SSRs, and the average number of alleles per locus was 4 and 3, respectively. Among the 315 newly tested SSRs in the BJ (BAT93 X Jalo EEP558) population, 24% (76) were polymorphic. The integration of these segregant loci into a framework map composed of 123 previously obtained SSR markers yielded a total of 199 segregant loci, of which 182 (91.5%) were mapped to 14 linkage groups, resulting in a map length of 1,157 cM. Conclusions A total of 302 newly developed EST-SSR markers, showing good amplification quality, are available for the genetic analysis of Phaseolus vulgaris. These markers showed satisfactory rates of transferability, especially between species that have great economic and genomic values. Their diversity was comparable to genomic SSRs, and they were incorporated in the common bean reference genetic map, which constitutes an important contribution to and advance in Phaseolus vulgaris genomic research. PMID:21554695
NASA Astrophysics Data System (ADS)
Zarifi, Keyvan; Gershman, Alex B.
2006-12-01
We analyze the performance of two popular blind subspace-based signature waveform estimation techniques proposed by Wang and Poor and Buzzi and Poor for direct-sequence code division multiple-access (DS-CDMA) systems with unknown correlated noise. Using the first-order perturbation theory, analytical expressions for the mean-square error (MSE) of these algorithms are derived. We also obtain simple high SNR approximations of the MSE expressions which explicitly clarify how the performance of these techniques depends on the environmental parameters and how it is related to that of the conventional techniques that are based on the standard white noise assumption. Numerical examples further verify the consistency of the obtained analytical results with simulation results.
NASA Astrophysics Data System (ADS)
Liu, Xiaomei; Li, Shengtao; Zhang, Kanjian
2017-08-01
In this paper, we solve an optimal control problem for a class of time-invariant switched stochastic systems with multi-switching times, where the objective is to minimise a cost functional with different costs defined on the states. In particular, we focus on problems in which a pre-specified sequence of active subsystems is given and the switching times are the only control variables. Based on the calculus of variation, we derive the gradient of the cost functional with respect to the switching times on an especially simple form, which can be directly used in gradient descent algorithms to locate the optimal switching instants. Finally, a numerical example is given, highlighting the validity of the proposed methodology.
Lundell, Henrik; Alexander, Daniel C; Dyrby, Tim B
2014-08-01
Stimulated echo acquisition mode (STEAM) diffusion MRI can be advantageous over pulsed-gradient spin-echo (PGSE) for diffusion times that are long compared with T2 . It therefore has potential for biomedical diffusion imaging applications at 7T and above where T2 is short. However, gradient pulses other than the diffusion gradients in the STEAM sequence contribute much greater diffusion weighting than in PGSE and lead to a disrupted experimental design. Here, we introduce a simple compensation to the STEAM acquisition that avoids the orientational bias and disrupted experiment design that these gradient pulses can otherwise produce. The compensation is simple to implement by adjusting the gradient vectors in the diffusion pulses of the STEAM sequence, so that the net effective gradient vector including contributions from diffusion and other gradient pulses is as the experiment intends. High angular resolution diffusion imaging (HARDI) data were acquired with and without the proposed compensation. The data were processed to derive standard diffusion tensor imaging (DTI) maps, which highlight the need for the compensation. Ignoring the other gradient pulses, a bias in DTI parameters from STEAM acquisition is found, due both to confounds in the analysis and the experiment design. Retrospectively correcting the analysis with a calculation of the full B matrix can partly correct for these confounds, but an acquisition that is compensated as proposed is needed to remove the effect entirely. © 2014 The Authors. NMR in Biomedicine published by John Wiley & Sons, Ltd.
Rao, R Nishanth; Mm, Balamurali; Maiti, Barnali; Thakuria, Ranjit; Chanda, Kaushik
2018-03-12
An expeditious catalyst-free heteroannulation reaction for imidazo[1,2- a]pyridines/pyrimidines/pyrazines was developed in green solvent under microwave irradiation. Using H 2 O-IPA as the reaction medium, various substituted 2-aminopyridines/pyrazines/pyrimidines underwent annulation reaction with α-bromoketones under microwave irradiation to provide the corresponding imidazo[1,2- a]pyridines/pyrimidines/pyrazines in excellent yields. The synthetic methodology appears to be very simple and superior to the already reported procedures with the high abundance of commercial reagents and great ability in expanding the molecular diversity. The present synthetic sequence is visualized as an environmentally benign process which allows the introduction of three points of structural diversity to expand chemical space with excellent purity and yields. The anti-inflammatory and antimicrobial activities of the derivatives were evaluated. Screening results uncovered three derivatives with strong inhibition of albumin denaturation and two derivatives were active on Proteus and Klebsiella bacteria. These positive bioassay results implied that the library of potential anti-inflammatory agents could be rapidly prepared in an ecofriendly manner, and provided new insights into drug discovery for medicinal chemists.
Isolation of a novel mutant gene for soil-surface rooting in rice (Oryza sativa L.)
2013-01-01
Background Root system architecture is an important trait affecting the uptake of nutrients and water by crops. Shallower root systems preferentially take up nutrients from the topsoil and help avoid unfavorable environments in deeper soil layers. We have found a soil-surface rooting mutant from an M2 population that was regenerated from seed calli of a japonica rice cultivar, Nipponbare. In this study, we examined the genetic and physiological characteristics of this mutant. Results The primary roots of the mutant showed no gravitropic response from the seedling stage on, whereas the gravitropic response of the shoots was normal. Segregation analyses by using an F2 population derived from a cross between the soil-surface rooting mutant and wild-type Nipponbare indicated that the trait was controlled by a single recessive gene, designated as sor1. Fine mapping by using an F2 population derived from a cross between the mutant and an indica rice cultivar, Kasalath, revealed that sor1 was located within a 136-kb region between the simple sequence repeat markers RM16254 and 2935-6 on the terminal region of the short arm of chromosome 4, where 13 putative open reading frames (ORFs) were found. We sequenced these ORFs and detected a 33-bp deletion in one of them, Os04g0101800. Transgenic plants of the mutant transformed with the genomic fragment carrying the Os04g0101800 sequence from Nipponbare showed normal gravitropic responses and no soil-surface rooting. Conclusion These results suggest that sor1, a rice mutant causing soil-surface rooting and altered root gravitropic response, is allelic to Os04g0101800, and that a 33-bp deletion in the coding region of this gene causes the mutant phenotypes. PMID:24280269
Isolation of a novel mutant gene for soil-surface rooting in rice (Oryza sativa L.).
Hanzawa, Eiko; Sasaki, Kazuhiro; Nagai, Shinsei; Obara, Mitsuhiro; Fukuta, Yoshimichi; Uga, Yusaku; Miyao, Akio; Hirochika, Hirohiko; Higashitani, Atsushi; Maekawa, Masahiko; Sato, Tadashi
2013-11-20
Root system architecture is an important trait affecting the uptake of nutrients and water by crops. Shallower root systems preferentially take up nutrients from the topsoil and help avoid unfavorable environments in deeper soil layers. We have found a soil-surface rooting mutant from an M2 population that was regenerated from seed calli of a japonica rice cultivar, Nipponbare. In this study, we examined the genetic and physiological characteristics of this mutant. The primary roots of the mutant showed no gravitropic response from the seedling stage on, whereas the gravitropic response of the shoots was normal. Segregation analyses by using an F2 population derived from a cross between the soil-surface rooting mutant and wild-type Nipponbare indicated that the trait was controlled by a single recessive gene, designated as sor1. Fine mapping by using an F2 population derived from a cross between the mutant and an indica rice cultivar, Kasalath, revealed that sor1 was located within a 136-kb region between the simple sequence repeat markers RM16254 and 2935-6 on the terminal region of the short arm of chromosome 4, where 13 putative open reading frames (ORFs) were found. We sequenced these ORFs and detected a 33-bp deletion in one of them, Os04g0101800. Transgenic plants of the mutant transformed with the genomic fragment carrying the Os04g0101800 sequence from Nipponbare showed normal gravitropic responses and no soil-surface rooting. These results suggest that sor1, a rice mutant causing soil-surface rooting and altered root gravitropic response, is allelic to Os04g0101800, and that a 33-bp deletion in the coding region of this gene causes the mutant phenotypes.
Construction of new EST-SSRs for Fusarium resistant wheat breeding.
Yumurtaci, Aysen; Sipahi, Hulya; Al-Abdallat, Ayed; Jighly, Abdulqader; Baum, Michael
2017-06-01
Surveying Fusarium resistance in wheat with easy applicable molecular markers such as simple sequence repeats (SSRs) is a prerequest for molecular breeding. Expressed sequence tags (ESTs) are one of the main sources for development of new SSR candidates. Therefore, 18.292 publicly available wheat ESTs were mined and genotyping of newly developed 55 EST-SSR derived primer pairs produced clear fragments in ten wheat cultivars carrying different levels of Fusarium resistance. Among the proved markers, 23 polymorphic EST-SSRs were obtained and related alleles were mostly found on B and D genome. Based on the fragment profiling and similarity analysis, a 327bp amplicon, which was a product of contig 1207 (chromosome 5BL), was detected only in Fusarium head blight (FHB) resistant cultivars (CM82036 and Sumai) and the amino acid sequences showed a similarity to pathogen related proteins. Another FHB resistance related EST-SSR, Contig 556 (chromosome 1BL) produced a 151bp fragment in Sumai and was associated to wax2-like protein. A polymorphic 204bp fragment, derived from Contig 578 (chromosome 1DL), was generated from root rot (FRR) resistant cultivars (2-49; Altay2000 and Sunco). A total of 98 alleles were displayed with an average of 1.8 alleles per locus and the polymorphic information content (PIC) ranged from 0.11 to 0.78. Dendrogram tree with two main and five sub-groups were displayed the highest genetic relationship between FRR resistant cultivars (2-49 and Altay2000), FRR sensitive cultivars (Seri82 and Scout66) and FHB resistant cultivars (CM82036 and Sumai). Thus, exploitation of these candidate EST-SSRs may help to genotype other wheat sources for Fusarium resistance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Nagano, Soichiro; Shirasawa, Kenta; Hirakawa, Hideki; Maeda, Fumi; Ishikawa, Masami; Isobe, Sachiko N
2017-05-12
The strawberry, Fragaria × ananassa, is an allo-octoploid (2n = 8x = 56) and outcrossing species. Although it is the most widely consumed berry crop in the world, its complex genome structure has hindered its genetic and genomic analysis, and thus discrimination of subgenome-specific loci among the homoeologous chromosomes is needed. In the present study, we identified candidate subgenome-specific single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) loci, and constructed a linkage map using an S 1 mapping population of the cultivar 'Reikou' with an IStraw90 Axiom® SNP array and previously published SSR markers. The 'Reikou' linkage map consisted of 11,574 loci (11,002 SNPs and 572 SSR loci) spanning 2816.5 cM of 31 linkage groups. The 11,574 loci were located on 4738 unique positions (bin) on the linkage map. Of the mapped loci, 8999 (8588 SNPs and 411 SSR loci) showed a 1:2:1 segregation ratio of AA:AB:BB allele, which suggested the possibility of deriving loci from candidate subgenome-specific sequences. In addition, 2575 loci (2414 SNPs and 161 SSR loci) showed a 3:1 segregation of AB:BB allele, indicating they were derived from homoeologous genomic sequences. Comparative analysis of the homoeologous linkage groups revealed differences in genome structure among the subgenomes. Our results suggest that candidate subgenome-specific loci are randomly located across the genomes, and that there are small- to large-scale structural variations among the subgenomes. The mapped SNPs and SSR loci on the linkage map are expected to be seed points for the construction of pseudomolecules in the octoploid strawberry.
Simkovic, Felix; Thomas, Jens M H; Keegan, Ronan M; Winn, Martyn D; Mayans, Olga; Rigden, Daniel J
2016-07-01
For many protein families, the deluge of new sequence information together with new statistical protocols now allow the accurate prediction of contacting residues from sequence information alone. This offers the possibility of more accurate ab initio (non-homology-based) structure prediction. Such models can be used in structure solution by molecular replacement (MR) where the target fold is novel or is only distantly related to known structures. Here, AMPLE, an MR pipeline that assembles search-model ensembles from ab initio structure predictions ('decoys'), is employed to assess the value of contact-assisted ab initio models to the crystallographer. It is demonstrated that evolutionary covariance-derived residue-residue contact predictions improve the quality of ab initio models and, consequently, the success rate of MR using search models derived from them. For targets containing β-structure, decoy quality and MR performance were further improved by the use of a β-strand contact-filtering protocol. Such contact-guided decoys achieved 14 structure solutions from 21 attempted protein targets, compared with nine for simple Rosetta decoys. Previously encountered limitations were superseded in two key respects. Firstly, much larger targets of up to 221 residues in length were solved, which is far larger than the previously benchmarked threshold of 120 residues. Secondly, contact-guided decoys significantly improved success with β-sheet-rich proteins. Overall, the improved performance of contact-guided decoys suggests that MR is now applicable to a significantly wider range of protein targets than were previously tractable, and points to a direct benefit to structural biology from the recent remarkable advances in sequencing.
Simkovic, Felix; Thomas, Jens M. H.; Keegan, Ronan M.; Winn, Martyn D.; Mayans, Olga; Rigden, Daniel J.
2016-01-01
For many protein families, the deluge of new sequence information together with new statistical protocols now allow the accurate prediction of contacting residues from sequence information alone. This offers the possibility of more accurate ab initio (non-homology-based) structure prediction. Such models can be used in structure solution by molecular replacement (MR) where the target fold is novel or is only distantly related to known structures. Here, AMPLE, an MR pipeline that assembles search-model ensembles from ab initio structure predictions (‘decoys’), is employed to assess the value of contact-assisted ab initio models to the crystallographer. It is demonstrated that evolutionary covariance-derived residue–residue contact predictions improve the quality of ab initio models and, consequently, the success rate of MR using search models derived from them. For targets containing β-structure, decoy quality and MR performance were further improved by the use of a β-strand contact-filtering protocol. Such contact-guided decoys achieved 14 structure solutions from 21 attempted protein targets, compared with nine for simple Rosetta decoys. Previously encountered limitations were superseded in two key respects. Firstly, much larger targets of up to 221 residues in length were solved, which is far larger than the previously benchmarked threshold of 120 residues. Secondly, contact-guided decoys significantly improved success with β-sheet-rich proteins. Overall, the improved performance of contact-guided decoys suggests that MR is now applicable to a significantly wider range of protein targets than were previously tractable, and points to a direct benefit to structural biology from the recent remarkable advances in sequencing. PMID:27437113
Stöcher, Markus; Leb, Victoria; Hölzl, Gabriele; Berg, Jörg
2002-12-01
The real-time PCR technology allows convenient detection and quantification of virus derived DNA. This approach is used in many PCR based assays in clinical laboratories. Detection and quantification of virus derived DNA is usually performed against external controls or external standards. Thus, adequacy within a clinical sample is not monitored for. This can be achieved using internal controls that are co-amplified with the specific target within the same reaction vessel. We describe a convenient way to prepare heterologous internal controls as competitors for real-time PCR based assays. The internal controls were devised as competitors in real-time PCR, e.g. LightCycler-PCR. The bacterial neomycin phosphotransferase gene (neo) was used as source for heterologous DNA. Within the neo gene a box was chosen containing sequences for four differently spaced forward primers, one reverse primer, and a pair of neo specific hybridization probes. Pairs of primers were constructed to compose of virus-specific primer sequences and neo box specific primer sequences. Using those composite primers in conventional preparative PCR four types of internal controls were amplified from the neo box and subsequently cloned. A panel of the four differently sized internal controls was generated and tested by LightCycler PCR using their virus-specific primers. All four different PCR products were detected with the single pair of neo specific FRET-hybridization probes. The presented approach to generate competitive internal controls for use in LightCycler PCR assays proved convenient und rapid. The obtained internal controls match most PCR product sizes used in clinical routine molecular assays and will assist to discriminate true from false negative results.
Dörries, Hans-Henno; Remus, Ivonne; Grönewald, Astrid; Grönewald, Cordt; Berghof-Jäger, Kornelia
2010-03-01
The number of commercially available genetically modified organisms (GMOs) and therefore the diversity of possible target sequences for molecular detection techniques are constantly increasing. As a result, GMO laboratories and the food production industry currently are forced to apply many different methods to reliably test raw material and complex processed food products. Screening methods have become more and more relevant to minimize the analytical effort and to make a preselection for further analysis (e.g., specific identification or quantification of the GMO). A multiplex real-time PCR kit was developed to detect the 35S promoter of the cauliflower mosaic virus, the terminator of the nopaline synthase gene of Agrobacterium tumefaciens, the 35S promoter from the figwort mosaic virus, and the bar gene of the soil bacterium Streptomyces hygroscopicus as the most widely used sequences in GMOs. The kit contains a second assay for the detection of plant-derived DNA to control the quality of the often processed and refined sample material. Additionally, the plant-specific assay comprises a homologous internal amplification control for inhibition control. The determined limits of detection for the five assays were 10 target copies/reaction. No amplification products were observed with DNAs of 26 bacterial species, 25 yeasts, 13 molds, and 41 not genetically modified plants. The specificity of the assays was further demonstrated to be 100% by the specific amplification of DNA derived from reference material from 22 genetically modified crops. The applicability of the kit in routine laboratory use was verified by testing of 50 spiked and unspiked food products. The herein described kit represents a simple and sensitive GMO screening method for the reliable detection of multiple GMO-specific target sequences in a multiplex real-time PCR reaction.
Microsatellites for Lindera species
Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson
2006-01-01
Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...
Genetic consequences of cladogenetic vs. anagenetic speciation in endemic plants of oceanic islands
Takayama, Koji; López-Sepúlveda, Patricio; Greimler, Josef; Crawford, Daniel J.; Peñailillo, Patricio; Baeza, Marcelo; Ruiz, Eduardo; Kohl, Gudrun; Tremetsberger, Karin; Gatica, Alejandro; Letelier, Luis; Novoa, Patricio; Novak, Johannes; Stuessy, Tod F.
2015-01-01
Adaptive radiation is a common mode of speciation among plants endemic to oceanic islands. This pattern is one of cladogenesis, or splitting of the founder population, into diverse lineages in divergent habitats. In contrast, endemic species have also evolved primarily by simple transformations from progenitors in source regions. This is anagenesis, whereby the founding population changes genetically and morphologically over time primarily through mutation and recombination. Gene flow among populations is maintained in a homogeneous environment with no splitting events. Genetic consequences of these modes of speciation have been examined in the Juan Fernández Archipelago, which contains two principal islands of differing geological ages. This article summarizes population genetic results (nearly 4000 analyses) from examination of 15 endemic species, involving 1716 and 1870 individuals in 162 and 163 populations (with amplified fragment length polymorphisms and simple sequence repeats, respectively) in the following genera: Drimys (Winteraceae), Myrceugenia (Myrtaceae), Rhaphithamnus (Verbenaceae), Robinsonia (Asteraceae, Senecioneae) and Erigeron (Asteraceae, Astereae). The results indicate that species originating anagenetically show high levels of genetic variation within the island population and no geographic genetic partitioning. This contrasts with cladogenetic species that show less genetic diversity within and among populations. Species that have been derived anagenetically on the younger island (1–2 Ma) contain less genetic variation than those that have anagenetically speciated on the older island (4 Ma). Genetic distinctness among cladogenetically derived species on the older island is greater than among similarly derived species on the younger island. An important point is that the total genetic variation within each genus analysed is comparable, regardless of whether adaptive divergence occurs. PMID:26311732
Remnants of an Ancient Deltaretrovirus in the Genomes of Horseshoe Bats (Rhinolophidae).
Hron, Tomáš; Farkašová, Helena; Gifford, Robert J; Benda, Petr; Hulva, Pavel; Görföl, Tamás; Pačes, Jan; Elleder, Daniel
2018-04-10
Endogenous retrovirus (ERV) sequences provide a rich source of information about the long-term interactions between retroviruses and their hosts. However, most ERVs are derived from a subset of retrovirus groups, while ERVs derived from certain other groups remain extremely rare. In particular, only a single ERV sequence has been identified that shows evidence of being related to an ancient Deltaretrovirus , despite the large number of vertebrate genome sequences now available. In this report, we identify a second example of an ERV sequence putatively derived from a past deltaretroviral infection, in the genomes of several species of horseshoe bats (Rhinolophidae). This sequence represents a fragment of viral genome derived from a single integration. The time of the integration was estimated to be 11-19 million years ago. This finding, together with the previously identified endogenous Deltaretrovirus in long-fingered bats (Miniopteridae), suggest a close association of bats with ancient deltaretroviruses.
Oldach, Klaus H; Peck, David M; Nair, Ramakrishnan M; Sokolova, Maria; Harris, John; Bogacki, Paul; Ballard, Ross
2014-04-17
The nematode Pratylenchus neglectus has a wide host range and is able to feed on the root systems of cereals, oilseeds, grain and pasture legumes. Under the Mediterranean low rainfall environments of Australia, annual Medicago pasture legumes are used in rotation with cereals to fix atmospheric nitrogen and improve soil parameters. Considerable efforts are being made in breeding programs to improve resistance and tolerance to Pratylenchus neglectus in the major crops wheat and barley, which makes it vital to develop appropriate selection tools in medics. A strong source of tolerance to root damage by the root lesion nematode (RLN) Pratylenchus neglectus had previously been identified in line RH-1 (strand medic, M. littoralis). Using RH-1, we have developed a single seed descent (SSD) population of 138 lines by crossing it to the intolerant cultivar Herald. After inoculation, RLN-associated root damage clearly segregated in the population. Genetic analysis was performed by constructing a genetic map using simple sequence repeat (SSR) and gene-based SNP markers. A highly significant quantitative trait locus (QTL), QPnTolMl.1, was identified explaining 49% of the phenotypic variation in the SSD population. All SSRs and gene-based markers in the QTL region were derived from chromosome 1 of the sequenced genome of the closely related species M. truncatula. Gene-based markers were validated in advanced breeding lines derived from the RH-1 parent and also a second RLN tolerance source, RH-2 (M. truncatula ssp. tricycla). Comparative analysis to sequenced legume genomes showed that the physical QTL interval exists as a synteny block in Lotus japonicus, common bean, soybean and chickpea. Furthermore, using the sequenced genome information of M. truncatula, the QTL interval contains 55 genes out of which five are discussed as potential candidate genes responsible for the mapped tolerance. The closely linked set of SNP-based PCR markers is directly applicable to select for two different sources of RLN tolerance in breeding programs. Moreover, genome sequence information has allowed proposing candidate genes for further functional analysis and nominates QPnTolMl.1 as a target locus for RLN tolerance in economically important grain legumes, e.g. chickpea.
FOUNTAIN: A JAVA open-source package to assist large sequencing projects
Buerstedde, Jean-Marie; Prill, Florian
2001-01-01
Background Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. Results We describe here a new program, named FOUNTAIN, for the management of large sequencing projects . FOUNTAIN uses the JAVA computer language and data storage in a relational database. Starting with a collection of sequencing objects (clones), the program generates and stores information related to the different stages of the sequencing project using a web browser interface for user input. The generated sequences are subsequently imported and annotated based on BLAST searches against the public databases. In addition, simple algorithms to cluster sequences and determine putative polymorphic positions are implemented. Conclusions A simple, but flexible and scalable software package is presented to facilitate data generation and storage for large sequencing projects. Open source and largely platform and database independent, we wish FOUNTAIN to be improved and extended in a community effort. PMID:11591214
2013-01-01
Background With high quantity and quality data production and low cost, next generation sequencing has the potential to provide new opportunities for plant phylogeographic studies on single and multiple species. Here we present an approach for in silicio chloroplast DNA assembly and single nucleotide polymorphism detection from short-read shotgun sequencing. The approach is simple and effective and can be implemented using standard bioinformatic tools. Results The chloroplast genome of Toona ciliata (Meliaceae), 159,514 base pairs long, was assembled from shotgun sequencing on the Illumina platform using de novo assembly of contigs. To evaluate its practicality, value and quality, we compared the short read assembly with an assembly completed using 454 data obtained after chloroplast DNA isolation. Sanger sequence verifications indicated that the Illumina dataset outperformed the longer read 454 data. Pooling of several individuals during preparation of the shotgun library enabled detection of informative chloroplast SNP markers. Following validation, we used the identified SNPs for a preliminary phylogeographic study of T. ciliata in Australia and to confirm low diversity across the distribution. Conclusions Our approach provides a simple method for construction of whole chloroplast genomes from shotgun sequencing of whole genomic DNA using short-read data and no available closely related reference genome (e.g. from the same species or genus). The high coverage of Illumina sequence data also renders this method appropriate for multiplexing and SNP discovery and therefore a useful approach for landscape level studies of evolutionary ecology. PMID:23497206
Amirhaeri, S; Wohlrab, F; Wells, R D
1995-02-17
The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Simple-MSSM: a simple and efficient method for simultaneous multi-site saturation mutagenesis.
Cheng, Feng; Xu, Jian-Miao; Xiang, Chao; Liu, Zhi-Qiang; Zhao, Li-Qing; Zheng, Yu-Guo
2017-04-01
To develop a practically simple and robust multi-site saturation mutagenesis (MSSM) method that enables simultaneously recombination of amino acid positions for focused mutant library generation. A general restriction enzyme-free and ligase-free MSSM method (Simple-MSSM) based on prolonged overlap extension PCR (POE-PCR) and Simple Cloning techniques. As a proof of principle of Simple-MSSM, the gene of eGFP (enhanced green fluorescent protein) was used as a template gene for simultaneous mutagenesis of five codons. Forty-eight randomly selected clones were sequenced. Sequencing revealed that all the 48 clones showed at least one mutant codon (mutation efficiency = 100%), and 46 out of the 48 clones had mutations at all the five codons. The obtained diversities at these five codons are 27, 24, 26, 26 and 22, respectively, which correspond to 84, 75, 81, 81, 69% of the theoretical diversity offered by NNK-degeneration (32 codons; NNK, K = T or G). The enzyme-free Simple-MSSM method can simultaneously and efficiently saturate five codons within one day, and therefore avoid missing interactions between residues in interacting amino acid networks.
NASA Technical Reports Server (NTRS)
Rai, Man Mohan (Inventor); Madavan, Nateri K. (Inventor)
2007-01-01
A method and system for data modeling that incorporates the advantages of both traditional response surface methodology (RSM) and neural networks is disclosed. The invention partitions the parameters into a first set of s simple parameters, where observable data are expressible as low order polynomials, and c complex parameters that reflect more complicated variation of the observed data. Variation of the data with the simple parameters is modeled using polynomials; and variation of the data with the complex parameters at each vertex is analyzed using a neural network. Variations with the simple parameters and with the complex parameters are expressed using a first sequence of shape functions and a second sequence of neural network functions. The first and second sequences are multiplicatively combined to form a composite response surface, dependent upon the parameter values, that can be used to identify an accurate mode
Brouilette, Scott; Kuersten, Scott; Mein, Charles; Bozek, Monika; Terry, Anna; Dias, Kerith-Rae; Bhaw-Rosun, Leena; Shintani, Yasunori; Coppen, Steven; Ikebe, Chiho; Sawhney, Vinit; Campbell, Niall; Kaneko, Masahiro; Tano, Nobuko; Ishida, Hidekazu; Suzuki, Ken; Yashiro, Kenta
2012-10-01
Deep sequencing of single cell-derived cDNAs offers novel insights into oncogenesis and embryogenesis. However, traditional library preparation for RNA-seq analysis requires multiple steps with consequent sample loss and stochastic variation at each step significantly affecting output. Thus, a simpler and better protocol is desirable. The recently developed hyperactive Tn5-mediated library preparation, which brings high quality libraries, is likely one of the solutions. Here, we tested the applicability of hyperactive Tn5-mediated library preparation to deep sequencing of single cell cDNA, optimized the protocol, and compared it with the conventional method based on sonication. This new technique does not require any expensive or special equipment, which secures wider availability. A library was constructed from only 100 ng of cDNA, which enables the saving of precious specimens. Only a few steps of robust enzymatic reaction resulted in saved time, enabling more specimens to be prepared at once, and with a more reproducible size distribution among the different specimens. The obtained RNA-seq results were comparable to the conventional method. Thus, this Tn5-mediated preparation is applicable for anyone who aims to carry out deep sequencing for single cell cDNAs. Copyright © 2012 Wiley Periodicals, Inc.
Strods, Arnis; Ose, Velta; Bogans, Janis; Cielens, Indulis; Kalnins, Gints; Radovica, Ilze; Kazaks, Andris; Pumpens, Paul; Renhofa, Regina
2015-06-26
Hepatitis B virus (HBV) core (HBc) virus-like particles (VLPs) are one of the most powerful protein engineering tools utilised to expose immunological epitopes and/or cell-targeting signals and for the packaging of genetic material and immune stimulatory sequences. Although HBc VLPs and their numerous derivatives are produced in highly efficient bacterial and yeast expression systems, the existing purification and packaging protocols are not sufficiently optimised and standardised. Here, a simple alkaline treatment method was employed for the complete removal of internal RNA from bacteria- and yeast-produced HBc VLPs and for the conversion of these VLPs into empty particles, without any damage to the VLP structure. The empty HBc VLPs were able to effectively package the added DNA and RNA sequences. Furthermore, the alkaline hydrolysis technology appeared efficient for the purification and packaging of four different HBc variants carrying lysine residues on the HBc VLP spikes. Utilising the introduced lysine residues and the intrinsic aspartic and glutamic acid residues exposed on the tips of the HBc spikes for chemical coupling of the chosen peptide and/or nucleic acid sequences ensured a standard and easy protocol for the further development of versatile HBc VLP-based vaccine and gene therapy applications.
a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences
NASA Astrophysics Data System (ADS)
Regoli, Massimo
2009-02-01
The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
The mathematical relationship between Zipf’s law and the hierarchical scaling law
NASA Astrophysics Data System (ADS)
Chen, Yanguang
2012-06-01
The empirical studies of city-size distribution show that Zipf's law and the hierarchical scaling law are linked in many ways. The rank-size scaling and hierarchical scaling seem to be two different sides of the same coin, but their relationship has never been revealed by strict mathematical proof. In this paper, the Zipf's distribution of cities is abstracted as a q-sequence. Based on this sequence, a self-similar hierarchy consisting of many levels is defined and the numbers of cities in different levels form a geometric sequence. An exponential distribution of the average size of cities is derived from the hierarchy. Thus we have two exponential functions, from which follows a hierarchical scaling equation. The results can be statistically verified by simple mathematical experiments and observational data of cities. A theoretical foundation is then laid for the conversion from Zipf's law to the hierarchical scaling law, and the latter can show more information about city development than the former. Moreover, the self-similar hierarchy provides a new perspective for studying networks of cities as complex systems. A series of mathematical rules applied to cities such as the allometric growth law, the 2n principle and Pareto's law can be associated with one another by the hierarchical organization.
Wang, Hongtao; Li, Guisheng; Kwon, Woo-Saeng; Yang, Deok-Chun
2016-06-04
Panax ginseng is one of the most valuable medicinal plants in the Orient. The low level of genetic variation has limited the application of molecular markers for cultivar authentication and marker-assisted selection in cultivated ginseng. To exploit DNA polymorphism within ginseng cultivars, ginseng expressed sequence tags (ESTs) were searched against the potential intron polymorphism (PIP) database to predict the positions of introns. Intron-flanking primers were then designed in conserved exon regions and used to amplify across the more variable introns. Sequencing results showed that single nucleotide polymorphisms (SNPs), as well as indels, were detected in four EST-derived introns, and SNP markers specific to "Gopoong" and "K-1" were first reported in this study. Based on cultivar-specific SNP sites, allele-specific polymerase chain reaction (PCR) was conducted and proved to be effective for the authentication of ginseng cultivars. Additionally, the combination of a simple NaOH-Tris DNA isolation method and real-time allele-specific PCR assay enabled the high throughput selection of cultivars from ginseng fields. The established real-time allele-specific PCR assay should be applied to molecular authentication and marker assisted selection of P. ginseng cultivars, and the EST intron-targeting strategy will provide a potential approach for marker development in species without whole genomic DNA sequence information.
Amarger, V; Mercier, L
1995-01-01
We have applied the recently developed technique of random amplified polymorphic DNA (RAPD) for the discrimination between two jojoba clones at the genomic level. Among a set of 30 primers tested, a simple reproducible pattern with three distinct fragments for clone D and two distinct fragments for clone E was obtained with primer OPB08. Since RAPD products are the results of arbitrarily priming events and because a given primer can amplify a number of non-homologous sequences, we wondered whether or not RAPD bands, even those of similar size, were derived from different loci in the two clones. To answer this question, two complementary approaches were used: i) cloning and sequencing of the amplification products from clone E; and ii) complementary Southern analysis of RAPD gels using cloned or amplified fragments (directly recovered from agarose gels) as RFLP probes. The data reported here show that the RAPD reaction generates multiple amplified fragments. Some fragments, although resolved as a single band on agarose gels, contain different DNA species of the same size. Furthermore, it appears that the cloned RAPD products of known sequence that do not target repetitive DNA can be used as hybridization probes in RFLP to detect a polymorphism among individuals.
On the Split Personality of Penultimate Proline
Glover, Matthew S.; Shi, Liuqing; Fuller, Daniel R.; Arnold, Randy J.; Radivojac, Predrag; Clemmer, David E.
2014-01-01
The influence of the position of the amino acid proline in polypeptide sequences is examined by a combination of ion mobility spectrometry-mass spectrometry (IMS-MS), amino acid substitutions, and molecular modeling. The results suggest that when proline exists as the second residue from the N-terminus (i.e., penultimate proline), two families of conformers are formed. We demonstrate the existence of these families by a study of a series of truncated and mutated peptides derived from the 11-residue peptide Ser1-Pro2-Glu3-Leu4-Pro5-Ser6-Pro7-Gln8-Ala9-Glu10-Lys11. We find that every peptide from this sequence with a penultimate proline residue has multiple conformations. Substitution of Ala for Pro residues indicates that multiple conformers arise from the cis- trans isomerization of Xaa1–Pro2 peptide bonds as Xaa–Ala peptide bonds are unlikely to adopt the cis isomer, and examination of spectra from a library of 58 peptides indicates that ~80% of sequences show this effect. A simple mechanism suggesting that the barrier between the cis-and trans-proline forms is lowered because of low steric impedance is proposed. This observation may have interesting biological implications as well, and we note that a number of biologically active peptides have penultimate proline residues. PMID:25503299
Strods, Arnis; Ose, Velta; Bogans, Janis; Cielens, Indulis; Kalnins, Gints; Radovica, Ilze; Kazaks, Andris; Pumpens, Paul; Renhofa, Regina
2015-01-01
Hepatitis B virus (HBV) core (HBc) virus-like particles (VLPs) are one of the most powerful protein engineering tools utilised to expose immunological epitopes and/or cell-targeting signals and for the packaging of genetic material and immune stimulatory sequences. Although HBc VLPs and their numerous derivatives are produced in highly efficient bacterial and yeast expression systems, the existing purification and packaging protocols are not sufficiently optimised and standardised. Here, a simple alkaline treatment method was employed for the complete removal of internal RNA from bacteria- and yeast-produced HBc VLPs and for the conversion of these VLPs into empty particles, without any damage to the VLP structure. The empty HBc VLPs were able to effectively package the added DNA and RNA sequences. Furthermore, the alkaline hydrolysis technology appeared efficient for the purification and packaging of four different HBc variants carrying lysine residues on the HBc VLP spikes. Utilising the introduced lysine residues and the intrinsic aspartic and glutamic acid residues exposed on the tips of the HBc spikes for chemical coupling of the chosen peptide and/or nucleic acid sequences ensured a standard and easy protocol for the further development of versatile HBc VLP-based vaccine and gene therapy applications. PMID:26113394
NASA Astrophysics Data System (ADS)
Strods, Arnis; Ose, Velta; Bogans, Janis; Cielens, Indulis; Kalnins, Gints; Radovica, Ilze; Kazaks, Andris; Pumpens, Paul; Renhofa, Regina
2015-06-01
Hepatitis B virus (HBV) core (HBc) virus-like particles (VLPs) are one of the most powerful protein engineering tools utilised to expose immunological epitopes and/or cell-targeting signals and for the packaging of genetic material and immune stimulatory sequences. Although HBc VLPs and their numerous derivatives are produced in highly efficient bacterial and yeast expression systems, the existing purification and packaging protocols are not sufficiently optimised and standardised. Here, a simple alkaline treatment method was employed for the complete removal of internal RNA from bacteria- and yeast-produced HBc VLPs and for the conversion of these VLPs into empty particles, without any damage to the VLP structure. The empty HBc VLPs were able to effectively package the added DNA and RNA sequences. Furthermore, the alkaline hydrolysis technology appeared efficient for the purification and packaging of four different HBc variants carrying lysine residues on the HBc VLP spikes. Utilising the introduced lysine residues and the intrinsic aspartic and glutamic acid residues exposed on the tips of the HBc spikes for chemical coupling of the chosen peptide and/or nucleic acid sequences ensured a standard and easy protocol for the further development of versatile HBc VLP-based vaccine and gene therapy applications.
Principles of protein folding--a perspective from simple exact models.
Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.
1995-01-01
General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459
Srivastava, Deepika; Shanker, Asheesh
2016-12-01
Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
NASA Astrophysics Data System (ADS)
Allstadt, K.; Moretti, L.; Mangeney, A.; Stutzmann, E.; Capdeville, Y.
2014-12-01
The time series of forces exerted on the earth by a large and rapid landslide derived remotely from the inversion of seismic records can be used to tie post-slide evidence to what actually occurred during the event and can be used to tune numerical models and test theoretical methods. This strategy is applied to the 48.5 Mm3 August 2010 Mount Meager rockslide-debris flow in British Columbia, Canada. By inverting data from just five broadband seismic stations less than 300 km from the source, we reconstruct the time series of forces that the landslide exerted on the Earth as it occurred. The result illuminates a complex retrogressive initiation sequence and features attributable to flow over a complicated path including several curves and runup against a valley wall. The seismically derived force history also allows for the estimation of the horizontal acceleration (0.39 m/s^2) and average apparent coefficient of basal friction (0.38) of the rockslide, and the speed of the center of mass of the debris flow (peak of 92 m/s). To extend beyond these simple calculations and to test the interpretation, we also use the seismically derived force history to guide numerical modeling of the event - seeking to simulate the landslide in a way that best fits both the seismic and field constraints. This allows for a finer reconstruction of the volume, timing, and sequence of events, estimates of friction, and spatiotemporal variations in speed and flow thickness. The modeling allowed us to analyze the sensitivity of the force to the different parameters involved in the landslide modeling to better understand what can and cannot be constrained from seismic source inversions of landslide signals.
P and S wave Coda Calibration in Central Asia and South Korea
NASA Astrophysics Data System (ADS)
Kim, D.; Mayeda, K.; Gok, R.; Barno, J.; Roman-Nieves, J. I.
2017-12-01
Empirically derived coda source spectra provide unbiased, absolute moment magnitude (Mw) estimates for events that are normally too small for accurate long-period waveform modeling. In this study, we obtain coda-derived source spectra using data from Central Asia (Kyrgyzstan networks - KN and KR, and Tajikistan - TJ) and South Korea (Korea Meteorological Administration, KMA). We used a recently developed coda calibration module of Seismic WaveForm Tool (SWFT). Seismic activities during this recording period include the recent Gyeongju earthquake of Mw=5.3 and its aftershocks, two nuclear explosions from 2009 and 2013 in North Korea, and a small number of construction and mining-related explosions. For calibration, we calculated synthetic coda envelopes for both P and S waves based on a simple analytic expression that fits the observed narrowband filtered envelopes using the method outlined in Mayeda et al. (2003). To provide an absolute scale of the resulting source spectra, path and site corrections are applied using independent spectral constraints (e.g., Mw and stress drop) from three Kyrgyzstan events and the largest events of the Gyeongju sequence in Central Asia and South Korea, respectively. In spite of major tectonic differences, stable source spectra were obtained in both regions. We validated the resulting spectra by comparing the ratio of raw envelopes and source spectra from calibrated envelopes. Spectral shapes of earthquakes and explosions show different patterns in both regions. We also find (1) the source spectra derived from S-coda is more robust than that from the P-coda at low frequencies; (2) unlike earthquake events, the source spectra of explosions have a large disagreement between P and S waves; and (3) similarity is observed between 2016 Gyeongju and 2011 Virginia earthquake sequence in the eastern U.S.
Defining and predicting structurally conserved regions in protein superfamilies
Huang, Ivan K.; Grishin, Nick V.
2013-01-01
Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223
Aircraft stress sequence development: A complex engineering process made simple
NASA Technical Reports Server (NTRS)
Schrader, K. H.; Butts, D. G.; Sparks, W. A.
1994-01-01
Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.
Characterization and analysis of a transcriptome from the boreal spider crab Hyas araneus.
Harms, Lars; Frickenhaus, Stephan; Schiffer, Melanie; Mark, Felix C; Storch, Daniela; Pörtner, Hans-Otto; Held, Christoph; Lucassen, Magnus
2013-12-01
Research investigating the genetic basis of physiological responses has significantly broadened our understanding of the mechanisms underlying organismic response to environmental change. However, genomic data are currently available for few taxa only, thus excluding physiological model species from this approach. In this study we report the transcriptome of the model organism Hyas araneus from Spitsbergen (Arctic). We generated 20,479 transcripts, using the 454 GS FLX sequencing technology in combination with an Illumina HiSeq sequencing approach. Annotation by Blastx revealed 7159 blast hits in the NCBI non-redundant protein database. The comparison between the spider crab H. araneus transcriptome and EST libraries of the European lobster Homarus americanus and the porcelain crab Petrolisthes cinctipes yielded 3229/2581 sequences with a significant hit, respectively. The clustering by the Markov Clustering Algorithm (MCL) revealed a common core of 1710 clusters present in all three species and 5903 unique clusters for H. araneus. The combined sequencing approaches generated transcripts that will greatly expand the limited genomic data available for crustaceans. We introduce the MCL clustering for transcriptome comparisons as a simple approach to estimate similarities between transcriptomic libraries of different size and quality and to analyze homologies within the selected group of species. In particular, we identified a large variety of reverse transcriptase (RT) sequences not only in the H. araneus transcriptome and other decapod crustaceans, but also sea urchin, supporting the hypothesis of a heritable, anti-viral immunity and the proposed viral fragment integration by host-derived RTs in marine invertebrates. © 2013.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martinez, Antonio D.; Berka, Randy; Henrissat, Bernard
2008-05-01
A major thrust of the white biotechnology movement involves the development of enzyme systems which depolymerize biomass to simple sugars which are subsequently converted to sustainable biofuels (e.g., ethanol) and chemical intermediates. The fungus Trichoderma reesei (syn. Hypocrea jecorina) represents a paradigm for the industrial production of highly efficient cellulases and hemicellulases needed for hydrolysis of biomass polysaccharides. Herein we describe intriguing attributes of the T. reeseigenome in relation to the future of fuel biotechnology. The T. reesei genome sequence was derived using a whole genome shotgun approach combined with finishing work to generate an assembly comprising 89 scaffolds totalingmore » 34 Mbp with few gaps. In total, 9,130 gene models were predicted using a combination of ab initio and sequence similarity-based methods and EST data. Considering the industrial utility and effectiveness of its enzymes, the T. reesei genome surprisingly encodes the fewest cellulases and hemicellulases of any fungus having the ability to hydrolyze plant cell wall polysaccharides and whose genome has been sequenced. Many genes encoding carbohydrate active enzymes are distributed non-randomly in groups or clusters that interestingly lie between regions of synteny with other Sordariomycetes. Additionally, the T. reesei genome contains a multitude of genes encoding biosynthetic pathways for secondary metabolites (possible antibacterial and antifungal compounds) which may promote successful competition and survival in the crowded and competitive soil habitat occupied by T. reesei. Our analysis coupled with the availability of genome sequence data provides a roadmap for construction of enhanced T. reesei strains for industrial applications.« less
Wang, Zunde; Engler, Peter; Longacre, Angelika; Storb, Ursula
2001-01-01
Large-scale genomic sequencing projects have provided DNA sequence information for many genes, but the biological functions for most of them will only be known through functional studies. Bacterial artificial chromosomes (BACs) and P1-derived artificial chromosomes (PACs) are large genomic clones stably maintained in bacteria and are very important in functional studies through transfection because of their large size and stability. Because most BAC or PAC vectors do not have a mammalian selection marker, transfecting mammalian cells with genes cloned in BACs or PACs requires the insertion into the BAC/PAC of a mammalian selectable marker. However, currently available procedures are not satisfactory in efficiency and fidelity. We describe a very simple and efficient procedure that allows one to retrofit dozens of BACs in a day with no detectable deletions or unwanted recombination. We use a BAC/PAC retrofitting vector that, on transformation into competent BAC or PAC strains, will catalyze the specific insertion of itself into BAC/PAC vectors through in vivo cre/loxP site-specific recombination. PMID:11156622
Dries, Jan
2016-01-01
On-line control of the biological treatment process is an innovative tool to cope with variable concentrations of chemical oxygen demand and nutrients in industrial wastewater. In the present study we implemented a simple dynamic control strategy for nutrient-removal in a sequencing batch reactor (SBR) treating variable tank truck cleaning wastewater. The control system was based on derived signals from two low-cost and robust sensors that are very common in activated sludge plants, i.e. oxidation reduction potential (ORP) and dissolved oxygen. The amount of wastewater fed during anoxic filling phases, and the number of filling phases in the SBR cycle, were determined by the appearance of the 'nitrate knee' in the profile of the ORP. The phase length of the subsequent aerobic phases was controlled by the oxygen uptake rate measured online in the reactor. As a result, the sludge loading rate (F/M ratio), the volume exchange rate and the SBR cycle length adapted dynamically to the activity of the activated sludge and the actual characteristics of the wastewater, without affecting the final effluent quality.
Hippocampal Replay is Not a Simple Function of Experience
Gupta, Anoopum S.; van der Meer, Matthijs A. A.; Touretzky, David S.; Redish, A. David
2015-01-01
Summary Replay of behavioral sequences in the hippocampus during sharp-wave-ripple-complexes (SWRs) provides a potential mechanism for memory consolidation and the learning of knowledge structures. Current hypotheses imply that replay should straightforwardly reflect recent experience. However, we find these hypotheses to be incompatible with the content of replay on a task with two distinct behavioral sequences (A&B). We observed forward and backward replay of B even when rats had been performing A for >10 minutes. Furthermore, replay of non-local sequence B occurred more often when B was infrequently experienced. Neither forward nor backward sequences preferentially represented highly-experienced trajectories within a session. Additionally, we observed the construction of never-experienced novel-path sequences. These observations challenge the idea that sequence activation during SWRs is a simple replay of recent experience. Instead, replay reflected all physically available trajectories within the environment, suggesting a potential role in active learning and maintenance of the cognitive map. PMID:20223204
Kim, Kwang-Hwan; Hwang, Ji-Hyun; Han, Dong-Yeup; Park, Minkyu; Kim, Seungill; Choi, Doil; Kim, Yongjae; Lee, Gung Pyo; Kim, Sun-Tae; Park, Young-Hoon
2015-01-01
An intraspecific genetic map for watermelon was constructed using an F2 population derived from 'Arka Manik' × 'TS34' and transcript sequence variants and quantitative trait loci (QTL) for resistance to powdery mildew (PMR), seed size (SS), and fruit shape (FS) were analyzed. The map consists of 14 linkage groups (LGs) defined by 174 cleaved amplified polymorphic sequences (CAPS), 2 derived-cleaved amplified polymorphic sequence markers, 20 sequence-characterized amplified regions, and 8 expressed sequence tag-simple sequence repeat markers spanning 1,404.3 cM, with a mean marker interval of 6.9 cM and an average of 14.6 markers per LG. Genetic inheritance and QTL analyses indicated that each of the PMR, SS, and FS traits is controlled by an incompletely dominant effect of major QTLs designated as pmr2.1, ss2.1, and fsi3.1, respectively. The pmr2.1, detected on chromosome 2 (Chr02), explained 80.0% of the phenotypic variation (LOD = 30.76). This QTL was flanked by two CAPS markers, wsb2-24 (4.00 cM) and wsb2-39 (13.97 cM). The ss2.1, located close to pmr2.1 and CAPS marker wsb2-13 (1.00 cM) on Chr02, explained 92.3% of the phenotypic variation (LOD = 68.78). The fsi3.1, detected on Chr03, explained 79.7% of the phenotypic variation (LOD = 31.37) and was flanked by two CAPS, wsb3-24 (1.91 cM) and wsb3-9 (7.00 cM). Candidate gene-based CAPS markers were developed from the disease resistance and fruit shape gene homologs located on Chr.02 and Chr03 and were mapped on the intraspecific map. Colocalization of these markers with the major QTLs indicated that watermelon orthologs of a nucleotide-binding site-leucine-rich repeat class gene containing an RPW8 domain and a member of SUN containing the IQ67 domain are candidate genes for pmr2.1 and fsi3.1, respectively. The results presented herein provide useful information for marker-assisted breeding and gene cloning for PMR and fruit-related traits.
Kim, Kwang-Hwan; Hwang, Ji-Hyun; Han, Dong-Yeup; Park, Minkyu; Kim, Seungill; Choi, Doil; Kim, Yongjae; Lee, Gung Pyo; Kim, Sun-Tae; Park, Young-Hoon
2015-01-01
An intraspecific genetic map for watermelon was constructed using an F2 population derived from ‘Arka Manik’ × ‘TS34’ and transcript sequence variants and quantitative trait loci (QTL) for resistance to powdery mildew (PMR), seed size (SS), and fruit shape (FS) were analyzed. The map consists of 14 linkage groups (LGs) defined by 174 cleaved amplified polymorphic sequences (CAPS), 2 derived-cleaved amplified polymorphic sequence markers, 20 sequence-characterized amplified regions, and 8 expressed sequence tag-simple sequence repeat markers spanning 1,404.3 cM, with a mean marker interval of 6.9 cM and an average of 14.6 markers per LG. Genetic inheritance and QTL analyses indicated that each of the PMR, SS, and FS traits is controlled by an incompletely dominant effect of major QTLs designated as pmr2.1, ss2.1, and fsi3.1, respectively. The pmr2.1, detected on chromosome 2 (Chr02), explained 80.0% of the phenotypic variation (LOD = 30.76). This QTL was flanked by two CAPS markers, wsb2-24 (4.00 cM) and wsb2-39 (13.97 cM). The ss2.1, located close to pmr2.1 and CAPS marker wsb2-13 (1.00 cM) on Chr02, explained 92.3% of the phenotypic variation (LOD = 68.78). The fsi3.1, detected on Chr03, explained 79.7% of the phenotypic variation (LOD = 31.37) and was flanked by two CAPS, wsb3-24 (1.91 cM) and wsb3-9 (7.00 cM). Candidate gene-based CAPS markers were developed from the disease resistance and fruit shape gene homologs located on Chr.02 and Chr03 and were mapped on the intraspecific map. Colocalization of these markers with the major QTLs indicated that watermelon orthologs of a nucleotide-binding site-leucine-rich repeat class gene containing an RPW8 domain and a member of SUN containing the IQ67 domain are candidate genes for pmr2.1 and fsi3.1, respectively. The results presented herein provide useful information for marker-assisted breeding and gene cloning for PMR and fruit-related traits. PMID:26700647
NGA-West 2 GMPE average site coefficients for use in earthquake-resistant design
Borcherdt, Roger D.
2015-01-01
Site coefficients corresponding to those in tables 11.4–1 and 11.4–2 of Minimum Design Loads for Buildings and Other Structures published by the American Society of Civil Engineers (Standard ASCE/SEI 7-10) are derived from four of the Next Generation Attenuation West2 (NGA-W2) Ground-Motion Prediction Equations (GMPEs). The resulting coefficients are compared with those derived by other researchers and those derived from the NGA-West1 database. The derivation of the NGA-W2 average site coefficients provides a simple procedure to update site coefficients with each update in the Maximum Considered Earthquake Response MCER maps. The simple procedure yields average site coefficients consistent with those derived for site-specific design purposes. The NGA-W2 GMPEs provide simple scale factors to reduce conservatism in current simplified design procedures.
Analysis of SSR information in EST resources of sugarcane
USDA-ARS?s Scientific Manuscript database
Expressed sequence tags ( ESTs) offer the opportunity to exploit single, low -copy, conserved sequence motifs for the development of simple sequence repeats ( SSRs). The total of 262 113 ESTs of sugarcane (Saccharum officinarum) in the database of NCBI were downloaded and analyzed, which resulted in...
Role of Mitochondrial Inheritance on Prostate Cancer Outcome in African American Men. Addendum
2016-11-01
DNA sequencing technique developed by our collaborator using single amplicon long-range PCR that permits deep coverage (10,000-20,000X on average) of...the mitochondrial genome. We have sequenced 652 samples derived from frozen fully using this technology. The additional DNA samples derived from...paraffin embedded (FFPE) tissue were more challenging, but have now been sequenced . Mapping of DNA variants in our sequenced genomes to mitochondrial
Statistical physics of nucleosome positioning and chromatin structure
NASA Astrophysics Data System (ADS)
Morozov, Alexandre
2012-02-01
Genomic DNA is packaged into chromatin in eukaryotic cells. The fundamental building block of chromatin is the nucleosome, a 147 bp-long DNA molecule wrapped around the surface of a histone octamer. Arrays of nucleosomes are positioned along DNA according to their sequence preferences and folded into higher-order chromatin fibers whose structure is poorly understood. We have developed a framework for predicting sequence-specific histone-DNA interactions and the effective two-body potential responsible for ordering nucleosomes into regular higher-order structures. Our approach is based on the analogy between nucleosomal arrays and a one-dimensional fluid of finite-size particles with nearest-neighbor interactions. We derive simple rules which allow us to predict nucleosome occupancy solely from the dinucleotide content of the underlying DNA sequences.Dinucleotide content determines the degree of stiffness of the DNA polymer and thus defines its ability to bend into the nucleosomal superhelix. As expected, the nucleosome positioning rules are universal for chromatin assembled in vitro on genomic DNA from baker's yeast and from the nematode worm C.elegans, where nucleosome placement follows intrinsic sequence preferences and steric exclusion. However, the positioning rules inferred from in vivo C.elegans chromatin are affected by global nucleosome depletion from chromosome arms relative to central domains, likely caused by the attachment of the chromosome arms to the nuclear membrane. Furthermore, intrinsic nucleosome positioning rules are overwritten in transcribed regions, indicating that chromatin organization is actively managed by the transcriptional and splicing machinery.
Allen, Alexandra M; Barker, Gary L A; Berry, Simon T; Coghill, Jane A; Gwilliam, Rhian; Kirby, Susan; Robinson, Phil; Brenchley, Rachel C; D'Amore, Rosalinda; McKenzie, Neil; Waite, Darren; Hall, Anthony; Bevan, Michael; Hall, Neil; Edwards, Keith J
2011-12-01
Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
A comprehensive evaluation of assembly scaffolding tools
2014-01-01
Background Genome assembly is typically a two-stage process: contig assembly followed by the use of paired sequencing reads to join contigs into scaffolds. Scaffolds are usually the focus of reported assembly statistics; longer scaffolds greatly facilitate the use of genome sequences in downstream analyses, and it is appealing to present larger numbers as metrics of assembly performance. However, scaffolds are highly prone to errors, especially when generated using short reads, which can directly result in inflated assembly statistics. Results Here we provide the first independent evaluation of scaffolding tools for second-generation sequencing data. We find large variations in the quality of results depending on the tool and dataset used. Even extremely simple test cases of perfect input, constructed to elucidate the behaviour of each algorithm, produced some surprising results. We further dissect the performance of the scaffolders using real and simulated sequencing data derived from the genomes of Staphylococcus aureus, Rhodobacter sphaeroides, Plasmodium falciparum and Homo sapiens. The results from simulated data are of high quality, with several of the tools producing perfect output. However, at least 10% of joins remains unidentified when using real data. Conclusions The scaffolders vary in their usability, speed and number of correct and missed joins made between contigs. Results from real data highlight opportunities for further improvements of the tools. Overall, SGA, SOPRA and SSPACE generally outperform the other tools on our datasets. However, the quality of the results is highly dependent on the read mapper and genome complexity. PMID:24581555
Diversity of Secondary Structure in Catalytic Peptides with β-Turn-Biased Sequences
2016-01-01
X-ray crystallography has been applied to the structural analysis of a series of tetrapeptides that were previously assessed for catalytic activity in an atroposelective bromination reaction. Common to the series is a central Pro-Xaa sequence, where Pro is either l- or d-proline, which was chosen to favor nucleation of canonical β-turn secondary structures. Crystallographic analysis of 35 different peptide sequences revealed a range of conformational states. The observed differences appear not only in cases where the Pro-Xaa loop-region is altered, but also when seemingly subtle alterations to the flanking residues are introduced. In many instances, distinct conformers of the same sequence were observed, either as symmetry-independent molecules within the same unit cell or as polymorphs. Computational studies using DFT provided additional insight into the analysis of solid-state structural features. Select X-ray crystal structures were compared to the corresponding solution structures derived from measured proton chemical shifts, 3J-values, and 1H–1H-NOESY contacts. These findings imply that the conformational space available to simple peptide-based catalysts is more diverse than precedent might suggest. The direct observation of multiple ground state conformations for peptides of this family, as well as the dynamic processes associated with conformational equilibria, underscore not only the challenge of designing peptide-based catalysts, but also the difficulty in predicting their accessible transition states. These findings implicate the advantages of low-barrier interconversions between conformations of peptide-based catalysts for multistep, enantioselective reactions. PMID:28029251
The transcriptome of Spodoptera exigua larvae exposed to different types of microbes.
Pascual, Laura; Jakubowska, Agata K; Blanca, Jose M; Cañizares, Joaquin; Ferré, Juan; Gloeckner, Gernot; Vogel, Heiko; Herrero, Salvador
2012-08-01
We have obtained and characterized the transcriptome of Spodoptera exigua larvae with special emphasis on pathogen-induced genes. In order to obtain a highly representative transcriptome, we have pooled RNA from diverse insect colonies, conditions and tissues. Sequenced cDNA included samples from 3 geographically different colonies. Enrichment of RNA from pathogen-related genes was accomplished by exposing larvae to different pathogenic and non-pathogenic microbial agents such as the bacteria Bacillus thuringiensis, Micrococcus luteus, and Escherichia coli, the yeast Saccharomyces cerevisiae, and the S. exigua nucleopolyhedrovirus (SeMNPV). In addition, to avoid the loss of tissue-specific genes we included cDNA from the midgut, fat body, hemocytes and integument derived from pathogen exposed insects. RNA obtained from the different types of samples was pooled, normalized and sequenced. Analysis of the sequences obtained using the Roche 454 FLX and Sanger methods has allowed the generation of the largest public set of ESTs from S. exigua, including a large group of immune genes, and the identification of an important number of SSR (simple sequence repeats) and SNVs (single nucleotide variants: SNPs and INDELs) with potential use as genetic markers. Moreover, data mining has allowed the discovery of novel RNA viruses with potential influence in the insect population dynamics and the larval interactions with the microbial pesticides that are currently in use for the biological control of this pest. Copyright © 2012 Elsevier Ltd. All rights reserved.
Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.
2011-01-01
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647
Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold
Li, Weizhong; Lopez, Rodrigo
2017-01-01
Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999
Wang, Gui-Xiang; Lv, Jing; Zhang, Jie; Han, Shuo; Zong, Mei; Guo, Ning; Zeng, Xing-Ying; Zhang, Yue-Yun; Wang, You-Ping; Liu, Fan
2016-01-01
Broad phenotypic variations were obtained previously in derivatives from the asymmetric somatic hybridization of cauliflower "Korso" (Brassica oleracea var. botrytis, 2n = 18, CC genome) and black mustard "G1/1" (Brassica nigra, 2n = 16, BB genome). However, the mechanisms underlying these variations were unknown. In this study, 28 putative introgression lines (ILs) were pre-selected according to a series of morphological (leaf shape and color, plant height and branching, curd features, and flower traits) and physiological (black rot/club root resistance) characters. Multi-color fluorescence in situ hybridization revealed that these plants contained 18 chromosomes derived from "Korso." Molecular marker (65 simple sequence repeats and 77 amplified fragment length polymorphisms) analysis identified the presence of "G1/1" DNA segments (average 7.5%). Additionally, DNA profiling revealed many genetic and epigenetic differences among the ILs, including sequence alterations, deletions, and variation in patterns of cytosine methylation. The frequency of fragments lost (5.1%) was higher than presence of novel bands (1.4%), and the presence of fragments specific to Brassica carinata (BBCC 2n = 34) were common (average 15.5%). Methylation-sensitive amplified polymorphism analysis indicated that methylation changes were common and that hypermethylation (12.4%) was more frequent than hypomethylation (4.8%). Our results suggested that asymmetric somatic hybridization and alien DNA introgression induced genetic and epigenetic alterations. Thus, these ILs represent an important, novel germplasm resource for cauliflower improvement that can be mined for diverse traits of interest to breeders and researchers.
Amino Acid Properties Conserved in Molecular Evolution
Rudnicki, Witold R.; Mroczek, Teresa; Cudek, Paweł
2014-01-01
That amino acid properties are responsible for the way protein molecules evolve is natural and is also reasonably well supported both by the structure of the genetic code and, to a large extent, by the experimental measures of the amino acid similarity. Nevertheless, there remains a significant gap between observed similarity matrices and their reconstructions from amino acid properties. Therefore, we introduce a simple theoretical model of amino acid similarity matrices, which allows splitting the matrix into two parts – one that depends only on mutabilities of amino acids and another that depends on pairwise similarities between them. Then the new synthetic amino acid properties are derived from the pairwise similarities and used to reconstruct similarity matrices covering a wide range of information entropies. Our model allows us to explain up to 94% of the variability in the BLOSUM family of the amino acids similarity matrices in terms of amino acid properties. The new properties derived from amino acid similarity matrices correlate highly with properties known to be important for molecular evolution such as hydrophobicity, size, shape and charge of amino acids. This result closes the gap in our understanding of the influence of amino acids on evolution at the molecular level. The methods were applied to the single family of similarity matrices used often in general sequence homology searches, but it is general and can be used also for more specific matrices. The new synthetic properties can be used in analyzes of protein sequences in various biological applications. PMID:24967708
Modeling the dynamics of choice.
Baum, William M; Davison, Michael
2009-06-01
A simple linear-operator model both describes and predicts the dynamics of choice that may underlie the matching relation. We measured inter-food choice within components of a schedule that presented seven different pairs of concurrent variable-interval schedules for 12 food deliveries each with no signals indicating which pair was in force. This measure of local choice was accurately described and predicted as obtained reinforcer sequences shifted it to favor one alternative or the other. The effect of a changeover delay was reflected in one parameter, the asymptote, whereas the effect of a difference in overall rate of food delivery was reflected in the other parameter, rate of approach to the asymptote. The model takes choice as a primary dependent variable, not derived by comparison between alternatives-an approach that agrees with the molar view of behaviour.
Electrophilic activation and cycloisomerization of enynes: a new route to functional cyclopropanes.
Bruneau, Christian
2005-04-15
Transformations of enynes in the presence of transition-metal catalysts have played an important role in the preparation of a variety of cyclic compounds. Recent developments in the activation of triple carbon-carbon bonds by electrophilic metal centers have provided a new entry to the selective synthesis of cyclopropane derivatives from enynes. The mechanisms of these reactions involve catalytic species with both ionic and cyclopropylcarbenoid character. This type of activation will undoubtedly be further developed for application to other unsaturated hydrocarbons and inspire new catalytic cascade reaction sequences. This Minireview discusses the recent developments in electrophilic activation of enynes and shows that simple catalysts such as [Ru(3)(CO)(12)], PtCl(2), and cationic gold complexes are efficient precursors to promote the formation of functional polyclic compounds.
Metal-free trifluoromethylation of aromatic and heteroaromatic aldehydes and ketones.
Qiao, Yupu; Si, Tuda; Yang, Ming-Hsiu; Altman, Ryan A
2014-08-01
The ability to convert simple and common substrates into fluoroalkyl derivatives under mild conditions remains an important goal for medicinal and agricultural chemists. One representative example of a desirable transformation involves the conversion of aromatic and heteroaromatic ketones and aldehydes into aryl and heteroaryl β,β,β-trifluoroethylarenes and -heteroarenes. The traditional approach for this net transformation involves stoichiometric metals and/or multistep reaction sequences that consume excessive time, material, and labor resources while providing low yields of products. To complement these traditional strategies, we report a one-pot metal-free decarboxylative procedure for accessing β,β,β-trifluoroethylarenes and -heteroarenes from readily available ketones and aldehydes. This method features several benefits, including ease of operation, readily available reagents, mild reaction conditions, high functional-group compatibility, and scalability.
Metal-Free Trifluoromethylation of Aromatic and Heteroaromatic Aldehydes and Ketones
2015-01-01
The ability to convert simple and common substrates into fluoroalkyl derivatives under mild conditions remains an important goal for medicinal and agricultural chemists. One representative example of a desirable transformation involves the conversion of aromatic and heteroaromatic ketones and aldehydes into aryl and heteroaryl β,β,β-trifluoroethylarenes and -heteroarenes. The traditional approach for this net transformation involves stoichiometric metals and/or multistep reaction sequences that consume excessive time, material, and labor resources while providing low yields of products. To complement these traditional strategies, we report a one-pot metal-free decarboxylative procedure for accessing β,β,β-trifluoroethylarenes and -heteroarenes from readily available ketones and aldehydes. This method features several benefits, including ease of operation, readily available reagents, mild reaction conditions, high functional-group compatibility, and scalability. PMID:25001876
How Does Sequence Structure Affect the Judgment of Time? Exploring a Weighted Sum of Segments Model
ERIC Educational Resources Information Center
Matthews, William J.
2013-01-01
This paper examines the judgment of segmented temporal intervals, using short tone sequences as a convenient test case. In four experiments, we investigate how the relative lengths, arrangement, and pitches of the tones in a sequence affect judgments of sequence duration, and ask whether the data can be described by a simple weighted sum of…
Discrete sequence prediction and its applications
NASA Technical Reports Server (NTRS)
Laird, Philip
1992-01-01
Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We apply sequence prediction using a simple and practical sequence-prediction algorithm, called TDAG. The TDAG algorithm is first tested by comparing its performance with some common data compression algorithms. Then it is adapted to the detailed requirements of dynamic program optimization, with excellent results.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes
A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt
2000-01-01
Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
ERIC Educational Resources Information Center
Cepic, Mojca
2008-01-01
Light beams in wavy unclear water, also called underwater rays, and caustic networks of light formed at the bottom of shallow water are two faces of a single phenomenon. Derivation of the caustic using only simple geometry, Snell's law and simple derivatives accounts for observations such as the existence of the caustic network on vertical walls,…
Simple Derivation of the Maxwell Stress Tensor and Electrostrictive Effects in Crystals
ERIC Educational Resources Information Center
Juretschke, H. J.
1977-01-01
Shows that local equilibrium and energy considerations in an elastic dielectric crystal lead to a simple derivation of the Maxwell stress tensor in anisotropic dielectric solids. The resulting equilibrium stress-strain relations are applied to determine the deformations of a charged parallel plate capacitor. (MLH)
Bannwarth, Markus B; Utech, Stefanie; Ebert, Sandro; Weitz, David A; Crespy, Daniel; Landfester, Katharina
2015-03-24
The assembly of nanoparticles into polymer-like architectures is challenging and usually requires highly defined colloidal building blocks. Here, we show that the broad size-distribution of a simple dispersion of magnetic nanocolloids can be exploited to obtain various polymer-like architectures. The particles are assembled under an external magnetic field and permanently linked by thermal sintering. The remarkable variety of polymer-analogue architectures that arises from this simple process ranges from statistical and block copolymer-like sequencing to branched chains and networks. This library of architectures can be realized by controlling the sequencing of the particles and the junction points via a size-dependent self-assembly of the single building blocks.
Deciphering mRNA Sequence Determinants of Protein Production Rate
NASA Astrophysics Data System (ADS)
Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen
2018-03-01
One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.
Microcomputer-Assisted Mathematics: From Simple Interest to e.
ERIC Educational Resources Information Center
Kimberling, Clark
1985-01-01
The progression from simple interest to compound interest leads naturally and quickly to the number e, involving mathematical discovery learning through writing programs. Several programs are given, with suggestions for a teaching sequence. (MNS)
A 48 SNP set for grapevine cultivar identification
2011-01-01
Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP markers are bi-allelic, allele identification and genotype naming are extremely simple and genotypes obtained with different equipments and by different laboratories are always fully comparable. PMID:22060012
Nakayama, Manabu; Oda, Hirotsugu; Nakagawa, Kenji; Yasumi, Takahiro; Kawai, Tomoki; Izawa, Kazushi; Nishikomori, Ryuta; Heike, Toshio; Ohara, Osamu
2017-03-01
Autoinflammatory diseases occupy one of a group of primary immunodeficiency diseases that are generally thought to be caused by mutation of genes responsible for innate immunity, rather than by acquired immunity. Mutations related to autoinflammatory diseases occur in 12 genes. For example, low-level somatic mosaic NLRP3 mutations underlie chronic infantile neurologic, cutaneous, articular syndrome (CINCA), also known as neonatal-onset multisystem inflammatory disease (NOMID). In current clinical practice, clinical genetic testing plays an important role in providing patients with quick, definite diagnoses. To increase the availability of such testing, low-cost high-throughput gene-analysis systems are required, ones that not only have the sensitivity to detect even low-level somatic mosaic mutations, but also can operate simply in a clinical setting. To this end, we developed a simple method that employs two-step tailed PCR and an NGS system, MiSeq platform, to detect mutations in all coding exons of the 12 genes responsible for autoinflammatory diseases. Using this amplicon sequencing system, we amplified a total of 234 amplicons derived from the 12 genes with multiplex PCR. This was done simultaneously and in one test tube. Each sample was distinguished by an index sequence of second PCR primers following PCR amplification. With our procedure and tips for reducing PCR amplification bias, we were able to analyze 12 genes from 25 clinical samples in one MiSeq run. Moreover, with the certified primers designed by our short program-which detects and avoids common SNPs in gene-specific PCR primers-we used this system for routine genetic testing. Our optimized procedure uses a simple protocol, which can easily be followed by virtually any office medical staff. Because of the small PCR amplification bias, we can analyze simultaneously several clinical DNA samples with low cost and can obtain sufficient read numbers to detect a low level of somatic mosaic mutations.
Jiménez-Moreno, Ester; Montalvillo-Jiménez, Laura; Santana, Andrés G; Gómez, Ana M; Jiménez-Osés, Gonzalo; Corzana, Francisco; Bastida, Agatha; Jiménez-Barbero, Jesús; Cañada, Francisco Javier; Gómez-Pinto, Irene; González, Carlos; Asensio, Juan Luis
2016-05-25
Development of strong and selective binders from promiscuous lead compounds represents one of the most expensive and time-consuming tasks in drug discovery. We herein present a novel fragment-based combinatorial strategy for the optimization of multivalent polyamine scaffolds as DNA/RNA ligands. Our protocol provides a quick access to a large variety of regioisomer libraries that can be tested for selective recognition by combining microdialysis assays with simple isotope labeling and NMR experiments. To illustrate our approach, 20 small libraries comprising 100 novel kanamycin-B derivatives have been prepared and evaluated for selective binding to the ribosomal decoding A-Site sequence. Contrary to the common view of NMR as a low-throughput technique, we demonstrate that our NMR methodology represents a valuable alternative for the detection and quantification of complex mixtures, even integrated by highly similar or structurally related derivatives, a common situation in the context of a lead optimization process. Furthermore, this study provides valuable clues about the structural requirements for selective A-site recognition.
Engineering Rhodosporidium toruloides for increased lipid production.
Zhang, Shuyan; Skerker, Jeffrey M; Rutter, Charles D; Maurer, Matthew J; Arkin, Adam P; Rao, Christopher V
2016-05-01
Oleaginous yeast are promising organisms for the production of lipid-based chemicals and fuels from simple sugars. In this work, we explored Rhodosporidium toruloides for the production of lipid-based products. This oleaginous yeast natively produces lipids at high titers and can grow on glucose and xylose. As a first step, we sequenced the genomes of two strains, IFO0880, and IFO0559, and generated draft assemblies and annotations. We then used this information to engineer two R. toruloides strains for increased lipid production by over-expressing the native acetyl-CoA carboxylase and diacylglycerol acyltransferase genes using Agrobacterium tumefaciens mediated transformation. Our best strain, derived from IFO0880, was able to produce 16.4 ± 1.1 g/L lipid from 70 g/L glucose and 9.5 ± 1.3 g/L lipid from 70 g/L xylose in shake-flask experiments. This work represents one of the first examples of metabolic engineering in R. toruloides and establishes this yeast as a new platform for production of fatty-acid derived products. © 2015 Wiley Periodicals, Inc.
Improved alternating gradient transport and focusing of neutral molecules
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalnins, Juris; Lambertson, Glen; Gould, Harvey
2001-12-02
Polar molecules, in strong-field seeking states, can be transported and focused by an alternating sequence of electric field gradients that focus in one transverse direction while defocusing in the other. We show by calculation and numerical simulation, how one may greatly improve the alternating gradient transport and focusing of molecules. We use a new optimized multipole lens design, a FODO lattice beam transport line, and lenses to match the beam transport line to the beam source and the final focus. We derive analytic expressions for the potentials, fields, and gradients that may be used to design these lenses. We describemore » a simple lens optimization procedure and derive the equations of motion for tracking molecules through a beam transport line. As an example, we model a straight beamline that transports a 560 m/s jet-source beam of methyl fluoride molecules 15 m from its source and focuses it to 2 mm diameter. We calculate the beam transport line acceptance and transmission, for a beam with velocity spread, and estimate the transmitted intensity for specified source conditions. Possible applications are discussed.« less
Cosmic Star Formation: A Simple Model of the SFRD(z)
NASA Astrophysics Data System (ADS)
Chiosi, Cesare; Sciarratta, Mauro; D’Onofrio, Mauro; Chiosi, Emanuela; Brotto, Francesca; De Michele, Rosaria; Politino, Valeria
2017-12-01
We investigate the evolution of the cosmic star formation rate density (SFRD) from redshift z = 20 to z = 0 and compare it with the observational one by Madau and Dickinson derived from recent compilations of ultraviolet (UV) and infrared (IR) data. The theoretical SFRD(z) and its evolution are obtained using a simple model that folds together the star formation histories of prototype galaxies that are designed to represent real objects of different morphological type along the Hubble sequence and the hierarchical growing of structures under the action of gravity from small perturbations to large-scale objects in Λ-CDM cosmogony, i.e., the number density of dark matter halos N(M,z). Although the overall model is very simple and easy to set up, it provides results that mimic results obtained from highly complex large-scale N-body simulations well. The simplicity of our approach allows us to test different assumptions for the star formation law in galaxies, the effects of energy feedback from stars to interstellar gas, the efficiency of galactic winds, and also the effect of N(M,z). The result of our analysis is that in the framework of the hierarchical assembly of galaxies, the so-called time-delayed star formation under plain assumptions mainly for the energy feedback and galactic winds can reproduce the observational SFRD(z).
Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.
Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T
1993-02-01
An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.
USDA-ARS?s Scientific Manuscript database
To discover resistance (R) and/or pathogen-induced (PR) genes involved in disease response, 12 bacterial artificial chromosome (BAC) clones from cv. Acala Maxxa (G. hirsutum) were sequenced at the Clemson University, Genomics Institute, Clemson, SC. These BACs derived MUSB single sequence repeat (SS...
Geometric Representations of Condition Queries on Three-Dimensional Vector Fields
NASA Technical Reports Server (NTRS)
Henze, Chris
1999-01-01
Condition queries on distributed data ask where particular conditions are satisfied. It is possible to represent condition queries as geometric objects by plotting field data in various spaces derived from the data, and by selecting loci within these derived spaces which signify the desired conditions. Rather simple geometric partitions of derived spaces can represent complex condition queries because much complexity can be encapsulated in the derived space mapping itself A geometric view of condition queries provides a useful conceptual unification, allowing one to intuitively understand many existing vector field feature detection algorithms -- and to design new ones -- as variations on a common theme. A geometric representation of condition queries also provides a simple and coherent basis for computer implementation, reducing a wide variety of existing and potential vector field feature detection techniques to a few simple geometric operations.
Ochirkhuu, Nyamsuren; Konnai, Satoru; Odbileg, Raadan; Murata, Shiro; Ohashi, Kazuhiko
2017-08-01
Anaplasma species are obligate intracellular rickettsial pathogens that cause great economic loss to the animal industry. Few studies on Anaplasma infections in Mongolian livestock have been conducted. This study examined the prevalence of Anaplasma marginale, Anaplasma ovis, Anaplasma phagocytophilum, and Anaplasma bovis by polymerase chain reaction assay in 928 blood samples collected from native cattle and dairy cattle (Bos taurus), yaks (Bos grunniens), sheep (Ovis aries), and goats (Capra aegagrus hircus) in four provinces of Ulaanbaatar city in Mongolia. We genetically characterized positive samples through sequencing analysis based on the heat-shock protein groEL, major surface protein 4 (msp4), and 16S rRNA genes. Only A. ovis was detected in Mongolian livestock (cattle, yaks, sheep, and goats), with 413 animals (44.5%) positive for groEL and 308 animals (33.2%) positive for msp4 genes. In the phylogenetic tree, we separated A. ovis sequences into two distinct clusters based on the groEL gene. One cluster comprised sequences derived mainly from sheep and goats, which was similar to that in A. ovis isolates from other countries. The other divergent cluster comprised sequences derived from cattle and yaks and appeared to be newly branched from that in previously published single isolates in Mongolian cattle. In addition, the msp4 gene of A. ovis using same and different samples with groEL gene of the pathogen demonstrated that all sequences derived from all animal species, except for three sequences derived from cattle and yak, were clustered together, and were identical or similar to those in isolates from other countries. We used 16S rRNA gene sequences to investigate the genetically divergent A. ovis and identified high homology of 99.3-100%. However, the sequences derived from cattle did not match those derived from sheep and goats. The results of this study on the prevalence and molecular characterization of A. ovis in Mongolian livestock can facilitate the control of infectious diseases in livestock.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species
Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha
2011-01-01
Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Singh, A K; Rai, V P; Chand, R; Singh, R P; Singh, M N
2013-01-01
Genetic diversity and identification of simple sequence repeat markers correlated with Fusarium wilt resistance was performed in a set of 36 elite cultivated pigeonpea genotypes differing in levels of resistance to Fusarium wilt. Twenty-four polymorphic sequence repeat markers were screened across these genotypes, and amplified a total of 59 alleles with an average high polymorphic information content value of 0.52. Cluster analysis, done by UPGMA and PCA, grouped the 36 pigeonpea genotypes into two main clusters according to their Fusarium wilt reaction. Based on the Kruskal-Wallis ANOVA and simple regression analysis, six simple sequence repeat markers were found to be significantly associated with Fusarium wilt resistance. The phenotypic variation explained by these markers ranged from 23.7 to 56.4%. The present study helps in finding out feasibility of prescreened SSR markers to be used in genetic diversity analysis and their potential association with disease resistance.
Genetic diversity in Trypanosoma theileri from Sri Lankan cattle and water buffaloes.
Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Fukushi, Shintaro; Tattiyapong, Muncharee; Tuvshintulga, Bumduuren; Kothalawala, Hemal; Silva, Seekkuge Susil Priyantha; Igarashi, Ikuo; Inoue, Noboru
2015-01-30
Trypanosoma theileri is a hemoprotozoan parasite that infects various ruminant species. We investigated the epidemiology of this parasite among cattle and water buffalo populations bred in Sri Lanka, using a diagnostic PCR assay based on the cathepsin L-like protein (CATL) gene. Blood DNA samples sourced from cattle (n=316) and water buffaloes (n=320) bred in different geographical areas of Sri Lanka were PCR screened for T. theileri. Parasite DNA was detected in cattle and water buffaloes alike in all the sampling locations. The overall T. theileri-positive rate was higher in water buffaloes (15.9%) than in cattle (7.6%). Subsequently, PCR amplicons were sequenced and the partial CATL sequences were phylogenetically analyzed. The identity values for the CATL gene were 89.6-99.7% among the cattle-derived sequences, compared with values of 90.7-100% for the buffalo-derived sequences. However, the cattle-derived sequences shared 88.2-100% identity values with those from buffaloes. In the phylogenetic tree, the Sri Lankan CATL gene sequences fell into two major clades (TthI and TthII), both of which contain CATL sequences from several other countries. Although most of the CATL sequences from Sri Lankan cattle and buffaloes clustered independently, two buffalo-derived sequences were observed to be closely related to those of the Sri Lankan cattle. Furthermore, a Sri Lankan buffalo sequence clustered with CATL gene sequences from Brazilian buffalo and Thai cattle. In addition to reporting the first PCR-based survey of T. theileri among Sri Lankan-bred cattle and water buffaloes, the present study found that some of the CATL gene fragments sourced from water buffaloes shared similarity with those determined from cattle in this country. Copyright © 2014 Elsevier B.V. All rights reserved.
Nagasaka, Kei; Mizuno, Koji; Thomson, Robert
2018-03-26
For occupant protection, it is important to understand how a car's deceleration time history in crashes can be designed using efficient of energy absorption by a car body's structure. In a previous paper, the authors proposed an energy derivative method to determine each structural component's contribution to the longitudinal deceleration of a car passenger compartment in crashes. In this study, this method was extended to 2 dimensions in order to analyze various crash test conditions. The contribution of each structure estimated from the energy derivative method was compared to that from a conventional finite element (FE) analysis method using cross-sectional forces. A 2-dimensional energy derivative method was established. A simple FE model with a structural column connected to a rigid body was used to confirm the validity of this method and to compare with the result of cross-sectional forces determined using conventional analysis. Applying this method to a full-width frontal impact simulation of a car FE model, the contribution and the cross-sectional forces of the front rails were compared. In addition, this method was applied to a pedestrian headform FE simulation in order to determine the influence of the structural and inertia forces of the hood structures on the deceleration of the headform undergoing planar motion. In an oblique impact of the simple column and rigid body model, the sum of the contributions of each part agrees with the rigid body deceleration, which indicates the validity of the 2-dimensional energy derivative method. Using the energy derivative method, it was observed that each part of the column contributes to the deceleration of the rigid body by collapsing in the sequence from front to rear, whereas the cross-sectional force at the rear of the column cannot detect the continuous collapse. In the full-width impact of a car, the contributions of the front rails estimated in the energy derivative method was smaller than that using the cross-sectional forces at the rear end of the front rails due to the deformation of the passenger compartment. For a pedestrian headform impact, the inertial and structural forces of the hood contributed to peaks of the headform deceleration in the initial and latter phases, respectively. Using the 2-dimensional energy derivative method, it is possible to analyze an oblique impact or a pedestrian headform impact with large rotations. This method has advantages compared to the conventional approach using cross-sectional forces because the contribution of each component to system deceleration can be determined.
Wada, Takuya; Oku, Koichiro; Nagano, Soichiro; Isobe, Sachiko; Suzuki, Hideyuki; Mori, Miyuki; Takata, Kinuko; Hirata, Chiharu; Shimomura, Katsumi; Tsubone, Masao; Katayama, Takao; Hirashima, Keita; Uchimura, Yosuke; Ikegami, Hidetoshi; Sueyoshi, Takayuki; Obu, Ko-ichi; Hayashida, Tatsuya; Shibato, Yasushi
2017-01-01
A strawberry Multi-parent Advanced Generation Intercrosses (MAGIC) population, derived from crosses using six strawberry cultivars was successfully developed. The population was composed of 338 individuals; genome conformation was evaluated by expressed sequence tag-derived simple short repeat (EST-SSR) markers. Cluster analysis and principal component analysis (PCA) based on EST-SSR marker polymorphisms revealed that the MAGIC population was a mosaic of the six founder cultivars and covered the genomic regions of the six founders evenly. Fruit quality related traits, including days to flowering (DTF), fruit weight (FW), fruit firmness (FF), fruit color (FC), soluble solid content (SC), and titratable acidity (TA), of the MAGIC population were evaluated over two years. All traits showed normal transgressive segregation beyond the founder cultivars and most traits, except for DTF, distributed normally. FC exhibited the highest correlation coefficient overall and was distributed normally regardless of differences in DTF, FW, FF, SC, and TA. These facts were supported by PCA using fruit quality related values as explanatory variables, suggesting that major genetic factors, which are not influenced by fluctuations in other fruit traits, could control the distribution of FC. This MAGIC population is a promising resource for genome-wide association studies and genomic selection for efficient strawberry breeding. PMID:29085247
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts
Naito, Yuki; Bono, Hidemasa
2012-01-01
GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users. PMID:22641850
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.
Naito, Yuki; Bono, Hidemasa
2012-07-01
GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.
Comparison of simple sequence repeats in 19 Archaea.
Trivedi, S
2006-12-05
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
Simple sequence repeat marker loci discovery using SSR primer.
Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David
2004-06-12
Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/
Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.
Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj
2017-01-01
Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).
Cuartas, Viviana; Insuasty, Braulio; Cobo, Justo; Glidewell, Christopher
2017-10-01
The reaction of 5-chloro-3-methyl-1-phenyl-1H-pyrazole-4-carbaldehyde and N-benzylmethylamine under microwave irradiation gives 5-[benzyl(methyl)amino]-3-methyl-1-phenyl-1H-pyrazole-4-carbaldehyde, C 19 H 19 N 3 O, (I). Subsequent reactions under basic conditions, between (I) and a range of acetophenones, yield the corresponding chalcones. These undergo cyclocondensation reactions with hydrazine to produce reduced bipyrazoles which can be N-formylated with formic acid or N-acetylated with acetic anhydride. The structures of (I) and of representative examples from this reaction sequence are reported, namely the chalcone (E)-3-{5-[benzyl(methyl)amino]-3-methyl-1-phenyl-1H-pyrazol-4-yl}-1-(4-bromophenyl)prop-2-en-1-one, C 27 H 24 BrN 3 O, (II), the N-formyl derivative (3RS)-5'-[benzyl(methyl)amino]-3'-methyl-1',5-diphenyl-3,4-dihydro-1'H,2H-[3,4'-bipyrazole]-2-carbaldehyde, C 28 H 27 N 5 O, (III), and the N-acetyl derivative (3RS)-2-acetyl-5'-[benzyl(methyl)amino]-5-(4-methoxyphenyl)-3'-methyl-1'-phenyl-3,4-dihydro-1'H,2H-[3,4'-bipyrazole], which crystallizes as the ethanol 0.945-solvate, C 30 H 31 N 5 O 2 ·0.945C 2 H 6 O, (IV). There is significant delocalization of charge from the benzyl(methyl)amino substituent onto the carbonyl group in (I), but not in (II). In each of (III) and (IV), the reduced pyrazole ring is modestly puckered into an envelope conformation. The molecules of (I) are linked by a combination of C-H...N and C-H...π(arene) hydrogen bonds to form a simple chain of rings; those of (III) are linked by a combination of C-H...O and C-H...N hydrogen bonds to form sheets of R 2 2 (8) and R 6 6 (42) rings, and those of (IV) are linked by a combination of O-H...N and C-H...O hydrogen bonds to form a ribbon of edge-fused R 2 4 (16) and R 4 4 (24) rings.
Wong, Wing-Cheong; Ng, Hong-Kiat; Tantoso, Erwin; Soong, Richie; Eisenhaber, Frank
2018-02-12
Though earlier works on modelling transcript abundance from vertebrates to lower eukaroytes have specifically singled out the Zip's law, the observed distributions often deviate from a single power-law slope. In hindsight, while power-laws of critical phenomena are derived asymptotically under the conditions of infinite observations, real world observations are finite where the finite-size effects will set in to force a power-law distribution into an exponential decay and consequently, manifests as a curvature (i.e., varying exponent values) in a log-log plot. If transcript abundance is truly power-law distributed, the varying exponent signifies changing mathematical moments (e.g., mean, variance) and creates heteroskedasticity which compromises statistical rigor in analysis. The impact of this deviation from the asymptotic power-law on sequencing count data has never truly been examined and quantified. The anecdotal description of transcript abundance being almost Zipf's law-like distributed can be conceptualized as the imperfect mathematical rendition of the Pareto power-law distribution when subjected to the finite-size effects in the real world; This is regardless of the advancement in sequencing technology since sampling is finite in practice. Our conceptualization agrees well with our empirical analysis of two modern day NGS (Next-generation sequencing) datasets: an in-house generated dilution miRNA study of two gastric cancer cell lines (NUGC3 and AGS) and a publicly available spike-in miRNA data; Firstly, the finite-size effects causes the deviations of sequencing count data from Zipf's law and issues of reproducibility in sequencing experiments. Secondly, it manifests as heteroskedasticity among experimental replicates to bring about statistical woes. Surprisingly, a straightforward power-law correction that restores the distribution distortion to a single exponent value can dramatically reduce data heteroskedasticity to invoke an instant increase in signal-to-noise ratio by 50% and the statistical/detection sensitivity by as high as 30% regardless of the downstream mapping and normalization methods. Most importantly, the power-law correction improves concordance in significant calls among different normalization methods of a data series averagely by 22%. When presented with a higher sequence depth (4 times difference), the improvement in concordance is asymmetrical (32% for the higher sequencing depth instance versus 13% for the lower instance) and demonstrates that the simple power-law correction can increase significant detection with higher sequencing depths. Finally, the correction dramatically enhances the statistical conclusions and eludes the metastasis potential of the NUGC3 cell line against AGS of our dilution analysis. The finite-size effects due to undersampling generally plagues transcript count data with reproducibility issues but can be minimized through a simple power-law correction of the count distribution. This distribution correction has direct implication on the biological interpretation of the study and the rigor of the scientific findings. This article was reviewed by Oliviero Carugo, Thomas Dandekar and Sandor Pongor.
A behavior analytic analogue of learning to use synonyms, syntax, and parts of speech.
Chase, Philip N; Ellenwood, David W; Madden, Gregory
2008-01-01
Matching-to-sample and sequence training procedures were used to develop responding to stimulus classes that were considered analogous to 3 aspects of verbal behavior: identifying synonyms and parts of speech, and using syntax. Matching-to-sample procedures were used to train 12 paired associates from among 24 stimuli. These pairs were analogous to synonyms. Then, sequence characteristics were trained to 6 of the stimuli. The result was the formation of 3 classes of 4 stimuli, with the classes controlling a sequence response analogous to a simple ordering syntax: first, second, and third. Matching-to-sample procedures were then used to add 4 stimuli to each class. These stimuli, without explicit sequence training, also began to control the same sequence responding as the other members of their class. Thus, three 8-member functionally equivalent sequence classes were formed. These classes were considered to be analogous to parts of speech. Further testing revealed three 8-member equivalence classes and 512 different sequences of first, second, and third. The study indicated that behavior analytic procedures may be used to produce some generative aspects of verbal behavior related to simple syntax and semantics.
Morgan, Andrew P.; Didion, John P.; Doran, Anthony G.; Holt, James M.; McMillan, Leonard; Keane, Thomas M.; de Villena, Fernando Pardo-Manuel
2016-01-01
Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction. PMID:27765810
NASA Astrophysics Data System (ADS)
Shekhar, Karthik; Ruberman, Claire F.; Ferguson, Andrew L.; Barton, John P.; Kardar, Mehran; Chakraborty, Arup K.
2013-12-01
Mutational escape from vaccine-induced immune responses has thwarted the development of a successful vaccine against AIDS, whose causative agent is HIV, a highly mutable virus. Knowing the virus' fitness as a function of its proteomic sequence can enable rational design of potent vaccines, as this information can focus vaccine-induced immune responses to target mutational vulnerabilities of the virus. Spin models have been proposed as a means to infer intrinsic fitness landscapes of HIV proteins from patient-derived viral protein sequences. These sequences are the product of nonequilibrium viral evolution driven by patient-specific immune responses and are subject to phylogenetic constraints. How can such sequence data allow inference of intrinsic fitness landscapes? We combined computer simulations and variational theory á la Feynman to show that, in most circumstances, spin models inferred from patient-derived viral sequences reflect the correct rank order of the fitness of mutant viral strains. Our findings are relevant for diverse viruses.
Henry, Kevin A
2018-01-01
Immunogenetic analyses of expressed antibody repertoires are becoming increasingly common experimental investigations and are critical to furthering our understanding of autoimmunity, infectious disease, and cancer. Next-generation DNA sequencing (NGS) technologies have now made it possible to interrogate antibody repertoires to unprecedented depths, typically by sequencing of cDNAs encoding immunoglobulin variable domains. In this chapter, we describe simple, fast, and reliable methods for producing and sequencing multiplex PCR amplicons derived from the variable regions (V H , V H H or V L ) of rearranged immunoglobulin heavy and light chain genes using the Illumina MiSeq platform. We include complete protocols and primer sets for amplicon sequencing of V H /V H H/V L repertoires directly from human, mouse, and llama lymphocytes as well as from phage-displayed V H /V H H/V L libraries; these can be easily be adapted to other types of amplicons with little modification. The resulting amplicons are diverse and representative, even using as few as 10 3 input B cells, and their generation is relatively inexpensive, requiring no special equipment and only a limited set of primers. In the absence of heavy-light chain pairing, single-domain antibodies are uniquely amenable to NGS analyses. We present a number of applications of NGS technology useful in discovery of single-domain antibodies from phage display libraries, including: (i) assessment of library functionality; (ii) confirmation of desired library randomization; (iii) estimation of library diversity; and (iv) monitoring the progress of panning experiments. While the case studies presented here are of phage-displayed single-domain antibody libraries, the principles extend to other types of in vitro display libraries.
Hu, Zhuang; Zhang, Tian; Gao, Xiao-Xiao; Wang, Yang; Zhang, Qiang; Zhou, Hui-Juan; Zhao, Gui-Fang; Wang, Ma-Li; Woeste, Keith E; Zhao, Peng
2016-04-01
Manchurian walnut (Juglans mandshurica Maxim.) is a vulnerable, temperate deciduous tree valued for its wood and nut, but transcriptomic and genomic data for the species are very limited. Next generation sequencing (NGS) has made it possible to develop molecular markers for this species rapidly and efficiently. Our goal is to use transcriptome information from RNA-Seq to understand development in J. mandshurica and develop polymorphic simple sequence repeats (SSRs, microsatellites) to understand the species' population genetics. In this study, more than 47.7 million clean reads were generated using Illumina sequencing technology. De novo assembly yielded 99,869 unigenes with an average length of 747 bp. Based on sequence similarity search with known proteins, a total of 39,708 (42.32 %) genes were identified. Searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) identified 15,903 (16.9 %) unigenes. Further, we identified and characterized 63 new transcriptome-derived microsatellite markers. By testing the markers on 4 to 14 individuals from four populations, we found that 20 were polymorphic and easily amplified. The number of alleles per locus ranged from 2 to 8. The observed and expected heterozygosity per locus ranged from 0.209 to 0.813 and 0.335 to 0.842, respectively. These twenty microsatellite markers will be useful for studies of population genetics, diversity, and genetic structure, and they will undoubtedly benefit future breeding studies of this walnut species. Moreover, the information uncovered in this research will also serve as a useful genetic resource for understanding the transcriptome and development of J. mandshurica and other Juglans species.
Molecular analysis of the glucocerebrosidase gene locus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Winfield, S.L.; Martin, B.M.; Fandino, A.
1994-09-01
Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Chengyuan; De Grijs, Richard; Deng, Licai, E-mail: joshuali@pku.edu.cn, E-mail: grijs@pku.edu.cn
2014-04-01
Using a combination of high-resolution Hubble Space Telescope/Wide-Field and Planetary Camera-2 observations, we explore the physical properties of the stellar populations in two intermediate-age star clusters, NGC 1831 and NGC 1868, in the Large Magellanic Cloud based on their color-magnitude diagrams. We show that both clusters exhibit extended main-sequence turn offs. To explain the observations, we consider variations in helium abundance, binarity, age dispersions, and the fast rotation of the clusters' member stars. The observed narrow main sequence excludes significant variations in helium abundance in both clusters. We first establish the clusters' main-sequence binary fractions using the bulk of themore » clusters' main-sequence stellar populations ≳ 1 mag below their turn-offs. The extent of the turn-off regions in color-magnitude space, corrected for the effects of binarity, implies that age spreads of order 300 Myr may be inferred for both clusters if the stellar distributions in color-magnitude space were entirely due to the presence of multiple populations characterized by an age range. Invoking rapid rotation of the population of cluster members characterized by a single age also allows us to match the observed data in detail. However, when taking into account the extent of the red clump in color-magnitude space, we encounter an apparent conflict for NGC 1831 between the age dispersion derived from that based on the extent of the main-sequence turn off and that implied by the compact red clump. We therefore conclude that, for this cluster, variations in stellar rotation rate are preferred over an age dispersion. For NGC 1868, both models perform equally well.« less
Leichty, Aaron R; Brisson, Dustin
2014-10-01
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.
Pyne, Robert; Honig, Josh; Vaiciunas, Jennifer; Koroch, Adolfina; Wyenandt, Christian; Bonos, Stacy; Simon, James
2017-01-01
Limited understanding of sweet basil (Ocimum basilicum L.) genetics and genome structure has reduced efficiency of breeding strategies. This is evidenced by the rapid, worldwide dissemination of basil downy mildew (Peronospora belbahrii) in the absence of resistant cultivars. In an effort to improve available genetic resources, expressed sequence tag simple sequence repeat (EST-SSR) and single nucleotide polymorphism (SNP) markers were developed and used to genotype the MRI x SB22 F2 mapping population, which segregates for response to downy mildew. SNP markers were generated from genomic sequences derived from double digestion restriction site associated DNA sequencing (ddRADseq). Disomic segregation was observed in both SNP and EST-SSR markers providing evidence of an O. basilicum allotetraploid genome structure and allowing for subsequent analysis of the mapping population as a diploid intercross. A dense linkage map was constructed using 42 EST-SSR and 1,847 SNP markers spanning 3,030.9 cM. Multiple quantitative trait loci (QTL) model (MQM) analysis identified three QTL that explained 37-55% of phenotypic variance associated with downy mildew response across three environments. A single major QTL, dm11.1 explained 21-28% of phenotypic variance and demonstrated dominant gene action. Two minor QTL dm9.1 and dm14.1 explained 5-16% and 4-18% of phenotypic variance, respectively. Evidence is provided for an additive effect between the two minor QTL and the major QTL dm11.1 increasing downy mildew susceptibility. Results indicate that ddRADseq-facilitated SNP and SSR marker genotyping is an effective approach for mapping the sweet basil genome.
Honig, Josh; Vaiciunas, Jennifer; Koroch, Adolfina; Wyenandt, Christian; Bonos, Stacy; Simon, James
2017-01-01
Limited understanding of sweet basil (Ocimum basilicum L.) genetics and genome structure has reduced efficiency of breeding strategies. This is evidenced by the rapid, worldwide dissemination of basil downy mildew (Peronospora belbahrii) in the absence of resistant cultivars. In an effort to improve available genetic resources, expressed sequence tag simple sequence repeat (EST-SSR) and single nucleotide polymorphism (SNP) markers were developed and used to genotype the MRI x SB22 F2 mapping population, which segregates for response to downy mildew. SNP markers were generated from genomic sequences derived from double digestion restriction site associated DNA sequencing (ddRADseq). Disomic segregation was observed in both SNP and EST-SSR markers providing evidence of an O. basilicum allotetraploid genome structure and allowing for subsequent analysis of the mapping population as a diploid intercross. A dense linkage map was constructed using 42 EST-SSR and 1,847 SNP markers spanning 3,030.9 cM. Multiple quantitative trait loci (QTL) model (MQM) analysis identified three QTL that explained 37–55% of phenotypic variance associated with downy mildew response across three environments. A single major QTL, dm11.1 explained 21–28% of phenotypic variance and demonstrated dominant gene action. Two minor QTL dm9.1 and dm14.1 explained 5–16% and 4–18% of phenotypic variance, respectively. Evidence is provided for an additive effect between the two minor QTL and the major QTL dm11.1 increasing downy mildew susceptibility. Results indicate that ddRADseq-facilitated SNP and SSR marker genotyping is an effective approach for mapping the sweet basil genome. PMID:28922359
Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter
2013-01-01
Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Subcellular localization of transiently expressed fluorescent fusion proteins.
Collings, David A
2013-01-01
The recent and massive expansion in plant genomics data has generated a large number of gene sequences for which two seemingly simple questions need to be answered: where do the proteins encoded by these genes localize in cells, and what do they do? One widespread approach to answering the localization question has been to use particle bombardment to transiently express unknown proteins tagged with green fluorescent protein (GFP) or its numerous derivatives. Confocal fluorescence microscopy is then used to monitor the localization of the fluorescent protein as it hitches a ride through the cell. The subcellular localization of the fusion protein, if not immediately apparent, can then be determined by comparison to localizations generated by fluorescent protein fusions to known signalling sequences and proteins, or by direct comparison with fluorescent dyes. This review aims to be a tour guide for researchers wanting to travel this hitch-hiker's path, and for reviewers and readers who wish to understand their travel reports. It will describe some of the technology available for visualizing protein localizations, and some of the experimental approaches for optimizing and confirming localizations generated by particle bombardment in onion epidermal cells, the most commonly used experimental system. As the non-conservation of signal sequences in heterologous expression systems such as onion, and consequent mis-targeting of fusion proteins, is always a potential problem, the epidermal cells of the Argenteum mutant of pea are proposed as a model system.
Wang, Hongtao; Li, Guisheng; Kwon, Woo-Saeng; Yang, Deok-Chun
2016-01-01
Panax ginseng is one of the most valuable medicinal plants in the Orient. The low level of genetic variation has limited the application of molecular markers for cultivar authentication and marker-assisted selection in cultivated ginseng. To exploit DNA polymorphism within ginseng cultivars, ginseng expressed sequence tags (ESTs) were searched against the potential intron polymorphism (PIP) database to predict the positions of introns. Intron-flanking primers were then designed in conserved exon regions and used to amplify across the more variable introns. Sequencing results showed that single nucleotide polymorphisms (SNPs), as well as indels, were detected in four EST-derived introns, and SNP markers specific to “Gopoong” and “K-1” were first reported in this study. Based on cultivar-specific SNP sites, allele-specific polymerase chain reaction (PCR) was conducted and proved to be effective for the authentication of ginseng cultivars. Additionally, the combination of a simple NaOH-Tris DNA isolation method and real-time allele-specific PCR assay enabled the high throughput selection of cultivars from ginseng fields. The established real-time allele-specific PCR assay should be applied to molecular authentication and marker assisted selection of P. ginseng cultivars, and the EST intron-targeting strategy will provide a potential approach for marker development in species without whole genomic DNA sequence information. PMID:27271615
Gherghe, Cristina; Lombo, Tania; Leonard, Christopher W.; Datta, Siddhartha A. K.; Bess, Julian W.; Gorelick, Robert J.; Rein, Alan; Weeks, Kevin M.
2010-01-01
All retroviral genomic RNAs contain a cis-acting packaging signal by which dimeric genomes are selectively packaged into nascent virions. However, it is not understood how Gag (the viral structural protein) interacts with these signals to package the genome with high selectivity. We probed the structure of murine leukemia virus RNA inside virus particles using SHAPE, a high-throughput RNA structure analysis technology. These experiments showed that NC (the nucleic acid binding domain derived from Gag) binds within the virus to the sequence UCUG-UR-UCUG. Recombinant Gag and NC proteins bound to this same RNA sequence in dimeric RNA in vitro; in all cases, interactions were strongest with the first U and final G in each UCUG element. The RNA structural context is critical: High-affinity binding requires base-paired regions flanking this motif, and two UCUG-UR-UCUG motifs are specifically exposed in the viral RNA dimer. Mutating the guanosine residues in these two motifs—only four nucleotides per genomic RNA—reduced packaging 100-fold, comparable to the level of nonspecific packaging. These results thus explain the selective packaging of dimeric RNA. This paradigm has implications for RNA recognition in general, illustrating how local context and RNA structure can create information-rich recognition signals from simple single-stranded sequence elements in large RNAs. PMID:20974908
Generating Models of Surgical Procedures using UMLS Concepts and Multiple Sequence Alignment
Meng, Frank; D’Avolio, Leonard W.; Chen, Andrew A.; Taira, Ricky K.; Kangarloo, Hooshang
2005-01-01
Surgical procedures can be viewed as a process composed of a sequence of steps performed on, by, or with the patient’s anatomy. This sequence is typically the pattern followed by surgeons when generating surgical report narratives for documenting surgical procedures. This paper describes a methodology for semi-automatically deriving a model of conducted surgeries, utilizing a sequence of derived Unified Medical Language System (UMLS) concepts for representing surgical procedures. A multiple sequence alignment was computed from a collection of such sequences and was used for generating the model. These models have the potential of being useful in a variety of informatics applications such as information retrieval and automatic document generation. PMID:16779094
SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop.
Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo
2014-01-01
Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig's scalability over many computing nodes and illustrate its use with example scripts. Available under the open source MIT license at http://sourceforge.net/projects/seqpig/
NASA Technical Reports Server (NTRS)
Tanimoto, T.
1983-01-01
A simple modification of Gilbert's formula to account for slight lateral heterogeneity of the Earth leads to a convenient formula to calculate synthetic long period seismograms. Partial derivatives are easily calculated, thus the formula is suitable for direct inversion of seismograms for lateral heterogeneity of the Earth.
Nakanishi, Satoshi; Kuramoto, Takashi; Kashiwazaki, Naomi; Yokoi, Norihide
2016-01-01
The Zucker fatty (ZF) rat is an outbred rat and a well-known model of obesity without diabetes, harboring a missense mutation (fatty, abbreviated as fa) in the leptin receptor gene (Lepr). Slc:Zucker (Slc:ZF) outbred rats exhibit obesity while Hos:ZFDM-Leprfa (Hos:ZFDM) outbred rats exhibit obesity and type 2 diabetes. Both outbred rats have been derived from an outbred ZF rat colony maintained at Tokyo Medical University. So far, genetic profiles of these outbred rats remain unknown. Here, we applied a simple genotyping method using Ampdirect reagents and FTA cards (Amp-FTA) in combination with simple sequence length polymorphisms (SSLP) markers to determine genetic profiles of Slc:ZF and Hos:ZFDM rats. Among 27 SSLP marker loci, 24 loci (89%) were fixed for specific allele at each locus in Slc:ZF rats and 26 loci (96%) were fixed in Hos:ZFDM rats, respectively. This indicates the low genetic heterogeneity in both colonies of outbred rats. Nine loci (33%) showed different alleles between the two outbred rats, suggesting considerably different genetic profiles between the two outbred rats in spite of the same origin. Additional analysis using 72 SSLP markers further supported these results and clarified the profiles in detail. This study revealed that genetic profiles of the Slc:ZF and Hos:ZFDM outbred rats are different for about 30% of the SSLP marker loci, which is the underlying basis for the phenotypic difference between the two outbred rats. PMID:27795491
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy
Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
2014-01-01
Background The nematode Pratylenchus neglectus has a wide host range and is able to feed on the root systems of cereals, oilseeds, grain and pasture legumes. Under the Mediterranean low rainfall environments of Australia, annual Medicago pasture legumes are used in rotation with cereals to fix atmospheric nitrogen and improve soil parameters. Considerable efforts are being made in breeding programs to improve resistance and tolerance to Pratylenchus neglectus in the major crops wheat and barley, which makes it vital to develop appropriate selection tools in medics. Results A strong source of tolerance to root damage by the root lesion nematode (RLN) Pratylenchus neglectus had previously been identified in line RH-1 (strand medic, M. littoralis). Using RH-1, we have developed a single seed descent (SSD) population of 138 lines by crossing it to the intolerant cultivar Herald. After inoculation, RLN-associated root damage clearly segregated in the population. Genetic analysis was performed by constructing a genetic map using simple sequence repeat (SSR) and gene-based SNP markers. A highly significant quantitative trait locus (QTL), QPnTolMl.1, was identified explaining 49% of the phenotypic variation in the SSD population. All SSRs and gene-based markers in the QTL region were derived from chromosome 1 of the sequenced genome of the closely related species M. truncatula. Gene-based markers were validated in advanced breeding lines derived from the RH-1 parent and also a second RLN tolerance source, RH-2 (M. truncatula ssp. tricycla). Comparative analysis to sequenced legume genomes showed that the physical QTL interval exists as a synteny block in Lotus japonicus, common bean, soybean and chickpea. Furthermore, using the sequenced genome information of M. truncatula, the QTL interval contains 55 genes out of which five are discussed as potential candidate genes responsible for the mapped tolerance. Conclusion The closely linked set of SNP-based PCR markers is directly applicable to select for two different sources of RLN tolerance in breeding programs. Moreover, genome sequence information has allowed proposing candidate genes for further functional analysis and nominates QPnTolMl.1 as a target locus for RLN tolerance in economically important grain legumes, e.g. chickpea. PMID:24742262
Physical characterization of the cloned protease III gene from Escherichia coli K-12.
Dykstra, C C; Kushner, S R
1985-09-01
Analysis of the cloned protease III gene (ptr) from Escherichia coli K-12 has demonstrated that in addition to the previously characterized 110,000-Mr protease III protein, a second 50,000-Mr polypeptide (p50) is derived from the amino-terminal end of the coding sequence. The p50 polypeptide is found predominantly in the periplasmic space along with protease III, but does not proteolytically degrade insulin, a substrate for protease III. p50 does not appear to originate from autolysis of the larger protein. Protease III is not essential for normal cell growth since deletion of the structural gene causes no observed alterations in the phenotypic properties of the bacteria. A 30-fold overproduction of protease III does not affect cell viability. A simple new purification method for protease III is described.
Li, Zhi-Zhong; Lu, Meng-Xue; Saina, Josphat K; Gichira, Andrew W; Wang, Qing-Feng; Chen, Jin-Ming
2017-11-01
Simple sequence repeat (SSR) markers were derived from transcriptomic data for Ottelia acuminata (Hydrocharitaceae), a species comprising five endemic and highly endangered varieties in China. Sixteen novel SSR markers were developed for O. acuminata var. jingxiensis . One to eight alleles per locus were found, with a mean of 2.896. The observed and expected heterozygosity ranged from 0.000 to 1.000 and 0.000 to 0.793, respectively. Interestingly, in cross-varietal amplification, 13 out of the 16 loci were successfully amplified in O. acuminata var. acuminata , and 12 amplified in each of the other three varieties of O. acuminata . These newly developed SSR markers will facilitate further study of genetic variation and provide important genetic data needed for appropriate conservation of natural populations of all varieties of O. acuminata .
Einstein’s sigh: hidden symmetry in Einstein’s derivation of the Lorentz transformation
NASA Astrophysics Data System (ADS)
Chao, Sheng D.
2017-03-01
‘Das hätte ich einfacher sagen können (I could have said that more simply).’ was Einstein’s sigh when he had a chance to remark on his own derivation of the Lorentz transformation (LT) in the 1905 seminal paper. In fact, in a popular science exposition of the theory of relativity Einstein did provide such a simple derivation of the LT. It is a curious historical fact that the latter derivation was presented in 1916, while Einstein’s remark was made in 1943. Was the 1916 derivation simple enough to relieve his sigh? Had he expected an even simpler derivation beyond his thoughts? In this paper, Einstein’s simple derivation of the LT is revisited and analysed. We show that the LT can be obtained from a symmetry principle hidden in Einstein’s logical reasoning. First, the relativity principle can be restated as a mirror principle based on the space-time exchange-inversion operation. Second, the assumed constancy of the speed of light (Einstein’s second postulate) can be derived by using the velocity reciprocity property, which is a deductive result of the space-time homogeneity and the space isotropy. Therefore, Einstein could have presented his derivation of the LT more simply, thus turning Einstein’s sigh of regret into a sigh of relief.
A high-density intraspecific SNP linkage map of pigeonpea (Cajanas cajan L. Millsp.)
Mandal, Paritra; Bhutani, Shefali; Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram Pratap; Chaudhary, A. K.; Yadav, Rekha; Gaikwad, K.; Sevanthi, Amitha Mithra; Datta, Subhojit; Raje, Ranjeet S.; Sharma, Tilak R.; Singh, Nagendra Kumar
2017-01-01
Pigeonpea (Cajanus cajan (L.) Millsp.) is a major food legume cultivated in semi-arid tropical regions including the Indian subcontinent, Africa, and Southeast Asia. It is an important source of protein, minerals, and vitamins for nearly 20% of the world population. Due to high carbon sequestration and drought tolerance, pigeonpea is an important crop for the development of climate resilient agriculture and nutritional security. However, pigeonpea productivity has remained low for decades because of limited genetic and genomic resources, and sparse utilization of landraces and wild pigeonpea germplasm. Here, we present a dense intraspecific linkage map of pigeonpea comprising 932 markers that span a total adjusted map length of 1,411.83 cM. The consensus map is based on three different linkage maps that incorporate a large number of single nucleotide polymorphism (SNP) markers derived from next generation sequencing data, using Illumina GoldenGate bead arrays, and genotyping with restriction site associated DNA (RAD) sequencing. The genotyping-by-sequencing enhanced the marker density but was met with limited success due to lack of common markers across the genotypes of mapping population. The integrated map has 547 bead-array SNP, 319 RAD-SNP, and 65 simple sequence repeat (SSR) marker loci. We also show here correspondence between our linkage map and published genome pseudomolecules of pigeonpea. The availability of a high-density linkage map will help improve the anchoring of the pigeonpea genome to its chromosomes and the mapping of genes and quantitative trait loci associated with useful agronomic traits. PMID:28654689
GABI-Kat SimpleSearch: new features of the Arabidopsis thaliana T-DNA mutant database.
Kleinboelting, Nils; Huep, Gunnar; Kloetgen, Andreas; Viehoever, Prisca; Weisshaar, Bernd
2012-01-01
T-DNA insertion mutants are very valuable for reverse genetics in Arabidopsis thaliana. Several projects have generated large sequence-indexed collections of T-DNA insertion lines, of which GABI-Kat is the second largest resource worldwide. User access to the collection and its Flanking Sequence Tags (FSTs) is provided by the front end SimpleSearch (http://www.GABI-Kat.de). Several significant improvements have been implemented recently. The database now relies on the TAIRv10 genome sequence and annotation dataset. All FSTs have been newly mapped using an optimized procedure that leads to improved accuracy of insertion site predictions. A fraction of the collection with weak FST yield was re-analysed by generating new FSTs. Along with newly found predictions for older sequences about 20,000 new FSTs were included in the database. Information about groups of FSTs pointing to the same insertion site that is found in several lines but is real only in a single line are included, and many problematic FST-to-line links have been corrected using new wet-lab data. SimpleSearch currently contains data from ~71,000 lines with predicted insertions covering 62.5% of the 27,206 nuclear protein coding genes, and offers insertion allele-specific data from 9545 confirmed lines that are available from the Nottingham Arabidopsis Stock Centre.
Prediction during statistical learning, and implications for the implicit/explicit divide
Dale, Rick; Duran, Nicholas D.; Morehead, J. Ryan
2012-01-01
Accounts of statistical learning, both implicit and explicit, often invoke predictive processes as central to learning, yet practically all experiments employ non-predictive measures during training. We argue that the common theoretical assumption of anticipation and prediction needs clearer, more direct evidence for it during learning. We offer a novel experimental context to explore prediction, and report results from a simple sequential learning task designed to promote predictive behaviors in participants as they responded to a short sequence of simple stimulus events. Predictive tendencies in participants were measured using their computer mouse, the trajectories of which served as a means of tapping into predictive behavior while participants were exposed to very short and simple sequences of events. A total of 143 participants were randomly assigned to stimulus sequences along a continuum of regularity. Analysis of computer-mouse trajectories revealed that (a) participants almost always anticipate events in some manner, (b) participants exhibit two stable patterns of behavior, either reacting to vs. predicting future events, (c) the extent to which participants predict relates to performance on a recall test, and (d) explicit reports of perceiving patterns in the brief sequence correlates with extent of prediction. We end with a discussion of implicit and explicit statistical learning and of the role prediction may play in both kinds of learning. PMID:22723817
Perczel, András; Jákli, Imre; McAllister, Michael A; Csizmadia, Imre G
2003-06-06
Folding properties of small globular proteins are determined by their amino acid sequence (primary structure). This holds both for local (secondary structure) and for global conformational features of linear polypeptides and proteins composed from natural amino acid derivatives. It thus provides the rational basis of structure prediction algorithms. The shortest secondary structure element, the beta-turn, most typically adopts either a type I or a type II form, depending on the amino acid composition. Herein we investigate the sequence-dependent folding stability of both major types of beta-turns using simple dipeptide models (-Xxx-Yyy-). Gas-phase ab initio properties of 16 carefully selected and suitably protected dipeptide models (for example Val-Ser, Ala-Gly, Ser-Ser) were studied. For each backbone fold most probable side-chain conformers were considered. Fully optimized 321G RHF molecular structures were employed in medium level [B3LYP/6-311++G(d,p)//RHF/3-21G] energy calculations to estimate relative populations of the different backbone conformers. Our results show that the preference for beta-turn forms as calculated by quantum mechanics and observed in Xray determined proteins correlates significantly.
Wang, Baohua; Liu, Limei; Zhang, Dong; Zhuang, Zhimin; Guo, Hui; Qiao, Xin; Wei, Lijuan; Rong, Junkang; May, O. Lloyd; Paterson, Andrew H.; Chee, Peng W.
2016-01-01
Among the seven tetraploid cotton species, little is known about transmission genetics and genome organization in Gossypium mustelinum, the species most distant from the source of most cultivated cotton, G. hirsutum. In this research, an F2 population was developed from an interspecific cross between G. hirsutum and G. mustelinum (HM). A genetic linkage map was constructed mainly using simple sequence repeat (SSRs) and restriction fragment length polymorphism (RFLP) DNA markers. The arrangements of most genetic loci along the HM chromosomes were identical to those of other tetraploid cotton species. However, both major and minor structural rearrangements were also observed, for which we propose a parsimony-based model for structural divergence of tetraploid cottons from common ancestors. Sequences of mapped markers were used for alignment with the 26 scaffolds of the G. hirsutum draft genome, and showed high consistency. Quantitative trait locus (QTL) mapping of fiber elongation in advanced backcross populations derived from the same parents demonstrated the value of the HM map. The HM map will serve as a valuable resource for QTL mapping and introgression of G. mustelinum alleles into G. hirsutum, and help clarify evolutionary relationships between the tetraploid cotton genomes. PMID:27172208
KungFQ: a simple and powerful approach to compress fastq files.
Grassi, Elena; Di Gregorio, Federico; Molineris, Ivan
2012-01-01
Nowadays storing data derived from deep sequencing experiments has become pivotal and standard compression algorithms do not exploit in a satisfying manner their structure. A number of reference-based compression algorithms have been developed but they are less adequate when approaching new species without fully sequenced genomes or nongenomic data. We developed a tool that takes advantages of fastq characteristics and encodes them in a binary format optimized in order to be further compressed with standard tools (such as gzip or lzma). The algorithm is straightforward and does not need any external reference file, it scans the fastq only once and has a constant memory requirement. Moreover, we added the possibility to perform lossy compression, losing some of the original information (IDs and/or qualities) but resulting in smaller files; it is also possible to define a quality cutoff under which corresponding base calls are converted to N. We achieve 2.82 to 7.77 compression ratios on various fastq files without losing information and 5.37 to 8.77 losing IDs, which are often not used in common analysis pipelines. In this paper, we compare the algorithm performance with known tools, usually obtaining higher compression levels.
Graph cuts for curvature based image denoising.
Bae, Egil; Shi, Juan; Tai, Xue-Cheng
2011-05-01
Minimization of total variation (TV) is a well-known method for image denoising. Recently, the relationship between TV minimization problems and binary MRF models has been much explored. This has resulted in some very efficient combinatorial optimization algorithms for the TV minimization problem in the discrete setting via graph cuts. To overcome limitations, such as staircasing effects, of the relatively simple TV model, variational models based upon higher order derivatives have been proposed. The Euler's elastica model is one such higher order model of central importance, which minimizes the curvature of all level lines in the image. Traditional numerical methods for minimizing the energy in such higher order models are complicated and computationally complex. In this paper, we will present an efficient minimization algorithm based upon graph cuts for minimizing the energy in the Euler's elastica model, by simplifying the problem to that of solving a sequence of easy graph representable problems. This sequence has connections to the gradient flow of the energy function, and converges to a minimum point. The numerical experiments show that our new approach is more effective in maintaining smooth visual results while preserving sharp features better than TV models.
Rapid detection of the CYP2A6*12 hybrid allele by Pyrosequencing technology.
Koontz, Deborah A; Huckins, Jacqueline J; Spencer, Antonina; Gallagher, Margaret L
2009-08-24
Identification of CYP2A6 alleles associated with reduced enzyme activity is important in the study of inter-individual differences in drug metabolism. CYP2A6*12 is a hybrid allele that results from unequal crossover between CYP2A6 and CYP2A7 genes. The 5' regulatory region and exons 1-2 are derived from CYP2A7, and exons 3-9 are derived from CYP2A6. Conventional methods for detection of CYP2A6*12 consist of two-step PCR protocols that are laborious and unsuitable for high-throughput genotyping. We developed a rapid and accurate method to detect the CYP2A6*12 allele by Pyrosequencing technology. A single set of PCR primers was designed to specifically amplify both the CYP2A6*1 wild-type allele and the CYP2A6*12 hybrid allele. An internal Pyrosequencing primer was used to generate allele-specific sequence information, which detected homozygous wild-type, heterozygous hybrid, and homozygous hybrid alleles. We first validated the assay on 104 DNA samples that were also genotyped by conventional two-step PCR and by cycle sequencing. CYP2A6*12 allele frequencies were then determined using the Pyrosequencing assay on 181 multi-ethnic DNA samples from subjects of African American, European Caucasian, Pacific Rim, and Hispanic descent. Finally, we streamlined the Pyrosequencing assay by integrating liquid handling robotics into the workflow. Pyrosequencing results demonstrated 100% concordance with conventional two-step PCR and cycle sequencing methods. Allele frequency data showed slightly higher prevalence of the CYP2A6*12 allele in European Caucasians and Hispanics. This Pyrosequencing assay proved to be a simple, rapid, and accurate alternative to conventional methods, which can be easily adapted to the needs of higher-throughput studies.
Wang, Gui-xiang; Lv, Jing; Zhang, Jie; Han, Shuo; Zong, Mei; Guo, Ning; Zeng, Xing-ying; Zhang, Yue-yun; Wang, You-ping; Liu, Fan
2016-01-01
Broad phenotypic variations were obtained previously in derivatives from the asymmetric somatic hybridization of cauliflower “Korso” (Brassica oleracea var. botrytis, 2n = 18, CC genome) and black mustard “G1/1” (Brassica nigra, 2n = 16, BB genome). However, the mechanisms underlying these variations were unknown. In this study, 28 putative introgression lines (ILs) were pre-selected according to a series of morphological (leaf shape and color, plant height and branching, curd features, and flower traits) and physiological (black rot/club root resistance) characters. Multi-color fluorescence in situ hybridization revealed that these plants contained 18 chromosomes derived from “Korso.” Molecular marker (65 simple sequence repeats and 77 amplified fragment length polymorphisms) analysis identified the presence of “G1/1” DNA segments (average 7.5%). Additionally, DNA profiling revealed many genetic and epigenetic differences among the ILs, including sequence alterations, deletions, and variation in patterns of cytosine methylation. The frequency of fragments lost (5.1%) was higher than presence of novel bands (1.4%), and the presence of fragments specific to Brassica carinata (BBCC 2n = 34) were common (average 15.5%). Methylation-sensitive amplified polymorphism analysis indicated that methylation changes were common and that hypermethylation (12.4%) was more frequent than hypomethylation (4.8%). Our results suggested that asymmetric somatic hybridization and alien DNA introgression induced genetic and epigenetic alterations. Thus, these ILs represent an important, novel germplasm resource for cauliflower improvement that can be mined for diverse traits of interest to breeders and researchers. PMID:27625659
Chusreeaeom, Katarut; Ariizumi, Tohru; Asamizu, Erika; Okabe, Yoshihiro; Shirasawa, Kenta; Ezura, Hiroshi
2014-06-01
Genes controlling fruit morphology offer important insights into patterns and mechanisms determining organ shape and size. In cultivated tomato (Solanum lycopersicum L.), a variety of fruit shapes are displayed, including round-, bell pepper-, pear-, and elongate-shaped forms. In this study, we characterized a tomato mutant possessing elongated fruit morphology by histologically analyzing its fruit structure and genetically analyzing and mapping the genetic locus. The mutant line, Solanum lycopersicum elongated fruit 1 (Slelf1), was selected in a previous study from an ethylmethane sulfonate-mutagenized population generated in the background of Micro-Tom, a dwarf and rapid-growth variety. Histological analysis of the Slelf1 mutant revealed dramatically increased elongation of ovary and fruit. Until 6 days before flowering, ovaries were round and they began to elongate afterward. We also determined pericarp thickness and the number of cell layers in three designated fruit regions. We found that mesocarp thickness, as well as the number of cell layers, was increased in the proximal region of immature green fruits, making this the key sector of fruit elongation. Using 262 F2 individuals derived from a cross between Slelf1 and the cultivar Ailsa Craig, we constructed a genetic map, simple sequence repeat (SSR), cleaved amplified polymorphism sequence (CAPS), and derived CAPS (dCAPS) markers and mapped to the 12 tomato chromosomes. Genetic mapping placed the candidate gene locus within a 0.2 Mbp interval on the long arm of chromosome 8 and was likely different from previously known loci affecting fruit shape.
Levy, Nitzan; Tatomer, Dierdre; Herber, Candice B.; Zhao, Xiaoyue; Tang, Hui; Sargeant, Toby; Ball, Lonnele J.; Summers, Jonathan; Speed, Terence P.; Leitman, Dale C.
2008-01-01
Estrogen receptors (ERs) regulate gene transcription by interacting with regulatory elements. Most information regarding how ER activates genes has come from studies using a small set of target genes or simple consensus sequences such as estrogen response element, activator protein 1, and Sp1 elements. However, these elements cannot explain the differences in gene regulation patterns and clinical effects observed with estradiol (E2) and selective estrogen receptor modulators. To obtain a greater understanding of how E2 and selective estrogen receptor modulators differentially regulate genes, it is necessary to investigate their action on a more comprehensive set of native regulatory elements derived from ER target genes. Here we used chromatin immunoprecipitation-cloning and sequencing to isolate 173 regulatory elements associated with ERα. Most elements were found in the introns (38%) and regions greater than 10 kb upstream of the transcription initiation site (38%); 24% of the elements were found in the proximal promoter region (<10 kb). Only 11% of the elements contained a classical estrogen response element; 23% of the elements did not have any known response elements, including one derived from the naked cuticle homolog gene, which was associated with the recruitment of p160 coactivators. Transfection studies found that 80% of the 173 elements were regulated by E2, raloxifene, or tamoxifen with ERα or ERβ. Tamoxifen was more effective than raloxifene at activating the elements with ERα, whereas raloxifene was superior with ERβ. Our findings demonstrate that E2, tamoxifen, and raloxifene differentially regulate native ER-regulatory elements isolated by chromatin immunoprecipitation with ERα and ERβ. PMID:17962382
Wallace, A. C.; Borkakoti, N.; Thornton, J. M.
1997-01-01
It is well established that sequence templates such as those in the PROSITE and PRINTS databases are powerful tools for predicting the biological function and tertiary structure for newly derived protein sequences. The number of X-ray and NMR protein structures is increasing rapidly and it is apparent that a 3D equivalent of the sequence templates is needed. Here, we describe an algorithm called TESS that automatically derives 3D templates from structures deposited in the Brookhaven Protein Data Bank. While a new sequence can be searched for sequence patterns, a new structure can be scanned against these 3D templates to identify functional sites. As examples, 3D templates are derived for enzymes with an O-His-O "catalytic triad" and for the ribonucleases and lysozymes. When these 3D templates are applied to a large data set of nonidentical proteins, several interesting hits are located. This suggests that the development of a 3D template database may help to identify the function of new protein structures, if unknown, as well as to design proteins with specific functions. PMID:9385633
Tang, Khanh G; Kent, Greggory T; Erden, Ihsan; Wu, Weiming
2017-10-04
cis -β-Bromostyrene derivatives were synthesized stereospecifically from cinnamic acids through β-lactone intermediates. The synthetic sequence did not require the purification of the β-lactone intermediates although they were found to be stable and readily purified in most cases.
Characterization of circulating transfer RNA-Derived RNA fragments in cattle
USDA-ARS?s Scientific Manuscript database
The objective was to characterize naturally occurring circulating transfer RNA-derived RNA Fragments (tRFs) in cattle. Serum from eight clinically normal adult dairy cows was collected, and small non-coding RNAs were extracted immediately after collection and sequenced by Illumina MiSeq. Sequences a...
Ayesh, Basim M
2017-01-01
Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.
Analysis on the DNA Fingerprinting of Aspergillus Oryzae Mutant Induced by High Hydrostatic Pressure
NASA Astrophysics Data System (ADS)
Wang, Hua; Zhang, Jian; Yang, Fan; Wang, Kai; Shen, Si-Le; Liu, Bing-Bing; Zou, Bo; Zou, Guang-Tian
2011-01-01
The mutant strains of aspergillus oryzae (HP300a) are screened under 300 MPa for 20 min. Compared with the control strains, the screened mutant strains have unique properties such as genetic stability, rapid growth, lots of spores, and high protease activity. Random amplified polymorphic DNA (RAPD) and inter simple sequence repeats (ISSR) are used to analyze the DNA fingerprinting of HP300a and the control strains. There are 67.9% and 51.3% polymorphic bands obtained by these two markers, respectively, indicating significant genetic variations between HP300a and the control strains. In addition, comparison of HP300a and the control strains, the genetic distances of random sequence and simple sequence repeat of DNA are 0.51 and 0.34, respectively.
Generation of a Maize B Centromere Minimal Map Containing the Central Core Domain.
Ellis, Nathanael A; Douglas, Ryan N; Jackson, Caroline E; Birchler, James A; Dawe, R Kelly
2015-10-28
The maize B centromere has been used as a model for centromere epigenetics and as the basis for building artificial chromosomes. However, there are no sequence resources for this important centromere. Here we used transposon display for the centromere-specific retroelement CRM2 to identify a collection of 40 sequence tags that flank CRM2 insertion points on the B chromosome. These were confirmed to lie within the centromere by assaying deletion breakpoints from centromere misdivision derivatives (intracentromere breakages caused by centromere fission). Markers were grouped together on the basis of their association with other markers in the misdivision series and assembled into a pseudocontig containing 10.1 kb of sequence. To identify sequences that interact directly with centromere proteins, we carried out chromatin immunoprecipitation using antibodies to centromeric histone H3 (CENH3), a defining feature of functional centromeric sequences. The CENH3 chromatin immunoprecipitation map was interpreted relative to the known transmission rates of centromere misdivision derivatives to identify a centromere core domain spanning 33 markers. A subset of seven markers was mapped in additional B centromere misdivision derivatives with the use of unique primer pairs. A derivative previously shown to have no canonical centromere sequences (Telo3-3) lacks these core markers. Our results provide a molecular map of the B chromosome centromere and identify key sequences within the map that interact directly with centromeric histone H3. Copyright © 2015 Ellis et al.
Li, Na; Mao, Wenjun; Liu, Xue; Wang, Shuyao; Xia, Zheng; Cao, Sujian; Li, Lin; Zhang, Qi; Liu, Shan
2016-10-04
Five sulfated oligosaccharide fragments, F1-F5, were prepared from a pyruvylated galactan sulfate from the green alga Codium divaricatum, by partial depolymerization using mild acid hydrolysis and purification with gel-permeation chromatography. Negative-ion electrospray tandem mass spectrometry with collision-induced dissociation (ES-CID-MS/MS) is attempted for sequence determination of the sulfated oligosaccharides. The sequence of F1 with homogeneous disaccharide composition was first characterized to be Galp-(4SO4)-(1 → 3)-Galp by detailed nuclear magnetic resonance spectroscopic analyses. The fragmentation pattern of F1 in the product ion spectra was established on the basis of negative-ion ES-CID MS/MS, which was then applied to sequence analysis of other sulfated oligosaccharides. The sequences of F2 and F3 were deduced to be Galp-(4SO4)-(1 → 3)-Galp-(1 → 3)-Galp-(1 → 3)-Galp and 3,4-O-(1-carboxyethylidene)-Galp-(6SO4)-(1 → 3)-Galp, respectively. The sequences of major fragments in F4 and F5 were also deduced. The investigation demonstrated that negative-ion ES-CID-MS/MS was an efficient method for the sequence analysis of the pyruvylated galactan sulfate-derived oligosaccharides which revealed the patterns of substitution and glycosidic linkages. The pyruvylated galactan sulfate-derived oligosaccharides were novel sulfated oligosaccharides different from other algal polysaccharide-derived oligosaccharides. Copyright © 2016 Elsevier Ltd. All rights reserved.
Learning predictive statistics from temporal sequences: Dynamics and strategies
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe
2017-01-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111
Learning predictive statistics from temporal sequences: Dynamics and strategies.
Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe
2017-10-01
Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.
Development of genome-wide SNP assays for rice
USDA-ARS?s Scientific Manuscript database
With the introduction of new sequencing technologies, single nucleotide polymorphisms (SNPs) are rapidly replacing simple sequence repeats (SSRs) as the DNA marker of choice for applications in plant breeding and genetics because they are more abundant, stable, amenable to automation, efficient, and...
2012-01-01
Background While safer than their viral counterparts, conventional non-viral gene delivery DNA vectors offer a limited safety profile. They often result in the delivery of unwanted prokaryotic sequences, antibiotic resistance genes, and the bacterial origins of replication to the target, which may lead to the stimulation of unwanted immunological responses due to their chimeric DNA composition. Such vectors may also impart the potential for chromosomal integration, thus potentiating oncogenesis. We sought to engineer an in vivo system for the quick and simple production of safer DNA vector alternatives that were devoid of non-transgene bacterial sequences and would lethally disrupt the host chromosome in the event of an unwanted vector integration event. Results We constructed a parent eukaryotic expression vector possessing a specialized manufactured multi-target site called “Super Sequence”, and engineered E. coli cells (R-cell) that conditionally produce phage-derived recombinase Tel (PY54), TelN (N15), or Cre (P1). Passage of the parent plasmid vector through R-cells under optimized conditions, resulted in rapid, efficient, and one step in vivo generation of mini lcc—linear covalently closed (Tel/TelN-cell), or mini ccc—circular covalently closed (Cre-cell), DNA constructs, separated from the backbone plasmid DNA. Site-specific integration of lcc plasmids into the host chromosome resulted in chromosomal disruption and 105 fold lower viability than that seen with the ccc counterpart. Conclusion We offer a high efficiency mini DNA vector production system that confers simple, rapid and scalable in vivo production of mini lcc DNA vectors that possess all the benefits of “minicircle” DNA vectors and virtually eliminate the potential for undesirable vector integration events. PMID:23216697
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies
2010-01-01
Background All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. Results The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. Conclusions This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general. PMID:20144194
Barrett, Angela N; Xiong, Li; Tan, Tuan Z; Advani, Henna V; Hua, Rui; Laureano-Asibal, Cecille; Soong, Richie; Biswas, Arijit; Nagarajan, Niranjan; Choolani, Mahesh
2017-01-01
Cell-free DNA from maternal plasma can be used for non-invasive prenatal testing for aneuploidies and single gene disorders, and also has applications as a biomarker for monitoring high-risk pregnancies, such as those at risk of pre-eclampsia. On average, the fractional cell-free fetal DNA concentration in plasma is approximately 15%, but can vary from less than 4% to greater than 30%. Although quantification of cell-free fetal DNA is straightforward in the case of a male fetus, there is no universal fetal marker; in a female fetus measurement is more challenging. We have developed a panel of multiplexed insertion/deletion polymorphisms that can measure fetal fraction in all pregnancies in a simple, targeted sequencing reaction. A multiplex panel of primers was designed for 35 indels plus a ZFX/ZFY amplicon. cfDNA was extracted from plasma from 157 pregnant women, and maternal genomic DNA was extracted for 20 of these samples for panel validation. Sixty-one samples from pregnancies with a male fetus were subjected to whole genome sequencing on the Ion Proton sequencing platform, and fetal fraction derived from Y chromosome counts was compared to fetal fraction measured using the indel panel. A total of 157 cell-free DNA samples were sequenced using the indel panel, and informativity was assessed, along with the proportion of fetal DNA. Using gDNA we optimised the indel panel, removing amplicons giving rise to PCR bias. Good correlation was found between fetal fraction using indels and using whole genome sequencing of the Y chromosome (Spearmans r = 0.69). A median of 12 indels were informative per sample. The indel panel was informative in 157/157 cases (mean fetal fraction 14.4% (±0.58%)). Using our targeted next generation sequencing panel we can readily assess the fetal DNA percentage in male and female pregnancies.
Xiong, Li; Tan, Tuan Z.; Advani, Henna V.; Hua, Rui; Laureano-Asibal, Cecille; Soong, Richie; Biswas, Arijit; Nagarajan, Niranjan; Choolani, Mahesh
2017-01-01
Objective Cell-free DNA from maternal plasma can be used for non-invasive prenatal testing for aneuploidies and single gene disorders, and also has applications as a biomarker for monitoring high-risk pregnancies, such as those at risk of pre-eclampsia. On average, the fractional cell-free fetal DNA concentration in plasma is approximately 15%, but can vary from less than 4% to greater than 30%. Although quantification of cell-free fetal DNA is straightforward in the case of a male fetus, there is no universal fetal marker; in a female fetus measurement is more challenging. We have developed a panel of multiplexed insertion/deletion polymorphisms that can measure fetal fraction in all pregnancies in a simple, targeted sequencing reaction. Methods A multiplex panel of primers was designed for 35 indels plus a ZFX/ZFY amplicon. cfDNA was extracted from plasma from 157 pregnant women, and maternal genomic DNA was extracted for 20 of these samples for panel validation. Sixty-one samples from pregnancies with a male fetus were subjected to whole genome sequencing on the Ion Proton sequencing platform, and fetal fraction derived from Y chromosome counts was compared to fetal fraction measured using the indel panel. A total of 157 cell-free DNA samples were sequenced using the indel panel, and informativity was assessed, along with the proportion of fetal DNA. Results Using gDNA we optimised the indel panel, removing amplicons giving rise to PCR bias. Good correlation was found between fetal fraction using indels and using whole genome sequencing of the Y chromosome (Spearmans r = 0.69). A median of 12 indels were informative per sample. The indel panel was informative in 157/157 cases (mean fetal fraction 14.4% (±0.58%)). Conclusions Using our targeted next generation sequencing panel we can readily assess the fetal DNA percentage in male and female pregnancies. PMID:29084245
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies.
David, Maria Pamela C; Concepcion, Gisela P; Padlan, Eduardo A
2010-02-08
All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.
A Simple View of Writing in Chinese
ERIC Educational Resources Information Center
Yeung, Pui-sze; Ho, Connie Suk-han; Chan, David Wai-ock; Chung, Kevin Kien-hoa
2017-01-01
This study examined the Chinese written composition development of elementary-grade students in relation to the simple view of writing. Measures of nonverbal reasoning ability, component skills of transcription (stroke sequence knowledge, word spelling, and handwriting fluency), oral language (definitional skill, oral narrative skills, and…
Epstein, F H; Mugler, J P; Brookeman, J R
1994-02-01
A number of pulse sequence techniques, including magnetization-prepared gradient echo (MP-GRE), segmented GRE, and hybrid RARE, employ a relatively large number of variable pulse sequence parameters and acquire the image data during a transient signal evolution. These sequences have recently been proposed and/or used for clinical applications in the brain, spine, liver, and coronary arteries. Thus, the need for a method of deriving optimal pulse sequence parameter values for this class of sequences now exists. Due to the complexity of these sequences, conventional optimization approaches, such as applying differential calculus to signal difference equations, are inadequate. We have developed a general framework for adapting the simulated annealing algorithm to pulse sequence parameter value optimization, and applied this framework to the specific case of optimizing the white matter-gray matter signal difference for a T1-weighted variable flip angle 3D MP-RAGE sequence. Using our algorithm, the values of 35 sequence parameters, including the magnetization-preparation RF pulse flip angle and delay time, 32 flip angles in the variable flip angle gradient-echo acquisition sequence, and the magnetization recovery time, were derived. Optimized 3D MP-RAGE achieved up to a 130% increase in white matter-gray matter signal difference compared with optimized 3D RF-spoiled FLASH with the same total acquisition time. The simulated annealing approach was effective at deriving optimal parameter values for a specific 3D MP-RAGE imaging objective, and may be useful for other imaging objectives and sequences in this general class.
The testes transcriptome derived from the New World Screwworm, Cochliomyia hominivorax SRA
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources researchers, we sequenced and assembled the testes transcriptome derived from the Pacora, Panama, production plant strain J06 of the New World Screwworm, Cochliomyia hominivorax. This sequencing project produced 72,750,822 raw reads and th...
Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data.
Parker, Nicolas J; Parker, Andrew G
2008-04-18
The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches. A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads. Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension. The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.
Recursive sequences in first-year calculus
NASA Astrophysics Data System (ADS)
Krainer, Thomas
2016-02-01
This article provides ready-to-use supplementary material on recursive sequences for a second-semester calculus class. It equips first-year calculus students with a basic methodical procedure based on which they can conduct a rigorous convergence or divergence analysis of many simple recursive sequences on their own without the need to invoke inductive arguments as is typically required in calculus textbooks. The sequences that are accessible to this kind of analysis are predominantly (eventually) monotonic, but also certain recursive sequences that alternate around their limit point as they converge can be considered.
2010-01-01
Background Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm. Results A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2. Conclusions The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species. PMID:20712870
NASA Technical Reports Server (NTRS)
Tanimoto, T.
1984-01-01
A simple modification of Gilbert's formula to account for slight lateral heterogeneity of the earth leads to a convenient formula to calculate synthetic long period seismograms. Partial derivatives are easily calculated, thus the formula is suitable for direct inversion of seismograms for lateral heterogeneity of the earth. Previously announced in STAR as N83-29893
Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.
2013-01-01
SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M
2013-01-01
SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
Genotyping variability of computationally categorized peach microsatellite markers
USDA-ARS?s Scientific Manuscript database
Numerous expressed sequence tag (EST) simple sequence repeat (SSR) primers can be easily mined out. The obstacle to develop them into usable markers is how to optimally select downsized subsets of the primers for genotyping, which accordingly reduces amplification failure and monomorphism often occu...
ERIC Educational Resources Information Center
Camp, Dane R.
1991-01-01
After introducing the two-dimensional Koch curve, which is generated by simple recursions on an equilateral triangle, the process is extended to three dimensions with simple recursions on a regular tetrahedron. Included, for both fractal sequences, are iterative formulae, illustrations of the first several iterations, and a sample PASCAL program.…
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.
Anwar, Tamanna; Khan, Asad U
2006-02-20
Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
USDA-ARS?s Scientific Manuscript database
To confirm a hybrid swarm population of Pinus densiflora × P. sylvestris in Jilin, China and to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSR), needles and seeds from P. densiflora, P. syl...
Coal-oil coprocessing at HTI - development and improvement of the technology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stalzer, R.H.; Lee, L.K.; Hu, J.
1995-12-31
Co-Processing refers to the combined processing of coal and petroleum-derived heavy oil feedstocks. The coal feedstocks used are those typically utilized in direct coal liquefaction: bituminous, subbituminous, and lignites. Petroleum-derived oil, is typically a petroleum residuum, containing at least 70 W% material boiling above 525{degrees}C. The combined coal and oil feedstocks are processed simultaneously with the dual objective of liquefying the coal and upgrading the petroleum-derived residuum to lower boiling (<525{degrees}C) premium products. HTI`s investigation of the Co-Processing technology has included work performed in laboratory, bench and PDU scale operations. The concept of co-processing technology is quite simple and amore » natural outgrowth of the work done with direct coal liquefaction. A 36 month program to evaluate new process concepts in coal-oil coprocessing at the bench-scale was begun in September 1994 and runs until September 1997. Included in this continuous bench-scale program are provisions to examine new improvements in areas such as: interstage product separation, feedstock concentrations (coal/oil), improved supported/dispersed catalysts, optimization of reactor temperature sequencing, and in-line hydrotreating. This does not preclude other ideas from DOE contracts and other sources that can lead to improved product quality and economics. This research work has led to important findings which significantly increased liquid yields, improved product quality, and improved process economics.« less
Production of SV40-derived vectors.
Strayer, David S; Mitchell, Christine; Maier, Dawn A; Nichols, Carmen N
2010-06-01
Recombinant simian virus 40 (rSV40)-derived vectors are particularly useful for gene delivery to bone marrow progenitor cells and their differentiated derivatives, certain types of epithelial cells (e.g., hepatocytes), and central nervous system neurons and microglia. They integrate rapidly into cellular DNA to provide long-term gene expression in vitro and in vivo in both resting and dividing cells. Here we describe a protocol for production and purification of these vectors. These procedures require only packaging cells (e.g., COS-7) and circular vector genome DNA. Amplification involves repeated infection of packaging cells with vector produced by transfection. Cotransfection is not required in any step. Viruses are purified by centrifugation using discontinuous sucrose or cesium chloride (CsCl) gradients and resulting vectors are replication-incompetent and contain no detectable wild-type SV40 revertants. These approaches are simple, give reproducible results, and may be used to generate vectors that are deleted only for large T antigen (Tag), or for all SV40-coding sequences capable of carrying up to 5 kb of foreign DNA. These vectors are best applied to long-term expression of proteins normally encoded by mammalian cells or by viruses that infect mammalian cells, or of untranslated RNAs (e.g., RNA interference). The preparative approaches described facilitate application of these vectors and allow almost any laboratory to exploit their strengths for diverse gene delivery applications.
A proposed technique for vehicle tracking, direction, and speed determination
NASA Astrophysics Data System (ADS)
Fisher, Paul S.; Angaye, Cleopas O.; Fisher, Howard P.
2004-12-01
A technique for recognition of vehicles in terms of direction, distance, and rate of change is presented. This represents very early work on this problem with significant hurdles still to be addressed. These are discussed in the paper. However, preliminary results also show promise for this technique for use in security and defense environments where the penetration of a perimeter is of concern. The material described herein indicates a process whereby the protection of a barrier could be augmented by computers and installed cameras assisting the individuals charged with this responsibility. The technique we employ is called Finite Inductive Sequences (FI) and is proposed as a means for eliminating data requiring storage and recognition where conventional mathematical models don"t eliminate enough and statistical models eliminate too much. FI is a simple idea and is based upon a symbol push-out technique that allows the order (inductive base) of the model to be set to an a priori value for all derived rules. The rules are obtained from exemplar data sets, and are derived by a technique called Factoring, yielding a table of rules called a Ruling. These rules can then be used in pattern recognition applications such as described in this paper.
Structural alphabets derived from attractors in conformational space
2010-01-01
Background The hierarchical and partially redundant nature of protein structures justifies the definition of frequently occurring conformations of short fragments as 'states'. Collections of selected representatives for these states define Structural Alphabets, describing the most typical local conformations within protein structures. These alphabets form a bridge between the string-oriented methods of sequence analysis and the coordinate-oriented methods of protein structure analysis. Results A Structural Alphabet has been derived by clustering all four-residue fragments of a high-resolution subset of the protein data bank and extracting the high-density states as representative conformational states. Each fragment is uniquely defined by a set of three independent angles corresponding to its degrees of freedom, capturing in simple and intuitive terms the properties of the conformational space. The fragments of the Structural Alphabet are equivalent to the conformational attractors and therefore yield a most informative encoding of proteins. Proteins can be reconstructed within the experimental uncertainty in structure determination and ensembles of structures can be encoded with accuracy and robustness. Conclusions The density-based Structural Alphabet provides a novel tool to describe local conformations and it is specifically suitable for application in studies of protein dynamics. PMID:20170534
Clément, Nathalie; Avalosse, Bernard; El Bakkouri, Karim; Velu, Thierry; Brandenburger, Annick
2001-01-01
The production of wild-type-free stocks of recombinant parvovirus minute virus of mice [MVM(p)] is difficult due to the presence of homologous sequences in vector and helper genomes that cannot easily be eliminated from the overlapping coding sequences. We have therefore cloned and sequenced spontaneously occurring defective particles of MVM(p) with very small genomes to identify the minimal cis-acting sequences required for DNA amplification and virus production. One of them has lost all capsid-coding sequences but is still able to replicate in permissive cells when nonstructural proteins are provided in trans by a helper plasmid. Vectors derived from this particle produce stocks with no detectable wild-type MVM after cotransfection with new, matched, helper plasmids that present no homology downstream from the transgene. PMID:11152501
Simple formula for the surface area of the body and a simple model for anthropometry.
Reading, Bruce D; Freeman, Brian
2005-03-01
The body surface area (BSA) of any adult, when derived from the arithmetic mean of the different values calculated from four independent accepted formulae, can be expressed accurately in Systeme International d'Unites (SI) units by the simple equation BSA = 1/6(WH)0.5, where W is body weight in kg, H is body height in m, and BSA is in m2. This formula, which is derived in part by modeling the body as a simple solid of revolution or a prolate spheroid (i.e., a stretched ellipsoid of revolution) gives students, teachers, and clinicians a simple rule for the rapid estimation of surface area using rational units. The formula was tested independently for human subjects by using it to predict body volume and then comparing this prediction against the actual volume measured by Archimedes' principle. Copyright 2005 Wiley-Liss, Inc.
Nagle, Padraic S; McKeever, Caitriona; Rodriguez, Fernando; Nguyen, Binh; Wilson, W David; Rozas, Isabel
2014-09-25
In this paper we report the design and biophysical evaluation of novel rigid-core symmetric and asymmetric dicationic DNA binders containing 9H-fluorene and 9,10-dihydroanthracene cores as well as the synthesis of one of these fluorene derivatives. First, the affinity toward particular DNA sequences of these compounds and flexible core derivatives was evaluated by means of surface plasmon resonance and thermal denaturation experiments finding that the position of the cations significantly influence the binding strength. Then their affinity and mode of binding were further studied by performing circular dichroism and UV studies and the results obtained were rationalized by means of DFT calculations. We found that the fluorene derivatives prepared have the ability to bind to the minor groove of certain DNA sequences and intercalate to others, whereas the dihydroanthracene compounds bind via intercalation to all the DNA sequences studied here.
SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop
Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo
2014-01-01
Summary: Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig’s scalability over many computing nodes and illustrate its use with example scripts. Availability and Implementation: Available under the open source MIT license at http://sourceforge.net/projects/seqpig/ Contact: andre.schumacher@yahoo.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24149054
Stability-Derivative Determination from Flight Data
NASA Technical Reports Server (NTRS)
Holowicz, Chester H.; Holleman, Euclid C.
1958-01-01
A comprehensive discussion of the various factors affecting the determination of stability and control derivatives from flight data is presented based on the experience of the NASA High-Speed Flight Station. Factors relating to test techniques, determination of mass characteristics, instrumentation, and methods of analysis are discussed. For most longitudinal-stability-derivative analyses simple equations utilizing period and damping have been found to be as satisfactory as more comprehensive methods. The graphical time-vector method has been the basis of lateral-derivative analysis, although simple approximate methods can be useful If applied with caution. Control effectiveness has been generally obtained by relating the peak acceleration to the rapid control input, and consideration must be given to aerodynamic contributions if reasonable accuracy is to be realized.. Because of the many factors involved In the determination of stability derivatives, It is believed that the primary stability and control derivatives are probably accurate to within 10 to 25 percent, depending upon the specific derivative. Static-stability derivatives at low angle of attack show the greatest accuracy.
A Practical Workshop for Generating Simple DNA Fingerprints of Plants
ERIC Educational Resources Information Center
Rouziere, A.-S.; Redman, J. E.
2011-01-01
Gel electrophoresis DNA fingerprints offer a graphical and visually appealing illumination of the similarities and differences between DNA sequences of different species and individuals. A polymerase chain reaction (PCR) and restriction digest protocol was designed to give high-school students the opportunity to generate simple fingerprints of…
Multiplexed microsatellite recovery using massively parallel sequencing
T.N. Jennings; B.J. Knaus; T.D. Mullins; S.M. Haig; R.C. Cronn
2011-01-01
Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of...
Assessing Date Palm Genetic Diversity Using Different Molecular Markers.
Atia, Mohamed A M; Sakr, Mahmoud M; Adawy, Sami S
2017-01-01
Molecular marker technologies which rely on DNA analysis provide powerful tools to assess biodiversity at different levels, i.e., among and within species. A range of different molecular marker techniques have been developed and extensively applied for detecting variability in date palm at the DNA level. Recently, the employment of gene-targeting molecular marker approaches to study biodiversity and genetic variations in many plant species has increased the attention of researchers interested in date palm to carry out phylogenetic studies using these novel marker systems. Molecular markers are good indicators of genetic distances among accessions, because DNA-based markers are neutral in the face of selection. Here we describe the employment of multidisciplinary molecular marker approaches: amplified fragment length polymorphism (AFLP), start codon targeted (SCoT) polymorphism, conserved DNA-derived polymorphism (CDDP), intron-targeted amplified polymorphism (ITAP), simple sequence repeats (SSR), and random amplified polymorphic DNA (RAPD) to assess genetic diversity in date palm.
Telomerase Mechanism of Telomere Synthesis
Wu, R. Alex; Upton, Heather E.; Vogan, Jacob M.; Collins, Kathleen
2017-01-01
Telomerase is the essential reverse transcriptase required for linear chromosome maintenance in most eukaryotes. Telomerase supplements the tandem array of simple-sequence repeats at chromosome ends to compensate for the DNA erosion inherent in genome replication. The template for telomerase reverse transcriptase is within the RNA subunit of the ribonucleoprotein complex, which in cells contains additional telomerase holoenzyme proteins that assemble the active ribonucleoprotein and promote its function at telomeres. Telomerase is distinct among polymerases in its reiterative reuse of an internal template. The template is precisely defined, processively copied, and regenerated by release of single-stranded product DNA. New specificities of nucleic acid handling that underlie the catalytic cycle of repeat synthesis derive from both active site specialization and new motif elaborations in protein and RNA subunits. Studies of telomerase provide unique insights into cellular requirements for genome stability, tissue renewal, and tumorigenesis as well as new perspectives on dynamic ribonucleoprotein machines. PMID:28141967
Estimating Position of Mobile Robots From Omnidirectional Vision Using an Adaptive Algorithm.
Li, Luyang; Liu, Yun-Hui; Wang, Kai; Fang, Mu
2015-08-01
This paper presents a novel and simple adaptive algorithm for estimating the position of a mobile robot with high accuracy in an unknown and unstructured environment by fusing images of an omnidirectional vision system with measurements of odometry and inertial sensors. Based on a new derivation where the omnidirectional projection can be linearly parameterized by the positions of the robot and natural feature points, we propose a novel adaptive algorithm, which is similar to the Slotine-Li algorithm in model-based adaptive control, to estimate the robot's position by using the tracked feature points in image sequence, the robot's velocity, and orientation angles measured by odometry and inertial sensors. It is proved that the adaptive algorithm leads to global exponential convergence of the position estimation errors to zero. Simulations and real-world experiments are performed to demonstrate the performance of the proposed algorithm.
Modification of Antibody Function by Mutagenesis.
Dasch, James R; Dasch, Amy L
2017-09-01
The ability to "fine-tune" recombinant antibodies by mutagenesis separates recombinant antibodies from hybridoma-derived antibodies because the latter are locked with respect to their properties. Recombinant antibodies can be modified to suit the application: Changes in isotype, format (e.g., scFv, Fab, bispecific antibodies), and specificity can be made once the heavy- and light-chain sequences are available. After immunoglobulin heavy and light chains for a particular antibody have been cloned, the binding site-namely, the complementarity determining regions (CDR)-can be manipulated by mutagenesis to obtain antibody variants with improved properties. The method described here is relatively simple, uses commercially available reagents, and is effective. Using the pComb3H vector, a commercial mutagenesis kit, PfuTurbo polymerase (Agilent), and two mutagenic primers, a library of phage with mutagenized heavy and light CDR3 can be obtained. © 2017 Cold Spring Harbor Laboratory Press.
Pre-main-sequence isochrones - II. Revising star and planet formation time-scales
NASA Astrophysics Data System (ADS)
Bell, Cameron P. M.; Naylor, Tim; Mayne, N. J.; Jeffries, R. D.; Littlefair, S. P.
2013-09-01
We have derived ages for 13 young (<30 Myr) star-forming regions and find that they are up to a factor of 2 older than the ages typically adopted in the literature. This result has wide-ranging implications, including that circumstellar discs survive longer (≃ 10-12 Myr) and that the average Class I lifetime is greater (≃1 Myr) than currently believed. For each star-forming region, we derived two ages from colour-magnitude diagrams. First, we fitted models of the evolution between the zero-age main sequence and terminal-age main sequence to derive a homogeneous set of main-sequence ages, distances and reddenings with statistically meaningful uncertainties. Our second age for each star-forming region was derived by fitting pre-main-sequence stars to new semi-empirical model isochrones. For the first time (for a set of clusters younger than 50 Myr), we find broad agreement between these two ages, and since these are derived from two distinct mass regimes that rely on different aspects of stellar physics, it gives us confidence in the new age scale. This agreement is largely due to our adoption of empirical colour-Teff relations and bolometric corrections for pre-main-sequence stars cooler than 4000 K. The revised ages for the star-forming regions in our sample are: ˜2 Myr for NGC 6611 (Eagle Nebula; M 16), IC 5146 (Cocoon Nebula), NGC 6530 (Lagoon Nebula; M 8) and NGC 2244 (Rosette Nebula); ˜6 Myr for σ Ori, Cep OB3b and IC 348; ≃10 Myr for λ Ori (Collinder 69); ≃11 Myr for NGC 2169; ≃12 Myr for NGC 2362; ≃13 Myr for NGC 7160; ≃14 Myr for χ Per (NGC 884); and ≃20 Myr for NGC 1960 (M 36).
Fakoli, Lawrence S.; Bolay, Kpehe; Bolay, Fatorma K.; Diclaro, Joseph W.; Brackney, Doug E.; Stenglein, Mark D.; Ebel, Gregory D.
2018-01-01
Background Novel surveillance strategies are needed to detect the rapid and continuous emergence of infectious disease agents. Ideally, new sampling strategies should be simple to implement, technologically uncomplicated, and applicable to areas where emergence events are known to occur. To this end, xenosurveillance is a technique that makes use of blood collected by hematophagous arthropods to monitor and identify vertebrate pathogens. Mosquitoes are largely ubiquitous animals that often exist in sizable populations. As well, many domestic or peridomestic species of mosquitoes will preferentially take blood-meals from humans, making them a unique and largely untapped reservoir to collect human blood. Methodology/Principal findings We sought to take advantage of this phenomenon by systematically collecting blood-fed mosquitoes during a field trail in Northern Liberia to determine whether pathogen sequences from blood engorged mosquitoes accurately mirror those obtained directly from humans. Specifically, blood was collected from humans via finger-stick and by aspirating bloodfed mosquitoes from the inside of houses. Shotgun metagenomic sequencing of RNA and DNA derived from these specimens was performed to detect pathogen sequences. Samples obtained from xenosurveillance and from finger-stick blood collection produced a similar number and quality of reads aligning to two human viruses, GB virus C and hepatitis B virus. Conclusions/Significance This study represents the first systematic comparison between xenosurveillance and more traditional sampling methodologies, while also demonstrating the viability of xenosurveillance as a tool to sample human blood for circulating pathogens. PMID:29561834
Fauver, Joseph R; Weger-Lucarelli, James; Fakoli, Lawrence S; Bolay, Kpehe; Bolay, Fatorma K; Diclaro, Joseph W; Brackney, Doug E; Foy, Brian D; Stenglein, Mark D; Ebel, Gregory D
2018-03-01
Novel surveillance strategies are needed to detect the rapid and continuous emergence of infectious disease agents. Ideally, new sampling strategies should be simple to implement, technologically uncomplicated, and applicable to areas where emergence events are known to occur. To this end, xenosurveillance is a technique that makes use of blood collected by hematophagous arthropods to monitor and identify vertebrate pathogens. Mosquitoes are largely ubiquitous animals that often exist in sizable populations. As well, many domestic or peridomestic species of mosquitoes will preferentially take blood-meals from humans, making them a unique and largely untapped reservoir to collect human blood. We sought to take advantage of this phenomenon by systematically collecting blood-fed mosquitoes during a field trail in Northern Liberia to determine whether pathogen sequences from blood engorged mosquitoes accurately mirror those obtained directly from humans. Specifically, blood was collected from humans via finger-stick and by aspirating bloodfed mosquitoes from the inside of houses. Shotgun metagenomic sequencing of RNA and DNA derived from these specimens was performed to detect pathogen sequences. Samples obtained from xenosurveillance and from finger-stick blood collection produced a similar number and quality of reads aligning to two human viruses, GB virus C and hepatitis B virus. This study represents the first systematic comparison between xenosurveillance and more traditional sampling methodologies, while also demonstrating the viability of xenosurveillance as a tool to sample human blood for circulating pathogens.
ERIC Educational Resources Information Center
Smyth, Sinead; Barnes-Holmes, Dermot; Forsyth, John P.
2006-01-01
Two experiments investigated the derived transfer of functions through equivalence relations established using a stimulus pairing observation procedure. In Experiment 1, participants were trained on a simple discrimination (A1+/A2-) and then a stimulus pairing observation procedure was used to establish 4 stimulus pairings (A1-B1, A2-B2, B1-C1,…
Series expansions of rotating two and three dimensional sound fields.
Poletti, M A
2010-12-01
The cylindrical and spherical harmonic expansions of oscillating sound fields rotating at a constant rate are derived. These expansions are a generalized form of the stationary sound field expansions. The derivations are based on the representation of interior and exterior sound fields using the simple source approach and determination of the simple source solutions with uniform rotation. Numerical simulations of rotating sound fields are presented to verify the theory.
HOMFLYPT polynomial is the best quantifier for topological cascades of vortex knots
NASA Astrophysics Data System (ADS)
Ricca, Renzo L.; Liu, Xin
2018-02-01
In this paper we derive and compare numerical sequences obtained by adapted polynomials such as HOMFLYPT, Jones and Alexander-Conway for the topological cascade of vortex torus knots and links that progressively untie by a single reconnection event at a time. Two cases are considered: the alternate sequence of knots and co-oriented links (with positive crossings) and the sequence of two-component links with oppositely oriented components (negative crossings). New recurrence equations are derived and sequences of numerical values are computed. In all cases the adapted HOMFLYPT polynomial proves to be the best quantifier for the topological cascade of torus knots and links.
From genomics to functional markers in the era of next-generation sequencing.
Salgotra, R K; Gupta, B B; Stewart, C N
2014-03-01
The availability of complete genome sequences, along with other genomic resources for Arabidopsis, rice, pigeon pea, soybean and other crops, has revolutionized our understanding of the genetic make-up of plants. Next-generation DNA sequencing (NGS) has facilitated single nucleotide polymorphism discovery in plants. Functionally-characterized sequences can be identified and functional markers (FMs) for important traits can be developed at an ever-increasing ease. FMs are derived from sequence polymorphisms found in allelic variants of a functional gene. Linkage disequilibrium-based association mapping and homologous recombinants have been developed for identification of "perfect" markers for their use in crop improvement practices. Compared with many other molecular markers, FMs derived from the functionally characterized sequence genes using NGS techniques and their use provide opportunities to develop high-yielding plant genotypes resistant to various stresses at a fast pace.
Integer sequence discovery from small graphs
Hoppe, Travis; Petrone, Anna
2015-01-01
We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526
Imai, Kazuo; Tarumoto, Norihito; Misawa, Kazuhisa; Runtuwene, Lucky Ronald; Sakai, Jun; Hayashida, Kyoko; Eshita, Yuki; Maeda, Ryuichiro; Tuda, Josef; Murakami, Takashi; Maesaki, Shigefumi; Suzuki, Yutaka; Yamagishi, Junya; Maeda, Takuya
2017-09-13
A simple and accurate molecular diagnostic method for malaria is urgently needed due to the limitations of conventional microscopic examination. In this study, we demonstrate a new diagnostic procedure for human malaria using loop mediated isothermal amplification (LAMP) and the MinION™ nanopore sequencer. We generated specific LAMP primers targeting the 18S-rRNA gene of all five human Plasmodium species including two P. ovale subspecies (P. falciparum, P. vivax, P. ovale wallikeri, P. ovale curtisi, P. knowlesi and P. malariae) and examined human blood samples collected from 63 malaria patients in Indonesia. Additionally, we performed amplicon sequencing of our LAMP products using MinION™ nanopore sequencer to identify each Plasmodium species. Our LAMP method allowed amplification of all targeted 18S-rRNA genes of the reference plasmids with detection limits of 10-100 copies per reaction. Among the 63 clinical samples, 54 and 55 samples were positive by nested PCR and our LAMP method, respectively. Identification of the Plasmodium species by LAMP amplicon sequencing analysis using the MinION™ was consistent with the reference plasmid sequences and the results of nested PCR. Our diagnostic method combined with LAMP and MinION™ could become a simple and accurate tool for the identification of human Plasmodium species, even in resource-limited situations.
Advanced data assimilation in strongly nonlinear dynamical systems
NASA Technical Reports Server (NTRS)
Miller, Robert N.; Ghil, Michael; Gauthiez, Francois
1994-01-01
Advanced data assimilation methods are applied to simple but highly nonlinear problems. The dynamical systems studied here are the stochastically forced double well and the Lorenz model. In both systems, linear approximation of the dynamics about the critical points near which regime transitions occur is not always sufficient to track their occurrence or nonoccurrence. Straightforward application of the extended Kalman filter yields mixed results. The ability of the extended Kalman filter to track transitions of the double-well system from one stable critical point to the other depends on the frequency and accuracy of the observations relative to the mean-square amplitude of the stochastic forcing. The ability of the filter to track the chaotic trajectories of the Lorenz model is limited to short times, as is the ability of strong-constraint variational methods. Examples are given to illustrate the difficulties involved, and qualitative explanations for these difficulties are provided. Three generalizations of the extended Kalman filter are described. The first is based on inspection of the innovation sequence, that is, the successive differences between observations and forecasts; it works very well for the double-well problem. The second, an extension to fourth-order moments, yields excellent results for the Lorenz model but will be unwieldy when applied to models with high-dimensional state spaces. A third, more practical method--based on an empirical statistical model derived from a Monte Carlo simulation--is formulated, and shown to work very well. Weak-constraint methods can be made to perform satisfactorily in the context of these simple models, but such methods do not seem to generalize easily to practical models of the atmosphere and ocean. In particular, it is shown that the equations derived in the weak variational formulation are difficult to solve conveniently for large systems.
GASP: Gapped Ancestral Sequence Prediction for proteins
Edwards, Richard J; Shields, Denis C
2004-01-01
Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
A Fast Method of Deriving the Kirchhoff Formula for Moving Surfaces
NASA Technical Reports Server (NTRS)
Farassat, F.; Posey, Joe W.
2007-01-01
The Kirchhoff formula for a moving surface is very useful in many wave propagation problems, particularly in the prediction of noise from rotating machinery. Several publications in the last two decades have presented derivations of the Kirchhoff formula for moving surfaces in both time and frequency domains. Here we present a method originally developed by Farassat and Myers in time domain that is both simple and direct. It is based on generalized function theory and the useful concept of imbedding the problem in the unbounded three-dimensional space. We derive an inhomogeneous wave equation with the source terms that involve Dirac delta functions with their supports on the moving data surface. This wave equation is then solved using the simple free space Green's function of the wave equation resulting in the Kirchhoff formula. The algebraic manipulations are minimal and simple. We do not need the Green's theorem in four dimensions and there is no ambiguity in the interpretation of any terms in the final formulas. Furthermore, this method also gives the simplest derivation of the classical Kirchhoff formula which has a fairly lengthy derivation in physics and applied mathematics books. The Farassat-Myers method can be used easily in frequency domain.
Borisjuk, N; Chu, P; Gutierrez, R; Zhang, H; Acosta, K; Friesen, N; Sree, K S; Garcia, C; Appenroth, K J; Lam, E
2015-01-01
Lemnaceae, commonly called duckweeds, comprise a diverse group of floating aquatic plants that have previously been classified into 37 species based on morphological and physiological criteria. In addition to their unique evolutionary position among angiosperms and their applications in biomonitoring, the potential of duckweeds as a novel sustainable crop for fuel and feed has recently increased interest in the study of their biodiversity and systematics. However, due to their small size and abbreviated structure, accurate typing of duckweeds based on morphology can be challenging. In the past decade, attempts to employ molecular barcoding techniques for species assignment have produced promising results; however, they have yet to be codified into a simple and quantitative protocol. A study that compiles and compares the barcode sequences within all known species of this family would help to establish the fidelity and limits of this DNA-based approach. In this work, we compared the level of conservation between over 100 strains of duckweed for two intergenic barcode sequences derived from the plastid genome. By using over 300 sequences publicly available in the NCBI database, we determined the utility of each of these two barcodes for duckweed species identification. Through sequencing of these barcodes from additional accessions, 30 of the 37 known species of duckweed could be identified with varying levels of confidence using this approach. From our analyses using this reference dataset, we also confirmed two instances where mis-assignment of species has likely occurred. Potential strategies for further improving the scope of this technology are discussed. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
USDA-ARS?s Scientific Manuscript database
Simple sequence repeat technology based on expressed sequence tag (EST-SSR) is a useful genomic tool for genome mapping, characterizing plant species relationships, elucidating genome evolution, and tracing genes on alien chromosome segments. EST-SSR primers developed from three perennial diploid T...
Gorrell, Jamieson C; Boutin, Stan; Raveh, Shirley; Neuhaus, Peter; Côté, Steeve D; Coltman, David W
2012-09-01
We determined the sequence of the male-specific minor histocompatibility complex antigen (Smcy) from the Y chromosome of seven squirrel species (Sciuridae, Rodentia). Based on conserved regions inside the Smcy intron sequence, we designed PCR primers for sex determination in these species that can be co-amplified with nuclear loci as controls. PCR co-amplification yields two products for males and one for females that are easily visualized as bands by agarose gel electrophoresis. Our method provides simple and reliable sex determination across a wide range of squirrel species. © 2012 Blackwell Publishing Ltd.
Prince, Linda M
2015-01-01
Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.
Wang, Q Z; Huang, M; Downie, S R; Chen, Z X
2016-05-23
Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Li, Hui; Li, Defang; Chen, Anguo; Tang, Huijuan; Li, Jianjun; Huang, Siqi
2016-01-01
Kenaf (Hibiscus cannabinus L.) is an economically important natural fiber crop grown worldwide. However, only 20 expressed tag sequences (ESTs) for kenaf are available in public databases. The aim of this study was to develop large-scale simple sequence repeat (SSR) markers to lay a solid foundation for the construction of genetic linkage maps and marker-assisted breeding in kenaf. We used Illumina paired-end sequencing technology to generate new EST-simple sequences and MISA software to mine SSR markers. We identified 71,318 unigenes with an average length of 1143 nt and annotated these unigenes using four different protein databases. Overall, 9324 complementary pairs were designated as EST-SSR markers, and their quality was validated using 100 randomly selected SSR markers. In total, 72 primer pairs reproducibly amplified target amplicons, and 61 of these primer pairs detected significant polymorphism among 28 kenaf accessions. Thus, in this study, we have developed large-scale SSR markers for kenaf, and this new resource will facilitate construction of genetic linkage maps, investigation of fiber growth and development in kenaf, and also be of value to novel gene discovery and functional genomic studies. PMID:26960153
Rohs, Remo; Sklenar, Heinz
2004-04-01
The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.
Buried and accessible surface area control intrinsic protein flexibility.
Marsh, Joseph A
2013-09-09
Proteins experience a wide variety of conformational dynamics that can be crucial for facilitating their diverse functions. How is the intrinsic flexibility required for these motions encoded in their three-dimensional structures? Here, the overall flexibility of a protein is demonstrated to be tightly coupled to the total amount of surface area buried within its fold. A simple proxy for this, the relative solvent-accessible surface area (Arel), therefore shows excellent agreement with independent measures of global protein flexibility derived from various experimental and computational methods. Application of Arel on a large scale demonstrates its utility by revealing unique sequence and structural properties associated with intrinsic flexibility. In particular, flexibility as measured by Arel shows little correspondence with intrinsic disorder, but instead tends to be associated with multiple domains and increased α-helical structure. Furthermore, the apparent flexibility of monomeric proteins is found to be useful for identifying quaternary-structure errors in published crystal structures. There is also a strong tendency for the crystal structures of more flexible proteins to be solved to lower resolutions. Finally, local solvent accessibility is shown to be a primary determinant of local residue flexibility. Overall, this work provides both fundamental mechanistic insight into the origin of protein flexibility and a simple, practical method for predicting flexibility from protein structures. © 2013 Elsevier Ltd. All rights reserved.
Artificial lunar impact craters: Four new identifications, part I
NASA Technical Reports Server (NTRS)
Whitaker, E. A.
1972-01-01
The Apollo 16 panoramic camera photographed the impact locations of the Ranger 7 and 9 spacecraft and the S-4B stage of the Apollo 14 Saturn launch vehicle. Identification of the Ranger craters was very simple because each photographed its target point before impact. Identification of the S-4B impact crater proved to be a simple matter because the impact location, as derived from earth-based tracking, displayed a prominent and unique system of mixed light and dark rays. By using the criterion of a dark ray pattern, a reexamination of the Apollo 14 500 mm Hasselblad sequence taken of the Apollo 13 S-4B impact area was made. This examination quickly led to the discovery of the ray system and the impact crater. The study of artificial lunar impact craters, ejecta blankets, and ray systems provides the long-needed link between the various experimental terrestrial impact and explosion craters, and the naturally occurring impact craters on the moon. This elementary study shows that lunar impact crater diameters are closely predictable from a knowledge of the energies involved, at least in the size range considered, and suggests that parameters, such as velocity, may have a profound effect on crater morphology and ejecta blanket albedo.
When push comes to shove: Exclusion processes with nonlocal consequences
NASA Astrophysics Data System (ADS)
Almet, Axel A.; Pan, Michael; Hughes, Barry D.; Landman, Kerry A.
2015-11-01
Stochastic agent-based models are useful for modelling collective movement of biological cells. Lattice-based random walk models of interacting agents where each site can be occupied by at most one agent are called simple exclusion processes. An alternative motility mechanism to simple exclusion is formulated, in which agents are granted more freedom to move under the compromise that interactions are no longer necessarily local. This mechanism is termed shoving. A nonlinear diffusion equation is derived for a single population of shoving agents using mean-field continuum approximations. A continuum model is also derived for a multispecies problem with interacting subpopulations, which either obey the shoving rules or the simple exclusion rules. Numerical solutions of the derived partial differential equations compare well with averaged simulation results for both the single species and multispecies processes in two dimensions, while some issues arise in one dimension for the multispecies case.
Bashir, Ali; Bansal, Vikas; Bafna, Vineet
2010-06-18
Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.
A Simple Acronym for Doing Calculus: CAL
ERIC Educational Resources Information Center
Hathaway, Richard J.
2008-01-01
An acronym is presented that provides students a potentially useful, unifying view of the major topics covered in an elementary calculus sequence. The acronym (CAL) is based on viewing the calculus procedure for solving a calculus problem P* in three steps: (1) recognizing that the problem cannot be solved using simple (non-calculus) techniques;…
ERIC Educational Resources Information Center
Datchuk, Shawn M.; Kubina, Richard M., Jr.
2017-01-01
The present study used a multiple-baseline, single-case experimental design to investigate the effects of a multicomponent intervention on construction of simple sentences and word sequences. The intervention entailed sequential delivery of sentence instruction and frequency building to a performance criterion and paragraph instruction.…
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance
Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S
2009-01-01
In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455
Intestinal microbiota composition in fishes is influenced by host ecology and environment.
Wong, Sandi; Rawls, John F
2012-07-01
The digestive tracts of vertebrates are colonized by complex assemblages of micro-organisms, collectively called the gut microbiota. Recent studies have revealed important contributions of gut microbiota to vertebrate health and disease, stimulating intense interest in understanding how gut microbial communities are assembled and how they impact host fitness (Sekirov et al. 2010). Although all vertebrates harbour a gut microbiota, current information on microbiota composition and function has been derived primarily from mammals. Comparisons of different mammalian species have revealed intriguing associations between gut microbiota composition and host diet, anatomy and phylogeny (Ley et al. 2008b). However, mammals constitute <10% of all vertebrate species, and it remains unclear whether similar associations exist in more diverse and ancient vertebrate lineages such as fish. In this issue, Sullam et al. (2012) make an important contribution toward identifying factors determining gut microbiota composition in fishes. The authors conducted a detailed meta-analysis of 25 bacterial 16S rRNA gene sequence libraries derived from the intestines of different fish species. To provide a broader context for their analysis, they compared these data sets to a large collection of 16S rRNA gene sequence data sets from diverse free-living and host-associated bacterial communities. Their results suggest that variation in gut microbiota composition in fishes is strongly correlated with species habitat salinity, trophic level and possibly taxonomy. Comparison of data sets from fish intestines and other environments revealed that fish gut microbiota compositions are often similar to those of other animals and contain relatively few free-living environmental bacteria. These results suggest that the gut microbiota composition of fishes is not a simple reflection of the micro-organisms in their local habitat but may result from host-specific selective pressures within the gut (Bevins & Salzman 2011).
Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana
2016-07-01
The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.
Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W
2015-04-11
Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Adenine specific DNA chemical sequencing reaction.
Iverson, B L; Dervan, P B
1987-01-01
Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067
Min, Xiang Jia
2013-01-01
Expressed Sequence Tags (ESTs) are a rich resource for identifying Alternatively Splicing (AS) genes. The ASFinder webserver is designed to identify AS isoforms from EST-derived sequences. Two approaches are implemented in ASFinder. If no genomic sequences are provided, the server performs a local BLASTN to identify AS isoforms from ESTs having both ends aligned but an internal segment unaligned. Otherwise, ASFinder uses SIM4 to map ESTs to the genome, then the overlapping ESTs that are mapped to the same genomic locus and have internal variable exon/intron boundaries are identified as AS isoforms. The tool is available at http://proteomics.ysu.edu/tools/ASFinder.html.
Kim, Bo-Bae; Kim, Minji; Park, Yun-Hee; Ko, Youngkyung; Park, Jun-Beom
2017-06-01
Objective Next-generation sequencing was performed to evaluate the effects of short-term application of dexamethasone on human gingiva-derived mesenchymal stem cells. Methods Human gingiva-derived stem cells were treated with a final concentration of 10 -7 M dexamethasone and the same concentration of vehicle control. This was followed by mRNA sequencing and data analysis, gene ontology and pathway analysis, quantitative real-time polymerase chain reaction of mRNA, and western blot analysis of RUNX2 and β-catenin. Results In total, 26,364 mRNAs were differentially expressed. Comparison of the results of dexamethasone versus control at 2 hours revealed that 7 mRNAs were upregulated and 25 mRNAs were downregulated. The application of dexamethasone reduced the expression of RUNX2 and β-catenin in human gingiva-derived mesenchymal stem cells. Conclusion The effects of dexamethasone on stem cells were evaluated with mRNA sequencing, and validation of the expression was performed with qualitative real-time polymerase chain reaction and western blot analysis. The results of this study can provide new insights into the role of mRNA sequencing in maxillofacial areas.
NASA Astrophysics Data System (ADS)
Seresangtakul, Pusadee; Takara, Tomio
In this paper, the distinctive tones of Thai in running speech are studied. We present rules to synthesize F0 contours of Thai tones in running speech by using the generative model of F0 contours. Along with our method, the pitch contours of Thai polysyllabic words, both disyllabic and trisyllabic words, were analyzed. The coarticulation effect of Thai tones in running speech were found. Based on the analysis of the polysyllabic words using this model, rules are derived and applied to synthesize Thai polysyllabic tone sequences. We performed listening tests to evaluate intelligibility of the rules for Thai tones generation. The average intelligibility scores became 98.8%, and 96.6% for disyllabic and trisyllabic words, respectively. From these result, the rule of the tones' generation was shown to be effective. Furthermore, we constructed the connecting rules to synthesize suprasegmental F0 contours using the trisyllable training rules' parameters. The parameters of the first, the third, and the second syllables were selected and assigned to the initial, the ending, and the remaining syllables in a sentence, respectively. Even such a simple rule, the synthesized phrases/senetences were completely identified in listening tests. The MOSs (Mean Opinion Score) was 3.50 while the original and analysis/synthesis samples were 4.82 and 3.59, respectively.
Back in time: a new systematic proposal for the Bilateria.
Baguñà, Jaume; Martinez, Pere; Paps, Jordi; Riutort, Marta
2008-04-27
Conventional wisdom suggests that bilateral organisms arose from ancestors that were radially, rather than bilaterally, symmetrical and, therefore, had a single body axis and no mesoderm. The two main hypotheses on how this transformation took place consider either a simple organism akin to the planula larva of extant cnidarians or the acoel Platyhelminthes (planuloid-acoeloid theory), or a rather complex organism bearing several or most features of advanced coelomate bilaterians (archicoelomate theory). We report phylogenetic analyses of bilaterian metazoans using quantitative (ribosomal, nuclear and expressed sequence tag sequences) and qualitative (HOX cluster genes and microRNA sets) markers. The phylogenetic trees obtained corroborate the position of acoel and nemertodermatid flatworms as the earliest branching extant members of the Bilateria. Moreover, some acoelomate and pseudocoelomate clades appear as early branching lophotrochozoans and deuterostomes. These results strengthen the view that stem bilaterians were small, acoelomate/pseudocoelomate, benthic organisms derived from planuloid-like organisms. Because morphological and recent gene expression data suggest that cnidarians are actually bilateral, the origin of the last common bilaterian ancestor has to be put back in time earlier than the cnidarian-bilaterian split in the form of a planuloid animal. A new systematic scheme for the Bilateria that includes the Cnidaria is suggested and its main implications discussed.
Formation of Bipolar Lobes by Jets
NASA Astrophysics Data System (ADS)
Soker, Noam
2002-04-01
I conduct an analytical study of the interaction of jets, or a collimated fast wind (CFW), with a previously blown asymptotic giant branch (AGB) slow wind. Such jets (or CFWs) are supposedly formed when a compact companion, a main-sequence star, or a white dwarf accretes mass from the AGB star, forms an accretion disk, and blows two jets. This type of flow, which I think shapes bipolar planetary nebulae (PNs), requires three-dimensional gasdynamical simulations, which are limited in the parameter space they can cover. By imposing several simplifying assumptions, I derive simple expressions which reproduce some basic properties of lobes in bipolar PNs and which can be used to guide future numerical simulations. I quantitatively apply the results to two proto-PNs. I show that the jet interaction with the slow wind can form lobes which are narrow close to, and far away from, the central binary system, and which are wider somewhere in between. Jets that are recollimated and have constant cross section can form cylindrical lobes with constant diameter, as observed in several bipolar PNs. Close to their source, jets blown by main-sequence companions are radiative; only further out they become adiabatic, i.e., they form high-temperature, low-density bubbles that inflate the lobes.
Detection of food intake from swallowing sequences by supervised and unsupervised methods.
Lopez-Meyer, Paulo; Makeyev, Oleksandr; Schuckers, Stephanie; Melanson, Edward L; Neuman, Michael R; Sazonov, Edward
2010-08-01
Studies of food intake and ingestive behavior in free-living conditions most often rely on self-reporting-based methods that can be highly inaccurate. Methods of Monitoring of Ingestive Behavior (MIB) rely on objective measures derived from chewing and swallowing sequences and thus can be used for unbiased study of food intake with free-living conditions. Our previous study demonstrated accurate detection of food intake in simple models relying on observation of both chewing and swallowing. This article investigates methods that achieve comparable accuracy of food intake detection using only the time series of swallows and thus eliminating the need for the chewing sensor. The classification is performed for each individual swallow rather than for previously used time slices and thus will lead to higher accuracy in mass prediction models relying on counts of swallows. Performance of a group model based on a supervised method (SVM) is compared to performance of individual models based on an unsupervised method (K-means) with results indicating better performance of the unsupervised, self-adapting method. Overall, the results demonstrate that highly accurate detection of intake of foods with substantially different physical properties is possible by an unsupervised system that relies on the information provided by the swallowing alone.
Detection of Food Intake from Swallowing Sequences by Supervised and Unsupervised Methods
Lopez-Meyer, Paulo; Makeyev, Oleksandr; Schuckers, Stephanie; Melanson, Edward L.; Neuman, Michael R.; Sazonov, Edward
2010-01-01
Studies of food intake and ingestive behavior in free-living conditions most often rely on self-reporting-based methods that can be highly inaccurate. Methods of Monitoring of Ingestive Behavior (MIB) rely on objective measures derived from chewing and swallowing sequences and thus can be used for unbiased study of food intake with free-living conditions. Our previous study demonstrated accurate detection of food intake in simple models relying on observation of both chewing and swallowing. This article investigates methods that achieve comparable accuracy of food intake detection using only the time series of swallows and thus eliminating the need for the chewing sensor. The classification is performed for each individual swallow rather than for previously used time slices and thus will lead to higher accuracy in mass prediction models relying on counts of swallows. Performance of a group model based on a supervised method (SVM) is compared to performance of individual models based on an unsupervised method (K-means) with results indicating better performance of the unsupervised, self-adapting method. Overall, the results demonstrate that highly accurate detection of intake of foods with substantially different physical properties is possible by an unsupervised system that relies on the information provided by the swallowing alone. PMID:20352335
Development and Characterization of 1,906 EST-SSR Markers from Unigenes in Jute (Corchorus spp.)
Zhang, Liwu; Li, Yanru; Tao, Aifen; Fang, Pingping; Qi, Jianmin
2015-01-01
Jute, comprising white and dark jute, is the second important natural fiber crop after cotton worldwide. However, the lack of expressed sequence tag-derived simple sequence repeat (EST-SSR) markers has resulted in a large gap in the improvement of jute. Previously, de novo 48,914 unigenes from white jute were assembled. In this study, 1,906 EST-SSRs were identified from these assembled uingenes. Among these markers, di-, tri- and tetra-nucleotide repeat types were the abundant types (12.0%, 56.9% and 21.6% respectively). The AG-rich or GA-rich nucleotide repeats were the predominant. Subsequently, a sample of 116 SSRs, located in genes encoding transcription factors and cellulose synthases, were selected to survey polymorphisms among12 diverse jute accessions. Of these, 83.6% successfully amplified at least one fragment and detected polymorphism among the 12diverse genotypes, indicating that the newly developed SSRs are of good quality. Furthermore, the genetic similarity coefficients of all the 12 accessions were evaluated using 97 polymorphic SSRs. The cluster analysis divided the jute accessions into two main groups with genetic similarity coefficient of 0.61. These EST-SSR markers not only enrich molecular markers of jute genome, but also facilitate genetic and genomic researches in jute. PMID:26512891
Pliaka, V; Dedepsidis, E; Kyriakopoulou, Z; Mpirli, K; Tsakogiannis, D; Pratti, A; Levidiotou-Stefanou, S; Markoulatos, P
2010-06-01
In the post-eradication era of wild polioviruses, the only remaining sources of poliovirus infection worldwide would be the vaccine-derived polioviruses (VDPVs). As the preponderance of countries certified to be polio-free has switched from OPV (oral poliovirus vaccine) to IPV (inactivated poliovirus vaccine), importation of recombinant evolved derivatives of vaccinal strains would have serious implication for public health. To test the robustness of the proposed RT-PCR screening analysis, eleven recombinant vaccine-derived polioviruses that were characterized previously by sequencing by our group, in addition to three recently identified recombinant environmental isolates were assayed. Although the most definitive characterization of VDPVs is by genomic sequencing, in this study we describe a new, inexpensive and broadly applicable RT-PCR assay for the identification of the predominant recombination types S3/Sx in 2C and S2/Sx in 3D genomic regions respectively of VDPVs, that can be readily implemented in laboratories lacking sequencing facilities as a first approach for the early detection of vaccine-derived poliovirus (VDPVs).
Udatha, D B R K Gupta; Kouskoumvekaki, Irene; Olsson, Lisbeth; Panagiotou, Gianni
2011-01-01
One of the most intriguing groups of enzymes, the feruloyl esterases (FAEs), is ubiquitous in both simple and complex organisms. FAEs have gained importance in biofuel, medicine and food industries due to their capability of acting on a large range of substrates for cleaving ester bonds and synthesizing high-added value molecules through esterification and transesterification reactions. During the past two decades extensive studies have been carried out on the production and partial characterization of FAEs from fungi, while much less is known about FAEs of bacterial or plant origin. Initial classification studies on FAEs were restricted on sequence similarity and substrate specificity on just four model substrates and considered only a handful of FAEs belonging to the fungal kingdom. This study centers on the descriptor-based classification and structural analysis of experimentally verified and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training sequences and all the sequences of the blind test. The underlying functionality of the 12 proposed FAE families was validated against a combination of prediction tools and published experimental data. Another important aspect of the present work involves the development of pharmacophore models for the new FAE families, for which sufficient information on known substrates existed. Knowing the pharmacophoric features of a small molecule that are essential for binding to the members of a certain family opens a window of opportunities for tailored applications of FAEs. Copyright © 2010 Elsevier Inc. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...
Loblolly pine SSR markers for shortleaf pine genetics
C. Dana Nelson; Sedley Josserand; Craig S. Echt; Jeff Koppelman
2007-01-01
Simple sequence repeats (SSR) are highly informative DNA-based markers widely used in population genetic and linkage mapping studies. We have been developing PCR primer pairs for amplifying SSR markers for loblolly pine (Pinus taeda L.) using loblolly pine DNA and EST sequence data as starting materials. Fifty primer pairs known to reliably amplify...
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)
Martin, Andrew C. R.
2014-01-01
The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).
Martin, Andrew C R
2014-01-01
The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.
Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire
2012-01-01
Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr. PMID:22210604
Thomas, Cyril; Didierjean, André; Kuhn, Gustav
2018-04-17
When faced with a difficult question, people sometimes work out an answer to a related, easier question without realizing that a substitution has taken place (e.g., Kahneman, 2011, Thinking, fast and slow. New York, Farrar, Strauss, Giroux). In two experiments, we investigated whether this attribute substitution effect can also affect the interpretation of a simple visual event sequence. We used a magic trick called the 'Flushtration Count Illusion', which involves a technique used by magicians to give the illusion of having seen multiple cards with identical backs, when in fact only the back of one card (the bottom card) is repeatedly shown. In Experiment 1, we demonstrated that most participants are susceptible to the illusion, even if they have the visual and analytical reasoning capacity to correctly process the sequence. In Experiment 2, we demonstrated that participants construct a biased and simplified representation of the Flushtration Count by substituting some attributes of the event sequence. We discussed of the psychological processes underlying this attribute substitution effect. © 2018 The British Psychological Society.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats
Anwar, Tamanna; Khan, Asad U
2006-01-01
Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y
2014-04-08
The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.
2012-01-01
Background Cultivated peanut or groundnut (Arachis hypogaea L.) is an important oilseed crop with an allotetraploid genome (AABB, 2n = 4x = 40). Both the low level of genetic variation within the cultivated gene pool and its polyploid nature limit the utilization of molecular markers to explore genome structure and facilitate genetic improvement. Nevertheless, a wealth of genetic diversity exists in diploid Arachis species (2n = 2x = 20), which represent a valuable gene pool for cultivated peanut improvement. Interspecific populations have been used widely for genetic mapping in diploid species of Arachis. However, an intraspecific mapping strategy was essential to detect chromosomal rearrangements among species that could be obscured by mapping in interspecific populations. To develop intraspecific reference linkage maps and gain insights into karyotypic evolution within the genus, we comparatively mapped the A- and B-genome diploid species using intraspecific F2 populations. Exploring genome organization among diploid peanut species by comparative mapping will enhance our understanding of the cultivated tetraploid peanut genome. Moreover, new sources of molecular markers that are highly transferable between species and developed from expressed genes will be required to construct saturated genetic maps for peanut. Results A total of 2,138 EST-SSR (expressed sequence tag-simple sequence repeat) markers were developed by mining a tetraploid peanut EST assembly including 101,132 unigenes (37,916 contigs and 63,216 singletons) derived from 70,771 long-read (Sanger) and 270,957 short-read (454) sequences. A set of 97 SSR markers were also developed by mining 9,517 genomic survey sequences of Arachis. An SSR-based intraspecific linkage map was constructed using an F2 population derived from a cross between K 9484 (PI 298639) and GKBSPSc 30081 (PI 468327) in the B-genome species A. batizocoi. A high degree of macrosynteny was observed when comparing the homoeologous linkage groups between A (A. duranensis) and B (A. batizocoi) genomes. Comparison of the A- and B-genome genetic linkage maps also showed a total of five inversions and one major reciprocal translocation between two pairs of chromosomes under our current mapping resolution. Conclusions Our findings will contribute to understanding tetraploid peanut genome origin and evolution and eventually promote its genetic improvement. The newly developed EST-SSR markers will enrich current molecular marker resources in peanut. PMID:23140574
Robust temporal alignment of multimodal cardiac sequences
NASA Astrophysics Data System (ADS)
Perissinotto, Andrea; Queirós, Sandro; Morais, Pedro; Baptista, Maria J.; Monaghan, Mark; Rodrigues, Nuno F.; D'hooge, Jan; Vilaça, João. L.; Barbosa, Daniel
2015-03-01
Given the dynamic nature of cardiac function, correct temporal alignment of pre-operative models and intraoperative images is crucial for augmented reality in cardiac image-guided interventions. As such, the current study focuses on the development of an image-based strategy for temporal alignment of multimodal cardiac imaging sequences, such as cine Magnetic Resonance Imaging (MRI) or 3D Ultrasound (US). First, we derive a robust, modality-independent signal from the image sequences, estimated by computing the normalized cross-correlation between each frame in the temporal sequence and the end-diastolic frame. This signal is a resembler for the left-ventricle (LV) volume curve over time, whose variation indicates different temporal landmarks of the cardiac cycle. We then perform the temporal alignment of these surrogate signals derived from MRI and US sequences of the same patient through Dynamic Time Warping (DTW), allowing to synchronize both sequences. The proposed framework was evaluated in 98 patients, which have undergone both 3D+t MRI and US scans. The end-systolic frame could be accurately estimated as the minimum of the image-derived surrogate signal, presenting a relative error of 1.6 +/- 1.9% and 4.0 +/- 4.2% for the MRI and US sequences, respectively, thus supporting its association with key temporal instants of the cardiac cycle. The use of DTW reduces the desynchronization of the cardiac events in MRI and US sequences, allowing to temporally align multimodal cardiac imaging sequences. Overall, a generic, fast and accurate method for temporal synchronization of MRI and US sequences of the same patient was introduced. This approach could be straightforwardly used for the correct temporal alignment of pre-operative MRI information and intra-operative US images.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-02-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-01-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS
Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T.; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J.; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A.; Lempicki, Richard A.; Huang, Da Wei
2013-01-01
PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results. PMID:24179701
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.
Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A; Lempicki, Richard A; Huang, Da Wei
2013-07-31
PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.
Kweon, Chang-Hee; Nguyen, Lien Thi Kim; Yoo, Mi-Sun; Kang, Seung-Won
2015-09-15
Porcine circovirus type 2 (PCV2) is the causative agent of post-weaning multisystemic wasting syndrome (PMWS) in swine. Here, a phylogenetic tree was constructed using PCV2 nucleotide sequences derived from the bone marrow of Korean boar and previously reported PCV2 sequences isolated from various countries. PCV2 from Korean boar bone marrow (KC188796) was classified into the group containing PCV2a-Canada and other PCV2 strain from Korea. While the ORF1 region of the PCV2 genome was highly conserved, ORF2 (the capsid protein coding region) was relatively variable. The nucleotide sequences for bone marrow-derived PCV2 were 93.4-99.0% homologous to the other reference sequences. The deduced amino acid sequences for the ORF1 and ORF2 coding regions were 97.4-99.3% and 84.5-97.4% homologous with the other reference strains, respectively, indicating that KC188796 did not differ markedly from the other PCV2 strains. Phylogenetic analysis demonstrated that bone marrow-derived PCV2 was highly similar to PCV2a from Canada and may be related to persistent PCV2 infections in swine. Copyright © 2015 Elsevier B.V. All rights reserved.
Sitt, Tatjana; Pelle, Roger; Chepkwony, Maurine; Morrison, W Ivan; Toye, Philip
2018-05-06
The extent of sequence diversity among the genes encoding 10 antigens (Tp1-10) known to be recognized by CD8+ T lymphocytes from cattle immune to Theileria parva was analysed. The sequences were derived from parasites in 23 buffalo-derived cell lines, three cattle-derived isolates and one cloned cell line obtained from a buffalo-derived stabilate. The results revealed substantial variation among the antigens through sequence diversity. The greatest nucleotide and amino acid diversity were observed in Tp1, Tp2 and Tp9. Tp5 and Tp7 showed the least amount of allelic diversity, and Tp5, Tp6 and Tp7 had the lowest levels of protein diversity. Tp6 was the most conserved protein; only a single non-synonymous substitution was found in all obtained sequences. The ratio of non-synonymous: synonymous substitutions varied from 0.84 (Tp1) to 0.04 (Tp6). Apart from Tp2 and Tp9, we observed no variation in the other defined CD8+ T cell epitopes (Tp4, 5, 7 and 8), indicating that epitope variation is not a universal feature of T. parva antigens. In addition to providing markers that can be used to examine the diversity in T. parva populations, the results highlight the potential for using conserved antigens to develop vaccines that provide broad protection against T. parva.
BioWord: A sequence manipulation suite for Microsoft Word
2012-01-01
Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms. PMID:22676326
BioWord: a sequence manipulation suite for Microsoft Word.
Anzaldi, Laura J; Muñoz-Fernández, Daniel; Erill, Ivan
2012-06-07
The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms.
NASA Technical Reports Server (NTRS)
Gatlin, L. L.
1974-01-01
Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.
Modeling How, When, and What Is Learned in a Simple Fault-Finding Task
ERIC Educational Resources Information Center
Ritter, Frank E.; Bibby, Peter A.
2008-01-01
We have developed a process model that learns in multiple ways while finding faults in a simple control panel device. The model predicts human participants' learning through its own learning. The model's performance was systematically compared to human learning data, including the time course and specific sequence of learned behaviors. These…
ERIC Educational Resources Information Center
Hollibaugh, Molly
2012-01-01
At first glance, a Zentangle creation can seem intricate and complicated. But, when you learn how it is done, you realize how simple it is. Zentangles are patterns, or "tangles," that have been reduced to a simple sequence of elemental strokes. When you learn to focus on each stroke you find yourself capable of things that you may have once…
An annotated genetic map of loblolly pine based on microsatellite and cDNA markers
USDA-ARS?s Scientific Manuscript database
Previous loblolly pine (Pinus taeda L.) genetic linkage maps have been based on a variety of DNA polymorphisms, such as AFLPs, RAPDs, RFLPs, and ESTPs, but only a few SSRs (simple sequence repeats), also known as simple tandem repeats or microsatellites, have been mapped in P. taeda. The objective o...
Simple model for deriving sdg interacting boson model Hamiltonians: 150Nd example
NASA Astrophysics Data System (ADS)
Devi, Y. D.; Kota, V. K. B.
1993-07-01
A simple and yet useful model for deriving sdg interacting boson model (IBM) Hamiltonians is to assume that single-boson energies derive from identical particle (pp and nn) interactions and proton, neutron single-particle energies, and that the two-body matrix elements for bosons derive from pn interaction, with an IBM-2 to IBM-1 projection of the resulting p-n sdg IBM Hamiltonian. The applicability of this model in generating sdg IBM Hamiltonians is demonstrated, using a single-j-shell Otsuka-Arima-Iachello mapping of the quadrupole and hexadecupole operators in proton and neutron spaces separately and constructing a quadrupole-quadrupole plus hexadecupole-hexadecupole Hamiltonian in the analysis of the spectra, B(E2)'s, and E4 strength distribution in the example of 150Nd.
Detection of possible restriction sites for type II restriction enzymes in DNA sequences.
Gagniuc, P; Cimponeriu, D; Ionescu-Tîrgovişte, C; Mihai, Andrada; Stavarachi, Monica; Mihai, T; Gavrilă, L
2011-01-01
In order to make a step forward in the knowledge of the mechanism operating in complex polygenic disorders such as diabetes and obesity, this paper proposes a new algorithm (PRSD -possible restriction site detection) and its implementation in Applied Genetics software. This software can be used for in silico detection of potential (hidden) recognition sites for endonucleases and for nucleotide repeats identification. The recognition sites for endonucleases may result from hidden sequences through deletion or insertion of a specific number of nucleotides. Tests were conducted on DNA sequences downloaded from NCBI servers using specific recognition sites for common type II restriction enzymes introduced in the software database (n = 126). Each possible recognition site indicated by the PRSD algorithm implemented in Applied Genetics was checked and confirmed by NEBcutter V2.0 and Webcutter 2.0 software. In the sequence NG_008724.1 (which includes 63632 nucleotides) we found a high number of potential restriction sites for ECO R1 that may be produced by deletion (n = 43 sites) or insertion (n = 591 sites) of one nucleotide. The second module of Applied Genetics has been designed to find simple repeats sizes with a real future in understanding the role of SNPs (Single Nucleotide Polymorphisms) in the pathogenesis of the complex metabolic disorders. We have tested the presence of simple repetitive sequences in five DNA sequence. The software indicated exact position of each repeats detected in the tested sequences. Future development of Applied Genetics can provide an alternative for powerful tools used to search for restriction sites or repetitive sequences or to improve genotyping methods.
``Sequence space soup'' of proteins and copolymers
NASA Astrophysics Data System (ADS)
Chan, Hue Sun; Dill, Ken A.
1991-09-01
To study the protein folding problem, we use exhaustive computer enumeration to explore ``sequence space soup,'' an imaginary solution containing the ``native'' conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymer sequence. The model is of short self-avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two-dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acid monomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.
Masking as an effective quality control method for next-generation sequencing data analysis.
Yun, Sajung; Yun, Sijung
2014-12-13
Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).
Rana, Niki; Cultrara, Christopher; Phillips, Mariana; Sabatino, David
2017-09-01
In the search for more potent peptide-based anti-cancer conjugates the generation of new, functionally diverse nucleolipid derived D-(KLAKLAK) 2 -AK sequences has enabled a structure and anti-cancer activity relationship study. A reductive amination approach was key for the synthesis of alkylamine, diamine and polyamine derived nucleolipids as well as those incorporating heterocyclic functionality. The carboxy-derived nucleolipids were then coupled to the C-terminus of the D-(KLAKLAK) 2 -AK killer peptide sequence and produced with and without the FITC fluorophore for investigating biological activity in cancer cells. The amphiphilic, α-helical peptide-nucleolipid bioconjugates were found to exhibit variable effects on the viability of MM.1S cells, with the histamine derived nucleolipid peptide bioconjugate displaying the most significant anti-cancer effects. Thus, functionally diverse nucleolipids have been developed to fine-tune the structure and anti-cancer properties of killer peptide sequences, such as D-(KLAKLAK) 2 -AK. Copyright © 2017 Elsevier Ltd. All rights reserved.
Modahl, Cassandra M.; Mackessy, Stephen P.
2016-01-01
Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides access to cDNA sequences in the absence of living specimens, even from commercial venom sources, to evaluate important regional differences in venom composition and to study snake venom protein evolution. PMID:27280639
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Optimization of sequence alignment for simple sequence repeat regions.
Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C
2011-07-20
Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
2010-01-01
Background Previous reports have shown that peptides derived from the apolipoprotein E receptor binding region and the amphipathic α-helical domains of apolipoprotein AI have broad anti-infective activity and antiviral activity respectively. Lipoproteins and viruses share a similar cell biological niche, being of overlapping size and displaying similar interactions with mammalian cells and receptors, which may have led to other antiviral sequences arising within apolipoproteins, in addition to those previously reported. We therefore designed a series of peptides based around either apolipoprotein receptor binding regions, or amphipathic α-helical domains, and tested these for antiviral and antibacterial activity. Results Of the nineteen new peptides tested, seven showed some anti-infective activity, with two of these being derived from two apolipoproteins not previously used to derive anti-infective sequences. Apolipoprotein J (151-170) - based on a predicted amphipathic alpha-helical domain from apolipoprotein J - had measurable anti-HSV1 activity, as did apolipoprotein B (3359-3367) dp (apoBdp), the latter being derived from the LDL receptor binding domain B of apolipoprotein B. The more active peptide - apoBdp - showed similarity to the previously reported apoE derived anti-infective peptide, and further modification of the apoBdp sequence to align the charge distribution more closely to that of apoEdp or to introduce aromatic residues resulted in increased breadth and potency of activity. The most active peptide of this type showed similar potent anti-HIV activity, comparable to that we previously reported for the apoE derived peptide apoEdpL-W. Conclusions These data suggest that further antimicrobial peptides may be obtained using human apolipoprotein sequences, selecting regions with either amphipathic α-helical structure, or those linked to receptor-binding regions. The finding that an amphipathic α-helical region of apolipoprotein J has antiviral activity comparable with that for the previously reported apolipoprotein AI derived peptide 18A, suggests that full-length apolipoprotein J may also have such activity, as has been reported for full-length apolipoprotein AI. Although the strength of the anti-infective activity of the sequences identified was limited, this could be increased substantially by developing related mutant peptides. Indeed the apolipoprotein B-derived peptide mutants uncovered by the present study may have utility as HIV therapeutics or microbicides. PMID:20298574
Bell's Theorem and Einstein's "Spooky Actions" from a Simple Thought Experiment
ERIC Educational Resources Information Center
Kuttner, Fred; Rosenblum, Bruce
2010-01-01
In 1964 John Bell proved a theorem allowing the experimental test of whether what Einstein derided as "spooky actions at a distance" actually exist. We will see that they "do". Bell's theorem can be displayed with a simple, nonmathematical thought experiment suitable for a physics course at "any" level. And a simple, semi-classical derivation of…
Hosseinkhani, Hossein; Tabata, Yasuhiko
2003-01-09
The objective of this study is to investigate the efficiency of a non-viral gene carrier with RGD sequences, Pronectin F(+) for gene transfection. The Pronectin F(+) was cationized by introducing ethylenediamine (Ed), spermidine (Sd), and spermine (Sm) to the hydroxyl groups while the corresponding gelatin derivative was prepared similarly because gelatin also has one RGD sequence per molecule. The zeta potential and molecular size of Pronectin F(+) and gelatin derivatives were examined before and after polyion complexation with a plasmid DNA of luciferase. When complexed with the plasmid DNA at the Pronectin F(+)/plasmid DNA mixing ratio of 50, the complex exhibited a zeta potential of about 10 mV, which is similar to that of the gelatin derivative-plasmid DNA complex. Irrespective of the type of Pronectin F(+) and gelatin derivatives, their complexation enabled the apparent molecular size of plasmid DNA to reduce to about 200 nm, the size decreasing with the increased derivative/plasmid DNA weight mixing ratio. The rat gastric mucosal (RGM)-1 cells treated with both complexes exhibited significantly stronger luciferase activities than free plasmid DNA although the enhanced extent was significant for the Sm derivative compared with the corresponding Ed and Sd derivatives. Cell attachment was enhanced by the Pronectin F(+) derivative to a significant high extent compared with the gelatin derivative. The amount of plasmid DNA internalized into the cells was enhanced by the complexation with every Pronectin F(+) derivative compared with the gelatin derivative. For both of Pronectin F(+) and gelatin carriers, the buffering capacity of Sm derivatives was higher than that of Ed and Sd derivatives and comparable to that of polyethyleneimine. It is likely that the high efficiency of gene transfection for the Sm derivative is due to the superior buffering effect. We conclude that the Sm derivative of Pronectin F(+) is promising as a non-viral vector of gene transfection.
A high-speed on-chip pseudo-random binary sequence generator for multi-tone phase calibration
NASA Astrophysics Data System (ADS)
Gommé, Liesbeth; Vandersteen, Gerd; Rolain, Yves
2011-07-01
An on-chip reference generator is conceived by adopting the technique of decimating a pseudo-random binary sequence (PRBS) signal in parallel sequences. This is of great benefit when high-speed generation of PRBS and PRBS-derived signals is the objective. The design implemented standard CMOS logic is available in commercial libraries to provide the logic functions for the generator. The design allows the user to select the periodicity of the PRBS and the PRBS-derived signals. The characterization of the on-chip generator marks its performance and reveals promising specifications.
Foltz, T M; Welsh, B M
1999-01-01
This paper uses the fact that the discrete Fourier transform diagonalizes a circulant matrix to provide an alternate derivation of the symmetric convolution-multiplication property for discrete trigonometric transforms. Derived in this manner, the symmetric convolution-multiplication property extends easily to multiple dimensions using the notion of block circulant matrices and generalizes to multidimensional asymmetric sequences. The symmetric convolution of multidimensional asymmetric sequences can then be accomplished by taking the product of the trigonometric transforms of the sequences and then applying an inverse trigonometric transform to the result. An example is given of how this theory can be used for applying a two-dimensional (2-D) finite impulse response (FIR) filter with nonlinear phase which models atmospheric turbulence.
Franke, Werner W; Heid, Hans; Zimbelmann, Ralf; Kuhn, Caecilia; Winter-Simanowski, Stefanie; Dörflinger, Yvette; Grund, Christine; Rickelt, Steffen
2013-07-01
Protein PERP (p53 apoptosis effector related to PMP-22) is a small (21.4 kDa) transmembrane polypeptide with an amino acid sequence indicative of a tetraspanin character. It is enriched in the plasma membrane and apparently contributes to cell-cell contacts. Hitherto, it has been reported to be exclusively a component of desmosomes of some stratified epithelia. However, by using a series of newly generated mono- and polyclonal antibodies, we show that protein PERP is not only present in all kinds of stratified epithelia but also occurs in simple, columnar, complex and transitional epithelia, in various types of squamous metaplasia and epithelium-derived tumors, in diverse epithelium-derived cell cultures and in myocardial tissue. Immunofluorescence and immunoelectron microscopy allow us to localize PERP predominantly in small intradesmosomal locations and in variously sized, junction-like peri- and interdesmosomal regions ("tessellate junctions"), mostly in mosaic or amalgamated combinations with other molecules believed, to date, to be exclusive components of tight and adherens junctions. In the heart, PERP is a major component of the composite junctions of the intercalated disks connecting cardiomyocytes. Finally, protein PERP is a cobblestone-like general component of special plasma membrane regions such as the bile canaliculi of liver and subapical-to-lateral zones of diverse columnar epithelia and upper urothelial cell layers. We discuss possible organizational and architectonic functions of protein PERP and its potential value as an immunohistochemical diagnostic marker.
2012-01-01
Background Most modern citrus cultivars have an interspecific origin. As a foundational step towards deciphering the interspecific genome structures, a reference whole genome sequence was produced by the International Citrus Genome Consortium from a haploid derived from Clementine mandarin. The availability of a saturated genetic map of Clementine was identified as an essential prerequisite to assist the whole genome sequence assembly. Clementine is believed to be a ‘Mediterranean’ mandarin × sweet orange hybrid, and sweet orange likely arose from interspecific hybridizations between mandarin and pummelo gene pools. The primary goals of the present study were to establish a Clementine reference map using codominant markers, and to perform comparative mapping of pummelo, sweet orange, and Clementine. Results Five parental genetic maps were established from three segregating populations, which were genotyped with Single Nucleotide Polymorphism (SNP), Simple Sequence Repeats (SSR) and Insertion-Deletion (Indel) markers. An initial medium density reference map (961 markers for 1084.1 cM) of the Clementine was established by combining male and female Clementine segregation data. This Clementine map was compared with two pummelo maps and a sweet orange map. The linear order of markers was highly conserved in the different species. However, significant differences in map size were observed, which suggests a variation in the recombination rates. Skewed segregations were much higher in the male than female Clementine mapping data. The mapping data confirmed that Clementine arose from hybridization between ‘Mediterranean’ mandarin and sweet orange. The results identified nine recombination break points for the sweet orange gamete that contributed to the Clementine genome. Conclusions A reference genetic map of citrus, used to facilitate the chromosome assembly of the first citrus reference genome sequence, was established. The high conservation of marker order observed at the interspecific level should allow reasonable inferences of most citrus genome sequences by mapping next-generation sequencing (NGS) data in the reference genome sequence. The genome of the haploid Clementine used to establish the citrus reference genome sequence appears to have been inherited primarily from the ‘Mediterranean’ mandarin. The high frequency of skewed allelic segregations in the male Clementine data underline the probable extent of deviation from Mendelian segregation for characters controlled by heterozygous loci in male parents. PMID:23126659
2011-01-01
Background Alfalfa, [Medicago sativa (L.) sativa], a widely-grown perennial forage has potential for development as a cellulosic ethanol feedstock. However, the genomics of alfalfa, a non-model species, is still in its infancy. The recent advent of RNA-Seq, a massively parallel sequencing method for transcriptome analysis, provides an opportunity to expand the identification of alfalfa genes and polymorphisms, and conduct in-depth transcript profiling. Results Cell walls in stems of alfalfa genotype 708 have higher cellulose and lower lignin concentrations compared to cell walls in stems of genotype 773. Using the Illumina GA-II platform, a total of 198,861,304 expression sequence tags (ESTs, 76 bp in length) were generated from cDNA libraries derived from elongating stem (ES) and post-elongation stem (PES) internodes of 708 and 773. In addition, 341,984 ESTs were generated from ES and PES internodes of genotype 773 using the GS FLX Titanium platform. The first alfalfa (Medicago sativa) gene index (MSGI 1.0) was assembled using the Sanger ESTs available from GenBank, the GS FLX Titanium EST sequences, and the de novo assembled Illumina sequences. MSGI 1.0 contains 124,025 unique sequences including 22,729 tentative consensus sequences (TCs), 22,315 singletons and 78,981 pseudo-singletons. We identified a total of 1,294 simple sequence repeats (SSR) among the sequences in MSGI 1.0. In addition, a total of 10,826 single nucleotide polymorphisms (SNPs) were predicted between the two genotypes. Out of 55 SNPs randomly selected for experimental validation, 47 (85%) were polymorphic between the two genotypes. We also identified numerous allelic variations within each genotype. Digital gene expression analysis identified numerous candidate genes that may play a role in stem development as well as candidate genes that may contribute to the differences in cell wall composition in stems of the two genotypes. Conclusions Our results demonstrate that RNA-Seq can be successfully used for gene identification, polymorphism detection and transcript profiling in alfalfa, a non-model, allogamous, autotetraploid species. The alfalfa gene index assembled in this study, and the SNPs, SSRs and candidate genes identified can be used to improve alfalfa as a forage crop and cellulosic feedstock. PMID:21504589
Colombo, M M; Swanton, M T; Donini, P; Prescott, D M
1984-01-01
Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
Pedersen, Niels; Liu, Hongwei; Millon, Lee; Greer, Kimberly
2011-01-01
A significantly increased risk for a number of autoimmune and infectious diseases in purebred and mixed-breed dogs has been associated with certain alleles or allele combinations of the dog leukocyte antigen (DLA) class II complex containing the DRB1, DQA1, and DQB1 genes. The exact level of risk depends on the specific disease, the alleles in question, and whether alleles exist in a homozygous or heterozygous state. The gold standard for identifying high-risk alleles and their zygosity has involved direct sequencing of the exon 2 regions of each of the 3 genes. However, sequencing and identification of specific alleles at each of the 3 loci are relatively expensive and sequencing techniques are not ideal for additional parentage or identity determination. However, it is often possible to get the same information from sequencing only 1 gene given the small number of possible alleles at each locus in purebred dogs, extensive homozygosity, and tendency for disease-causing alleles at each of the 3 loci to be strongly linked to each other into haplotypes. Therefore, genetic testing in purebred dogs with immune diseases can be often simplified by sequencing alleles at 1 rather than 3 loci. Further simplification of genetic tests for canine immune diseases can be achieved by the use of alternative genetic markers in the DLA class II region that are also strongly linked with the disease genotype. These markers consist of either simple tandem repeats or single nucleotide polymorphisms that are also in strong linkage with specific DLA class II genotypes and/or haplotypes. The current study uses necrotizing meningoencephalitis of Pug dogs as a paradigm to assess simple alternative genetic tests for disease risk. It was possible to attain identical necrotizing meningoencephalitis risk assessments to 3-locus DLA class II sequencing by sequencing only the DQB1 gene, using 3 DLA class II-linked simple tandem repeat markers, or with a small single nucleotide polymorphism array designed to identify breed-specific DQB1 alleles.
Revising Star and Planet Formation Timescales
NASA Astrophysics Data System (ADS)
Bell, Cameron P. M.; Naylor, Tim; Mayne, N. J.; Jeffries, R. D.; Littlefair, S. P.
2013-07-01
We have derived ages for 13 young (<30 Myr) star-forming regions and find that they are up to a factor of 2 older than the ages typically adopted in the literature. This result has wide-ranging implications, including that circumstellar discs survive longer (≃ 10-12 Myr) and that the average Class I lifetime is greater (≃1 Myr) than currently believed. For each star-forming region, we derived two ages from colour-magnitude diagrams. First, we fitted models of the evolution between the zero-age main sequence and terminal-age main sequence to derive a homogeneous set of main-sequence ages, distances and reddenings with statistically meaningful uncertainties. Our second age for each star-forming region was derived by fitting pre-main-sequence stars to new semi-empirical model isochrones. For the first time (for a set of clusters younger than 50 Myr), we find broad agreement between these two ages, and since these are derived from two distinct mass regimes that rely on different aspects of stellar physics, it gives us confidence in the new age scale. This agreement is largely due to our adoption of empirical colour-Teff relations and bolometric corrections for pre-main-sequence stars cooler than 4000 K. The revised ages for the star-forming regions in our sample are: 2 Myr for NGC 6611 (Eagle Nebula; M 16), IC 5146 (Cocoon Nebula), NGC 6530 (Lagoon Nebula; M 8) and NGC 2244 (Rosette Nebula); 6 Myr for σ Ori, Cep OB3b and IC 348; ≃10 Myr for λ Ori (Collinder 69); ≃11 Myr for NGC 2169; ≃12 Myr for NGC 2362; ≃13 Myr for NGC 7160; ≃14 Myr for χ Per (NGC 884); and ≃20 Myr for NGC 1960 (M 36).
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis
Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje
2016-01-01
A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.
Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje
2016-01-01
A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie
2011-09-12
Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
2011-01-01
Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Identification of apple cultivars on the basis of simple sequence repeat markers.
Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y
2014-09-12
DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.
40 CFR 86.230-94 - Test sequence: general requirements.
Code of Federal Regulations, 2010 CFR
2010-07-01
... testing. (2) The ambient temperature reported shall be a simple average of the test cell temperatures... cell temperature shall be 20 °F±3 °F (−7 °C±1.7 °C) when measured in accordance with paragraph (e)(2... approximately level during all phases of the test sequence to prevent abnormal fuel distribution. (e) Engine...
Cross-species transferability and mapping of genomic and cDNA SSRs in pines
D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion
2004-01-01
Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...
Genetic variation patterns of American chestnut populations at EST-SSRs
Oliver Gailing; C. Dana Nelson
2017-01-01
The objective of this study is to analyze patterns of genetic variation at genic expressed sequence tag - simple sequence repeats (EST-SSRs) and at chloroplast DNA markers in populations of American chestnut (Castanea dentata Borkh.) to assist in conservation and breeding efforts. Allelic diversity at EST-SSRs decreased significantly from southwest to northeast along...
ERIC Educational Resources Information Center
Viczko, Jeremy; Sergeeva, Valya; Ray, Laura B.; Owen, Adrian M.; Fogel, Stuart M.
2018-01-01
Sleep facilitates the consolidation (i.e., enhancement) of simple, explicit (i.e., conscious) motor sequence learning (MSL). MSL can be dissociated into egocentric (i.e., motor) or allocentric (i.e., spatial) frames of reference. The consolidation of the allocentric memory representation is sleep-dependent, whereas the egocentric consolidation…
ERIC Educational Resources Information Center
Kundey, Shannon M. A.; Strandell, Brittany; Mathis, Heather; Rowan, James D.
2010-01-01
(Hulse and Dorsky, 1977) and (Hulse and Dorsky, 1979) found that rats, like humans, learn sequences following a simple rule-based structure more quickly than those lacking a rule-based structure. Through two experiments, we explored whether two additional species--domesticated horses ("Equus callabus") and chickens ("Gallus domesticus")--would…
The determination of high-resolution spatio-temporal glacier motion fields from time-lapse sequences
NASA Astrophysics Data System (ADS)
Schwalbe, Ellen; Maas, Hans-Gerd
2017-12-01
This paper presents a comprehensive method for the determination of glacier surface motion vector fields at high spatial and temporal resolution. These vector fields can be derived from monocular terrestrial camera image sequences and are a valuable data source for glaciological analysis of the motion behaviour of glaciers. The measurement concepts for the acquisition of image sequences are presented, and an automated monoscopic image sequence processing chain is developed. Motion vector fields can be derived with high precision by applying automatic subpixel-accuracy image matching techniques on grey value patterns in the image sequences. Well-established matching techniques have been adapted to the special characteristics of the glacier data in order to achieve high reliability in automatic image sequence processing, including the handling of moving shadows as well as motion effects induced by small instabilities in the camera set-up. Suitable geo-referencing techniques were developed to transform image measurements into a reference coordinate system.The result of monoscopic image sequence analysis is a dense raster of glacier surface point trajectories for each image sequence. Each translation vector component in these trajectories can be determined with an accuracy of a few centimetres for points at a distance of several kilometres from the camera. Extensive practical validation experiments have shown that motion vector and trajectory fields derived from monocular image sequences can be used for the determination of high-resolution velocity fields of glaciers, including the analysis of tidal effects on glacier movement, the investigation of a glacier's motion behaviour during calving events, the determination of the position and migration of the grounding line and the detection of subglacial channels during glacier lake outburst floods.
A better sequence-read simulator program for metagenomics.
Johnson, Stephen; Trost, Brett; Long, Jeffrey R; Pittet, Vanessa; Kusalik, Anthony
2014-01-01
There are many programs available for generating simulated whole-genome shotgun sequence reads. The data generated by many of these programs follow predefined models, which limits their use to the authors' original intentions. For example, many models assume that read lengths follow a uniform or normal distribution. Other programs generate models from actual sequencing data, but are limited to reads from single-genome studies. To our knowledge, there are no programs that allow a user to generate simulated data following non-parametric read-length distributions and quality profiles based on empirically-derived information from metagenomics sequencing data. We present BEAR (Better Emulation for Artificial Reads), a program that uses a machine-learning approach to generate reads with lengths and quality values that closely match empirically-derived distributions. BEAR can emulate reads from various sequencing platforms, including Illumina, 454, and Ion Torrent. BEAR requires minimal user input, as it automatically determines appropriate parameter settings from user-supplied data. BEAR also uses a unique method for deriving run-specific error rates, and extracts useful statistics from the metagenomic data itself, such as quality-error models. Many existing simulators are specific to a particular sequencing technology; however, BEAR is not restricted in this way. Because of its flexibility, BEAR is particularly useful for emulating the behaviour of technologies like Ion Torrent, for which no dedicated sequencing simulators are currently available. BEAR is also the first metagenomic sequencing simulator program that automates the process of generating abundances, which can be an arduous task. BEAR is useful for evaluating data processing tools in genomics. It has many advantages over existing comparable software, such as generating more realistic reads and being independent of sequencing technology, and has features particularly useful for metagenomics work.
Characterization of circulating transfer RNA-derived RNA fragments in cattle
Casas, Eduardo; Cai, Guohong; Neill, John D.
2015-01-01
The objective was to characterize naturally occurring circulating transfer RNA-derived RNA fragments (tRFs) in cattle1. Serum from eight clinically normal adult dairy cows was collected, and small non-coding RNAs were extracted immediately after collection and sequenced by Illumina MiSeq. Sequences aligned to transfer RNA (tRNA) genes or their flanking sequences were characterized. Sequences aligned to the beginning of 5′ end of the mature tRNA were classified as tRF5; those aligned to the 3′ end of mature tRNA were classified as tRF3; and those aligned to the beginning of the 3′ end flanking sequences were classified as tRF1. There were 3,190,962 sequences that mapped to transfer RNA and small non-coding RNAs in the bovine genome. Of these, 2,323,520 were identified as tRF5s, 562 were tRF3s, and 81 were tRF1s. There were 866,799 sequences identified as other small non-coding RNAs (microRNA, rRNA, snoRNA, etc.) and were excluded from the study. The tRF5s ranged from 28 to 40 nucleotides; and 98.7% ranged from 30 to 34 nucleotides in length. The tRFs with the greatest number of sequences were derived from tRNA of histidine, glutamic acid, lysine, glycine, and valine. There was no association between number of codons for each amino acid and number of tRFs in the samples. The reason for tRF5s being the most abundant can only be explained if these sequences are associated with function within the animal. PMID:26379699
Demetriou, Eleni; Tachrount, Mohamed; Zaiss, Moritz; Shmueli, Karin; Golay, Xavier
2018-03-05
To develop a new MRI technique to rapidly measure exchange rates in CEST MRI. A novel pulse sequence for measuring chemical exchange rates through a progressive saturation recovery process, called PRO-QUEST (progressive saturation for quantifying exchange rates using saturation times), has been developed. Using this method, the water magnetization is sampled under non-steady-state conditions, and off-resonance saturation is interleaved with the acquisition of images obtained through a Look-Locker type of acquisition. A complete theoretical framework has been set up, and simple equations to obtain the exchange rates have been derived. A reduction of scan time from 58 to 16 minutes has been obtained using PRO-QUEST versus the standard QUEST. Maps of both T 1 of water and B 1 can simply be obtained by repetition of the sequence without off-resonance saturation pulses. Simulations and calculated exchange rates from experimental data using amino acids such as glutamate, glutamine, taurine, and alanine were compared and found to be in good agreement. The PRO-QUEST sequence was also applied on healthy and infarcted rats after 24 hours, and revealed that imaging specificity to ischemic acidification during stroke was substantially increased relative to standard amide proton transfer-weighted imaging. Because of the reduced scan time and insensitivity to nonchemical exchange factors such as direct water saturation, PRO-QUEST can serve as an excellent alternative for researchers and clinicians interested to map pH changes in vivo. © 2018 International Society for Magnetic Resonance in Medicine.
Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L
1993-01-01
Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002
Construction of a reference genetic linkage map for carnation (Dianthus caryophyllus L.)
2013-01-01
Background Genetic linkage maps are important tools for many genetic applications including mapping of quantitative trait loci (QTLs), identifying DNA markers for fingerprinting, and map-based gene cloning. Carnation (Dianthus caryophyllus L.) is an important ornamental flower worldwide. We previously reported a random amplified polymorphic DNA (RAPD)-based genetic linkage map derived from Dianthus capitatus ssp. andrezejowskianus and a simple sequence repeat (SSR)-based genetic linkage map constructed using data from intraspecific F2 populations; however, the number of markers was insufficient, and so the number of linkage groups (LGs) did not coincide with the number of chromosomes (x = 15). Therefore, we aimed to produce a high-density genetic map to improve its usefulness for breeding purposes and genetic research. Results We improved the SSR-based genetic linkage map using SSR markers derived from a genomic library, expression sequence tags, and RNA-seq data. Linkage analysis revealed that 412 SSR loci (including 234 newly developed SSR loci) could be mapped to 17 linkage groups (LGs) covering 969.6 cM. Comparison of five minor LGs covering less than 50 cM with LGs in our previous RAPD-based genetic map suggested that four LGs could be integrated into two LGs by anchoring common SSR loci. Consequently, the number of LGs corresponded to the number of chromosomes (x = 15). We added 192 new SSRs, eight RAPD, and two sequence-tagged site loci to refine the RAPD-based genetic linkage map, which comprised 15 LGs consisting of 348 loci covering 978.3 cM. The two maps had 125 SSR loci in common, and most of the positions of markers were conserved between them. We identified 635 loci in carnation using the two linkage maps. We also mapped QTLs for two traits (bacterial wilt resistance and anthocyanin pigmentation in the flower) and a phenotypic locus for flower-type by analyzing previously reported genotype and phenotype data. Conclusions The improved genetic linkage maps and SSR markers developed in this study will serve as reference genetic linkage maps for members of the genus Dianthus, including carnation, and will be useful for mapping QTLs associated with various traits, and for improving carnation breeding programs. PMID:24160306
Construction of a reference genetic linkage map for carnation (Dianthus caryophyllus L.).
Yagi, Masafumi; Yamamoto, Toshiya; Isobe, Sachiko; Hirakawa, Hideki; Tabata, Satoshi; Tanase, Koji; Yamaguchi, Hiroyasu; Onozaki, Takashi
2013-10-26
Genetic linkage maps are important tools for many genetic applications including mapping of quantitative trait loci (QTLs), identifying DNA markers for fingerprinting, and map-based gene cloning. Carnation (Dianthus caryophyllus L.) is an important ornamental flower worldwide. We previously reported a random amplified polymorphic DNA (RAPD)-based genetic linkage map derived from Dianthus capitatus ssp. andrezejowskianus and a simple sequence repeat (SSR)-based genetic linkage map constructed using data from intraspecific F2 populations; however, the number of markers was insufficient, and so the number of linkage groups (LGs) did not coincide with the number of chromosomes (x = 15). Therefore, we aimed to produce a high-density genetic map to improve its usefulness for breeding purposes and genetic research. We improved the SSR-based genetic linkage map using SSR markers derived from a genomic library, expression sequence tags, and RNA-seq data. Linkage analysis revealed that 412 SSR loci (including 234 newly developed SSR loci) could be mapped to 17 linkage groups (LGs) covering 969.6 cM. Comparison of five minor LGs covering less than 50 cM with LGs in our previous RAPD-based genetic map suggested that four LGs could be integrated into two LGs by anchoring common SSR loci. Consequently, the number of LGs corresponded to the number of chromosomes (x = 15). We added 192 new SSRs, eight RAPD, and two sequence-tagged site loci to refine the RAPD-based genetic linkage map, which comprised 15 LGs consisting of 348 loci covering 978.3 cM. The two maps had 125 SSR loci in common, and most of the positions of markers were conserved between them. We identified 635 loci in carnation using the two linkage maps. We also mapped QTLs for two traits (bacterial wilt resistance and anthocyanin pigmentation in the flower) and a phenotypic locus for flower-type by analyzing previously reported genotype and phenotype data. The improved genetic linkage maps and SSR markers developed in this study will serve as reference genetic linkage maps for members of the genus Dianthus, including carnation, and will be useful for mapping QTLs associated with various traits, and for improving carnation breeding programs.
Transcriptome-enabled marker discovery and mapping of plastochron-related genes in Petunia spp.
Guo, Yufang; Wiegert-Rininger, Krystle E; Vallejo, Veronica A; Barry, Cornelius S; Warner, Ryan M
2015-09-24
Petunia (Petunia × hybrida), derived from a hybrid between P. axillaris and P. integrifolia, is one of the most economically important bedding plant crops and Petunia spp. serve as model systems for investigating the mechanisms underlying diverse mating systems and pollination syndromes. In addition, we have previously described genetic variation and quantitative trait loci (QTL) related to petunia development rate and morphology, which represent important breeding targets for the floriculture industry to improve crop production and performance. Despite the importance of petunia as a crop, the floriculture industry has been slow to adopt marker assisted selection to facilitate breeding strategies and there remains a limited availability of sequences and molecular markers from the genus compared to other economically important members of the Solanaceae family such as tomato, potato and pepper. Here we report the de novo assembly, annotation and characterization of transcriptomes from P. axillaris, P. exserta and P. integrifolia. Each transcriptome assembly was derived from five tissue libraries (callus, 3-week old seedlings, shoot apices, flowers of mixed developmental stages, and trichomes). A total of 74,573, 54,913, and 104,739 assembled transcripts were recovered from P. axillaris, P. exserta and P. integrifolia, respectively and following removal of multiple isoforms, 32,994 P. axillaris, 30,225 P. exserta, and 33,540 P. integrifolia high quality representative transcripts were extracted for annotation and expression analysis. The transcriptome data was mined for single nucleotide polymorphisms (SNP) and simple sequence repeat (SSR) markers, yielding 89,007 high quality SNPs and 2949 SSRs, respectively. 15,701 SNPs were computationally converted into user-friendly cleaved amplified polymorphic sequence (CAPS) markers and a subset of SNP and CAPS markers were experimentally verified. CAPS markers developed from plastochron-related homologous transcripts from P. axillaris were mapped in an interspecific Petunia population and evaluated for co-localization with QTL for development rate. The high quality of the three Petunia spp. transcriptomes coupled with the utility of the SNP data will serve as a resource for further exploration of genetic diversity within the genus and will facilitate efforts to develop genetic and physical maps to aid the identification of QTL associated with traits of interest.
Abraham, A D; Menzel, W; Varrelmann, M; Vetten, H Josef
2009-01-01
Chickpea chlorotic stunt virus (CpCSV), a proposed new member of the genus Polerovirus (family Luteoviridae), has been reported only from Ethiopia. In attempts to determine the geographical distribution and variability of CpCSV, a pair of degenerate primers derived from conserved domains of the luteovirus coat protein (CP) gene was used for RT-PCR analysis of various legume samples originating from five countries and containing unidentified luteoviruses. Sequencing of the amplicons provided evidence for the occurrence of CpCSV also in Egypt, Morocco, Sudan, and Syria. Phylogenetic analysis of the CP nucleotide sequences of 18 samples from the five countries revealed the existence of two geographic groups of CpCSV isolates differing in CP sequences by 8-10%. Group I included isolates from Ethiopia and Sudan, while group II comprised those from Egypt, Morocco and Syria. For distinguishing these two groups, a simple RFLP test using HindIII and/or PvuII for cleavage of CP-gene-derived PCR products was developed. In ELISA and immunoelectron microscopy, however, isolates from these two groups could not be distinguished with rabbit antisera raised against a group-I isolate from Ethiopia (CpCSV-Eth) and a group-II isolate from Syria (CpCSV-Sy). Since none of the ten monoclonal antibodies (MAbs) that had been produced earlier against CpCSV-Eth reacted with group-II isolates, further MAbs were produced. Of the seven MAbs raised against CpCSV-Sy, two reacted only with CpCSV-Sy and two others with both CpCSV-Sy and -Eth. This indicated that there are group I- and II-specific and common (species-specific) epitopes on the CpCSV CP and that the corresponding MAbs are suitable for specific detection and discrimination of CpCSV isolates. Moreover, CpCSV-Sy (group II) caused more severe stunting and yellowing in faba bean than CpCSV-Eth (group I). In conclusion, our data indicate the existence of a geographically associated variation in the molecular, serological and presumably biological properties of CpCSV.
Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P
1993-01-01
The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521
Sequential de novo centromere formation and inactivation on a chromosomal fragment in maize.
Liu, Yalin; Su, Handong; Pang, Junling; Gao, Zhi; Wang, Xiu-Jie; Birchler, James A; Han, Fangpu
2015-03-17
The ability of centromeres to alternate between active and inactive states indicates significant epigenetic aspects controlling centromere assembly and function. In maize (Zea mays), misdivision of the B chromosome centromere on a translocation with the short arm of chromosome 9 (TB-9Sb) can produce many variants with varying centromere sizes and centromeric DNA sequences. In such derivatives of TB-9Sb, we found a de novo centromere on chromosome derivative 3-3, which has no canonical centromeric repeat sequences. This centromere is derived from a 288-kb region on the short arm of chromosome 9, and is 19 megabases (Mb) removed from the translocation breakpoint of chromosome 9 in TB-9Sb. The functional B centromere in progenitor telo2-2 is deleted from derivative 3-3, but some B-repeat sequences remain. The de novo centromere of derivative 3-3 becomes inactive in three further derivatives with new centromeres being formed elsewhere on each chromosome. Our results suggest that de novo centromere initiation is quite common and can persist on chromosomal fragments without a canonical centromere. However, we hypothesize that when de novo centromeres are initiated in opposition to a larger normal centromere, they are cleared from the chromosome by inactivation, thus maintaining karyotype integrity.
Sequential de novo centromere formation and inactivation on a chromosomal fragment in maize
Liu, Yalin; Su, Handong; Pang, Junling; Gao, Zhi; Wang, Xiu-Jie; Birchler, James A.; Han, Fangpu
2015-01-01
The ability of centromeres to alternate between active and inactive states indicates significant epigenetic aspects controlling centromere assembly and function. In maize (Zea mays), misdivision of the B chromosome centromere on a translocation with the short arm of chromosome 9 (TB-9Sb) can produce many variants with varying centromere sizes and centromeric DNA sequences. In such derivatives of TB-9Sb, we found a de novo centromere on chromosome derivative 3-3, which has no canonical centromeric repeat sequences. This centromere is derived from a 288-kb region on the short arm of chromosome 9, and is 19 megabases (Mb) removed from the translocation breakpoint of chromosome 9 in TB-9Sb. The functional B centromere in progenitor telo2-2 is deleted from derivative 3-3, but some B-repeat sequences remain. The de novo centromere of derivative 3-3 becomes inactive in three further derivatives with new centromeres being formed elsewhere on each chromosome. Our results suggest that de novo centromere initiation is quite common and can persist on chromosomal fragments without a canonical centromere. However, we hypothesize that when de novo centromeres are initiated in opposition to a larger normal centromere, they are cleared from the chromosome by inactivation, thus maintaining karyotype integrity. PMID:25733907
Solis, Armando D
2014-01-01
The most informative probability distribution functions (PDFs) describing the Ramachandran phi-psi dihedral angle pair, a fundamental descriptor of backbone conformation of protein molecules, are derived from high-resolution X-ray crystal structures using an information-theoretic approach. The Information Maximization Device (IMD) is established, based on fundamental information-theoretic concepts, and then applied specifically to derive highly resolved phi-psi maps for all 20 single amino acid and all 8000 triplet sequences at an optimal resolution determined by the volume of current data. The paper shows that utilizing the latent information contained in all viable high-resolution crystal structures found in the Protein Data Bank (PDB), totaling more than 77,000 chains, permits the derivation of a large number of optimized sequence-dependent PDFs. This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs. Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs. Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision. The high-resolution phi-psi maps derived in this work are available for download.
Torrent, C; Gabus, C; Darlix, J L
1994-02-01
Retroviral genomes consist of two identical RNA molecules associated at their 5' ends by the dimer linkage structure located in the packaging element (Psi or E) necessary for RNA dimerization in vitro and packaging in vivo. In murine leukemia virus (MLV)-derived vectors designed for gene transfer, the Psi + sequence of 600 nucleotides directs the packaging of recombinant RNAs into MLV virions produced by helper cells. By using in vitro RNA dimerization as a screening system, a sequence of rat VL30 RNA located next to the 5' end of the Harvey mouse sarcoma virus genome and as small as 67 nucleotides was found to form stable dimeric RNA. In addition, a purine-rich sequence located at the 5' end of this VL30 RNA seems to be critical for RNA dimerization. When this VL30 element was extended by 107 nucleotides at its 3' end and inserted into an MLV-derived vector lacking MLV Psi +, it directed the efficient encapsidation of recombinant RNAs into MLV virions. Because this VL30 packaging signal is smaller and more efficient in packaging recombinant RNAs than the MLV Psi + and does not contain gag or glyco-gag coding sequences, its use in MLV-derived vectors should render even more unlikely recombinations which could generate replication-competent viruses. Therefore, utilization of the rat VL30 packaging sequence should improve the biological safety of MLV vectors for human gene transfer.
Gilchrist, Anthony Stuart; Shearman, Deborah C A; Frommer, Marianne; Raphael, Kathryn A; Deshpande, Nandan P; Wilkins, Marc R; Sherwin, William B; Sved, John A
2014-12-20
The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. We produced a draft de novo genome assembly of Australia's major tephritid pest species, Bactrocera tryoni. The male genome (650-700 Mbp) includes approximately 150 Mb of interspersed repetitive DNA sequences and 60 Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA families. These genetic resources will facilitate the further investigations of genetic mechanisms responsible for the behavioural and morphological differences between these three species and other tephritids. We have also shown how whole genome sequence data can be used to generate simple diagnostic tests between very closely-related species where only one of the species is scaffolded.
NASA Technical Reports Server (NTRS)
Deshpande, M. D.
1997-01-01
The dyadic Green's function for an electric current source placed in a rectangular waveguide is derived using a magnetic vector potential approach. A complete solution for the electric and magnetic fields including the source location is obtained by simple differentiation of the vector potential around the source location. The simple differentiation approach which gives electric and magnetic fields identical to an earlier derivation is overlooked by the earlier workers in the derivation of the dyadic Green's function particularly around the source location. Numerical results obtained using the Green's function approach are compared with the results obtained using the Finite Element Method (FEM).
NASA Technical Reports Server (NTRS)
Campbell, John P; Mckinney, Marion O
1952-01-01
A summary of methods for making dynamic lateral stability and response calculations and for estimating the aerodynamic stability derivatives required for use in these calculations is presented. The processes of performing calculations of the time histories of lateral motions, of the period and damping of these motions, and of the lateral stability boundaries are presented as a series of simple straightforward steps. Existing methods for estimating the stability derivatives are summarized and, in some cases, simple new empirical formulas are presented. Detailed estimation methods are presented for low-subsonic-speed conditions but only a brief discussion and a list of references are given for transonic and supersonic speed conditions.
Clevenger, Tracy N.; Hinman, Cassidy R.; Ashley Rubin, Rebekah K.; Smither, Kate; Burke, Daniel J.; Hawker, Craig J.; Messina, Darin; Van Epps, Dennis
2016-01-01
Soft tissue defects are relatively common, yet currently used reconstructive treatments have varying success rates, and serious potential complications such as unpredictable volume loss and reabsorption. Human adipose-derived stem cells (ASCs), isolated from liposuction aspirate have great potential for use in soft tissue regeneration, especially when combined with a supportive scaffold. To design scaffolds that promote differentiation of these cells down an adipogenic lineage, we characterized changes in the surrounding extracellular environment during adipogenic differentiation. We found expression changes in both extracellular matrix proteins, including increases in expression of collagen-IV and vitronectin, as well as changes in the integrin expression profile, with an increase in expression of integrins such as αVβ5 and α1β1. These integrins are known to specifically interact with vitronectin and collagen-IV, respectively, through binding to an Arg-Gly-Asp (RGD) sequence. When three different short RGD-containing peptides were incorporated into three-dimensional (3D) hydrogel cultures, it was found that an RGD-containing peptide derived from vitronectin provided strong initial attachment, maintained the desired morphology, and created optimal conditions for in vitro 3D adipogenic differentiation of ASCs. These results describe a simple, nontoxic encapsulating scaffold, capable of supporting the survival and desired differentiation of ASCs for the treatment of soft tissue defects. PMID:26956095
Does the cost function matter in Bayes decision rule?
Schlü ter, Ralf; Nussbaum-Thom, Markus; Ney, Hermann
2012-02-01
In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.
Upadhyay, Richa; Kashyap, Sarvesh Pratap; Singh, Chandra Shekhar; Tiwari, Kavindra Nath; Singh, Karuna; Singh, Major
2014-11-01
Germplasm storage of Phyllanthus fraternus by using synseed technology has been optimized. Synseeds were prepared from nodal segments taken from in vitro-grown plantlets. An encapsulation matrix of 3 % sodium alginate and 100 mM calcium chloride with polymerization duration up to 15 min was found most suitable for synseed formation. Maximum plantlet conversion (92.5 ± 2.5 %) was obtained on a growth regulator-free ½-strength solid Murashige and Skoog (MS) medium. Multiple shoot proliferation was optimum on a ½ MS medium containing 0.5 mg/l 6-benzylaminopurine (BAP). Shoots were subjected to rooting on MS media containing 1 mg/l α-naphthaleneacetic acid (NAA) and acclimatized successfully. Encapsulated nodal segments can be stored for up to 90 days with a survival frequency of 47.33 %. The clonal fidelity of synseed-derived plantlets was also assessed and compared with that of the mother plant using rapid amplified polymorphic DNA and inter-simple sequence repeat analysis. No changes in molecular profiles were observed among the synseed-derived plantlets and mother plant, which confirms the genetic stability of regenerates. This synseed production protocol could be useful for in vitro multiplication, short-term storage, and exchange of germplasm of this important antiviral and hepatoprotective plant.
One-pot multienzyme (OPME) systems for chemoenzymatic synthesis of carbohydrates
Yu, Hai; Chen, Xi
2016-01-01
Glycosyltransferase-catalyzed enzymatic and chemoenzymatic syntheses are powerful approaches for the production of oligosaccharides, polysaccharides, glycoconjugates, and their derivatives. Enzymes involved in the biosynthesis of sugar nucleotide donors can be combined with the glycosyltransferases in one pot for efficient production of target glycans from simple monosaccharides and accpetors. The identification of enzymes involved in the salvage pathway of sugar nucleotide generation has greatly facilitate the development of simplified and efficient one-pot multienzyme (OPME) systems for synthesizing major glycan epitopes in mammalian glycomes. The applications of OPME methods are steadily gaining popularity mainly due to the increasing availability of wild-type and engineered enzymes. Substrate promiscuity of these enzymes and their mutants allows OPME synthesis of carbohydrates with naturally occurring post-glycosylational modificiation (PGMs) and their non-natural derivatives using modified monosaccharides as precursors. The OPME systems can be applied in sequential for synthesizing complex carbohydrates. The sequence of the sequential OPME processes, the glycosyltransferase used, and the substrate specificities of glycosyltransferasese define the structures of the products. The OPME and sequential OPME strategies can be extended to diverse glycans in other glycomes when suitable enzymes with substrate promiscuity become available. The Perspective summariezes the work of the authors and collaborators on the development of glycosyltransferase-based OPME systems for carbohydrate synthesis. Future directions are also discussed. PMID:26881499
Biotransformation of explosives by Reticulitermes flavipes--associated termite Endosymbionts.
Indest, Karl J; Eaton, Hillary L; Jung, Carina M; Lounds, Caly B
2014-01-01
Termites have an important role in the carbon and nitrogen cycles despite their reputation as destructive pests. With the assistance of microbial endosymbionts, termites are responsible for the conversion of complex biopolymers into simple carbon substrates. Termites also rely on endosymbionts for fixing and recycling nitrogen. As a result, we hypothesize that termite bacterial endosymbionts are a novel source of metabolic pathways for the transformation of nitrogen-rich compounds like explosives. Explosives transformation capability of termite (Reticulitermes flavipes)-derived endosymbionts was determined in media containing the chemical constituents nitrotriazolone (NTO) and hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX) that comprise new insensitive explosive formulations. Media dosed with 40 µg/ml of explosive was inoculated with surface-sterilized, macerated termites. Bacterial isolates capable of explosives transformation were characterized by 16S rRNA sequencing. Termite-derived enrichment cultures demonstrated degradation activity towards the explosives NTO, RDX, as well as the legacy explosive 2,4,6-trinitrotoluene (TNT). Three isolates with high similarity to the Enterobacteriaceae(Enterobacter, Klebsiella) were able to transform TNT and NTO within 2 days, while isolates with high similarity to Serratia marcescens and Lactococcus lactis were able to transform RDX. Termite endosymbionts harbor a range of metabolic activities and possess unique abilities to transform nitrogen-rich explosives. © 2014 S. Karger AG, Basel.
Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs
NASA Astrophysics Data System (ADS)
Choi, Woo-Yong; Chatterjee, Mainak
2015-03-01
With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.
Paleovirology of bornaviruses: What can be learned from molecular fossils of bornaviruses.
Horie, Masayuki; Tomonaga, Keizo
2018-04-06
Endogenous viral elements (EVEs) are virus-derived sequences embedded in eukaryotic genomes formed by germline integration of viral sequences. As many EVEs were integrated into eukaryotic genomes millions of years ago, EVEs are considered molecular fossils of viruses. EVEs can be valuable informational sources about ancient viruses, including their time scale, geographical distribution, genetic information, and hosts. Although integration of viral sequences is not required for replications of viruses other than retroviruses, many non-retroviral EVEs have been reported to exist in eukaryotes. Investigation of these EVEs has expanded our knowledge regarding virus-host interactions, as well as provided information on ancient viruses. Among them, EVEs derived from bornaviruses, non-retroviral RNA viruses, have been relatively well studied. Bornavirus-derived EVEs are widely distributed in animal genomes, including the human genome, and the history of bornaviruses can be dated back to more than 65 million years. Although there are several reports focusing on the biological significance of bornavirus-derived sequences in mammals, paleovirology of bornaviruses has not yet been well described and summarized. In this paper, we describe what can be learned about bornaviruses from endogenous bornavirus-like elements from the view of paleovirology using published results and our novel data. Copyright © 2018 Elsevier B.V. All rights reserved.
New fundamental parameters for attitude representation
NASA Astrophysics Data System (ADS)
Patera, Russell P.
2017-08-01
A new attitude parameter set is developed to clarify the geometry of combining finite rotations in a rotational sequence and in combining infinitesimal angular increments generated by angular rate. The resulting parameter set of six Pivot Parameters represents a rotation as a great circle arc on a unit sphere that can be located at any clocking location in the rotation plane. Two rotations are combined by linking their arcs at either of the two intersection points of the respective rotation planes. In a similar fashion, linking rotational increments produced by angular rate is used to derive the associated kinematical equations, which are linear and have no singularities. Included in this paper is the derivation of twelve Pivot Parameter elements that represent all twelve Euler Angle sequences, which enables efficient conversions between Pivot Parameters and any Euler Angle sequence. Applications of this new parameter set include the derivation of quaternions and the quaternion composition rule, as well as, the derivation of the analytical solution to time dependent coning motion. The relationships between Pivot Parameters and traditional parameter sets are included in this work. Pivot Parameters are well suited for a variety of aerospace applications due to their effective composition rule, singularity free kinematic equations, efficient conversion to and from Euler Angle sequences and clarity of their geometrical foundation.
Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages.
Choi, Youn-Kyung; Kim, Jinmi; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Kim, Yong-Il
2016-01-01
This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5-18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level.
Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages
Choi, Youn-Kyung; Kim, Jinmi; Maki, Koutaro; Ko, Ching-Chang
2016-01-01
This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5–18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level. PMID:27340668
A Simple Method to Determine the "R" or "S" Configuration of Molecules with an Axis of Chirality
ERIC Educational Resources Information Center
Wang, Cunde; Wu, Weiming
2011-01-01
A simple method for the "R" or "S" designation of molecules with an axis of chirality is described. The method involves projection of the substituents along the chiral axis, utilizes the Cahn-Ingold-Prelog sequence rules in assigning priority to the substituents, is easy to use, and has broad applicability. (Contains 5 figures.)
Feng, Feiling; Cheng, Qingbao; Yang, Liang; Zhang, Dadong; Ji, Shunlong; Zhang, Qiangzu; Lin, Yihui; Li, Fugen; Xiong, Lei; Liu, Chen; Jiang, Xiaoqing
2017-01-17
Gallbladder sarcomatoid carcinoma is a rare cancer with no clinical standard treatment. With the rapid development of next generation sequencing, it has been able to provide reasonable treatment options for patients based on genetic variations. However, most cancer drugs are not approval for gallbladder sarcomatoid carcinoma indications. The correlation between drug response and a genetic variation needs to be further elucidated. Three patient-derived cells-JXQ-3D-001, JXQ-3D-002, and JXQ-3D-003, were derived from biopsy samples of one gallbladder sarcomatoid carcinoma patient with progression and have been characterized. In order to study the relationship between drug sensitivity and gene alteration, genetic mutations of three patient-derived cells were discovered by whole exome sequencing, and drug screening has been performed based on the gene alterations and related signaling pathways that are associated with drug targets. It has been found that there are differences in biological characteristics such as morphology, cell proliferation, cell migration and colony formation activity among these three patient-derived cells although they are derived from the same patient. Their sensitivities to the chemotherapy drugs-Fluorouracil, Doxorubicin, and Cisplatin are distinct. Moreover, none of common chemotherapy drugs could inhibit the proliferations of all three patient-derived cells. Comprehensive analysis of their whole exome sequencing demonstrated that tumor-associated genes TP53, AKT2, FGFR3, FGF10, SDHA, and PI3KCA were mutated or amplified. Part of these alterations are actionable. By screening a set of compounds that are associated with the genetic alteration, it has been found that GDC-0941 and PF-04691502 for PI3K-AKT-mTOR pathway inhibitors could dramatically decrease the proliferation of three patient-derived cells. Importantly, expression of phosphorylated AKT and phosphorylated S6 were markedly decreased after treatments with PI3K-AKT-mTOR pathway inhibitors GDC-0941 (0.5 μM) and PF-04691502 (0.1 μM) in all three patient-derived cells. These data suggested that inhibition of the PI3K-AKT-mTOR pathway that was activated by PIK3CA amplification in all three patient-derived cells could reduce the cell proliferation. A patient-derived cell model combined with whole exome sequencing is a powerful tool to elucidate relationship between drug sensitivities and genetic alternations. In these gallbladder sarcomatoid carcinoma patient-derived cells, it is found that PIK3CA amplification could be used as a biomarker to indicate PI3K-AKT-mTOR pathway activation. Block of the pathway may benefit the gallbladder sarcomatoid carcinoma patient with this alternation in hypothesis. The real efficacy needs to be confirmed in vivo or in a clinical trial.
Functionally conserved enhancers with divergent sequences in distant vertebrates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Song; Oksenberg, Nir; Takayama, Sachiko
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Functionally conserved enhancers with divergent sequences in distant vertebrates
Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...
2015-10-30
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.
Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick
2013-01-01
Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.
Fine-tuning gene networks using simple sequence repeats
Egbert, Robert G.; Klavins, Eric
2012-01-01
The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Zhang, Gaihua; Su, Zhen
2012-01-01
Work on protein structure prediction is very useful in biological research. To evaluate their accuracy, experimental protein structures or their derived data are used as the 'gold standard'. However, as proteins are dynamic molecular machines with structural flexibility such a standard may be unreliable. To investigate the influence of the structure flexibility, we analysed 3,652 protein structures of 137 unique sequences from 24 protein families. The results showed that (1) the three-dimensional (3D) protein structures were not rigid: the root-mean-square deviation (RMSD) of the backbone Cα of structures with identical sequences was relatively large, with the average of the maximum RMSD from each of the 137 sequences being 1.06 Å; (2) the derived data of the 3D structure was not constant, e.g. the highest ratio of the secondary structure wobble site was 60.69%, with the sequence alignments from structural comparisons of two proteins in the same family sometimes being completely different. Proteins may have several stable conformations and the data derived from resolved structures as a 'gold standard' should be optimized before being utilized as criteria to evaluate the prediction methods, e.g. sequence alignment from structural comparison. Helix/β-sheet transition exists in normal free proteins. The coil ratio of the 3D structure could affect its resolution as determined by X-ray crystallography.
Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai
2013-08-01
To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.
Gradients and Non-Adiabatic Derivative Coupling Terms for Spin-Orbit Wavefunctions
2011-06-01
derivative, symmetric to the first time derivative. Solutions to the Dirac equation simultaneously satisfy the simple relativistic wave equation, the...For Pooki vi Acknowledgments I would like to thank the members of my committee for their time and...Theorem..............................................................................191 Appendix J. The Symmetric Group
Wickersheim, Michelle L; Blumenstiel, Justin P
2013-11-01
A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.
Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng
2018-05-09
Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.
Molecular beacon sequence design algorithm.
Monroe, W Todd; Haselton, Frederick R
2003-01-01
A method based on Web-based tools is presented to design optimally functioning molecular beacons. Molecular beacons, fluorogenic hybridization probes, are a powerful tool for the rapid and specific detection of a particular nucleic acid sequence. However, their synthesis costs can be considerable. Since molecular beacon performance is based on its sequence, it is imperative to rationally design an optimal sequence before synthesis. The algorithm presented here uses simple Microsoft Excel formulas and macros to rank candidate sequences. This analysis is carried out using mfold structural predictions along with other free Web-based tools. For smaller laboratories where molecular beacons are not the focus of research, the public domain algorithm described here may be usefully employed to aid in molecular beacon design.
A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library
Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying
2013-01-01
DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477
A simple and efficient method for assembling TALE protein based on plasmid library.
Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying
2013-01-01
DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.
Labeled Graph Kernel for Behavior Analysis.
Zhao, Ruiqi; Martinez, Aleix M
2016-08-01
Automatic behavior analysis from video is a major topic in many areas of research, including computer vision, multimedia, robotics, biology, cognitive science, social psychology, psychiatry, and linguistics. Two major problems are of interest when analyzing behavior. First, we wish to automatically categorize observed behaviors into a discrete set of classes (i.e., classification). For example, to determine word production from video sequences in sign language. Second, we wish to understand the relevance of each behavioral feature in achieving this classification (i.e., decoding). For instance, to know which behavior variables are used to discriminate between the words apple and onion in American Sign Language (ASL). The present paper proposes to model behavior using a labeled graph, where the nodes define behavioral features and the edges are labels specifying their order (e.g., before, overlaps, start). In this approach, classification reduces to a simple labeled graph matching. Unfortunately, the complexity of labeled graph matching grows exponentially with the number of categories we wish to represent. Here, we derive a graph kernel to quickly and accurately compute this graph similarity. This approach is very general and can be plugged into any kernel-based classifier. Specifically, we derive a Labeled Graph Support Vector Machine (LGSVM) and a Labeled Graph Logistic Regressor (LGLR) that can be readily employed to discriminate between many actions (e.g., sign language concepts). The derived approach can be readily used for decoding too, yielding invaluable information for the understanding of a problem (e.g., to know how to teach a sign language). The derived algorithms allow us to achieve higher accuracy results than those of state-of-the-art algorithms in a fraction of the time. We show experimental results on a variety of problems and datasets, including multimodal data.
On the sum of generalized Fibonacci sequence
NASA Astrophysics Data System (ADS)
Chong, Chin-Yoon; Ho, C. K.
2014-06-01
We consider the generalized Fibonacci sequence {Un defined by U0 = 0, U1 = 1, and Un+2 = pUn+1+qUn for all n∈Z0+ and p, q∈Z+. In this paper, we derived various sums of the generalized Fibonacci sequence from their recursive relations.
Miao, Ning; Zhang, Lei; Li, Maoping; Fan, Liqiang; Mao, Kangshan
2017-01-01
Premise of the study: We developed transcriptome microsatellite markers (simple sequence repeats) for Taxillus nigrans (Loranthaceae) to survey the genetic diversity and population structure of this species. Methods and Results: We used Illumina HiSeq data to reconstruct the transcriptome of T. nigrans by de novo assembly and used the transcriptome to develop a set of simple sequence repeat markers. Overall, 40 primer pairs were designed and tested; 19 of them amplified successfully and demonstrated polymorphisms. Two loci that detected null alleles were eliminated, and the remaining 17, which were subjected to further analyses, yielded two to 21 alleles per locus. Conclusions: The markers will serve as a basis for studies to assess the extent and pattern of distribution of genetic variation in T. nigrans, and they may also be useful in conservation genetic, ecological, and evolutionary studies of the genus Taxillus, a group of plant species of importance in Chinese traditional medicine. PMID:28924510
Inverted-U Function Relating Cortical Plasticity and Task Difficulty
Engineer, Navzer D.; Engineer, Crystal T.; Reed, Amanda C.; Pandya, Pritesh K.; Jakkamsetti, Vikram; Moucha, Raluca; Kilgard, Michael P.
2012-01-01
Many psychological and physiological studies with simple stimuli have suggested that perceptual learning specifically enhances the response of primary sensory cortex to task-relevant stimuli. The aim of this study was to determine whether auditory discrimination training on complex tasks enhances primary auditory cortex responses to a target sequence relative to non-target and novel sequences. We collected responses from more than 2,000 sites in 31 rats trained on one of six discrimination tasks that differed primarily in the similarity of the target and distractor sequences. Unlike training with simple stimuli, long-term training with complex stimuli did not generate target specific enhancement in any of the groups. Instead, cortical receptive field size decreased, latency decreased, and paired pulse depression decreased in rats trained on the tasks of intermediate difficulty while tasks that were too easy or too difficult either did not alter or degraded cortical responses. These results suggest an inverted-U function relating neural plasticity and task difficulty. PMID:22249158
[Analysis of MAT1A gene mutations in a child affected with simple hypermethioninemia].
Sun, Yun; Ma, Dingyuan; Wang, Yanyun; Yang, Bin; Jiang, Tao
2017-02-10
To detect potential mutations of MAT1A gene in a child suspected with simple hypermethioninemia by MS/MS neonatal screening. Clinical data of the child was collected. Genomic DNA was extracted by a standard method and subjected to targeted sequencing using an Ion Ampliseq TM Inherited Disease Panel. Detected mutations were verified by Sanger sequencing. The child showed no clinical features except evaluated methionine. A novel compound mutation of the MAT1A gene, i.e., c.345delA and c.529C>T, was identified in the child. His father and mother were found to be heterozygous for the c.345delA mutation and c.529C>T mutation, respectively. The compound mutation c.345delA and c.529C>T of the MAT1A gene probably underlie the disease in the child. The semi-conductor sequencing has provided an important means for the diagnosis of hereditary diseases.
Investigation of microsatellite instability in Turkish breast cancer patients.
Demokan, Semra; Muslumanoglu, Mahmut; Yazici, H; Igci, Abdullah; Dalay, Nejat
2002-01-01
Multiple somatic and inherited genetic changes that lead to loss of growth control may contribute to the development of breast cancer. Microsatellites are tandem repeats of simple sequences that occur abundantly and at random throughout most eucaryotic genomes. Microsatellite instability (MI), characterized by the presence of random contractions or expansions in the length of simple sequence repeats or microsatellites, is observed in a variety of tumors. The aim of this study was to compare tumor DNA fingerprints with constitutional DNA fingerprints to investigate changes specific to breast cancer and evaluate its correlation with clinical characteristics. Tumor and normal tissue samples of 38 patients with breast cancer were investigated by comparing PCR-amplified microsatellite sequences D2S443 and D21S1436. Microsatellite instability at D21S1436 and D2S443 was found in 5 (13%) and 7 (18%) patients, respectively. Two patients displayed instability at both marker loci. No association was found between MI and age, family history, lymph node involvement and other clinical parameters.
Simple diazonium chemistry to develop specific gene sensing platforms.
Revenga-Parra, M; García-Mendiola, T; González-Costas, J; González-Romero, E; Marín, A García; Pau, J L; Pariente, F; Lorenzo, E
2014-02-27
A simple strategy for covalent immobilizing DNA sequences, based on the formation of stable diazonized conducting platforms, is described. The electrochemical reduction of 4-nitrobenzenediazonium salt onto screen-printed carbon electrodes (SPCE) in aqueous media gives rise to terminal grafted amino groups. The presence of primary aromatic amines allows the formation of diazonium cations capable to react with the amines present at the DNA capture probe. As a comparison a second strategy based on the binding of aminated DNA capture probes to the developed diazonized conducting platforms through a crosslinking agent was also employed. The resulting DNA sensing platforms were characterized by cyclic voltammetry, electrochemical impedance spectroscopy and spectroscopic ellipsometry. The hybridization event with the complementary sequence was detected using hexaamineruthenium (III) chloride as electrochemical indicator. Finally, they were applied to the analysis of a 145-bp sequence from the human gene MRP3, reaching a detection limit of 210 pg μL(-1). Copyright © 2014 Elsevier B.V. All rights reserved.
WebSat--a web software for microsatellite marker development.
Martins, Wellington Santos; Lucas, Divino César Soares; Neves, Kelligton Fabricio de Souza; Bertioli, David John
2009-01-01
Simple sequence repeats (SSR), also known as microsatellites, have been extensively used as molecular markers due to their abundance and high degree of polymorphism. We have developed a simple to use web software, called WebSat, for microsatellite molecular marker prediction and development. WebSat is accessible through the Internet, requiring no program installation. Although a web solution, it makes use of Ajax techniques, providing a rich, responsive user interface. WebSat allows the submission of sequences, visualization of microsatellites and the design of primers suitable for their amplification. The program allows full control of parameters and the easy export of the resulting data, thus facilitating the development of microsatellite markers. The web tool may be accessed at http://purl.oclc.org/NET/websat/
Plant genome and transcriptome annotations: from misconceptions to simple solutions
Bolger, Marie E; Arsova, Borjana; Usadel, Björn
2018-01-01
Abstract Next-generation sequencing has triggered an explosion of available genomic and transcriptomic resources in the plant sciences. Although genome and transcriptome sequencing has become orders of magnitudes cheaper and more efficient, often the functional annotation process is lagging behind. This might be hampered by the lack of a comprehensive enumeration of simple-to-use tools available to the plant researcher. In this comprehensive review, we present (i) typical ontologies to be used in the plant sciences, (ii) useful databases and resources used for functional annotation, (iii) what to expect from an annotated plant genome, (iv) an automated annotation pipeline and (v) a recipe and reference chart outlining typical steps used to annotate plant genomes/transcriptomes using publicly available resources. PMID:28062412
GATA simple sequence repeats function as enhancer blocker boundaries.
Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K
2013-01-01
Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
Thermodynamics-based models of transcriptional regulation with gene sequence.
Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing
2015-12-01
Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
Andersson, P; Klein, M; Lilliebridge, R A; Giffard, P M
2013-09-01
Ultra-deep Illumina sequencing was performed on whole genome amplified DNA derived from a Chlamydia trachomatis-positive vaginal swab. Alignment of reads with reference genomes allowed robust SNP identification from the C. trachomatis chromosome and plasmid. This revealed that the C. trachomatis in the specimen was very closely related to the sequenced urogenital, serovar F, clade T1 isolate F-SW4. In addition, high genome-wide coverage was obtained for Prevotella melaninogenica, Gardnerella vaginalis, Clostridiales genomosp. BVAB3 and Mycoplasma hominis. This illustrates the potential of metagenome data to provide high resolution bacterial typing data from multiple taxa in a diagnostic specimen. ©2013 The Authors Clinical Microbiology and Infection ©2013 European Society of Clinical Microbiology and Infectious Diseases.
Jenista, Elizabeth R; Stokes, Ashley M; Branca, Rosa Tamara; Warren, Warren S
2009-11-28
A recent quantum computing paper (G. S. Uhrig, Phys. Rev. Lett. 98, 100504 (2007)) analytically derived optimal pulse spacings for a multiple spin echo sequence designed to remove decoherence in a two-level system coupled to a bath. The spacings in what has been called a "Uhrig dynamic decoupling (UDD) sequence" differ dramatically from the conventional, equal pulse spacing of a Carr-Purcell-Meiboom-Gill (CPMG) multiple spin echo sequence. The UDD sequence was derived for a model that is unrelated to magnetic resonance, but was recently shown theoretically to be more general. Here we show that the UDD sequence has theoretical advantages for magnetic resonance imaging of structured materials such as tissue, where diffusion in compartmentalized and microstructured environments leads to fluctuating fields on a range of different time scales. We also show experimentally, both in excised tissue and in a live mouse tumor model, that optimal UDD sequences produce different T(2)-weighted contrast than do CPMG sequences with the same number of pulses and total delay, with substantial enhancements in most regions. This permits improved characterization of low-frequency spectral density functions in a wide range of applications.
Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri
2013-02-19
Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun
2017-01-03
Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Ralph, Duncan K; Matsen, Frederick A
2016-01-01
VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.
Changes in spinal reflex excitability associated with motor sequence learning.
Lungu, Ovidiu; Frigon, Alain; Piché, Mathieu; Rainville, Pierre; Rossignol, Serge; Doyon, Julien
2010-05-01
There is ample evidence that motor sequence learning is mediated by changes in brain activity. Yet the question of whether this form of learning elicits changes detectable at the spinal cord level has not been addressed. To date, studies in humans have revealed that spinal reflex activity may be altered during the acquisition of various motor skills, but a link between motor sequence learning and changes in spinal excitability has not been demonstrated. To address this issue, we studied the modulation of H-reflex amplitude evoked in the flexor carpi radialis muscle of 14 healthy individuals between blocks of movements that involved the implicit acquisition of a sequence versus other movements that did not require learning. Each participant performed the task in three conditions: "sequence"-externally triggered, repeating and sequential movements, "random"-similar movements, but performed in an arbitrary order, and "simple"- involving alternating movements in a left-right or up-down direction only. When controlling for background muscular activity, H-reflex amplitude was significantly more reduced in the sequence (43.8 +/- 1.47%. mean +/- SE) compared with the random (38.2 +/- 1.60%) and simple (31.5 +/- 1.82%) conditions, while the M-response was not different across conditions. Furthermore, H-reflex changes were observed from the beginning of the learning process up to when subjects reached asymptotic performance on the motor task. Changes also persisted for >60 s after motor activity ceased. Such findings suggest that the excitability in some spinal reflex circuits is altered during the implicit learning process of a new motor sequence.
Service, Elisabet; Maury, Sini
2015-01-01
Working memory (WM) has been described as an interface between cognition and action, or a system for access to a limited amount of information needed in complex cognition. Access to morphological information is needed for comprehending and producing sentences. The present study probed WM for morphologically complex word forms in Finnish, a morphologically rich language. We studied monomorphemic (boy), inflected (boy+’s), and derived (boy+hood) words in three tasks. Simple span, immediate serial recall of words, in Experiment 1, is assumed to mainly rely on information in the focus of attention. Sentence span, a dual task combining sentence reading with recall of the last word (Experiment 2) or of a word not included in the sentence (Experiment 3) is assumed to involve establishment of a search set in long-term memory for fast activation into the focus of attention. Recall was best for monomorphemic and worst for inflected word forms with performance on derived words in between. However, there was an interaction between word type and experiment, suggesting that complex span is more sensitive to morphological complexity in derivations than simple span. This was explored in a within-subjects Experiment 4 combining all three tasks. An interaction between morphological complexity and task was replicated. Both inflected and derived forms increased load in WM. In simple span, recall of inflectional forms resulted in form errors. Complex span tasks were more sensitive to morphological load in derived words, possibly resulting from interference from morphological neighbors in the mental lexicon. The results are best understood as involving competition among inflectional forms when binding words from input into an output structure, and competition from morphological neighbors in secondary memory during cumulative retrieval-encoding cycles. Models of verbal recall need to be able to represent morphological as well as phonological and semantic information. PMID:25642181
A SIMPLE CELLULAR AUTOMATON MODEL FOR HIGH-LEVEL VEGETATION DYNAMICS
We have produced a simple two-dimensional (ground-plan) cellular automata model of vegetation dynamics specifically to investigate high-level community processes. The model is probabilistic, with individual plant behavior determined by physiologically-based rules derived from a w...
Draft Genome Sequence of the Cellulolytic Bacterium Clostridium papyrosolvens C7 (ATCC 700395).
Zepeda, Veronica; Dassa, Bareket; Borovok, Ilya; Lamed, Raphael; Bayer, Edward A; Cate, Jamie H D
2013-09-12
We report the draft genome sequence of the cellulose-degrading bacterium Clostridium papyrosolvens C7, originally isolated from mud collected below a freshwater pond in Massachusetts. This Gram-positive bacterium grows in a mesophilic anaerobic environment with filter paper as the only carbon source, and it has a simple cellulosome system with multiple carbohydrate-degrading enzymes.
Draft Genome Sequence of the Cellulolytic Bacterium Clostridium papyrosolvens C7 (ATCC 700395)
Zepeda, Veronica; Dassa, Bareket; Borovok, Ilya; Lamed, Raphael; Bayer, Edward A.
2013-01-01
We report the draft genome sequence of the cellulose-degrading bacterium Clostridium papyrosolvens C7, originally isolated from mud collected below a freshwater pond in Massachusetts. This Gram-positive bacterium grows in a mesophilic anaerobic environment with filter paper as the only carbon source, and it has a simple cellulosome system with multiple carbohydrate-degrading enzymes. PMID:24029755
ERIC Educational Resources Information Center
Axelrod, Michael I.; Zank, Amber J.
2012-01-01
Noncompliance is one of the most problematic behaviors within the school setting. One strategy to increase compliance of noncompliant students is a high-probability command sequence (HPCS; i.e., a set of simple commands in which an individual is likely to comply immediately prior to the delivery of a command that has a lower probability of…
USDA-ARS?s Scientific Manuscript database
The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding it...
Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.
Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y
2000-01-01
Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.
Improved Spin-Echo-Edited NMR Diffusion Measurements
NASA Astrophysics Data System (ADS)
Otto, William H.; Larive, Cynthia K.
2001-12-01
The need for simple and robust schemes for the analysis of ligand-protein binding has resulted in the development of diffusion-based NMR techniques that can be used to assay binding in protein solutions containing a mixture of several ligands. As a means of gaining spectral selectivity in NMR diffusion measurements, a simple experiment, the gradient modified spin-echo (GOSE), has been developed to reject the resonances of coupled spins and detect only the singlets in the 1H NMR spectrum. This is accomplished by first using a spin echo to null the resonances of the coupled spins. Following the spin echo, the singlet magnetization is flipped out of the transverse plane and a dephasing gradient is applied to reduce the spectral artifacts resulting from incomplete cancellation of the J-coupled resonances. The resulting modular sequence is combined here with the BPPSTE pulse sequence; however, it could be easily incorporated into any pulse sequence where additional spectral selectivity is desired. Results obtained with the GOSE-BPPSTE pulse sequence are compared with those obtained with the BPPSTE and CPMG-BPPSTE experiments for a mixture containing the ligands resorcinol and tryptophan in a solution of human serum albumin.
Yang, Ji Young; Pak, Jae-Hong; Kim, Seung-Chul
2018-08-20
Previous phylogenetic studies have suggested that Rubus takesimensis (Rosaceae), which is endemic to Ulleung Island, Korea, is closely related to R. crataegifolius, which is broadly distributed across East Asia. A recent phylogeographic study also suggested the possible polyphyletic origins of R. takesimensis from multiple source populations of its continental progenitor R. crataegifolius in China, Japan, Korea, and the Russian Far East. However, even though the progenitor-derivative relationship between R. crataegifolius and R. takesimensis has been established, little is known about the chloroplast genome (i.e., plastome) evolution of anagenetically derived species on oceanic islands and their continental progenitor species. In the present study, we characterized the complete plastome of R. takesimensis and compared it to those of R. crataegifolius and four other Rubus species. The R. takesimensis plastome was 155,760 base pairs (bp) long, a total of 46 bp longer than the plastome of R. crataegifolius (28 from LSC and 18 from SSC). No structural or content rearrangements were found between the species pairs. Four highly variable intergenic regions (rpl32/trnL, rps4/trnT, trnT/trnL, and psbZ/trnG) were identified between R. takesimensis and R. crataegifolius. Compared to the plastomes of other congeneric species (R. corchorifolius, R. fockeanus, and R. niveus), six highly variable intergenic regions (ndhC/psaC, rps16/trnQ, trnK/rps16, trnL/trnF, trnM/atpE, and trnQ/psbK) were also identified. A total of 116 simple sequence repeats (SSRs), including 48 mononucleotide, 64 dinucleotide, and four trinucleotide repeat motifs were characterized in R. takesimensis. The plastome resources generated by the present study will help to elucidate plastome evolution within the genus and to resolve phylogenetic relationships within highly complex and reticulated lineages. Phylogenetic analysis supported both the monophyly of Rubus and the sister relationship between R. crataegifolius and R. takesimensis. Copyright © 2018. Published by Elsevier B.V.
Bounds on the cross-correlation functions of state m-sequences
NASA Astrophysics Data System (ADS)
Woodcock, C. F.; Davies, Phillip A.; Shaar, Ahmed A.
1987-03-01
Lower and upper bounds on the peaks of the periodic Hamming cross-correlation function for state m-sequences, which are often used in frequency-hopped spread-spectrum systems, are derived. The state position mapped (SPM) sequences of the state m-sequences are described. The use of SPM sequences for OR-channel code division multiplexing is studied. The relation between the Hamming cross-correlation function and the correlation function of SPM sequence is examined. Numerical results which support the theoretical data are presented.
Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K
2004-01-01
The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
Lewers, Kim S; Saski, Chris A; Cuthbertson, Brandon J; Henry, David C; Staton, Meg E; Main, Dorrie S; Dhanaraj, Anik L; Rowland, Lisa J; Tomkins, Jeff P
2008-01-01
Background The recent development of novel repeat-fruiting types of blackberry (Rubus L.) cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST) library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR), and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry. PMID:18570660
On the normalization of the minimum free energy of RNAs by sequence length.
Trotta, Edoardo
2014-01-01
The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.
On the Normalization of the Minimum Free Energy of RNAs by Sequence Length
Trotta, Edoardo
2014-01-01
The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875
Palaeosymbiosis Revealed by Genomic Fossils of Wolbachia in a Strongyloidean Nematode
Koutsovoulos, Georgios; Makepeace, Benjamin; Tanya, Vincent N.; Blaxter, Mark
2014-01-01
Wolbachia are common endosymbionts of terrestrial arthropods, and are also found in nematodes: the animal-parasitic filaria, and the plant-parasite Radopholus similis. Lateral transfer of Wolbachia DNA to the host genome is common. We generated a draft genome sequence for the strongyloidean nematode parasite Dictyocaulus viviparus, the cattle lungworm. In the assembly, we identified nearly 1 Mb of sequence with similarity to Wolbachia. The fragments were unlikely to derive from a live Wolbachia infection: most were short, and the genes were disabled through inactivating mutations. Many fragments were co-assembled with definitively nematode-derived sequence. We found limited evidence of expression of the Wolbachia-derived genes. The D. viviparus Wolbachia genes were most similar to filarial strains and strains from the host-promiscuous clade F. We conclude that D. viviparus was infected by Wolbachia in the past, and that clade F-like symbionts may have been the source of filarial Wolbachia infections. PMID:24901418