Science.gov

Sample records for sequence tags analysis

  1. Analysis of expressed sequence tags from Plasmodium falciparum.

    PubMed

    Chakrabarti, D; Reddy, G R; Dame, J B; Almira, E C; Laipis, P J; Ferl, R J; Yang, T P; Rowe, T C; Schuster, S M

    1994-07-01

    An initiative was undertaken to sequence all genes of the human malaria parasite Plasmodium falciparum in an effort to gain a better understanding at the molecular level of the parasite that inflicts much suffering in the developing world. 550 random complimentary DNA clones were partially sequenced from the intraerythrocytic form of the parasite as one of the approaches to analyze the transcribed sequences of its genome. The sequences, after editing, generated 389 expressed sequence tag sites and over 105 kb of DNA sequences. About 32% of these clones showed significant homology with other genes in the database. These clones represent 340 new Plasmodium falciparum expressed sequence tags.

  2. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    NASA Astrophysics Data System (ADS)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  3. Expressed sequence tag analysis in tef (Eragrostis tef (Zucc) Trotter).

    PubMed

    Yu, Ju-Kyung; Sun, Qi; Rota, Mauricio La; Edwards, Hugh; Tefera, Hailu; Sorrells, Mark E

    2006-04-01

    Tef (Eragrostis tef (Zucc.) Trotter) is the most important cereal crop in Ethiopia; however, there is very little DNA sequence information available for this species. Expressed sequence tags (ESTs) were generated from 4 cDNA libraries: seedling leaf, seedling root, and inflorescence of E. tef and seedling leaf of Eragrostis pilosa, a wild relative of E. tef. Clustering of 3603 sequences produced 530 clusters and 1890 singletons, resulting in 2420 tef unigenes. Approximately 3/4 of tef unigenes matched protein or nucleotide sequences in public databases. Annotation of unigenes associated 68% of the putative tef genes with gene ontology categories. Identification of the translated unigenes for conserved protein domains revealed 389 protein family domains (Pfam), the most frequent of which was protein kinase. A total of 170 ESTs containing simple sequence repeats (EST-SSRs) were identified and 80 EST-SSR markers were developed. In addition, 19 single-nucleotide polymorphism (SNP) and (or) insertion-deletion (indel) and 34 intron fragment length polymorphism (IFLP) markers were developed. The EST database and molecular markers generated in this study will be valuable resources for further tef genetic research.

  4. Computational methods for the analysis of tag sequences in metagenomics studies.

    PubMed

    Chang, Qin; Luan, Yihui; Chen, Ting; Fuhrman, Jed A; Sun, Fengzhu

    2012-06-01

    Metagenomics commonly refers to the study of genetic materials directly derived from environments without culturing. Several ongoing large-scale metagenomics projects related to human and marine life, as well as pedology studies, have generated enormous amounts of data, posing a key challenge for efficient analysis, as we try to 1) understand microbial organism assemblage under different conditions, 2) compare different communities, and 3) understand how microbial organisms associate with each other and the environment.To address such questions, investigators are using new sequencing technologies, including Sanger, Illumina Solexa, and Roche 454, to sequence either particular genes, called tag sequences, mostly 16S or 18S ribosomal RNA sequences or other conserved genes, or whole metagenome shotgun sequences of all the genetic materials in a given community. In this paper, we review computational methods used for the analysis of tag sequences.

  5. ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing.

    PubMed

    Li, Guoliang; Fullwood, Melissa J; Xu, Han; Mulawadi, Fabianus Hendriyan; Velkov, Stoyan; Vega, Vinsensius; Ariyaratne, Pramila Nuwantha; Mohamed, Yusoff Bin; Ooi, Hong-Sain; Tennakoon, Chandana; Wei, Chia-Lin; Ruan, Yijun; Sung, Wing-Kin

    2010-01-01

    Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg). PMID:20181287

  6. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

    PubMed Central

    Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

    2003-01-01

    To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

  7. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  8. Massively parallel sequencing and analysis of expressed sequence tags in a successful invasive plant

    PubMed Central

    Prentis, Peter J.; Woolfit, Megan; Thomas-Hall, Skye R.; Ortiz-Barrientos, Daniel; Pavasovic, Ana; Lowe, Andrew J.; Schenk, Peer M.

    2010-01-01

    Background Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. Methods We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. Key Results Assembly of the 221 746 sequence reads resulted in 12 442 contigs. Over 50 % (6183) of 12 442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. Conclusions This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing

  9. ANALYSIS OF EXPRESSED SEQUENCE TAGS FROM THE GREEN ALGA DUNALIELLA SALINA (CHLOROPHYTA)(1).

    PubMed

    Zhao, Rui; Cao, Yu; Xu, Hui; Lv, Linfeng; Qiao, Dairong; Cao, Yi

    2011-12-01

    The unicellular green alga Dunaliella salina (Dunal) Teodor. is a novel model photosynthetic eukaryote for studying photosystems, high salinity acclimation, and carotenoid accumulation. In spite of such significance, there have been limited studies on the Dunaliella genome transcriptome and proteome. To further investigate D. salina, a cDNA library was constructed and sequenced. Here, we present the analysis of the 2,282 expressed sequence tags (ESTs) generated together with 3,990 ESTs from dbEST. A total of 4,148 unique sequences (UniSeqs) were identified, of which 56.1% had sequence similarity with Uniprot entries, suggesting that a large number of unique genes may be harbored by Dunaliella. Additionally, protein family domains were identified to further characterize these sequences. Then, we also compared EST sequences with different complete eukaryotic genomes from several animals, plants, and fungi. We observed notable differences between D. salina and other organisms. This EST collection and its annotation provided a significant resource for basic and applied research on D. salina and laid the foundation for a systematic analysis of the transcriptome basis of green algae development and diversification.

  10. Analysis of expressed sequence tags from a naked foraminiferan Reticulomyxa filosa.

    PubMed

    Burki, Fabien; Nikolaev, Sergey I; Bolivar, Ignacio; Guiard, Jackie; Pawlowski, Jan

    2006-08-01

    Foraminifers are a major component of modern marine ecosystems and one of the most important oceanic producers of calcium carbonate. They are a key phylogenetic group among amoeboid protists, but our knowledge of their genome is still mostly limited to a few conserved genes. Here, we report the first study of expressed genes by means of expressed sequence tag (EST) from the freshwater naked foraminiferan Reticulomyxa filosa. Cluster analysis of 1630 valid ESTs enabled the identification of 178 groups of related sequences and 871 singlets. Approximately 50% of the putative unique 1059 ESTs could be annotated using Blast searches against the protein database SwissProt + TrEMBL. The EST database described here is the first step towards gene discovery in Foraminifera and should provide the basis for new insights into the genomic and transcriptomic characteristics of these interesting but poorly understood protists.

  11. Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

    PubMed Central

    Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

    2007-01-01

    Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730

  12. Analysis of expressed sequence tags from the red alga Griffithsia okiensis.

    PubMed

    Lee, Hyoungseok; Lee, Hong Kum; An, Gynheung; Lee, Yoo Kyung

    2007-12-01

    Red algae are distributed globally, and the group contains several commercially important species. Griffithsia okiensis is one of the most extensively studied red algal species. In this study, we conducted expressed sequence tag (ESTs) analysis and synonymous codon usage analysis using cultured G. okiensis samples. A total of 1,104 cDNA clones were sequenced using a cDNA library made from samples collected from Dolsan Island, on the southern coast of Korea. The clustering analysis of these sequences allowed for the identification of 1,048 unigene clusters consisting of 36 consensus and 1,012 singleton sequences. BLASTX searches generated 532 significant hits (E-value <10(-4)) and via further Gene Ontology analysis, we constructed a functional classification of 434 unigenes. Our codon usage analysis showed that unigene clusters with more than three ESTs had higher GC contents (76.5%) at the third position of the codons than the singletons. Also, the majority of the optimal codons of G. okiensis and Chondrus crispus belonging to Bangiophycidae were C-ending, whereas those of Porphyra yezoensis belonging to Florideophycidae were G-ending. An orthologous gene search for the P. yezoensis EST database resulted in the identification of 39 unigenes commonly expressed in two rhodophytes, which have putative functions for structural proteins, protein degradation, signal transduction, stress response, and physiological processes. Although experiments have been conducted on a limited scale, this study provides a material basis for the development of microarrays useful for gene expression studies, as well as useful information for the comparative genomic analysis of red algae.

  13. Analysis and functional annotation of expressed sequence tags of water buffalo.

    PubMed

    Bajetha, Garima; Bhati, Jyotika; Sarika; Iquebal, M A; Rai, Anil; Arora, Vasu; Kumar, Dinesh

    2013-01-01

    An elucidated genome of domestic livestock river buffalo will contribute enormously to economy and better understanding of genome evolution as well. An attempt is made to obtain genomic information on buffalo, based on total Expressed Sequence Tags (ESTs) of Bubalus bubalis available in public domain. These ESTs were annotated and classified into 15 different functional categories based on their homology to the known proteins. Interestingly, 41.79% of the contigs were found to be buffalo specific novel ESTs with respect to other species used in analysis which needs further studies. Also, 224 pSNPs (putative Single Nucleotide Polymorphism) were detected. This study will provide a home base for further genomic studies of buffalo and comparative studies enabling a starting point for the genome annotation of the organism. Supplementary materials are available for this article online.

  14. Identification of Differential Gene Expression in Brassica rapa Nectaries through Expressed Sequence Tag Analysis

    PubMed Central

    Hampton, Marshall; Xu, Wayne W.; Kram, Brian W.; Chambers, Emily M.; Ehrnriter, Jerad S.; Gralewski, Jonathan H.; Joyal, Teresa; Carter, Clay J.

    2010-01-01

    Background Nectaries are the floral organs responsible for the synthesis and secretion of nectar. Despite their central roles in pollination biology, very little is understood about the molecular mechanisms underlying nectar production. This project was undertaken to identify genes potentially involved in mediating nectary form and function in Brassica rapa. Methodology and Principal Findings Four cDNA libraries were created using RNA isolated from the median and lateral nectaries of B. rapa flowers, with one normalized and one non-normalized library being generated from each tissue. Approximately 3,000 clones from each library were randomly sequenced from the 5′ end to generate a total of 11,101 high quality expressed sequence tags (ESTs). Sequence assembly of all ESTs together allowed the identification of 1,453 contigs and 4,403 singleton sequences, with the Basic Localized Alignment Search Tool (BLAST) being used to identify 4,138 presumptive orthologs to Arabidopsis thaliana genes. Several genes differentially expressed between median and lateral nectaries were initially identified based upon the number of BLAST hits represented by independent ESTs, and later confirmed via reverse transcription polymerase chain reaction (RT PCR). RT PCR was also used to verify the expression patterns of eight putative orthologs to known Arabidopsis nectary-enriched genes. Conclusions/Significance This work provided a snapshot of gene expression in actively secreting B. rapa nectaries, and also allowed the identification of differential gene expression between median and lateral nectaries. Moreover, 207 orthologs to known nectary-enriched genes from Arabidopsis were identified through this analysis. The results suggest that genes involved in nectar production are conserved amongst the Brassicaceae, and also supply clones and sequence information that can be used to probe nectary function in B. rapa. PMID:20098697

  15. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    PubMed

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

    2016-06-24

    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.

  16. Alternative splicing and expression profile analysis of expressed sequence tags in domestic pig.

    PubMed

    Zhang, Liang; Tao, Lin; Ye, Lin; He, Ling; Zhu, Yuan-Zhong; Zhu, Yue-Dong; Zhou, Yan

    2007-02-01

    Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different non-normalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account. PMID:17572361

  17. Alternative splicing and expression profile analysis of expressed sequence tags in domestic pig.

    PubMed

    Zhang, Liang; Tao, Lin; Ye, Lin; He, Ling; Zhu, Yuan-Zhong; Zhu, Yue-Dong; Zhou, Yan

    2007-02-01

    Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different non-normalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account.

  18. Transcriptome analysis of the Amazonian viper Bothrops atrox venom gland using expressed sequence tags (ESTs).

    PubMed

    Neiva, Márcia; Arraes, Fabricio B M; de Souza, Jonso Vieira; Rádis-Baptista, Gandhi; Prieto da Silva, Alvaro R B; Walter, Maria Emilia M T; Brigido, Marcelo de Macedo; Yamane, Tetsuo; López-Lozano, Jorge Luiz; Astolfi-Filho, Spartaco

    2009-03-15

    Bothrops atrox is a highly dangerous pit viper in the Brazilian Amazon region. We produced a global catalogue of gene transcripts to identify the main toxin and other protein families present in the B. atrox venom gland. We prepared a directional cDNA library, from which a set of 610 high quality expressed sequence tags (ESTs) were generated by bioinformatics processing. Our data indicated a predominance of transcripts encoding mainly metalloproteinases (59% of the toxins). The expression pattern of the B. atrox venom was similar to Bothrops insularis, Bothrops jararaca and Bothrops jararacussu in terms of toxin type, although some differences were observed. B. atrox showed a higher amount of the PIII class of metalloproteinases which correlates well with the observed intense hemorrhagic action of its toxin. Also, the PLA2 content was the second highest in this sample compared to the other three Bothrops transcriptomes. To our knowledge, this work is the first transcriptome analysis of an Amazonian rain forest pit viper and it will contribute to the body of knowledge regarding the gene diversity of the venom gland of members of the Bothrops genus. Moreover, our results can be used for future studies with other snake species from the Amazon region to investigate differences in gene patterns or phylogenetic relationships. PMID:19708221

  19. Expressed sequence tag analysis in Cycas, the most primitive living seed plant

    PubMed Central

    Brenner, Eric D; Stevenson, Dennis W; McCombie, Richard W; Katari, Manpreet S; Rudd, Stephen A; Mayer, Klaus FX; Palenchar, Peter M; Runko, Suzan J; Twigg, Richard W; Dai, Guangwei; Martienssen, Rob A; Benfey, Phillip N; Coruzzi, Gloria M

    2003-01-01

    Background Cycads are ancient seed plants (living fossils) with origins in the Paleozoic. Cycads are sometimes considered a 'missing link' as they exhibit characteristics intermediate between vascular non-seed plants and the more derived seed plants. Cycads have also been implicated as the source of 'Guam's dementia', possibly due to the production of S(+)-beta-methyl-alpha, beta-diaminopropionic acid (BMAA), which is an agonist of animal glutamate receptors. Results A total of 4,200 expressed sequence tags (ESTs) were created from Cycas rumphii and clustered into 2,458 contigs, of which 1,764 had low-stringency BLAST similarity to other plant genes. Among those cycad contigs with similarity to plant genes, 1,718 cycad 'hits' are to angiosperms, 1,310 match genes in gymnosperms and 734 match lower (non-seed) plants. Forty-six contigs were found that matched only genes in lower plants and gymnosperms. Upon obtaining the complete sequence from the clones of 37/46 contigs, 14 still matched only gymnosperms. Among those cycad contigs common to higher plants, ESTs were discovered that correspond to those involved in development and signaling in present-day flowering plants. We purified a cycad EST for a glutamate receptor (GLR)-like gene, as well as ESTs potentially involved in the synthesis of the GLR agonist BMAA. Conclusions Analysis of cycad ESTs has uncovered conserved and potentially novel genes. Furthermore, the presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling. PMID:14659015

  20. Analysis of expressed sequence tags from Brassica rapa L. ssp. pekinensis.

    PubMed

    Lim, J Y; Shin, C S; Chung, E J; Kim, J S; Kim, H U; Oh, S J; Choi, W B; Ryou, C S; Kim, J B; Kwon, M S; Chung, T Y; Song, S I; Kim, J K; Nahm, B H; Hwang, Y S; Eun, M Y; Lee, J S; Cheong, J J; Choi, Y D

    2000-08-31

    Non-redundant expressed sequence tags (ESTs) were generated from six different organs at various developmental stages of Chinese cabbage, Brassica rapa L. ssp. pekinensis. Of the 1,295 ESTs, 915 (71%) showed significantly high homology in nucleotide or deduced amino acid sequences with other sequences deposited in databases, while 380 did not show similarity to any sequences. Briefly, 598 ESTs matched with proteins of identified biological function, 177 with hypothetical proteins or non-annotated Arabidopsis genome sequences, and 140 with other ESTs. About 82% of the top-scored matching sequences were from Arabidopsis or Brassica, but overall 558 (43%) ESTs matched with Arabidopsis ESTs at the nucleotide sequence level. This observation strongly supports the idea that gene-expression profiles of Chinese cabbage differ from that of Arabidopsis, despite their genome structures being similar to each other. Moreover, sequence analyses of 21 Brassica ESTs revealed that their primary structure is different from those of corresponding annotated sequences of Arabidopsis genes. Our data suggest that direct prediction of Brassica gene expression pattern based on the information from Arabidopsis genome research has some limitations. Thus, information obtained from the Brassica EST study is useful not only for understanding of unique developmental processes of the plant, but also for the study of Arabidopsis genome structure.

  1. Preparation and analysis of an expressed sequence tag library from the toxic dinoflagellate Alexandrium catenella.

    PubMed

    Uribe, Paulina; Fuentes, Daniela; Valdés, Jorge; Shmaryahu, Amir; Zúñiga, Alicia; Holmes, David; Valenzuela, Pablo D T

    2008-01-01

    Dinoflagellates of the genus Alexandrium are photosynthetic microalgae that have an extreme importance due to the impact of some toxic species on shellfish aquaculture industry. Alexandrium catenella is the species responsible for the production of paralytic shellfish poisoning in Chile and other geographical areas. We have constructed a cDNA library from midexponential cells of A. catenella grown in culture free of associated bacteria and sequenced 10,850 expressed sequence tags (ESTs) that were assembled into 1,021 contigs and 5,475 singletons for a total of 6,496 unigenes. Approximately 41.6% of the unigenes showed similarity to genes with predicted function. A significant number of unigenes showed similarity with genes from other dinoflagellates, plants, and other protists. Among the identified genes, the most expressed correspond to those coding for proteins of luminescence, carbohydrate metabolism, and photosynthesis. The sequences of 9,847 ESTs have been deposited in Gene Bank (accession numbers EX 454357-464203).

  2. Myocardial tagging by cardiovascular magnetic resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications.

    PubMed

    Ibrahim, El-Sayed H

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  3. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    PubMed Central

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  4. Desiccation survival in an Antarctic nematode: molecular analysis using expressed sequenced tags

    PubMed Central

    Adhikari, Bishwo N; Wall, Diana H; Adams, Byron J

    2009-01-01

    Background Nematodes are the dominant soil animals in Antarctic Dry Valleys and are capable of surviving desiccation and freezing in an anhydrobiotic state. Genes induced by desiccation stress have been successfully enumerated in nematodes; however we have little knowledge of gene regulation by Antarctic nematodes which can survive multiple environmental stresses. To address this problem we investigated the genetic responses of a nematode species, Plectus murrayi, that is capable of tolerating Antarctic environmental extremes, in particular desiccation and freezing. In this study, we provide the first insight into the desiccation induced transcriptome of an Antarctic nematode through cDNA library construction and suppressive subtractive hybridization. Results We obtained 2,486 expressed sequence tags (ESTs) from 2,586 clones derived from the cDNA library of desiccated P. murrayi. The 2,486 ESTs formed 1,387 putative unique transcripts of which 523 (38%) had matches in the model-nematode Caenorhabditis elegans, 107 (7%) in nematodes other than C. elegans, 153 (11%) in non-nematode organisms and 605 (44%) had no significant match to any sequences in the current databases. The 1,387 unique transcripts were functionally classified by using Gene Ontology (GO) hierarchy and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The results indicate that the transcriptome contains a group of transcripts from diverse functional areas. The subtractive library of desiccated nematodes showed 80 transcripts differentially expressed during desiccation stress, of which 28% were metabolism related, 19% were involved in environmental information processing, 28% involved in genetic information processing and 21% were novel transcripts. Expression profiling of 14 selected genes by quantitative Real-time PCR showed 9 genes significantly up-regulated, 3 down-regulated and 2 continuously expressed in response to desiccation. Conclusion The establishment of a desiccation EST

  5. Generation and Analysis of End Sequence Database for T-DNA Tagging Lines in Rice1

    PubMed Central

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-01-01

    We analyzed 6,749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3,793 genomic sequences flanking the T-DNA. Among the insertions, 1,846 T-DNAs were integrated into genic regions, and 1,864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1,846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  6. Pyrosequence analysis of expressed sequence tags for Manduca sexta hemolymph proteins involved in immune responses.

    PubMed

    Zou, Zhen; Najar, Fares; Wang, Yang; Roe, Bruce; Jiang, Haobo

    2008-06-01

    The tobacco hornworm Manduca sexta is widely used as a model organism to investigate the biochemical basis of insect physiological processes but little transcriptome information is available. To get a broad view of the larval hemolymph proteins, particularly those related to immunity, we synthesized and sequenced cDNA fragments from a mixture of eight total RNA samples: fat body and hemocytes from larvae injected with killed bacteria, fat body, hemocytes, integument and trachea from naïve larvae, and fat body and hemocytes from wandering larvae. Using massively parallel pyrosequencing, we obtained 95,458 M. sexta expressed sequence tags (ESTs) at an average size of 185bp per read. A majority of the sequences (69,429 reads) could be assembled into 7231 contigs with an average size of 300bp, 1178 of which had significant similarity with Drosophila genes from various functional groups. Only approximately 8% (606) of the contigs matched known M. sexta cDNA sequences, representing 186 of the 375 unique NCBI entries. The remaining 6625 contigs represented newly discovered cDNA segments from this well studied biochemical model insect. A search of the 7231 contigs using Tribolium castaneum, Drosophila melanogaster, and Bombyx mori immunity-related sequences revealed 424 cDNA contigs with significant similarity (E-value <1 x 10(-5)). These included 218 previously unknown M. sexta sequences coding for putative defense molecules such as pattern recognition receptors, serine proteinases, serpins, Spätzle, Toll-like receptors, intracellular signaling molecules, and antimicrobial peptides. PMID:18510979

  7. Preparation and analysis of an expressed sequence tag library from the toxic dinoflagellate Alexandrium catenella.

    PubMed

    Uribe, Paulina; Fuentes, Daniela; Valdés, Jorge; Shmaryahu, Amir; Zúñiga, Alicia; Holmes, David; Valenzuela, Pablo D T

    2008-01-01

    Dinoflagellates of the genus Alexandrium are photosynthetic microalgae that have an extreme importance due to the impact of some toxic species on shellfish aquaculture industry. Alexandrium catenella is the species responsible for the production of paralytic shellfish poisoning in Chile and other geographical areas. We have constructed a cDNA library from midexponential cells of A. catenella grown in culture free of associated bacteria and sequenced 10,850 expressed sequence tags (ESTs) that were assembled into 1,021 contigs and 5,475 singletons for a total of 6,496 unigenes. Approximately 41.6% of the unigenes showed similarity to genes with predicted function. A significant number of unigenes showed similarity with genes from other dinoflagellates, plants, and other protists. Among the identified genes, the most expressed correspond to those coding for proteins of luminescence, carbohydrate metabolism, and photosynthesis. The sequences of 9,847 ESTs have been deposited in Gene Bank (accession numbers EX 454357-464203). PMID:18478293

  8. Functional annotation of an expressed sequence tag library from Haliotis diversicolor and analysis of its plant-like sequences.

    PubMed

    Jiang, Jing-Zhe; Zhang, Wei; Guo, Zhi-Xun; Cai, Chen-Chen; Su, You-Lu; Wang, Rui-Xuan; Wang, Jiang-Yong

    2011-09-01

    The small abalone, Haliotis diversicolor, is a widely distributed and cultured species in the subtropical coastal area of China. To identify and classify functional genes of this important species, a normalized expressed sequence tag (EST) library, including 7069 high quality ESTs from the total body of H. diversicolor, was analyzed. A total of 4781 unigenes were assembled and 2991 novel abalone genes were identified. The GC content, codon and amino acid usage of the transcriptome were analyzed. For the accurate annotation of the abalone library, different influencing factors were evaluated. The gene ontology (GO) database provided a higher annotation rate (69.6%), and sequences longer than 800bp were easily subjected to a BLAST search. The taxonomy of the BLAST results showed that lancelet and invertebrates are most closely related to abalone. Sixty-seven identified plant-like genes were further examined by reverse transcription-polymerase chain reaction (RT-PCR) and sequencing, only seven of these were real transcripts in abalone. Phylogenic trees were also constructed to illustrate the positions of two Cystatin sequences and one Calmodulin protein sequence identified in abalone. To perform functional classification, three different databases (GO, KEGG and COG) were used and 60 immune or disease-related unigenes were determined. This work has greatly enlarged the known gene pool of H. diversicolor and will have important implications for future molecular and genetic analyses in this organism.

  9. High-Throughput Tag-Sequencing Analysis of Early Events Induced by Ochratoxin A in HepG-2 Cells.

    PubMed

    Zhang, Yu; Qi, Xiaozhe; Zheng, Juanjuan; Luo, YunBo; Huang, Kunlun; Xu, Wentao

    2016-01-01

    Ochratoxin A (OTA) is produced by fungi of the species Aspergillus and Penicillium. OTA has displayed hepatotoxicity in mammals. Although recent studies have indicated that OTA influences liver function, little is known regarding its impact on differential early liver toxicity. In this study, we report high-throughput tag-sequencing (Tag-seq) analysis of the transcriptome using Solexa Analyzer platform after 4 h of OTA treatment on HepG-2 cells. The analyses of differentially expressed genes revealed the substantial changes. A total of 21,449 genes were identified and quantified, with 2726 displaying significantly altered expression levels. Expression level data were then integrated with a network of gene-gene interactions, and biological pathways to obtain a systems-level view of changes in the transcriptome that occur with OTA resistance. Our data suggest that OTA exposure leads to an imbalance in zinc finger expression and shed light on splicing factor and mitochondrial-based mechanisms. PMID:26377828

  10. High-Throughput Tag-Sequencing Analysis of Early Events Induced by Ochratoxin A in HepG-2 Cells.

    PubMed

    Zhang, Yu; Qi, Xiaozhe; Zheng, Juanjuan; Luo, YunBo; Huang, Kunlun; Xu, Wentao

    2016-01-01

    Ochratoxin A (OTA) is produced by fungi of the species Aspergillus and Penicillium. OTA has displayed hepatotoxicity in mammals. Although recent studies have indicated that OTA influences liver function, little is known regarding its impact on differential early liver toxicity. In this study, we report high-throughput tag-sequencing (Tag-seq) analysis of the transcriptome using Solexa Analyzer platform after 4 h of OTA treatment on HepG-2 cells. The analyses of differentially expressed genes revealed the substantial changes. A total of 21,449 genes were identified and quantified, with 2726 displaying significantly altered expression levels. Expression level data were then integrated with a network of gene-gene interactions, and biological pathways to obtain a systems-level view of changes in the transcriptome that occur with OTA resistance. Our data suggest that OTA exposure leads to an imbalance in zinc finger expression and shed light on splicing factor and mitochondrial-based mechanisms.

  11. Analysis of expressed sequence tag loci on wheat chromosome group 4.

    PubMed

    Miftahudin; Ross, K; Ma, X-F; Mahmoud, A A; Layton, J; Milla, M A Rodriguez; Chikmawati, T; Ramalingam, J; Feril, O; Pathan, M S; Momirovic, G Surlan; Kim, S; Chema, K; Fang, P; Haule, L; Struxness, H; Birkes, J; Yaghoubian, C; Skinner, R; McAllister, J; Nguyen, V; Qi, L L; Echalier, B; Gill, B S; Linkiewicz, A M; Dubcovsky, J; Akhunov, E D; Dvorák, J; Dilbirligi, M; Gill, K S; Peng, J H; Lapitan, N L V; Bermudez-Kandianis, C E; Sorrells, M E; Hossain, K G; Kalavacharla, V; Kianian, S F; Lazo, G R; Chao, S; Anderson, O D; Gonzalez-Hernandez, J; Conley, E J; Anderson, J A; Choi, D-W; Fenton, R D; Close, T J; McGuire, P E; Qualset, C O; Nguyen, H T; Gustafson, J P

    2004-10-01

    A total of 1918 loci, detected by the hybridization of 938 expressed sequence tag unigenes (ESTs) from 26 Triticeae cDNA libraries, were mapped to wheat (Triticum aestivum L.) homoeologous group 4 chromosomes using a set of deletion, ditelosomic, and nulli-tetrasomic lines. The 1918 EST loci were not distributed uniformly among the three group 4 chromosomes; 41, 28, and 31% mapped to chromosomes 4A, 4B, and 4D, respectively. This pattern is in contrast to the cumulative results of EST mapping in all homoeologous groups, as reported elsewhere, that found the highest proportion of loci mapped to the B genome. Sixty-five percent of these 1918 loci mapped to the long arms of homoeologous group 4 chromosomes, while 35% mapped to the short arms. The distal regions of chromosome arms showed higher numbers of loci than the proximal regions, with the exception of 4DL. This study confirmed the complex structure of chromosome 4A that contains two reciprocal translocations and two inversions, previously identified. An additional inversion in the centromeric region of 4A was revealed. A consensus map for homoeologous group 4 was developed from 119 ESTs unique to group 4. Forty-nine percent of these ESTs were found to be homoeologous to sequences on rice chromosome 3, 12% had matches with sequences on other rice chromosomes, and 39% had no matches with rice sequences at all. Limited homology (only 26 of the 119 consensus ESTs) was found between wheat ESTs on homoeologous group 4 and the Arabidopsis genome. Forty-two percent of the homoeologous group 4 ESTs could be classified into functional categories on the basis of blastX searches against all protein databases. PMID:15514042

  12. Analysis of Expressed Sequence Tag Loci on Wheat Chromosome Group 4

    PubMed Central

    Miftahudin; Ross, K.; Ma, X.-F.; Mahmoud, A. A.; Layton, J.; Milla, M. A. Rodriguez; Chikmawati, T.; Ramalingam, J.; Feril, O.; Pathan, M. S.; Momirovic, G. Surlan; Kim, S.; Chema, K.; Fang, P.; Haule, L.; Struxness, H.; Birkes, J.; Yaghoubian, C.; Skinner, R.; McAllister, J.; Nguyen, V.; Qi, L. L.; Echalier, B.; Gill, B. S.; Linkiewicz, A. M.; Dubcovsky, J.; Akhunov, E. D.; Dvořák, J.; Dilbirligi, M.; Gill, K. S.; Peng, J. H.; Lapitan, N. L. V.; Bermudez-Kandianis, C. E.; Sorrells, M. E.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Lazo, G. R.; Chao, S.; Anderson, O. D.; Gonzalez-Hernandez, J.; Conley, E. J.; Anderson, J. A.; Choi, D.-W.; Fenton, R. D.; Close, T. J.; McGuire, P. E.; Qualset, C. O.; Nguyen, H. T.; Gustafson, J. P.

    2004-01-01

    A total of 1918 loci, detected by the hybridization of 938 expressed sequence tag unigenes (ESTs) from 26 Triticeae cDNA libraries, were mapped to wheat (Triticum aestivum L.) homoeologous group 4 chromosomes using a set of deletion, ditelosomic, and nulli-tetrasomic lines. The 1918 EST loci were not distributed uniformly among the three group 4 chromosomes; 41, 28, and 31% mapped to chromosomes 4A, 4B, and 4D, respectively. This pattern is in contrast to the cumulative results of EST mapping in all homoeologous groups, as reported elsewhere, that found the highest proportion of loci mapped to the B genome. Sixty-five percent of these 1918 loci mapped to the long arms of homoeologous group 4 chromosomes, while 35% mapped to the short arms. The distal regions of chromosome arms showed higher numbers of loci than the proximal regions, with the exception of 4DL. This study confirmed the complex structure of chromosome 4A that contains two reciprocal translocations and two inversions, previously identified. An additional inversion in the centromeric region of 4A was revealed. A consensus map for homoeologous group 4 was developed from 119 ESTs unique to group 4. Forty-nine percent of these ESTs were found to be homoologous to sequences on rice chromosome 3, 12% had matches with sequences on other rice chromosomes, and 39% had no matches with rice sequences at all. Limited homology (only 26 of the 119 consensus ESTs) was found between wheat ESTs on homoeologous group 4 and the Arabidopsis genome. Forty-two percent of the homoeologous group 4 ESTs could be classified into functional categories on the basis of blastX searches against all protein databases. PMID:15514042

  13. Identification, analysis, and linkage mapping of expressed sequence tags from the Australian sheep blowfly

    PubMed Central

    2011-01-01

    Background The Australian sheep blowfly Lucilia cuprina (Wiedemann) (Diptera: Calliphoridae) is a destructive pest of the sheep, a model organism for insecticide resistance research, and a valuable tool for medical and forensic professionals. However, genomic information on L. cuprina is still sparse. Results We report here the construction of an embryonic and 2 larval cDNA libraries for L. cuprina. A total of 29,816 expressed sequence tags (ESTs) were obtained and assembled into 7,464 unique clusters. The sequence collection captures a great diversity of genes, including those related to insecticide resistance (e.g., 12 cytochrome P450s, 2 glutathione S transferases, and 6 esterases). Compared to Drosophila melanogaster, codon preference is different in 13 of the 18 amino acids encoded by redundant codons, reflecting the lower overall GC content in L. cuprina. In addition, we demonstrated that the ESTs could be converted into informative gene markers by capitalizing on the known gene structures in the model organism D. melanogaster. We successfully assigned 41 genes to their respective chromosomes in L. cuprina. The relative locations of these loci revealed high but incomplete chromosomal synteny between L. cuprina and D. melanogaster. Conclusions Our results represent the first major transcriptomic undertaking in L. cuprina. These new genetic resources could be useful for the blowfly and insect research community. PMID:21827708

  14. Assembly of a gene sequence tag microarray by reversible biotin-streptavidin capture for transcript analysis of Arabidopsis thaliana

    PubMed Central

    Wirta, Valtteri; Holmberg, Anders; Lukacs, Morten; Nilsson, Peter; Hilson, Pierre; Uhlén, Mathias; Bhalerao, Rishikesh P; Lundeberg, Joakim

    2005-01-01

    Background Transcriptional profiling using microarrays has developed into a key molecular tool for the elucidation of gene function and gene regulation. Microarray platforms based on either oligonucleotides or purified amplification products have been utilised in parallel to produce large amounts of data. Irrespective of platform examined, the availability of genome sequence or a large number of representative expressed sequence tags (ESTs) is, however, a pre-requisite for the design and selection of specific and high-quality microarray probes. This is of great importance for organisms, such as Arabidopsis thaliana, with a high number of duplicated genes, as cross-hybridisation signals between evolutionary related genes cannot be distinguished from true signals unless the probes are carefully designed to be specific. Results We present an alternative solid-phase purification strategy suitable for efficient preparation of short, biotinylated and highly specific probes suitable for large-scale expression profiling. Twenty-one thousand Arabidopsis thaliana gene sequence tags were amplified and subsequently purified using the described technology. The use of the arrays is exemplified by analysis of gene expression changes caused by a four-hour indole-3-acetic (auxin) treatment. A total of 270 genes were identified as differentially expressed (120 up-regulated and 150 down-regulated), including several previously known auxin-affected genes, but also several previously uncharacterised genes. Conclusions The described solid-phase procedure can be used to prepare gene sequence tag microarrays based on short and specific amplified probes, facilitating the analysis of more than 21 000 Arabidopsis transcripts. PMID:15689241

  15. Analysis of expressed sequence tags from cDNA library of Fusarium culmorum infected barley (Hordeum vulgare L.) roots

    PubMed Central

    Tufan, Feyza; Uçarlı, Cüneyt; Gürel, Filiz

    2015-01-01

    Fusarium culmorum is one of the most common and globally important causal agent of root and crown rot diseases of cereals. These diseases cause grain yield loss and reduced grain quality in barley. In this study, we have analyzed an expressed sequence tag (EST) database derived from F. culmorum infected barley root tissues available at the National Center for Biotechnology Information (NCBI). The 2294 sequences were assembled into 1619 non-redundant sequences consisting of 359 contigs and 1260 singletons using the program CAP3. BLASTX analysis for these sequences was conducted in order to find similar sequences in all databases. Gene Ontology search, enzyme search, KEGG mapping and InterProScan search were done using Blast2GO 3.0.7 tool. By BLASTX analysis, 41.7%, 7.7%, 3.2% and 47.4% of ESTs were categorized as annotated, unannotated, not mapping and without blast hits, respectively. BLASTX analysis revealed that the majority of top hits were barley proteins (43.5%). Based on Gene Ontology classification, 38.3%, 31.3%, and 16% of ESTs were assigned to molecular function, biological process, and cellular component GO terms, respectively. Most abundant GO terms were as follows: 157 sequences were related to response to stress (biological process), 207 sequences were related to ion binding (molecular function), and 160 sequences were related to plastid (cellular component). Furthermore, based on KEGG mapping, 369 sequences could be assigned to 264 enzymes and 83 different KEGG pathways. According to Enzyme Commission (EC) distribution; 94 sequences were transferases (EC2) while 70 sequences were hydrolases (EC3). PMID:25780278

  16. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  17. ESTPiper – a web-based analysis pipeline for expressed sequence tags

    PubMed Central

    Tang, Zuojian; Choi, Jeong-Hyeon; Hemmerich, Chris; Sarangi, Ankita; Colbourne, John K; Dong, Qunfeng

    2009-01-01

    Background EST sequencing projects are increasing in scale and scope as the genome sequencing technologies migrate from core sequencing centers to individual research laboratories. Effectively, generating EST data is no longer a bottleneck for investigators. However, processing large amounts of EST data remains a non-trivial challenge for many. Web-based EST analysis tools are proving to be the most convenient option for biologists when performing their analysis, so these tools must continuously improve on their utility to keep in step with the growing needs of research communities. We have developed a web-based EST analysis pipeline called ESTPiper, which streamlines typical large-scale EST analysis components. Results The intuitive web interface guides users through each step of base calling, data cleaning, assembly, genome alignment, annotation, analysis of gene ontology (GO), and microarray oligonucleotide probe design. Each step is modularized. Therefore, a user can execute them separately or together in batch mode. In addition, the user has control over the parameters used by the underlying programs. Extensive documentation of ESTPiper's functionality is embedded throughout the web site to facilitate understanding of the required input and interpretation of the computational results. The user can also download intermediate results and port files to separate programs for further analysis. In addition, our server provides a time-stamped description of the run history for reproducibility. The pipeline can also be installed locally, allowing researchers to modify ESTPiper to suit their own needs. Conclusion ESTPiper streamlines the typical process of EST analysis. The pipeline was initially designed in part to support the Daphnia pulex cDNA sequencing project. A web server hosting ESTPiper is provided at to now support projects of all size. The software is also freely available from the authors for local installations. PMID:19383159

  18. Gene expression analysis of volatile-rich male flowers of dioecious Pandanus fascicularis using expressed sequence tags.

    PubMed

    Vinod, M S; Sankararamasubramanian, H M; Priyanka, R; Ganesan, G; Parida, Ajay

    2010-07-15

    Pandanus fascicularis is dioecious with the female plant producing a non-scented fruit while the male produces a flower rich in volatiles. The essential oil extracted from the flowers is economically exploited as a natural flavouring agent as well as for its therapeutic properties. Molecular dissection of this distinct flower for identifying the genes responsible for its aroma by way of expressed sequence tags (ESTs) has not been initiated in spite of its economic viability. A male flower-specific cDNA library was constructed and 977 ESTs were generated. CAP3 analysis performed on the dataset revealed 83 contigs (549 ESTs) and 428 singlets, thereby yielding a total of 511 unigenes. Functional annotation using the BLAST2GO software resulted in 1952 Gene ontology (GO) functional classification terms for 621 sequences. Unknown proteins were further analysed with InterProScan to determine their functional motifs. RNA gel blot analysis of 26 functionally distinct transcripts potentially involved in flowering and volatile generation, using vegetative and reproductive tissues of both the sexes, revealed differential expression profiles. In addition to an overview of genes expressed, candidate genes with expression that are modulated predominantly in the male inflorescence were also identified. This is the first report on generation of ESTs to determine the subset of genes that can be used as potential candidates for future attempts aimed towards its genetic and genome analysis including metabolic engineering of floral volatiles in this economically important plant.

  19. Analysis of expressed sequence tags (ESTs) from a normalized cDNA library and isolation of EST simple sequence repeats from the invasive cotton mealybug Phenacoccus solenopsis.

    PubMed

    Li, Hui; Lang, Kun-Ling; Fu, Hai-Bin; Shen, Chang-Peng; Wan, Fang-Hao; Chu, Dong

    2015-12-01

    The cotton mealybug, Phenacoccus solenopsis Tinsley, is a serious and invasive pest. At present, genetic resources for studying P. solenopsis are limited, and this negatively affects genetic research on the organism and, consequently, translational work to improve management of this pest. In the present study, expressed sequence tags (ESTs) were analyzed from a normalized complementary DNA library of P. solenopsis. In addition, EST-derived microsatellite loci (also known as simple sequence repeats or SSRs) were isolated and characterized. A total of 1107 high-quality ESTs were acquired from the library. Clustering and assembly analysis resulted in 785 unigenes, which were classified functionally into 23 categories according to the Gene Ontology database. Seven EST-based SSR markers were developed in this study and are expected to be useful in characterizing how this invasive species was introduced, as well as providing insights into its genetic microevolution.

  20. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.).

    PubMed

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant.

  1. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  2. Gene Discovery and Expression Profile Analysis through Sequencing of Expressed Sequence Tags from Different Developmental Stages of the Chytridiomycete Blastocladiella emersonii†

    PubMed Central

    Ribichich, Karina F.; Salem-Izacc, Silvia M.; Georg, Raphaela C.; Vêncio, Ricardo Z. N.; Navarro, Luci D.; Gomes, Suely L.

    2005-01-01

    Blastocladiella emersonii is an aquatic fungus of the chytridiomycete class which diverged early from the fungal lineage and is notable for the morphogenetic processes which occur during its life cycle. Its particular taxonomic position makes this fungus an interesting system to be considered when investigating phylogenetic relationships and studying the biology of lower fungi. To contribute to the understanding of the complexity of the B. emersonii genome, we present here a survey of expressed sequence tags (ESTs) from various stages of the fungal development. Nearly 20,000 cDNA clones from 10 different libraries were partially sequenced from their 5′ end, yielding 16,984 high-quality ESTs. These ESTs were assembled into 4,873 putative transcripts, of which 48% presented no matches with existing sequences in public databases. As a result of Gene Ontology (GO) project annotation, 1,680 ESTs (35%) were classified into biological processes of the GO structure, with transcription and RNA processing, protein biosynthesis, and transport as prevalent processes. We also report full-length sequences, useful for construction of molecular phylogenies, and several ESTs that showed high similarity with known proteins, some of which were not previously described in fungi. Furthermore, we analyzed the expression profile (digital Northern analysis) of each transcript throughout the life cycle of the fungus using Bayesian statistics. The in silico approach was validated by Northern blot analysis with good agreement between the two methodologies. PMID:15701807

  3. Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

    PubMed Central

    Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

    2010-01-01

    Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085

  4. Analysis and functional annotation of expressed sequence tags (ESTs) from multiple tissues of oil palm (Elaeis guineensis Jacq.)

    PubMed Central

    Ho, Chai-Ling; Kwan, Yen-Yen; Choi, Mei-Chooi; Tee, Sue-Sean; Ng, Wai-Har; Lim, Kok-Ang; Lee, Yang-Ping; Ooi, Siew-Eng; Lee, Weng-Wah; Tee, Jin-Ming; Tan, Siang-Hee; Kulaveerasingam, Harikrishna; Alwee, Sharifah Shahrul Rabiah Syed; Abdullah, Meilina Ong

    2007-01-01

    Background Oil palm is the second largest source of edible oil which contributes to approximately 20% of the world's production of oils and fats. In order to understand the molecular biology involved in in vitro propagation, flowering, efficient utilization of nitrogen sources and root diseases, we have initiated an expressed sequence tag (EST) analysis on oil palm. Results In this study, six cDNA libraries from oil palm zygotic embryos, suspension cells, shoot apical meristems, young flowers, mature flowers and roots, were constructed. We have generated a total of 14537 expressed sequence tags (ESTs) from these libraries, from which 6464 tentative unique contigs (TUCs) and 2129 singletons were obtained. Approximately 6008 of these tentative unique genes (TUGs) have significant matches to the non-redundant protein database, from which 2361 were assigned to one or more Gene Ontology categories. Predominant transcripts and differentially expressed genes were identified in multiple oil palm tissues. Homologues of genes involved in many aspects of flower development were also identified among the EST collection, such as CONSTANS-like, AGAMOUS-like (AGL)2, AGL20, LFY-like, SQUAMOSA, SQUAMOSA binding protein (SBP) etc. Majority of them are the first representatives in oil palm, providing opportunities to explore the cause of epigenetic homeotic flowering abnormality in oil palm, given the importance of flowering in fruit production. The transcript levels of two flowering-related genes, EgSBP and EgSEP were analysed in the flower tissues of various developmental stages. Gene homologues for enzymes involved in oil biosynthesis, utilization of nitrogen sources, and scavenging of oxygen radicals, were also uncovered among the oil palm ESTs. Conclusion The EST sequences generated will allow comparative genomic studies between oil palm and other monocotyledonous and dicotyledonous plants, development of gene-targeted markers for the reference genetic map, design and

  5. TagRecon: High-Throughput Mutation Identification through Sequence Tagging

    PubMed Central

    Dasari, Surendra; Chambers, Matthew C.; Slebos, Robbert J.; Zimmerman, Lisa J.; Ham, Amy-Joan L.; Tabb, David L.

    2010-01-01

    Shotgun proteomics produces collections of tandem mass spectra that contain all the data needed to identify mutated peptides from clinical samples. Identifying these sequence variations, however, has not been feasible with conventional database search strategies, which require exact matches between observed and expected sequences. Searching for mutations as mass shifts on specified residues through database search can incur significant performance penalties and generate substantial false positive rates. Here we describe TagRecon, an algorithm that leverages inferred sequence tags to identify unanticipated mutations in clinical proteomic data sets. TagRecon identifies unmodified peptides as sensitively as the related MyriMatch database search engine. In both LTQ and Orbitrap data sets, TagRecon outperformed state of the art software in recognizing sequence mismatches from data sets with known variants. We developed guidelines for filtering putative mutations from clinical samples, and we applied them in an analysis of cancer cell lines and an examination of colon tissue. Mutations were found in up to 6% of identified peptides, and only a small fraction corresponded to dbSNP entries. The RKO cell line, which is DNA mismatch repair deficient, yielded more mutant peptides than the mismatch repair proficient SW480 line. Analysis of colon cancer tumor and adjacent tissue revealed hydroxyproline modifications associated with extracellular matrix degradation. These results demonstrate the value of using sequence tagging algorithms to fully interrogate clinical proteomic data sets. PMID:20131910

  6. Comprehensive analysis of expressed sequence tags from the pulp of the red mutant 'Cara Cara' navel orange (Citrus sinensis Osbeck).

    PubMed

    Ye, Jun-Li; Zhu, An-Dan; Tao, Neng-Guo; Xu, Qiang; Xu, Juan; Deng, Xiu-Xin

    2010-10-01

    Expressed sequence tag (EST) analysis of the pulp of the red-fleshed mutant 'Cara Cara' navel orange provided a starting point for gene discovery and transcriptome survey during citrus fruit maturation. Interpretation of the EST datasets revealed that the mutant pulp transcriptome held a high section of stress responses related genes, such as the type III metallothionein-like gene (6.0%), heat shock protein (2.8%), Cu/Zn superoxide dismutase (0.8%), late embryogenesis abundant protein 5 (0.8%), etc. 133 transcripts were detected to be differentially expressed between the red mutant and its orange-color wild genotype 'Washington' via digital expression analysis. Among them, genes involved in metabolism, defense/stress and signal transduction were statistical overrepresented. Fifteen transcription factors, composed of NAM, ATAF, and CUC transcription factor (NAC); myeloblastosis (MYB); myelocytomatosis (MYC); basic helix-loop-helix (bHLH); basic leucine zipper (bZIP) domain members, were also included. The data reflected the distinct expression profile and the unique regulatory module associated with these two genotypes. Eight differently expressed genes analyzed in digital were validated by quantitative real-time polymerase chain reaction. For structural polymorphism, both simple sequence repeats and single nucleotide polymorphisms (SNP) loci were surveyed; dinucleotide presentation revealed a bias toward AG/GA/TC/CT repeats (52.5%), against GC/CG repeats (0%). SNPs analysis found that transitions (73%) outnumbered transversions (27%). Seventeen potential cultivar-specific and 387 heterozygous SNP loci were detected from 'Cara Cara' and 'Washington' EST pool.

  7. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat.

  8. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat

    PubMed Central

    Goswami, Suneha; Kumar, Ranjeet R.; Dubey, Kavita; Singh, Jyoti P.; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C.; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C.; Kala, Yugal K.; Singh, Gyanendra P.; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D.

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat—a novel step toward the development of

  9. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat

  10. An expressed sequence tag database of T-cell-enriched activated chicken splenocytes: sequence analysis of 5251 clones.

    PubMed

    Tirunagaru, V G; Sofer, L; Cui, J; Burnside, J

    2000-06-01

    The cDNA and gene sequences of many mammalian cytokines and their receptors are known. However, corresponding information on avian cytokines is limited due to the lack of cross-species activity at the functional level or strong homology at the molecular level. To improve the efficiency of identifying cytokines and novel chicken genes, a directionally cloned cDNA library from T-cell-enriched activated chicken splenocytes was constructed, and the partial sequence of 5251 clones was obtained. Sequence clustering indicates that 2357 (42%) of the clones are present as a single copy, and 2961 are distinct clones, demonstrating the high level of complexity of this library. Comparisons of the sequence data with known DNA sequences in GenBank indicate that approximately 25% of the clones match known chicken genes, 39% have similarity to known genes in other species, and 11% had no match to any sequence in the database. Several previously uncharacterized chicken cytokines and their receptors were present in our library. This collection provides a useful database for cataloging genes expressed in T cells and a valuable resource for future investigations of gene expression in avian immunology. A chicken EST Web site (http://udgenome. ags.udel. edu/chickest/chick.htm) has been created to provide access to the data, and a set of unique sequences has been deposited with GenBank (Accession Nos. AI979741-AI982511). Our new Web site (http://www. chickest.udel.edu) will be active as of March 3, 2000, and will also provide keyword-searching capabilities for BLASTX and BLASTN hits of all our clones. PMID:10860659

  11. Deciphering Noncoding RNA and Chromatin Interactions: Multiplex Chromatin Interaction Analysis by Paired-End Tag Sequencing (mChIA-PET).

    PubMed

    Choy, Jocelyn; Fullwood, Melissa J

    2017-01-01

    Genomic DNA is dynamically associated with protein factors and folded to form chromatin fibers. The 3-dimensional (3D) configuration of the chromatin will enable the distal genetic elements to come into close proximity, allowing transcriptional regulation. Noncoding RNA can mediate the 3D structure of chromatin. Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) is a valuable and powerful technique in molecular biology which allows the study of unbiased, genome-wide de novo chromatin interactions with paired-end tags. Here, we describe the standard version of ChIA-PET and a Multiplex ChIA-PET version. PMID:27662871

  12. Comparative analysis of the Acyrthosiphon pisum genome and expressed sequence tag-based gene sets from other aphid species.

    PubMed

    Ollivier, M; Legeai, F; Rispe, C

    2010-03-01

    To study gene repertoires and their evolution within aphids, we compared the complete genome sequence of Acyrthosiphon pisum (reference gene set) and expressed sequence tag (EST) data from three other species: Myzus persicae, Aphis gossypii and Toxoptera citricida. We assembled ESTs, predicted coding sequences, and identified potential pairs of orthologues (reciprocical best hits) with A. pisum. Pairwise comparisons show that a fraction of the genes evolve fast (high ratio of non-synonymous to synonymous rates), including many genes shared by aphids but with no hit in Uniprot. A detailed phylogenetic study for four fast-evolving genes (C002, JHAMT, Apo and GH) shows that rate accelerations are often associated with duplication events. We also compare compositional patterns between the two tribes of aphids, Aphidini and Macrosiphini.

  13. Molecular diversification based on analysis of expressed sequence tags from the venom glands of the Chinese bird spider Ornithoctonus huwena.

    PubMed

    Jiang, Liping; Peng, Li; Chen, Jinjun; Zhang, Yongqun; Xiong, Xia; Liang, Songping

    2008-06-15

    The bird spider Ornithoctonus huwena is one of the most venomous spiders in China. Its venom has been investigated but usually only the most abundant components have been analyzed. To characterize the primary structure of O. huwena toxins, a list of transcripts within the venom gland were made using the expressed sequence tag (EST) strategy. We generated 468 ESTs from a directional cDNA library of O. huwena venom glands. All ESTs were grouped into 24 clusters and 65 singletons, of which 68.00% of total ESTs belong to toxin-like sequences, 13.00% are similar to body peptide transcripts and 19.00% have no significant similarity to any known sequences. Precursors of all toxin-like sequences can be classified into eight different superfamilies (HWTX-I superfamily, HWTX-II superfamily, HWTX-X superfamily, HWTX-XIV superfamily, HWTX-XV superfamily, HWTX-XVI superfamily, HWTX-XVII superfamily, HWTX-XVIII superfamily) except HWTX-XI and HWTX-XIII, according to the identity of their precursor sequences. The results have predictive value for the discovery of various groups of pharmacologically distinct toxins in complex venoms, and for understanding the relationship of spider toxin evolution based on the diversification of cDNA sequences, primary structure of precursor peptides, three-dimensional structure motifs and biological functions.

  14. Expressed sequence tag analysis of khat (Catha edulis) provides a putative molecular biochemical basis for the biosynthesis of phenylpropylamino alkaloids

    PubMed Central

    Hagel, Jillian M.; Krizevski, Raz; Kilpatrick, Korey; Sitrit, Yaron; Marsolais, Frédéric; Lewinsohn, Efraim; Facchini, Peter J.

    2011-01-01

    Khat (Catha edulis Forsk.) is a flowering perennial shrub cultivated for its neurostimulant properties resulting mainly from the occurrence of (S)-cathinone in young leaves. The biosynthesis of (S)-cathinone and the related phenylpropylamino alkaloids (1S,2S)-cathine and (1R,2S)-norephedrine is not well characterized in plants. We prepared a cDNA library from young khat leaves and sequenced 4,896 random clones, generating an expressed sequence tag (EST) library of 3,293 unigenes. Putative functions were assigned to > 98% of the ESTs, providing a key resource for gene discovery. Candidates potentially involved at various stages of phenylpropylamino alkaloid biosynthesis from L-phenylalanine to (1S,2S)-cathine were identified. PMID:22215969

  15. Analysis of expressed sequence tags from the anamorphic basidiomycetous yeast, Pseudozyma antarctica, which produces glycolipid biosurfactants, mannosylerythritol lipids.

    PubMed

    Morita, Tomotake; Konishi, Masaaki; Fukuoka, Tokuma; Imura, Tomohiro; Kitamoto, Dai

    2006-07-15

    Pseudozyma antarctica T-34 secretes a large amount of biosurfactants (BS), mannosylerythritol lipids (MEL), from different carbon sources such as hydrocarbons and vegetable oils. The detailed biosynthetic pathway of MEL remained unknown due to lack of genetic information on the anamorphic basidiomycetous yeasts, including the genus Pseudozyma. Here, in order to obtain genetic information on P. antarctica T-34, we constructed a cDNA library from yeast cells producing MEL from soybean oil and identified the genes expressed through the creation of an expressed sequence tags (EST) library. We generated 398 ESTs, assembled into 146 contiguous sequences. Based upon a BLAST search similarity cut-off of Esequences in the protein database; 60.3% of all contiguous sequences shared significant identities to hypothetical protein of Ustilago maydis, which is a smut fungus and BS producer. Based on the gene expression study using real-time reverse transcriptase-PCR, the predicted genes, such as mannosyltranferase and acyltransferase, were demonstrated to be highly involved in MEL biosynthesis in soybean oil-grown cells. PMID:16845679

  16. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

    PubMed

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

    2016-01-01

    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family. PMID:27323082

  17. Expressed Sequence Tags from Developing Castor Seeds.

    PubMed Central

    Van De Loo, F. J.; Turner, S.; Somerville, C.

    1995-01-01

    To expand the availability of genes encoding enzymes and structural proteins associated with storage lipid synthesis and deposition, partial nucleotide sequences, or expressed sequence tags (ESTs), were obtained for 743 cDNA clones derived from developing seeds of castor (Ricinus communis L.). Enrichment for seed-specific cDNA clones was obtained by selecting clones that did not detectably hybridize to first-strand cDNA from leaf mRNA. Similarly, clones that hybridized to storage proteins or other highly abundant mRNA species from developing seeds were selected against. To enrich for endomembrane-associated proteins, some clones were selected for sequencing by immunological screening with antibodies prepared against partially purified endoplasmic reticulum membranes. Comparison of the deduced amino acid sequences of the ESTs with the public data bases resulted in the assignment of putative identities of 49% of the clones selected by differential hybridization and 71% of the clones selected by immunological screening. Open reading frames in 100 of the ESTs exhibited higher homology to 78 different nonplant gene products than to any previously known plant gene product. PMID:12228533

  18. Flavonoid biosynthesis genes putatively identified in the aromatic plant Polygonum minus via Expressed Sequences Tag (EST) analysis.

    PubMed

    Roslan, Nur Diyana; Yusop, Jastina Mat; Baharum, Syarul Nataqain; Othman, Roohaida; Mohamed-Hussein, Zeti-Azura; Ismail, Ismanizan; Noor, Normah Mohd; Zainal, Zamri

    2012-01-01

    P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large-scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs) which were deposited in dbEST in the National Center of Biotechnology Information (NCBI). From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304), flavonol synthase, FLS (JG705819) and leucoanthocyanidin dioxygenase, LDOX (JG745247) were selected for further examination by quantitative RT-PCR (qRT-PCR) in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  19. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST) Analysis

    PubMed Central

    Roslan, Nur Diyana; Yusop, Jastina Mat; Baharum, Syarul Nataqain; Othman, Roohaida; Mohamed-Hussein, Zeti-Azura; Ismail, Ismanizan; Noor, Normah Mohd; Zainal, Zamri

    2012-01-01

    P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large-scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs) which were deposited in dbEST in the National Center of Biotechnology Information (NCBI). From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304), flavonol synthase, FLS (JG705819) and leucoanthocyanidin dioxygenase, LDOX (JG745247) were selected for further examination by quantitative RT-PCR (qRT-PCR) in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes. PMID:22489118

  20. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  1. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-06-21

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility.

  2. Exploring the Host Parasitism of the Migratory Plant-Parasitic Nematode Ditylenchus destuctor by Expressed Sequence Tags Analysis

    PubMed Central

    Peng, Huan; Gao, Bing-li; Kong, Ling-an; Yu, Qing; Huang, Wen-kun; He, Xu-feng; Long, Hai-bo; Peng, De-liang

    2013-01-01

    The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary

  3. Comparative analysis of expressed sequence tag (EST) libraries in the seagrass Zostera marina subjected to temperature stress.

    PubMed

    Reusch, Thorsten B H; Veron, Amelie S; Preuss, Christoph; Weiner, January; Wissler, Lothar; Beck, Alfred; Klages, Sven; Kube, Michael; Reinhardt, Richard; Bornberg-Bauer, Erich

    2008-01-01

    Global warming is associated with increasing stress and mortality on temperate seagrass beds, in particular during periods of high sea surface temperatures during summer months, adding to existing anthropogenic impacts, such as eutrophication and habitat destruction. We compare several expressed sequence tag (EST) in the ecologically important seagrass Zostera marina (eelgrass) to elucidate the molecular genetic basis of adaptation to environmental extremes. We compared the tentative unigene (TUG) frequencies of libraries derived from leaf and meristematic tissue from a control situation with two experimentally imposed temperature stress conditions and found that TUG composition is markedly different among these conditions (all P < 0.0001). Under heat stress, we find that 63 TUGs are differentially expressed (d.e.) at 25 degrees C compared with lower, no-stress condition temperatures (4 degrees C and 17 degrees C). Approximately one-third of d.e. eelgrass genes were characteristic for the stress response of the terrestrial plant model Arabidopsis thaliana. The changes in gene expression suggest complex photosynthetic adjustments among light-harvesting complexes, reaction center subunits of photosystem I and II, and components of the dark reaction. Heat shock encoding proteins and reactive oxygen scavengers also were identified, but their overall frequency was too low to perform statistical tests. In all conditions, the most abundant transcript (3-15%) was a putative metallothionein gene with unknown function. We also find evidence that heat stress may translate to enhanced infection by protists. A total of 210 TUGs contain one or more microsatellites as potential candidates for gene-linked genetic markers. Data are publicly available in a user-friendly database at http://www.uni-muenster.de/Evolution/ebb/Services/zostera .

  4. Adult midgut expressed sequence tags from the tsetse fly Glossina morsitans morsitans and expression analysis of putative immune response genes

    PubMed Central

    Lehane, M J; Aksoy, S; Gibson, W; Kerhornou, A; Berriman, M; Hamilton, J; Soares, M B; Bonaldo, M F; Lehane, S; Hall, N

    2003-01-01

    Background Tsetse flies transmit African trypanosomiasis leading to half a million cases annually. Trypanosomiasis in animals (nagana) remains a massive brake on African agricultural development. While trypanosome biology is widely studied, knowledge of tsetse flies is very limited, particularly at the molecular level. This is a serious impediment to investigations of tsetse-trypanosome interactions. We have undertaken an expressed sequence tag (EST) project on the adult tsetse midgut, the major organ system for establishment and early development of trypanosomes. Results A total of 21,427 ESTs were produced from the midgut of adult Glossina morsitans morsitans and grouped into 8,876 clusters or singletons potentially representing unique genes. Putative functions were ascribed to 4,035 of these by homology. Of these, a remarkable 3,884 had their most significant matches in the Drosophila protein database. We selected 68 genes with putative immune-related functions, macroarrayed them and determined their expression profiles following bacterial or trypanosome challenge. In both infections many genes are downregulated, suggesting a malaise response in the midgut. Trypanosome and bacterial challenge result in upregulation of different genes, suggesting that different recognition pathways are involved in the two responses. The most notable block of genes upregulated in response to trypanosome challenge are a series of Toll and Imd genes and a series of genes involved in oxidative stress responses. Conclusions The project increases the number of known Glossina genes by two orders of magnitude. Identification of putative immunity genes and their preliminary characterization provides a resource for the experimental dissection of tsetse-trypanosome interactions. PMID:14519198

  5. Expressed sequence-tag analysis of ovaries of Brachiaria brizantha reveals genes associated with the early steps of embryo sac differentiation of apomictic plants.

    PubMed

    Silveira, Erica Duarte; Guimarães, Larissa Arrais; de Alencar Dusi, Diva Maria; da Silva, Felipe Rodrigues; Martins, Natália Florencio; do Carmo Costa, Marcos Mota; Alves-Ferreira, Márcio; de Campos Carneiro, Vera Tavares

    2012-02-01

    In apomixis, asexual mode of plant reproduction through seeds, an unreduced megagametophyte is formed due to circumvented or altered meiosis. The embryo develops autonomously from the unreduced egg cell, independently of fertilization. Brachiaria is a genus of tropical forage grasses that reproduces sexually or by apomixis. A limited number of studies have reported the sequencing of apomixis-related genes and a few Brachiaria sequences have been deposited at genebank databases. This work shows sequencing and expression analyses of expressed sequence-tags (ESTs) of Brachiaria genus and points to transcripts from ovaries with preferential expression at megasporogenesis in apomictic plants. From the 11 differentially expressed sequences from immature ovaries of sexual and apomictic Brachiaria brizantha obtained from macroarray analysis, 9 were preferentially detected in ovaries of apomicts, as confirmed by RT-qPCR. A putative involvement in early steps of Panicum-type embryo sac differentiation of four sequences from B. brizantha ovaries: BbrizHelic, BbrizRan, BbrizSec13 and BbrizSti1 is suggested. Two of these, BbrizSti1 and BbrizHelic, with similarity to a gene coding to stress induced protein and a helicase, respectively, are preferentially expressed in the early stages of apomictic ovaries development, especially in the nucellus, in a stage previous to the differentiation of aposporous initials, as verified by in situ hybridization.

  6. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    PubMed Central

    Lu, Chaofu; Wallis, James G; Browse, John

    2007-01-01

    Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12) gene that is responsible for ricinoleate biosynthesis. The role(s) of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2) gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at the Institute for Genome

  7. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

  8. Identification of stress-induced genes from the drought-tolerant plant Prosopis juliflora (Swartz) DC. through analysis of expressed sequence tags.

    PubMed

    George, Suja; Venkataraman, Gayatri; Parida, Ajay

    2007-05-01

    Abiotic stresses such as cold, salinity, drought, wounding, and heavy metal contamination adversely affect crop productivity throughout the world. Prosopis juliflora is a phreatophyte that can tolerate severe adverse environmental conditions such as drought, salinity, and heavy metal contamination. As a first step towards the characterization of genes that contribute to combating abiotic stress, construction and analysis of a cDNA library of P. juliflora genes is reported here. Random expressed sequence tag (EST) sequencing of 1750 clones produced 1467 high-quality reads. These clones were classified into functional categories, and BLAST comparisons revealed that 114 clones were homologous to genes implicated in stress response(s) and included heat shock proteins, metallothioneins, lipid transfer proteins, and late embryogenesis abundant proteins. Of the ESTs analyzed, 26% showed homology to previously uncharacterized genes in the databases. Fifty-two clones from this category were selected for reverse Northern analysis: 21 were shown to be upregulated and 16 downregulated. The results obtained by reverse Northern analysis were confirmed by Northern analysis. Clustering of the 1467 ESTs produced a total of 295 contigs encompassing 790 ESTs, resulting in a 54.2% redundancy. Two of the abundant genes coding for a nonspecific lipid transfer protein and late embryogenesis abundant protein were sequenced completely. Northern analysis (after polyethylene glycol stress) of the 2 genes was carried out. The implications of the analyzed genes in abiotic stress tolerance are also discussed.

  9. Application of the High Resolution Melting analysis for genetic mapping of Sequence Tagged Site markers in narrow-leafed lupin (Lupinus angustifolius L.).

    PubMed

    Kamel, Katarzyna A; Kroc, Magdalena; Święcicki, Wojciech

    2015-01-01

    Sequence tagged site (STS) markers are valuable tools for genetic and physical mapping that can be successfully used in comparative analyses among related species. Current challenges for molecular markers genotyping in plants include the lack of fast, sensitive and inexpensive methods suitable for sequence variant detection. In contrast, high resolution melting (HRM) is a simple and high-throughput assay, which has been widely applied in sequence polymorphism identification as well as in the studies of genetic variability and genotyping. The present study is the first attempt to use the HRM analysis to genotype STS markers in narrow-leafed lupin (Lupinus angustifolius L.). The sensitivity and utility of this method was confirmed by the sequence polymorphism detection based on melting curve profiles in the parental genotypes and progeny of the narrow-leafed lupin mapping population. Application of different approaches, including amplicon size and a simulated heterozygote analysis, has allowed for successful genetic mapping of 16 new STS markers in the narrow-leafed lupin genome.

  10. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  11. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  12. Analysis of expressed sequence tags from Maize mosaic rhabdovirus-infected gut tissues of Peregrinus maidis reveals the presence of key components of insect innate immunity.

    PubMed

    Whitfield, A E; Rotenberg, D; Aritua, V; Hogenhout, S A

    2011-04-01

    The corn planthopper, Peregrinus maidis, causes direct feeding damage to plants and transmits Maize mosaic rhabdovirus (MMV) in a persistent-propagative manner. MMV must cross several insect tissue layers for successful transmission to occur, and the gut serves as an important barrier for rhabdovirus transmission. In order to facilitate the identification of proteins that may interact with MMV either by facilitating acquisition or responding to virus infection, we generated and analysed the gut transcriptome of P. maidis. From two normalized cDNA libraries, we generated a P. maidis gut transcriptome composed of 20,771 expressed sequence tags (ESTs). Assembly of the sequences yielded 1860 contigs and 14,032 singletons, and biological roles were assigned to 5793 (36%). Comparison of P. maidis ESTs with other insect amino acid sequences revealed that P. maidis shares greatest sequence similarity with another hemipteran, the brown planthopper Nilaparvata lugens. We identified 202 P. maidis transcripts with putative homology to proteins associated with insect innate immunity, including those implicated in the Toll, Imd, JAK/STAT, Jnk and the small-interfering RNA-mediated pathways. Sequence comparisons between our P. maidis gut EST collection and the currently available National Center for Biotechnology Information EST database collection for Ni. lugens revealed that a pathogen recognition receptor in the Imd pathway, peptidoglycan recognition protein-long class (PGRP-LC), is present in these two members of the family Delphacidae; however, these recognition receptors are lacking in the model hemipteran Acyrthosiphon pisum. In addition, we identified sequences in the P. maidis gut transcriptome that share significant amino acid sequence similarities with the rhabdovirus receptor molecule, acetylcholine receptor (AChR), found in other hosts. This EST analysis sheds new light on immune response pathways in hemipteran guts that will be useful for further dissecting innate

  13. Expressed sequence tag analysis and development of gene associated markers in a near-isogenic plant system of Eragrostis curvula.

    PubMed

    Cervigni, Gerardo D L; Paniego, Norma; Díaz, Marina; Selva, Juan P; Zappacosta, Diego; Zanazzi, Darío; Landerreche, Iñaki; Martelotto, Luciano; Felitti, Silvina; Pessino, Silvina; Spangenberg, Germán; Echenique, Viviana

    2008-05-01

    Eragrostis curvula (Schrad.) Nees is a forage grass native to the semiarid regions of Southern Africa, which reproduces mainly by pseudogamous diplosporous apomixis. A collection of ESTs was generated from four cDNA libraries, three of them obtained from panicles of near-isogenic lines with different ploidy levels and reproductive modes, and one obtained from 12 days-old plant leaves. A total of 12,295 high-quality ESTs were clustered and assembled, rendering 8,864 unigenes, including 1,490 contigs and 7,394 singletons, with a genome coverage of 22%. A total of 7,029 (79.11%) unigenes were functionally categorized by BLASTX analysis against sequences deposited in public databases, but only 37.80% could be classified according to Gene Ontology. Sequence comparison against the cereals genes indexes (GI) revealed 50% significant hits. A total of 254 EST-SSRs were detected from 219 singletons and 35 from contigs. Di- and tri- motifs were similarly represented with percentages of 38.95 and 40.16%, respectively. In addition, 190 SNPs and Indels were detected in 18 contigs generated from 3 to 4 libraries. The ESTs and the molecular markers obtained in this study will provide valuable resources for a wide range of applications including gene identification, genetic mapping, cultivar identification, analysis of genetic diversity, phenotype mapping and marker assisted selection.

  14. Accumulation, functional annotation, and comparative analysis of expressed sequence tags in eggplant (Solanum melongena L.), the third pole of the genus Solanum species after tomato and potato.

    PubMed

    Fukuoka, Hiroyuki; Yamaguchi, Hirotaka; Nunome, Tsukasa; Negoro, Satomi; Miyatake, Koji; Ohyama, Akio

    2010-01-15

    Eggplant (Solanum melongena L.) is a widely grown vegetable crop that belongs to the genus Solanum, which is comprised of more than 1000 species of wide genetic and phenotypic variation. Unlike tomato and potato, Solanum crops that belong to subgenus Potatoe and have been targets for comprehensive genomic studies, eggplant is endemic to the Old World and belongs to a different subgenus, Leptostemonum, and therefore, would be a unique member for comparative molecular biology in Solanum. In this study, more than 60,000 eggplant cDNA clones from various tissues and treatments were sequenced from both the 5'- and 3'-ends, and a unigene set consisting of 16,245 unique sequences was constructed. Functional annotations based on sequence similarity to known plant reference datasets revealed a distribution of functional categories almost similar to that of tomato, while 1316 unigenes were suggested to be eggplant-specific. Sequence-based comparative analysis using putative orthologous gene groups setup by reciprocal sequence comparison among six solanaceous species suggested that eggplant and its wild ally Solanum torvum were clustered separately from subgenus Potatoe species, and then, all Solanum species were clustered separately from the genus Capsicum. Microsatellite motif distribution was different among species and likely to be coincident with the phylogenetic relationships. Furthermore, the eggplant unigene dataset exhibited its utility in transcriptome analysis by the SAGE strategy where a considerable number of short tag sequences of interest were successfully assigned to unigenes and their functional annotations. The eggplant ESTs and 16k unigene set developed in this study would be a useful resource not only for molecular genetics and breeding in eggplant itself, but for expanding the scope of comparative biology in Solanum species.

  15. Construction of cDNA library and preliminary analysis of expressed sequence tags from tea plant [Camellia sinensis (L) O. Kuntze].

    PubMed

    Phukon, Munmi; Namdev, Richa; Deka, Diganta; Modi, Mahendra K; Sen, Priyabrata

    2012-09-10

    Tea is the most popular non-alcoholic and healthy beverage across the world. The understanding of the genetic organization and molecular biology of tea plant, which is very poorly understood at present, is required for quantum increase in productivity and efficient use of germplasm for either cultivation or breeding program. Single-pass sequencing of randomly selected cDNA clones is the most widely accepted technique for gene identification and cloning. In the present study, a good quality cDNA library was constructed and preliminary analysis of ESTs was carried out. The titers of unamplified and amplified libraries were 1.4 × 10(6)pfu/ml and 5.27 × 10(8)pfu/ml respectively. A total of 210 cDNA clones from the constructed cDNA library were sequenced and analyzed. A total of 84 high quality Expressed Sequence Tags (ESTs) were generated, among which 71 ESTs had significant homology with sequences in NCBI non-redundant protein database by BLAST X analysis. About 80% ESTs had poly (A) tail at 3' end indicating that the cDNAs were full length. The database-matched ESTs were classified into putative cellular roles, viz. energy-related category (corresponding to 20% of total BLAST X matched ESTs), Transcription (14.2%), protein synthesis (14.2%) cell growth and division (8.6%), cell structure (5.7%), signal transduction (5.7%), transporters (2.9%), disease and defenses (2.9%), secondary metabolism (2.9%) and gene regulation (2.9%). This study provides an overview of the mRNA expression profile and first hand information of gene sequence expressed in tender leaves and apical buds of tea plant.

  16. Sequencing Degraded RNA Addressed by 3' Tag Counting

    PubMed Central

    Sigurgeirsson, Benjamín; Emanuelsson, Olof; Lundeberg, Joakim

    2014-01-01

    RNA sequencing has become widely used in gene expression profiling experiments. Prior to any RNA sequencing experiment the quality of the RNA must be measured to assess whether or not it can be used for further downstream analysis. The RNA integrity number (RIN) is a scale used to measure the quality of RNA that runs from 1 (completely degraded) to 10 (intact). Ideally, samples with high RIN (8) are used in RNA sequencing experiments. RNA, however, is a fragile molecule which is susceptible to degradation and obtaining high quality RNA is often hard, or even impossible when extracting RNA from certain clinical tissues. Thus, occasionally, working with low quality RNA is the only option the researcher has. Here we investigate the effects of RIN on RNA sequencing and suggest a computational method to handle data from samples with low quality RNA which also enables reanalysis of published datasets. Using RNA from a human cell line we generated and sequenced samples with varying RINs and illustrate what effect the RIN has on the basic procedure of RNA sequencing; both quality aspects and differential expression. We show that the RIN has systematic effects on gene coverage, false positives in differential expression and the quantification of duplicate reads. We introduce 3' tag counting (3TC) as a computational approach to reliably estimate differential expression for samples with low RIN. We show that using the 3TC method in differential expression analysis significantly reduces false positives when comparing samples with different RIN, while retaining reasonable sensitivity. PMID:24632678

  17. Sequence-tagged microsatellite profiling (STMP): a rapid technique for developing SSR markers.

    PubMed

    Hayden, M J; Sharp, P J

    2001-04-15

    We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an approximately 25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat. PMID:11292857

  18. Sequence-tagged microsatellite profiling (STMP): a rapid technique for developing SSR markers

    PubMed Central

    Hayden, M. J.; Sharp, P. J.

    2001-01-01

    We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an ∼25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat. PMID:11292857

  19. Spatiotemporal analysis of bacterial diversity in sediments of Sundarbans using parallel 16S rRNA gene tag sequencing.

    PubMed

    Basak, Pijush; Majumder, Niladri Shekhar; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Chakraborty, Arpita; SenGupta, Sohan; Roy, Arunava; Mukherjee, Arghya; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2015-04-01

    The influence of temporal and spatial variations on the microbial community composition was assessed in the unique coastal mangrove of Sundarbans using parallel 16S rRNA gene pyrosequencing. The total sediment DNA was extracted and subjected to the 16S rRNA gene pyrosequencing, which resulted in 117 Mbp of data from three experimental stations. The taxonomic analysis of the pyrosequencing data was grouped into 24 different phyla. In general, Proteobacteria were the most dominant phyla with predominance of Deltaproteobacteria, Alphaproteobacteria, and Gammaproteobacteria within the sediments. Besides Proteobacteria, there are a number of sequences affiliated to the following major phyla detected in all three stations in both the sampling seasons: Actinobacteria, Bacteroidetes, Planctomycetes, Acidobacteria, Chloroflexi, Cyanobacteria, Nitrospira, and Firmicutes. Further taxonomic analysis revealed abundance of micro-aerophilic and anaerobic microbial population in the surface layers, suggesting anaerobic nature of the sediments in Sundarbans. The results of this study add valuable information about the composition of microbial communities in Sundarbans mangrove and shed light on possible transformations promoted by bacterial communities in the sediments. PMID:25256302

  20. Comparative analysis of secreted protein evolution using expressed sequence tags from four poplar leaf rusts (Melampsora spp.)

    PubMed Central

    2010-01-01

    Background Obligate biotrophs such as rust fungi are believed to establish long-term relationships by modulating plant defenses through a plethora of effector proteins, whose most recognizable feature is the presence of a signal peptide for secretion. Since the phenotypes of these effectors extend to host cells, their genes are expected to be under accelerated evolution stimulated by host-pathogen coevolutionary arms races. Recently, whole genome sequence data has allowed the prediction of secretomes, facilitating the identification of putative effectors. Results We generated cDNA libraries from four poplar leaf rust pathogens (Melampsora spp.) and used computational approaches to identify and annotate putative secreted proteins with the aim of uncovering new knowledge about the nature and evolution of the rust secretome. While more than half of the predicted secretome members encoded lineage-specific proteins, similarities with experimentally characterized fungal effectors were also identified. A SAGE analysis indicated a strong stage-specific regulation of transcripts encoding secreted proteins. The average sequence identity of putative secreted proteins to their closest orthologs in the wheat stem rust Puccinia graminis f. sp. tritici was dramatically reduced compared with non-secreted ones. A comparative genomics approach based on homologous gene groups unravelled positive selection in putative members of the secretome. Conclusion We uncovered robust evidence that different evolutionary constraints are acting on the rust secretome when compared to the rest of the genome. These results are consistent with the view that these genes are more likely to exhibit an effector activity and be involved in coevolutionary arms races with host factors. PMID:20615251

  1. Expressed sequence tags from the halophyte Limonium sinense.

    PubMed

    Chen, Shi-Hua; Guo, Shan Li; Wang, Zeng Lan; Zhao, Ji Qiang; Zhao, Yan Xiu; Zhang, Hui

    2007-02-01

    Halophytes can grow under a high salinity condition. Similar to glycophytes, their salt-tolerance possesses a high genetic complexity. There are many morphological and physiological studies on halophytes but very little information is at molecular level why they are salt-tolerant. Limonium sinense is a salt-secreting halophyte and can excretes salts by multi-cellular glands. Here, we report the library construction and sequence analysis of a cDNA library made from leaf tissue of L. sinenes. Among those 1082 expressed sequence tag (EST) obtained, 684 unique genes were identified: 429 showed homology to previously identified genes, 255 matched to uncharacterized genes. Compared with other EST databases, some characteristic features such as abundance genes in related to cytoskeleton and intracellular traffic, membrane transporting were observed, which may be specific to halophytes. PMID:17364815

  2. SPIDER: software for protein identification from sequence tags with de novo sequencing error.

    PubMed

    Han, Yonghua; Ma, Bin; Zhang, Kaizhong

    2005-06-01

    For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. PMID:16108090

  3. SPIDER: software for protein identification from sequence tags with de novo sequencing error.

    PubMed

    Han, Yonghua; Ma, Bin; Zhang, Kaizhong

    2004-01-01

    For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. PMID:16448014

  4. Identification of Genes with Potential Roles in Apple Fruit Development and Biochemistry through Large-Scale Statistical Analysis of Expressed Sequence Tags1[W

    PubMed Central

    Park, Sunchung; Sugimoto, Nobuko; Larson, Matthew D.; Beaudry, Randy; van Nocker, Steven

    2006-01-01

    Advanced studies of apple (Malus domestica Borkh) development, physiology, and biochemistry have been hampered by the lack of appropriate genomics tools. One exception is the recent acquisition of extensive expressed sequence tag (EST) data. The entire available EST dataset for apple resulted from the efforts of at least 20 contributors and was derived from more than 70 cDNA libraries representing diverse transcriptional profiles from a variety of organs, fruit parts, developmental stages, biotic and abiotic stresses, and from at least nine cultivars. We analyzed apple EST sequences available in public databanks using statistical algorithms to identify those apple genes that are likely to be highly expressed in fruit, expressed uniquely or preferentially in fruit, and/or temporally or spatially regulated during fruit growth and development. We applied these results to the analysis of biochemical pathways involved in biosynthesis of precursors for volatile esters and identified a subset of apple genes that may participate in generating flavor and aroma components found in mature fruit. PMID:16825339

  5. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    PubMed Central

    de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084

  6. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  7. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  8. A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

    PubMed Central

    2012-01-01

    Background Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis. Results To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response

  9. Unraveling new genes associated with seed development and metabolism in Bixa orellana L. by expressed sequence tag (EST) analysis.

    PubMed

    Soares, Virgínia L F; Rodrigues, Simone M; de Oliveira, Tahise M; de Queiroz, Talisson O; Lima, Lívia S; Hora-Júnior, Braz T; Gramacho, Karina P; Micheli, Fabienne; Cascardo, Júlio C M; Otoni, Wagner C; Gesteira, Abelmon S; Costa, Marcio G C

    2011-02-01

    The tropical tree Bixa orellana L. produces a range of secondary metabolites which biochemical and molecular biosynthesis basis are not well understood. In this work we have characterized a set of ESTs from a non-normalized cDNA library of B. orellana seeds to obtain information about the main developmental and metabolic processes taking place in developing seeds and their associated genes. After sequencing a set of randomly selected clones, most of the sequences were assigned with putative functions based on similarity, GO annotations and protein domains. The most abundant transcripts encoded proteins associated with cell wall (prolyl 4-hydroxylase), fatty acid (acyl carrier protein), and hormone/flavonoid (2OG-Fe oxygenase) synthesis, germination (MADS FLC-like protein) and embryo development (AP2/ERF transcription factor) regulation, photosynthesis (chlorophyll a-b binding protein), cell elongation (MAP65-1a), and stress responses (metallothionein- and thaumatin-like proteins). Enzymes were assigned to 16 different metabolic pathways related to both primary and secondary metabolisms. Characterization of two candidate genes of the bixin biosynthetic pathway, BoCCD and BoOMT, showed that they belong, respectively, to the carotenoid-cleavage dioxygenase 4 (CCD4) and caffeic acid O-methyltransferase (COMT) families, and are up-regulated during seed development. It indicates their involvement in the synthesis of this commercially important carotenoid pigment in seeds of B. orellana. Most of the genes identified here are the first representatives of their gene families in B. orellana. PMID:20563648

  10. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  11. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-01-01

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species. PMID:26505424

  12. Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

    PubMed Central

    Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

    2015-01-01

    Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular

  13. Development of expressed sequence tag-based microsatellite markers for the critically endangered Isoëtes sinensis (Isoetaceae) based on transcriptome analysis.

    PubMed

    Gichira, A W; Long, Z C; Wang, Q F; Chen, J M; Liao, K

    2016-01-01

    Isoëtes sinensis is a critically endangered quillwort. To facilitate studies on the conservation genetics of this species, we developed expressed sequence tag-simple sequence repeat (EST-SSR) markers. A total of 50,063 unigenes were predicted by transcriptome sequencing, 5294 (10.6%) of which significantly matched 3011 Gene Ontology annotations and 2363 were assigned to Kyoto Encyclopedia of Genes and Genomes metabolic pathways. Most of these (2297) were involved in metabolism. A total of 1982 SSR motifs were identified, with trinucleotides being the dominant repeat motif, and 1438 (72.6%) SSR primers were designed. Eighteen randomly selected primer pairs were used to genotype 24 I. sinensis accessions, which confirmed the suitability of these novel markers for molecular studies of I. sinensis. The heterozygosity index value ranged between 0.0799 and 0.9106, while the Shannon-Wiener diversity index value ranged between 0.1732 and 2.5589. The EST-SSRs reported in this study are linked to genic sequences, and are therefore ideal for investigating the evolutionary history of I. sinensis. These markers, together with the large EST dataset generated in this study, will greatly facilitate conservation genetic studies of I. sinensis. PMID:27525847

  14. Analysis of expressed sequence tags from a significant livestock pest, the stable fly (Stomoxys calcitrans), identifies transcripts with a putative role in chemosensation and sex determination.

    PubMed

    Olafson, Pia Untalan; Lohmeyer, Kimberly H; Dowd, Scot E

    2010-07-01

    The stable fly, Stomoxys calcitrans L. (Diptera: Muscidae), is one of the most significant pests of livestock in the United States. The identification of targets for the development of novel control for this pest species, focusing on those molecules that play a role in successful feeding and reproduction, is critical to mitigating its impact on confined and rangeland livestock. A database was developed representing genes expressed at the immature and adult life stages of the stable fly, comprising data obtained from pyrosequencing both immature and adult stages and from small-scale sequencing of an antennal/maxillary palp-expressed sequence tag library. The full-length sequence and expression of 21 transcripts that may have a role in chemosensation is presented, including 13 odorant-binding proteins, 6 chemosensory proteins, and 2 odorant receptors. Transcripts with potential roles in sex determination and reproductive behaviors are identified, including evidence for the sex-specific expression of stable fly doublesex- and transformer-like transcripts. The current database will be a valuable tool for target identification and for comparative studies with other Diptera. PMID:20572127

  15. Analysis of expressed sequence tags from a significant livestock pest, the stable fly (Stomoxys calcitrans), identifies transcripts with a putative role in chemosensation and sex determination.

    PubMed

    Olafson, Pia Untalan; Lohmeyer, Kimberly H; Dowd, Scot E

    2010-07-01

    The stable fly, Stomoxys calcitrans L. (Diptera: Muscidae), is one of the most significant pests of livestock in the United States. The identification of targets for the development of novel control for this pest species, focusing on those molecules that play a role in successful feeding and reproduction, is critical to mitigating its impact on confined and rangeland livestock. A database was developed representing genes expressed at the immature and adult life stages of the stable fly, comprising data obtained from pyrosequencing both immature and adult stages and from small-scale sequencing of an antennal/maxillary palp-expressed sequence tag library. The full-length sequence and expression of 21 transcripts that may have a role in chemosensation is presented, including 13 odorant-binding proteins, 6 chemosensory proteins, and 2 odorant receptors. Transcripts with potential roles in sex determination and reproductive behaviors are identified, including evidence for the sex-specific expression of stable fly doublesex- and transformer-like transcripts. The current database will be a valuable tool for target identification and for comparative studies with other Diptera.

  16. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities

    PubMed Central

    Stoeck, Thorsten; Behnke, Anke; Christen, Richard; Amaral-Zettler, Linda; Rodriguez-Mora, Maria J; Chistoserdov, Andrei; Orsi, William; Edgcomb, Virginia P

    2009-01-01

    Background Recent advances in sequencing strategies make possible unprecedented depth and scale of sampling for molecular detection of microbial diversity. Two major paradigm-shifting discoveries include the detection of bacterial diversity that is one to two orders of magnitude greater than previous estimates, and the discovery of an exciting 'rare biosphere' of molecular signatures ('species') of poorly understood ecological significance. We applied a high-throughput parallel tag sequencing (454 sequencing) protocol adopted for eukaryotes to investigate protistan community complexity in two contrasting anoxic marine ecosystems (Framvaren Fjord, Norway; Cariaco deep-sea basin, Venezuela). Both sampling sites have previously been scrutinized for protistan diversity by traditional clone library construction and Sanger sequencing. By comparing these clone library data with 454 amplicon library data, we assess the efficiency of high-throughput tag sequencing strategies. We here present a novel, highly conservative bioinformatic analysis pipeline for the processing of large tag sequence data sets. Results The analyses of ca. 250,000 sequence reads revealed that the number of detected Operational Taxonomic Units (OTUs) far exceeded previous richness estimates from the same sites based on clone libraries and Sanger sequencing. More than 90% of this diversity was represented by OTUs with less than 10 sequence tags. We detected a substantial number of taxonomic groups like Apusozoa, Chrysomerophytes, Centroheliozoa, Eustigmatophytes, hyphochytriomycetes, Ichthyosporea, Oikomonads, Phaeothamniophytes, and rhodophytes which remained undetected by previous clone library-based diversity surveys of the sampling sites. The most important innovations in our newly developed bioinformatics pipeline employ (i) BLASTN with query parameters adjusted for highly variable domains and a complete database of public ribosomal RNA (rRNA) gene sequences for taxonomic assignments of tags; (ii

  17. Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

    PubMed Central

    2009-01-01

    Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as

  18. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  19. A sequence-tagged linkage map of Brassica rapa.

    PubMed

    Kim, Jung Sun; Chung, Tae Young; King, Graham J; Jin, Mina; Yang, Tae-Jin; Jin, Yong-Moon; Kim, Ho-Il; Park, Beom-Seok

    2006-09-01

    A detailed genetic linkage map of Brassica rapa has been constructed containing 545 sequence-tagged loci covering 1287 cM, with an average mapping interval of 2.4 cM. The loci were identified using a combination of 520 RFLP and 25 PCR-based markers. RFLP probes were derived from 359 B. rapa EST clones and amplification products of 11 B. rapa and 26 Arabidopsis. Including 21 SSR markers provided anchors to previously published linkage maps for B. rapa and B. napus and is followed as the referenced mapping of R1-R10. The sequence-tagged markers allowed interpretation of the pattern of chromosome duplications within the B. rapa genome and comparison with Arabidopsis. A total of 62 EST markers showing a single RFLP band were mapped through 10 linkage groups, indicating that these can be valuable anchoring markers for chromosome-based genome sequencing of B. rapa. Other RFLP probes gave rise to 2-5 loci, inferring that B. rapa genome duplication is a general phenomenon through 10 chromosomes. The map includes five loci of FLC paralogues, which represent the previously reported BrFLC-1, -2, -3, and -5 and additionally identified BrFLC3 paralogues derived from local segmental duplication on R3.

  20. Highly sensitive targeted methylome sequencing by post-bisulfite adaptor tagging

    PubMed Central

    Miura, Fumihito; Ito, Takashi

    2015-01-01

    The current gold standard method for methylome analysis is whole-genome bisulfite sequencing (WGBS), but its cost is substantial, especially for the purpose of multi-sample comparison of large methylomes. Shotgun bisulfite sequencing of target-enriched DNA, or targeted methylome sequencing (TMS), can be a flexible, cost-effective alternative to WGBS. However, the current TMS protocol requires a considerable amount of input DNA and hence is hardly applicable to samples of limited quantity. Here we report a method to overcome this limitation by using post-bisulfite adaptor tagging (PBAT), in which adaptor tagging is conducted after bisulfite treatment to circumvent bisulfite-induced loss of intact sequencing templates, thereby enabling TMS of a 100-fold smaller amount of input DNA with far fewer cycles of polymerase chain reaction than in the current protocol. We thus expect that the PBAT-mediated TMS will serve as an invaluable method in epigenomics. PMID:25324297

  1. An SNR improvement of passive SAW tags with 5-bit Barker code sequence

    NASA Astrophysics Data System (ADS)

    Bae, Hyunchul; Kim, Jaekwon; Burm, Jinwook

    2012-07-01

    Passive surface acoustic wave (SAW) tags require a large signal-to-noise ratio (SNR) in order to increase the interrogation range. For the purpose of achieving high SNR for radio frequency identification (RFID) communication systems, Barker codes, a binary phase shift keying (BPSK) modulation technique, have been adopted in this study. Passive SAW RFID tags were designed with 5-bit Barker code sequences to generate BPSK modulated signals. Through the SNR analysis, the improvements in SNR were about 11 dB using Barker codes along with a correlator, which can be further improved by optimisation in the correlator.

  2. Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species.

    PubMed

    Kumpatla, Siva P; Mukhopadhyay, Snehasis

    2005-12-01

    Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.

  3. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  4. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-05-24

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.

  5. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  6. Sequence tagged microsatellite profiling (STMP): improved isolation of DNA sequence flanking target SSRs

    PubMed Central

    Hayden, M. J.; Good, G.; Sharp, P. J.

    2002-01-01

    Sequence tagged microsatellite profiling (STMP) enables the rapid development of large numbers of co-dominant DNA markers, known as sequence tagged microsatellites (STMs). Each STM is amplified by PCR using a single primer specific to the conserved DNA sequence flanking the microsatellite repeat in combination with a universal primer that anchors to the 5′-ends of the microsatellites. It is also possible to convert STMs into conventional microsatellite, or simple sequence repeat (SSR), markers that are amplified using a pair of primers flanking the repeat sequence. Here, we describe a modification of the STMP procedure to significantly improve the capacity to convert STMs into conventional SSRs and, therefore, facilitate the development of highly specific DNA markers for purposes such as marker-assisted breeding. The usefulness of this technique was demonstrated in bread wheat. PMID:12466561

  7. Sequence tagged microsatellite profiling (STMP): improved isolation of DNA sequence flanking target SSRs.

    PubMed

    Hayden, M J; Good, G; Sharp, P J

    2002-12-01

    Sequence tagged microsatellite profiling (STMP) enables the rapid development of large numbers of co-dominant DNA markers, known as sequence tagged microsatellites (STMs). Each STM is amplified by PCR using a single primer specific to the conserved DNA sequence flanking the microsatellite repeat in combination with a universal primer that anchors to the 5'-ends of the microsatellites. It is also possible to convert STMs into conventional microsatellite, or simple sequence repeat (SSR), markers that are amplified using a pair of primers flanking the repeat sequence. Here, we describe a modification of the STMP procedure to significantly improve the capacity to convert STMs into conventional SSRs and, therefore, facilitate the development of highly specific DNA markers for purposes such as marker-assisted breeding. The usefulness of this technique was demonstrated in bread wheat. PMID:12466561

  8. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  9. CREST – Classification Resources for Environmental Sequence Tags

    PubMed Central

    Lanzén, Anders; Jørgensen, Steffen L.; Huson, Daniel H.; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  10. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  11. Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences.

    PubMed

    Dantec, Loïck Le; Chagné, David; Pot, David; Cantin, Olivier; Garnier-Géré, Pauline; Bedon, Frank; Frigerio, Jean-Marc; Chaumeil, Philippe; Léger, Patrick; Garcia, Virginie; Laigret, Frédéric; De Daruvar, Antoine; Plomion, Christophe

    2004-02-01

    We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of the three programs were optimized in order to retrieve as many true SNPs, while keeping the rate of false positive as low as possible. Overall, the efficiency of detection of true SNPs was 83.1%. However, this rate varied largely as a function of the rare SNP allele frequency: down to 41% for rare SNP alleles (frequency < 10%), up to 98% for allele frequencies above 10%. Third, the detection method was applied to the 18498 assembled maritime pine (Pinus pinaster Ait.) ESTs, allowing to identify a total of 1400 candidate SNPs, in contigs containing between 4 and 20 sequence reads. These genetic resources, described for the first time in a forest tree species, were made available at http://www.pierroton.inra/genetics/Pinesnps. We also derived an analytical expression for the SNP detection probability as a function of the SNP allele frequency, the number of haploid genomes used to generate the EST sequence database, and the sample size of the contigs considered for SNP detection. The frequency of the SNP allele was shown to be the main factor influencing the probability of SNP detection.

  12. Analysis of expressed sequence tags (ESTs) and gene expression changes under different growth conditions for the ciliate Anophryoides haemophila, the causative agent of bumper car disease in the American lobster (Homarus americanus).

    PubMed

    Acorn, Adam R; Clark, K Fraser; Jones, Sarah; Després, Béatrice M; Munro, Sarah; Cawthorn, Richard J; Greenwood, Spencer J

    2011-06-01

    The scuticociliate Anophryoides haemophila, causes bumper car disease in American lobster (Homarus americanus) in commercial holding facilities in Atlantic Canada. While the parasite has been recognized since the 1970s and much has been learned about its biology, minimal molecular characterization exists. With genome consortiums turning to model organisms like the ciliates Tetrahymena and Paramecium, the amount of relevant sequence data available has made sequence surveys more attractive for gene discovery in related ciliates. We sequenced 9984 expressed sequence tags (ESTs) from a non-normalized A. haemophila cDNA library to characterize gene expression patterns, functional gene distribution and to discover novel genes related to the parasitic life history. The A. haemophila ESTs were grouped into 843 clusters and singletons with 658 EST clusters having identifiable homologs, while 159 ESTs were unique and had no similarity to any sequences in the public databases. Not unexpectedly, about 67% of the A. haemophila ESTs have similarity to annotated and hypothetical genes from the related oligohymenophorean ciliate, Tetrahymena. Numerous cysteine proteases, hypothetical proteins and novel sequences possess putative secretory signal peptides suggesting that they may contribute to the pathogenesis of bumper car disease in lobster. Real time RT-qPCR analysis of cathepsin L and two homologs of cathepsin B did not show any changes in gene expression under varying in vitro growth conditions or during a modified-in vivo infection which may be suggestive of the opportunistic life history strategy of this ciliate.

  13. Identification of molecular motors in the Woods Hole squid, Loligo pealei: an expressed sequence tag approach.

    PubMed

    DeGiorgis, Joseph A; Cavaliere, Kimberly R; Burbach, J Peter H

    2011-10-01

    The squid giant axon and synapse are unique systems for studying neuronal function. While a few nucleotide and amino acid sequences have been obtained from squid, large scale genetic and proteomic information is lacking. We have been particularly interested in motors present in axons and their roles in transport processes. Here, to obtain genetic data and to identify motors expressed in squid, we initiated an expressed sequence tag project by single-pass sequencing mRNAs isolated from the stellate ganglia of the Woods Hole Squid, Loligo pealei. A total of 22,689 high quality expressed sequence tag (EST) sequences were obtained and subjected to basic local alignment search tool analysis. Seventy six percent of these sequences matched genes in the National Center for Bioinformatics databases. By CAP3 analysis this library contained 2459 contigs and 7568 singletons. Mining for motors successfully identified six kinesins, six myosins, a single dynein heavy chain, as well as components of the dynactin complex, and motor light chains and accessory proteins. This initiative demonstrates that EST projects represent an effective approach to obtain sequences of interest.

  14. Initiation of a Sarcocystis neurona expressed sequence tag (EST) sequencing project: a preliminary report.

    PubMed

    Howe, D K

    2001-02-26

    To accelerate genetic and molecular characterization of Sarcocystis neurona, the primary causative agent of equine protozoal myeloencephalitis (EPM), a sequencing project has been initiated that will generate approximately 7000-8000 expressed sequence tags (ESTs) from this apicomplexan parasite. Poly(A)(+) RNA was isolated from culture-derived S. neurona merozoites, and a cDNA library was constructed in a unidirectional lambda phage cloning vector. Sixty phage clones were randomly picked from the library, and the cDNA inserts were amplified from these clones using the T3 and T7 primers that flank the multi-cloning site of the lambda vector. This analysis demonstrated that 100% (60/60) of the clones selected from this library contained recombinant cDNA inserts ranging in size from 0.4 to 4.0 kilobases (kb) with an average size of 1.23kb. Single-pass sequencing from the 5' end of the 60 amplified cDNAs produced high-quality nucleotide sequence from 53 of the clones. Comparison of these ESTs to the current gene databases revealed significant matches for 10 of the ESTs, six of which are similar to sequences from other Apicomplexa (i.e., Toxoplasma gondii). Importantly, none of the ESTs were of obvious mammalian origin, thus indicating that the cDNAs in this library were derived primarily from parasite mRNA and not from mRNA of the bovine turbinate host cells. Collectively, these data indicate that the described cDNA library will provide an excellent substrate for generating a portion of the ESTs that are planned from S. neurona. This sequencing project will greatly hasten gene discovery for this protozoan pathogen thereby enhancing efforts towards the development of improved diagnostics, treatments, and preventatives for EPM. In addition, the S. neurona ESTs will represent a significant contribution to the extensive database of sequences from the Apicomplexa. Comparative analyses of these apicomplexan sequences will likely offer a multitude of important information

  15. Identification of grapevine rootstock cultivars using expressed sequence tag-simple sequence repeats.

    PubMed

    Fan, X C; Chu, J Q; Liu, C H; Sun, X; Fang, J G

    2014-09-26

    Grapevine (Vitis) rootstock varieties or cultivars are used to confer resistance and tolerance to insect and disease pests, unfavorable soil conditions, and other environmental conditions to cultivars that are susceptible to these conditions but otherwise have desired properties. The need to genotype and thoroughly identify grapevine rootstock varieties in the grape industry has become increasingly critical as more and more varieties are bred or selected. Although DNA markers have advantageous applications in plant identification, markers developed from classic DNA fingerprint analysis methods are not practical for plant cultivar identification. The manual cultivar identification diagram (MCID), which was previously developed in our research group, has been shown to select DNA markers that are relatively more exploitable in identifications of genotyped plant individuals. Using this MCID strategy and expressed sequence tag-simple sequence repeat (EST-SSR) markers, we identified 22 grapevine rootstock cultivars of diverse origin. All cultivars were clearly separated by fingerprints of seven pairs of EST-SSR primers and the grapevine rootstock CID (V-R-CID) generated is both practical and referable for the identification of any grapevine rootstock cultivars studied here. Furthermore, fewer primers can be used to distinguish all cultivars using this approach since the fingerprint from each primer pair could be used several times once it is generated. This initial version of V-R-CID can be made more informative with the identification and incorporation of more cultivars, thus providing better service to the grape industry.

  16. Hierarchical molecular tagging to resolve long continuous sequences by massively parallel sequencing

    PubMed Central

    Lundin, Sverker; Gruselius, Joel; Nystedt, Björn; Lexow, Preben; Käller, Max; Lundeberg, Joakim

    2013-01-01

    Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region. PMID:23470464

  17. Expressed sequence tags: normalization and subtraction of cDNA libraries expressed sequence tags\\ normalization and subtraction of cDNA libraries.

    PubMed

    Soares, Marcelo Bento; de Fatima Bonaldo, Maria; Hackett, Jeremiah D; Bhattacharya, Debashish

    2009-01-01

    Expressed Sequence Tags (ESTs) provide a rapid and efficient approach for gene discovery and analysis of gene expression in eukaryotes. ESTs have also become particularly important with recent expanded efforts in complete genome sequencing of understudied, nonmodel eukaryotes such as protists and algae. For these projects, ESTs provide an invaluable source of data for gene identification and prediction of exon-intron boundaries. The generation of EST data, although straightforward in concept, requires nonetheless great care to ensure the highest efficiency and return for the investment in time and funds. To this end, key steps in the process include generation of a normalized cDNA library to facilitate a high gene discovery rate followed by serial subtraction of normalized libraries to maintain the discovery rate. Here we describe in detail, protocols for normalization and subtraction of cDNA libraries followed by an example using the toxic dinoflagellate Alexandrium tamarense.

  18. Perceptual learning of contrast discrimination under roving: the role of semantic sequence in stimulus tagging.

    PubMed

    Cong, Lin-Juan; Zhang, Jun-Yun

    2014-11-03

    Perceptual learning may occur when multiple contrasts are practiced in a fixed, but not in a roving (random), temporal sequence. However, learning may escape roving disruption when each contrast is assigned a letter tag (i.e., A, B, C, D). Because these letter tags carry not only stimulus identity information, but also semantic sequence information, here we investigated whether the semantic sequence information is necessary for learning of tagged contrasts under the roving condition. We found that assigning number tags (i.e., 1, 2, 3, 4), which also contained both identity and semantic sequence information, to four roving contrasts enabled significant learning of discrimination of each contrast, confirming previous data. However, learning became insignificant when the contrast tags were replaced with Greek letters that were familiar to our Chinese observers except their sequence or Chinese characters that carried no sequence information. In addition, assigning orientation tags, which carried no sequence information either, to roving contrasts was ineffective as well because learning occurred only with sequenced but not roving contrasts. These results suggest that semantic sequence information is necessary for stimulus tagging to effectively enable perceptual learning of multiple contrast discrimination under roving.

  19. Comprehensive Functional Analyses of Expressed Sequence Tags in Common Wheat (Triticum aestivum)

    PubMed Central

    Manickavelu, Alagu; Kawaura, Kanako; Oishi, Kazuko; Shin-I, Tadasu; Kohara, Yuji; Yahiaoui, Nabila; Keller, Beat; Abe, Reina; Suzuki, Ayako; Nagayama, Taishi; Yano, Kentaro; Ogihara, Yasunari

    2012-01-01

    About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, resulting in 37 138 contigs and 215 199 singlets. In the assembled sequences, 10.6% presented no matches with existing sequences in public databases. Functional characterization of wheat unigenes by gene ontology annotation, mining transcription factors, full-length cDNA, and miRNA targeting sites were carried out. A bioinformatics strategy was developed to discover single-nucleotide polymorphisms (SNPs) within our large EST resource and reported the SNPs between and within (homoeologous) cultivars. Digital gene expression was performed to find the tissue-specific gene expression, and correspondence analysis was executed to identify common and specific gene expression by selecting four biotic stress-related libraries. The assembly and associated information cater a framework for future investigation in functional genomics. PMID:22334568

  20. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    PubMed Central

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  1. Shotgun sequencing of the human transcriptome with ORF expressed sequence tags

    PubMed Central

    Dias Neto, Emmanuel; Garcia Correa, Ricardo; Verjovski-Almeida, Sergio; Briones, Marcelo R. S.; Nagai, Maria Aparecida; da Silva, Wilson; Zago, Marco Antonio; Bordin, Silvana; Costa, Fernando Ferreira; Goldman, Gustavo Henrique; Carvalho, Alex F.; Matsukuma, Adriana; Baia, Gilson S.; Simpson, David H.; Brunstein, Adriana; de Oliveira, Paulo S. L.; Bucher, Philipp; Jongeneel, C. Victor; O'Hare, Michael J.; Soares, Fernando; Brentani, Ricardo R.; Reis, Luis F. L.; de Souza, Sandro J.; Simpson, Andrew J. G.

    2000-01-01

    Theoretical considerations predict that amplification of expressed gene transcripts by reverse transcription–PCR using arbitrarily chosen primers will result in the preferential amplification of the central portion of the transcript. Systematic, high-throughput sequencing of such products would result in an expressed sequence tag (EST) database consisting of central, generally coding regions of expressed genes. Such a database would add significant value to existing public EST databases, which consist mostly of sequences derived from the extremities of cDNAs, and facilitate the construction of contigs of transcript sequences. We tested our predictions, creating a database of 10,000 sequences from human breast tumors. The data confirmed the central distribution of the sequences, the significant normalization of the sequence population, the frequent extension of contigs composed of existing human ESTs, and the identification of a series of potentially important homologues of known genes. This approach should make a significant contribution to the early identification of important human genes, the deciphering of the draft human genome sequence currently being compiled, and the shotgun sequencing of the human transcriptome. PMID:10737800

  2. Immunological responses of turbot (Psetta maxima) to nodavirus infection or polyriboinosinic polyribocytidylic acid (pIC) stimulation, using expressed sequence tags (ESTs) analysis and cDNA microarrays.

    PubMed

    Park, Kyoung C; Osborne, Jane A; Montes, Ariana; Dios, Sonia; Nerland, Audun H; Novoa, Beatriz; Figueras, Antonio; Brown, Laura L; Johnson, Stewart C

    2009-01-01

    To investigate the immunological responses of turbot to nodavirus infection or pIC stimulation, we constructed cDNA libraries from liver, kidney and gill tissues of nodavirus-infected fish and examined the differential gene expression within turbot kidney in response to nodavirus infection or pIC stimulation using a turbot cDNA microarray. Turbot were experimentally infected with nodavirus and samples of each tissue were collected at selected time points post-infection. Using equal amount of total RNA at each sampling time, we made three tissue-specific cDNA libraries. After sequencing 3230 clones we obtained 3173 (98.2%) high quality sequences from our liver, kidney and gill libraries. Of these 2568 (80.9%) were identified as known genes and 605 (19.1%) as unknown genes. A total of 768 unique genes were identified. The two largest groups resulting from the classification of ESTs according to function were the cell/organism defense genes (71 uni-genes) and apoptosis-related process (23 uni-genes). Using these clones, a 1920 element cDNA microarray was constructed and used to investigate the differential gene expression within turbot in response to experimental nodavirus infection or pIC stimulation. Kidney tissue was collected at selected times post-infection (HPI) or stimulation (HPS), and total RNA was isolated for microarray analysis. Of the 1920 genes studied on the microarray, we identified a total of 121 differentially expressed genes in the kidney: 94 genes from nodavirus-infected animals and 79 genes from those stimulated with pIC. Within the nodavirus-infected fish we observed the highest number of differentially expressed genes at 24 HPI. Our results indicate that certain genes in turbot have important roles in immune responses to nodavirus infection and dsRNA stimulation.

  3. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank. PMID:12107414

  4. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  5. Paired-end genomic signature tags: a method for the functional analysis of genomes and epigenomes.

    PubMed

    Dunn, John J; McCorkle, Sean R; Everett, Logan; Anderson, Carl W

    2007-01-01

    Because paired-end genomic signature tags are sequenced-based, they have the potential to become an alternate tool to tiled microarray hybridization as a method for genome-wide localization of transcription factors and other sequence-specific DNA binding proteins. As outlined here the method also can be used for global analysis of DNA methylation. One advantage of this approach is the ability to easily switch between different genome types without having to fabricate a new microarray for each and every DNA type. However, the method does have some disadvantages. Among the most rate-limiting steps of our PE-GST protocol are the need to concatemerize the diTAGs, size fractionate them and then clone them prior to sequencing. This is usually followed by additional steps to amplify and size select for long (> or = 500) concatemer inserts prior to sequencing. These time-consuming steps are important for standard DNA sequencing as they increase efficiency approximately 20-30-fold since each amplified concatemer can now provide information on multiple tags; the limitation on data acqui- sition is read length during sequencing. However, the development of new sequencing methods such as Life Sciences' 454 new nanotechnology-based sequencing instrument (41) could increase tag sequencing efficiency by several orders of magnitude (> or = 100,000 diTAG reads/run), which is sufficient to provide in-depth global analysis of all ChIP PE-GSTs in a single run. This is because the lengths of our paired-end diTAGs (approximately 60 bp) fall well within the region of high accuracy for read lengths on this instrument. In principle, sequence analysis of diTAGs could begin as soon as they are generated, thereby completely bypassing the need for the concatemerization, sizing, downstream cloning steps and sequencing template purification. In addition, our protocol places any one of several unique four-base long nucleotide sequences, such as GATC, between each and every diTAG pair, which could

  6. From expressed sequence tags to 'epigenomics': an understanding of disease processes.

    PubMed

    Zweiger, G; Scott, R W

    1997-12-01

    Expressed sequence tags (ESTs) are at the forefront of technological change that is sweeping the biomedical research community. ESTs provide a high throughput means for identifying gene transcripts and monitoring complex gene expression patterns. EST-based technologies coupled with sophisticated computer analysis tools enable the informational content and output of the genome to be accessed and evaluated on a scale immensely larger than previously possible. EST-based technologies are being used to understand disease processes and to find better disease treatments, and will allow biology to move from single gene to multigene, or even more complex epigenetic, explanations for disease.

  7. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study. PMID:11313146

  8. Insilico analysis of three different tag polypeptides with dual roles in scFv antibodies.

    PubMed

    Mohammadi, Mozafar; Nejatollahi, Foroogh; Sakhteman, Amirhossein; Zarei, Neda

    2016-08-01

    Single chain fragment variable (scFv) antibodies are composed of variable heavy (VH) and variable light (VL) domains that are joined by a polypeptide linker. Typically, [(Gly4Ser) n] sequence is used as a linker to retain the integrity of the antigen-binding domain. Due to its low immunogenicity, this sequence cannot be used as a tag for scFv detection and purification. Several evidences have shown that the addition of an N or C-terminal tag for scFv detection and purification will result in the decreased expression and binding capacity of this antibody fragment. In this study, we substituted the traditional linker (GGGGS) with His-tag, C-myc or E-tag sequences through molecular modeling. Stability and integrity of all models were assessed by molecular dynamic (MD) simulation. Based on MD simulation analysis, the model containing E-tag sequence as a linker indicated more stability compared to other molecules. The results suggest that E-tag not only can be substituted for the traditional linker, also eliminates the necessity of using additional tag for scFv detection and purification. PMID:27113782

  9. A method to introduce an internal tag sequence into a Salmonella chromosomal gene.

    PubMed

    Zhao, Weidong; Méresse, Stéphane

    2015-01-01

    Epitope tags are short peptide sequences that are particularly useful for the characterization of proteins against which no antibody has been developed. Influenza hemagglutinin (HA) tag is one of the most widely used epitope tags as several valuable monoclonal and polyclonal antibodies that can be used in various techniques are commercially available. Therefore, adding a HA tag to a protein of interest is quite helpful to get rapid and cost less information regarding its localization, its expression or its biological function. In this chapter, we describe a process, derived from the Datsenko and Wanner procedure, which allows the introduction of an internal 2HA tag sequence into a chromosomal gene of the bacterial pathogen Salmonella.

  10. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  11. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  12. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future.

  13. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future. PMID:24630959

  14. Rapid in silico cloning of genes using expressed sequence tags (ESTs).

    PubMed

    Gill, R W; Sanseau, P

    2000-01-01

    Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information. PMID:10874996

  15. Phylogeny of Saccharina and Laminaria (Laminariaceae, Laminariales, Phaeophyta) in sequence-tagged-site markers

    NASA Astrophysics Data System (ADS)

    Qu, Jieqiong; Zhang, Jing; Wang, Xumin; Chi, Shan; Liu, Cui; Liu, Tao

    2014-01-01

    Laminaria and Saccharina have recently been recognized as two independent clades from the former genus Laminaria. Traditional morphological taxonomy is being challenged by molecular evidence from both nucleus and plastid. Intensive work is in great demand from the perspective of genome colinearity. In this study, 118 sequence-tagged site (STS) markers were screened for phylogenetic analyses, 29 based on genome sequences, while 89 were based on expressed sequence tag (EST) sequences. EST-based STS marker development (29.37%) had an effi ciency twice as high as genome-sequence-based development (9.48%) as a result of high conservation of gene transcripts among the relative species. S. ochotensis, S. religiosa, S. japonica, and L. hyperborea showed great homogeneity in all 118 STS markers. Our result supports the view that the diversifi cation between the genera Saccharina and Laminaria was a more recent event and that Saccharina and Laminaria shared high phylogenetic affi nity. However, when it came to the single nucleotide polymorphism (SNP) level among the 41 SNPs, L. hyperborea owned 29 unique SNPs against 12 within the left three Saccharina species and 12 of the 13 indels were supposedly unique for L. hyperborea, indicated by its high variability. Originating from homologous ancestors, species between the recently diverged genera Laminaria and Saccharina may have taken in enough mutations at the SNP level only, in spite of different evolutionary strategies for better adaptation to the environment. Our study lays a solid foundation from a new perspective, although more accurate phylogenetic analysis is still needed to clarify the evolutionary traces between the genera Saccharina and Laminaria.

  16. Construction of a chromosome-assigned, sequence-tagged linkage map for the radish, Raphanus sativus L. and QTL analysis of morphological traits.

    PubMed

    Hashida, Tomoko; Nakatsuji, Ryoichi; Budahn, Holger; Schrader, Otto; Peterka, Herbert; Fujimura, Tatsuhito; Kubo, Nakao; Hirai, Masashi

    2013-06-01

    The radish displays great morphological variation but the genetic factors underlying this variability are mostly unknown. To identify quantitative trait loci (QTLs) controlling radish morphological traits, we cultivated 94 F4 and F5 recombinant inbred lines derived from a cross between the rat-tail radish and the Japanese radish cultivar 'Harufuku' inbred lines. Eight morphological traits (ovule and seed numbers per silique, plant shape, pubescence and root formation) were measured for investigation. We constructed a map composed of 322 markers with a total length of 673.6 cM. The linkage groups were assigned to the radish chromosomes using disomic rape-radish chromosome-addition lines. On the map, eight and 10 QTLs were identified in 2008 and 2009, respectively. The chromosome-linkage group correspondence, the sequence-specific markers and the QTLs detected here will provide useful information for further genetic studies and for selection during radish breeding programs.

  17. Development of an Expressed Sequence Tag (EST) Resource for Wheat (Triticum aestivum L.)

    PubMed Central

    Lazo, G. R.; Chao, S.; Hummel, D. D.; Edwards, H.; Crossman, C. C.; Lui, N.; Matthews, D. E.; Carollo, V. L.; Hane, D. L.; You, F. M.; Butler, G. E.; Miller, R. E.; Close, T. J.; Peng, J. H.; Lapitan, N. L. V.; Gustafson, J. P.; Qi, L. L.; Echalier, B.; Gill, B. S.; Dilbirligi, M.; Randhawa, H. S.; Gill, K. S.; Greene, R. A.; Sorrells, M. E.; Akhunov, E. D.; Dvořák, J.; Linkiewicz, A. M.; Dubcovsky, J.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Mahmoud, A. A.; Miftahudin; Ma, X.-F.; Conley, E. J.; Anderson, J. A.; Pathan, M. S.; Nguyen, H. T.; McGuire, P. E.; Qualset, C. O.; Anderson, O. D.

    2004-01-01

    This report describes the rationale, approaches, organization, and resource development leading to a large-scale deletion bin map of the hexaploid (2n = 6x = 42) wheat genome (Triticum aestivum L.). Accompanying reports in this issue detail results from chromosome bin-mapping of expressed sequence tags (ESTs) representing genes onto the seven homoeologous chromosome groups and a global analysis of the entire mapped wheat EST data set. Among the resources developed were the first extensive public wheat EST collection (113,220 ESTs). Described are protocols for sequencing, sequence processing, EST nomenclature, and the assembly of ESTs into contigs. These contigs plus singletons (unassembled ESTs) were used for selection of distinct sequence motif unigenes. Selected ESTs were rearrayed, validated by 5′ and 3′ sequencing, and amplified for probing a series of wheat aneuploid and deletion stocks. Images and data for all Southern hybridizations were deposited in databases and were used by the coordinators for each of the seven homoeologous chromosome groups to validate the mapping results. Results from this project have established the foundation for future developments in wheat genomics. PMID:15514037

  18. Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping.

    PubMed

    Zhu, Wei; Schlueter, Shannon D; Brendel, Volker

    2003-06-01

    Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.

  19. Development of peanut expessed sequence tag-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  20. Development of peanut EST (expressed sequence tag)-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  1. Genomic analysis of cultivated barley (Hordeum vulgare) using sequence-tagged molecular markers. Estimates of divergence based on RFLP and PCR markers derived from stress-responsive genes, and simple-sequence repeats (SSRs).

    PubMed

    Maestri, E; Malcevschi, A; Massari, A; Marmiroli, N

    2002-04-01

    Three types of molecular markers have been compared for their utility in evaluating genetic diversity among cultivars of Hordeum vulgare. Restriction fragment length polymorphisms at 71 sites were scored with the aid of probes corresponding to stress-responsive genes from barley and wheat, coding for a low-molecular-weight heat shock protein, a dehydrin, an aldose reductase homolog, and a 18.9-kDa drought-induced protein of unknown function. Indexes of genetic diversity computed in the total sample and within groups of cultivars (two-rowed and six-rowed, winter and spring varieties) indicated high values of genetic differentiation ( F (ST) >15%). A second assessment of genetic diversity was performed by PCR amplification of genomic DNA using as primers 13 arbitrary oligonucleotides derived from sequences of the same stress-responsive genes. A high degree of polymorphism was uncovered using these markers also, but they yielded low values for F (ST) (<7%) among groups of cultivars. Finally, 15 different simple-sequence repeats (AC or AG) were amplified with primers based on unique flanking sequences. Levels of polymorphism and differentiation between groups of cultivars revealed by these markers were quite high. Ordination techniques applied to measures of genetic distance among cultivars demonstrated a remarkable ability of the RFLPs associated with stress-responsive genes to discriminate on the basis of growth habit. The correlation with production data for the cultivars in different environments was also significant. This "functional genomics" strategy was therefore as informative as the "structural genomics" (SSR-based) approach, but requires the analysis of fewer probes. PMID:11976962

  2. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp.

  3. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp. PMID:23898551

  4. Mining expressed sequence tag (EST) libraries for cancer-associated genes.

    PubMed

    Schmitt, Armin O

    2010-01-01

    Originally established in the beginning of the 1990s as a direct route to gene finding, expressed sequence tags (ESTs) still lend themselves as a means to analyze gene expression in almost all human tissues. The type of questions that can be addressed using public EST libraries ranges from tissue-specific gene profiling to the comparison between tissues in diseased and healthy states. Thanks to a multitude of web-based online bioinformatics resources, mining in EST libraries is not restricted to experts in the field of data analysis, but can readily be performed by the medical or life scientist. In this chapter, a couple of cases studies are presented that guide the scientist to the most useful online resources so that they can conduct their own research.

  5. Peanut (Arachis hypogaea) Expressed Sequence Tag Project: Progress and Application

    PubMed Central

    Feng, Suping; Wang, Xingjun; Zhang, Xinyou; Dang, Phat M.; Holbrook, C. Corley; Culbreath, Albert K.; Wu, Yaoting; Guo, Baozhu

    2012-01-01

    Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research community had the historic 2004 Atlanta Genomics Workshop and named the EST project as a main priority. As of August 2011, the peanut research community had deposited 252,832 ESTs in the public NCBI EST database, and this resource has been providing the community valuable tools and core foundations for various genome-scale experiments before the whole genome sequencing project. These EST resources have been used for marker development, gene cloning, microarray gene expression and genetic map construction. Certainly, the peanut EST sequence resources have been shown to have a wide range of applications and accomplished its essential role at the time of need. Then the EST project contributes to the second historic event, the Peanut Genome Project 2010 Inaugural Meeting also held in Atlanta where it was decided to sequence the entire peanut genome. After the completion of peanut whole genome sequencing, ESTs or transcriptome will continue to play an important role to fill in knowledge gaps, to identify particular genes and to explore gene function. PMID:22745594

  6. AGIA Tag System Based on a High Affinity Rabbit Monoclonal Antibody against Human Dopamine Receptor D1 for Protein Analysis

    PubMed Central

    Yano, Tomoya; Takeda, Hiroyuki; Uematsu, Atsushi; Yamanaka, Satoshi; Nomura, Shunsuke; Nemoto, Keiichirou; Iwasaki, Takahiro; Takahashi, Hirotaka; Sawasaki, Tatsuya

    2016-01-01

    Polypeptide tag technology is widely used for protein detection and affinity purification. It consists of two fundamental elements: a peptide sequence and a binder which specifically binds to the peptide tag. In many tag systems, antibodies have been used as binder due to their high affinity and specificity. Recently, we obtained clone Ra48, a high-affinity rabbit monoclonal antibody (mAb) against dopamine receptor D1 (DRD1). Here, we report a novel tag system composed of Ra48 antibody and its epitope sequence. Using a deletion assay, we identified EEAAGIARP in the C-terminal region of DRD1 as the minimal epitope of Ra48 mAb, and we named this sequence the “AGIA” tag, based on its central sequence. The tag sequence does not include the four amino acids, Ser, Thr, Tyr, or Lys, which are susceptible to post-translational modification. We demonstrated performance of this new tag system in biochemical and cell biology applications. SPR analysis demonstrated that the affinity of the Ra48 mAb to the AGIA tag was 4.90 × 10−9 M. AGIA tag showed remarkably high sensitivity and specificity in immunoblotting. A number of AGIA-fused proteins overexpressed in animal and plant cells were detected by anti-AGIA antibody in immunoblotting and immunostaining with low background, and were immunoprecipitated efficiently. Furthermore, a single amino acid substitution of the second Glu to Asp (AGIA/E2D) enabled competitive dissociation of AGIA/E2D-tagged protein by adding wild-type AGIA peptide. It enabled one-step purification of AGIA/E2D-tagged recombinant proteins by peptide competition under physiological conditions. The sensitivity and specificity of the AGIA system makes it suitable for use in multiple methods for protein analysis. PMID:27271343

  7. Comparative mapping of expressed sequence tags containing microsatellites in rainbow trout (Oncorhynchus mykiss)

    PubMed Central

    Rexroad, Caird E; Rodriguez, Maria F; Coulibaly, Issa; Gharbi, Karim; Danzmann, Roy G; DeKoning, Jenefer; Phillips, Ruth; Palti, Yniv

    2005-01-01

    Background Comparative genomics, through the integration of genetic maps from species of interest with whole genome sequences of other species, will facilitate the identification of genes affecting phenotypes of interest. The development of microsatellite markers from expressed sequence tags will serve to increase marker densities on current salmonid genetic maps and initiate in silico comparative maps with species whose genomes have been fully sequenced. Results Eighty-nine polymorphic microsatellite markers were generated for rainbow trout of which at least 74 amplify in other salmonids. Fifty-five have been associated with functional annotation and 30 were mapped on existing genetic maps. Homologous sequences were identified for 20 of the EST containing microsatellites to identify comparative assignments within the tetraodon, mouse, and/or human genomes. Conclusion The addition of microsatellite markers constructed from expressed sequence tag data will facilitate the development of high-density genetic maps for rainbow trout and comparative maps with other salmonids and better studied species. PMID:15836796

  8. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Hixson, Kim K; Purvine, Samuel O; Anderson, Gordon A; Smith, Richard D

    2008-10-15

    De novo sequencing is a spectrum analysis approach for mass spectrometry data to discover post-translational modifications in proteins; however, such an approach is still in its infancy and is still not widely applied to proteomic practices due to its limited reliability. In this work, we describe a de novo sequencing approach for the discovery of protein modifications based on identification of the proteome UStags (Shen, Y.; Tolić, N.; Hixson, K. K.; Purvine, S. O.; Pasa-Tolić, L.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Smith, R. D. Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry data for peptides and polypeptides from a yeast lysate, and the de novo sequences obtained were selected based on filter levels designed to provide a limited yet high quality subset of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags' prefix and suffix sequences and the UStags themselves) were used to infer possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances within several yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. To determine false discovery rates, two random (false) databases were independently used for sequence matching, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity of the approach were investigated and described. The combined de novo-UStag approach complements the UStag method previously reported by enabling the discovery of new protein modifications. PMID:18783246

  9. Comprehensive Genetic Database of Expressed Sequence Tags for Coccolithophorids

    NASA Astrophysics Data System (ADS)

    Ranji, Mohammad; Hadaegh, Ahmad R.

    Coccolithophorids are unicellular, marine, golden-brown, single-celled algae (Haptophyta) commonly found in near-surface waters in patchy distributions. They belong to the Phytoplankton family that is known to be responsible for much of the earth reproduction. Phytoplankton, just like plants live based on the energy obtained by Photosynthesis which produces oxygen. Substantial amount of oxygen in the earth's atmosphere is produced by Phytoplankton through Photosynthesis. The single-celled Emiliana Huxleyi is the most commonly known specie of Coccolithophorids and is known for extracting bicarbonate (HCO3) from its environment and producing calcium carbonate to form Coccoliths. Coccolithophorids are one of the world's primary producers, contributing about 15% of the average oceanic phytoplankton biomass to the oceans. They produce elaborate, minute calcite platelets (Coccoliths), covering the cell to form a Coccosphere and supplying up to 60% of the bulk pelagic calcite deposited on the sea floors. In order to understand the genetics of Coccolithophorid and the complexities of their biochemical reactions, we decided to build a database to store a complete profile of these organisms' genomes. Although a variety of such databases currently exist, (http://www.geneservice.co.uk/home/) none have yet been developed to comprehensively address the sequencing efforts underway by the Coccolithophorid research community. This database is called CocooExpress and is available to public (http://bioinfo.csusm.edu) for both data queries and sequence contribution.

  10. Motion analysis of both ventricles using tagged MRI

    NASA Astrophysics Data System (ADS)

    Ozturk, Cengizhan; McVeigh, Elliot R.

    2000-04-01

    Although several methods exist for the analysis of tagged MRI images of the left ventricle (LV), analysis of the right ventricle (RV) remains challenging due to its complex anatomy and significant through plane motion. We present here preliminary results of our new motion analysis method, both for RV and LV, in healthy human volunteers. In this method, following standard myocardial and tag segmentation of cardiac gated cine tagged MR images; a 4D B-spline based parametric motion field was computed for a volume of interest encompassing both ventricles. Using this motion field, 3D displacements and strains were calculated on the RV and LV. We observed that for both chambers the circumferential strain (Ecc) decreased with a constant rate throughout systole. The systolic strain rate displayed spatial similarity not only for the LV but also for the RV. For RV free wall, mean systolic Ecc was -0.19 +/- 0.05 with an average coefficient of variability of 20%. The 4D B-spline based motion analysis technique for tagged MRI yields compatible results for the LV and gives consistent circumferential strain measures for the RV free wall. Tagged MRI based RV mechanical analysis can be used along with LV results for a more complete cardiac evaluation.

  11. Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

    PubMed

    Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

    2012-08-01

    Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.

  12. Expressed sequence tags reveal genetic diversity and putative virulence factors of the pathogenic oomycete Pythium insidiosum.

    PubMed

    Krajaejun, Theerapong; Khositnithikul, Rommanee; Lerksuthirat, Tassanee; Lowhnoo, Tassanee; Rujirawat, Thidarat; Petchthong, Thanom; Yingyong, Wanta; Suriyaphol, Prapat; Smittipat, Nat; Juthayothin, Tada; Phuntumart, Vipaporn; Sullivan, Thomas D

    2011-07-01

    Oomycetes are unique eukaryotic microorganisms that share a mycelial morphology with fungi. Many oomycetes are pathogenic to plants, and a more limited number are pathogenic to animals. Pythium insidiosum is the only oomycete that is capable of infecting both humans and animals, and causes a life-threatening infectious disease, called "pythiosis". In the majority of pythiosis patients life-long handicaps result from the inevitable radical excision of infected organs, and many die from advanced infection. Better understanding P. insidiosum pathogenesis at molecular levels could lead to new forms of treatment. Genetic and genomic information is lacking for P. insidiosum, so we have undertaken an expressed sequence tag (EST) study, and report on the first dataset of 486 ESTs, assembled into 217 unigenes. Of these, 144 had significant sequence similarity with known genes, including 47 with ribosomal protein homology. Potential virulence factors included genes involved in antioxidation, thermal adaptation, immunomodulation, and iron and sterol binding. Effectors resembling pathogenicity factors of plant-pathogenic oomycetes were also discovered, such as, a CBEL-like protein (possible involvement in host cell adhesion and hemagglutination), a putative RXLR effector (possibly involved in host cell modulation) and elicitin-like (ELL) proteins. Phylogenetic analysis mapped P. insidiosum ELLs to several novel clades of oomycete elicitins (ELIs), and homology modeling predicted that P. insidiosum ELLs should bind sterols. Most of the P. insidiosum ESTs showed homology to sequences in the genome or EST databases of other oomycetes, but one putative gene, with unknown function, was found to be unique to P. insidiosum. The EST dataset reported here represents the first steps in identifying genes of P. insidiosum and beginning transcriptome analysis. This genetic information will facilitate understanding of pathogenic mechanisms of this devastating pathogen. PMID:21724174

  13. Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

    PubMed Central

    2011-01-01

    Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882

  14. Velocity measurement of clay intrusion through a sudden contraction step using a tagging pulse sequence.

    PubMed

    Tsushima, Shohji; Hasegawa, Atsushi; Suekane, Tetsuya; Hirai, Shuichiro; Tanaka, Yoshihiro; Nakasuji, Yoshizumi

    2003-07-01

    Magnetic resonance imaging (MRI) with a spatial tagging sequence was used to measure the velocity distribution of clay that was forced past a sudden contraction. A spatial tagging sequence provided magnetic resonance images of clay that allowed measurement of the velocity distribution in the clay, which can provide profound insights on the deformation process of clay during the intrusion process. The experiments were conducted using a specially-designed vessel that could operate at up to 30 MPa. The vessel offers a rectangle test section with a sudden contraction step that had a ratio of contraction of 2:1. The vessel was installed into a commercial magnetic resonance imaging equipment and then the fluid motion of clay flowing into the narrow contracted channel was quantitatively investigated to examine behaviors of flowing clay as non-Newtonian fluid. MRI results are compared with those obtained by computational fluid dynamics (CFD) calculation. Velocity distributions obtained from each tag displacement did not well agree with those predicted by CFD results near the contraction step where the fluid accelerated rapidly. However, a post-processing on calculation results, in which virtual tag displacement is calculated, gave better agreement with experiment and enabled us to compare MRI results with CFD results. PMID:12915199

  15. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors

    PubMed Central

    Owen, Jeremy G.; Charlop-Powers, Zachary; Smith, Alexandra G.; Ternei, Melinda A.; Calle, Paula Y.; Reddy, Boojala Vijay B.; Montiel, Daniel; Brady, Sean F.

    2015-01-01

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A–E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome. PMID:25831524

  16. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors.

    PubMed

    Owen, Jeremy G; Charlop-Powers, Zachary; Smith, Alexandra G; Ternei, Melinda A; Calle, Paula Y; Reddy, Boojala Vijay B; Montiel, Daniel; Brady, Sean F

    2015-04-01

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A-E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome.

  17. Sub-wavelength plasmonic readout for direct linear analysis of optically tagged DNA

    NASA Astrophysics Data System (ADS)

    Varsanik, Jonathan; Teynor, William; LeBlanc, John; Clark, Heather; Krogmeier, Jeffrey; Yang, Tian; Crozier, Kenneth; Bernstein, Jonathan

    2010-02-01

    This work describes the development and fabrication of a novel nanofluidic flow-through sensing chip that utilizes a plasmonic resonator to excite fluorescent tags with sub-wavelength resolution. We cover the design of the microfluidic chip and simulation of the plasmonic resonator using Finite Difference Time Domain (FDTD) software. The fabrication methods are presented, with testing procedures and preliminary results. This research is aimed at improving the resolution limits of the Direct Linear Analysis (DLA) technique developed by US Genomics [1]. In DLA, intercalating dyes which tag a specific 8 base-pair sequence are inserted in a DNA sample. This sample is pumped though a nano-fluidic channel, where it is stretched into a linear geometry and interrogated with light which excites the fluorescent tags. The resulting sequence of optical pulses produces a characteristic "fingerprint" of the sample which uniquely identifies any sample of DNA. Plasmonic confinement of light to a 100 nm wide metallic nano-stripe enables resolution of a higher tag density compared to free space optics. Prototype devices have been fabricated and are being tested with fluorophore solutions and tagged DNA. Preliminary results show evanescent coupling to the plasmonic resonator is occurring with 0.1 micron resolution, however light scattering limits the S/N of the detector. Two methods to reduce scattered light are presented: index matching and curved waveguides.

  18. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  19. Confirming single nucleotide polymorphisms from expressed sequence tag datasets derived from three cattle cDNA libraries.

    PubMed

    Lee, Seung-Hwan; Park, Eung-Woo; Cho, Yong-Min; Lee, Ji-Woong; Kim, Hyoung-Yong; Lee, Jun-Heon; Oh, Sung-Jong; Cheong, Il-Cheong; Yoon, Du-Hak

    2006-03-31

    Using the Phred/Phrap/Polyphred/Consed pipeline established in the National Livestock Research Institute of Korea, we predicted candidate coding single nucleotide polymorphisms (cSNPs) from 7,600 expressed sequence tags (ESTs) derived from three cDNA libraries (liver, M. longissimus dorsi, and intermuscular fat) of Hanwoo (Korean native cattle) steers. From the 7,600 ESTs, 829 contigs comprising more than two EST reads were assembled using the Phrap assembler. Based on the contig analysis, 201 candidate cSNPs were identified in 129 contigs, in which transitions (69%) outnumbered transversions (31%). To verify whether the predicted cSNPs are real, 17 SNPs involved in lipid and energy metabolism were selected from the ESTs. Twelve of these were confirmed to be real while five were identified as artifacts, possibly due to expressed sequence tag sequence error. Further analysis of the 12 verified cSNPs was performed using the program BLASTX. Five were identified as nonsynonymous cSNPs, five were synonymous cSNPs, and two SNPs were located in 3'-UTRs. Our data indicated that a relatively high SNP prediction rate (71%) from a large EST database could produce abundant cSNPs rapidly, which can be used as valuable genetic markers in cattle.

  20. Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

    PubMed

    Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

    2010-02-01

    Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.

  1. RAD tag sequencing as a source of SNP markers in Cynara cardunculus L

    PubMed Central

    2012-01-01

    Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349

  2. Sequencing degraded DNA from non-destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics.

    PubMed

    Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich

    2014-01-01

    Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.

  3. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    PubMed Central

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  4. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  5. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  6. Gene expression profile in the anterior regeneration of the earthworm using expressed sequence tags.

    PubMed

    Cho, Sung-Jin; Lee, Myung Sik; Tak, Eun Sik; Lee, Eun; Koh, Ki Seok; Ahn, Chi Hyun; Park, Soon Cheol

    2009-01-01

    In order to gain insight into the gene expression profiles associated with anterior regeneration of the earthworm, Perionyx excavatus, we analyzed 1,159 expressed sequence tags (ESTs) derived from cDNA library early anterior regenerated tissue. Among the 1,159 ESTs analyzed, 622 (53.7%) ESTs showed significant similarity to known genes and represented 338 genes, of which 233 ESTs were singletons and 105 ESTs manifested as two or more ESTs. While 663 ESTs (57.2%) were sequenced only once, 308 ESTs (26.6%) appeared 2 to 5 times, and 188 ESTs (16.2%) were sequenced more than 5 times. A total of 803 genes were categorized into 15 groups according to their biological functions. Among 1,159 ESTs sequenced, we found several gene encoding signaling molecules, such as Notch and Distal-less. The ESTs used in this study should provide a resource for future research in earthworm regeneration. PMID:19129665

  7. A physical map of the X chromosome of Drosophila melanogaster: Cosmid contigs and sequence tagged sites

    SciTech Connect

    Madueno, E.; Modolell, J.; Papagiannakis, G.

    1995-04-01

    A physical map of the euchromatic X chromosome of Drosophila melanogaster has been constructed by assembling contiguous arrays of cosmids that were selected by screening a library with DNA isolated from microamplified chromosomal divisions. This map, consisting of 893 cosmids, covers {approximately}64% of the euchromatic part of the chromosome. In addition, 568 sequence tagged sites (STS), in aggregate representing 120 kb of sequenced DNA, were derived from selected cosmids. Most of these STSs, spaced at an average distance of {approximately} 35 kb along the euchromatic region of the chromosome, represent DNA tags that can be used as entry points to the fruitfly genome. Furthermore, 42 genes have been placed on the physical map, either through the hybridization of specific probes to the cosmids or through the fact that they were represented among the STSs. These provide a link between the physical and the genetic maps of D. melanogaster. Nine novel genes have been tentatively identified in Drosophila on the basis of matches between STS sequences and sequences from other species. 32 refs., 3 figs., 4 tabs.

  8. Identification of expressed resistance gene analogs from peanut (Arachis hypogaea L.) expressed sequence tags.

    PubMed

    Liu, Zhanji; Feng, Suping; Pandey, Manish K; Chen, Xiaoping; Culbreath, Albert K; Varshney, Rajeev K; Guo, Baozhu

    2013-05-01

    Low genetic diversity makes peanut (Arachis hypogaea L.) very vulnerable to plant pathogens, causing severe yield loss and reduced seed quality. Several hundred partial genomic DNA sequences as nucleotide-binding-site leucine-rich repeat (NBS-LRR) resistance genes (R) have been identified, but a small portion with expressed transcripts has been found. We aimed to identify resistance gene analogs (RGAs) from peanut expressed sequence tags (ESTs) and to develop polymorphic markers. The protein sequences of 54 known R genes were used to identify homologs from peanut ESTs from public databases. A total of 1,053 ESTs corresponding to six different classes of known R genes were recovered, and assembled 156 contigs and 229 singletons as peanut-expressed RGAs. There were 69 that encoded for NBS-LRR proteins, 191 that encoded for protein kinases, 82 that encoded for LRR-PK/transmembrane proteins, 28 that encoded for Toxin reductases, 11 that encoded for LRR-domain containing proteins and four that encoded for TM-domain containing proteins. Twenty-eight simple sequence repeats (SSRs) were identified from 25 peanut expressed RGAs. One SSR polymorphic marker (RGA121) was identified. Two polymerase chain reaction-based markers (Ahsw-1 and Ahsw-2) developed from RGA013 were homologous to the Tomato Spotted Wilt Virus (TSWV) resistance gene. All three markers were mapped on the same linkage group AhIV. These expressed RGAs are the source for RGA-tagged marker development and identification of peanut resistance genes.

  9. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Hixson, Kim K.; Purvine, Samuel O.; Anderson, Gordon A.; Smith, Richard D.

    2008-10-15

    De novo sequencing has a promise to discover the protein post-translation modifications; however, such approach is still in their infancy and not widely applied for proteomics practices due to its limited reliability. In this work, we describe a de novo sequencing approach for discovery of protein modifications through identification of the UStags (Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry for peptides and polypeptides in a yeast lysate, and the de novo sequences obtained were filtered to define a more limited set of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags’ prefix and suffix sequences and the UStags themselves) were used to infer the possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances of yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. Random matching of the de novo sequences to the predicted sequences were examined with use of two random (false) databases, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity are described. The de novo-UStag complements the UStag method previously reported by enabling discovery of new protein modifications.

  10. Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus.

    PubMed

    Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

    2012-01-01

    Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function.

  11. Development of expressed sequence tag-simple sequence repeat markers for Chrysanthemum morifolium and closely related species.

    PubMed

    Liu, H; Zhang, Q X; Sun, M; Pan, H T; Kong, Z X

    2015-01-01

    With the development of chrysanthemum breeding in recent years, an increasing number of wild species in genera related to Chrysanthemum were introduced to extend the genetic resources and facilitate the genetic improvement of chrysanthemums via hybridization. However, few simple sequence repeat (SSR) markers are available for marker-assisted breeding and population genetic studies of chrysanthemum and closely related species. Expressed sequence tags (ESTs) in public databases and cross-species transferable markers are considered to be a cost-effective means for developing sequence-based markers. In this study, 25 EST-SSRs were successfully developed from Chrysanthemum EST sequences for Chrysanthemum morifolium and closely related species. In total, 4164 unigene sequences were assembled from 7180 ESTs of chrysanthemum in GenBank, which were subsequently used to screen for the presence of microsatellites with the SSRIT software. The screening criteria were 8, 5, 4, and 3 repeating units for di-, tri-, tetra-, and penta- and higher-order nucleotides, respectively. Moreover, 310 SSR loci from 296 sequences were identified, and 198 primer pairs for SSR amplification were designed with the Primer Premier 5.0 software, of which 25 SSR loci showed polymorphic amplification in 52 species and varieties belonging to Chrysanthemum, Ajania, and Opisthopappus. The application of EST-SSR markers to the identification of intergeneric hybrids between Chrysanthemum and Ajania was demonstrated. Therefore, EST-SSRs can be developed for species that lack gene sequences or ESTs by utilizing ESTs of closely related species. PMID:26214436

  12. Regulatory sequence analysis tools.

    PubMed

    van Helden, Jacques

    2003-07-01

    The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.

  13. SV40 Tag DNA sequences, present in a small proportion of human hepatocellular carcinomas, are associated with reduced survival

    PubMed Central

    Wong, N A C S; Rae, F; Herriot, M M; Mayer, N J; Brewster, D H; Harrison, D J

    2003-01-01

    Aims: To study the association between simian virus 40 (SV40) and human hepatocarcinogenesis. Methods: Polymerase chain reaction (PCR) to detect SV40 large T antigen (Tag) DNA was performed on: 50 human hepatocellular carcinoma (HCCs) diagnosed between 1978 and 1989 (cohort A); 20 cases of alcoholic liver cirrhosis from the same period; and 20 HCCs diagnosed after 1997 (cohort B). PCR to detect SV40 regulatory sequence and SV40 Tag immunohistochemistry were performed on selected cases from cohorts A and B. Amplified products were directly sequenced. Immunohistochemistry for p53 and pRb and clinicopathological analyses were performed on selected cases from cohorts A and B. Complete survival data were collected for cohort A. Result: SV40 Tag DNA was found in five cohort A HCCs but not in alcoholic liver cirrhosis cases or cohort B HCCs. Neither SV40 regulatory sequence nor SV40 Tag protein were demonstrated in Tag DNA positive HCCs. No clinicopathological differences existed between Tag DNA positive and negative HCCs, but the presence of Tag DNA was associated with reduced disease specific survival. Relatively fewer Tag DNA positive than negative HCCs expressed p53, but loss of pRb expression was similar in the two groups. Patients with Tag DNA positive HCCs were unlikely to have received SV40 contaminated poliovirus vaccine. Conclusions: SV40 Tag DNA is present in a small proportion of historical HCCs and may contribute to their pathogenesis and influence their outcome. The source of the virus is uncertain and more recent HCCs show no evidence of SV40. PMID:14645347

  14. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation

  15. New sequence-tagged site molecular markers for identification of sex in Distichlis spicata.

    PubMed

    Eppley, Sarah M; O'Quinn, Robin; Brown, Anna L

    2009-09-01

    Sex-linked molecular markers have become valuable tools for understanding sex ratio evolution and sex-specific physiology in pre-reproductive plants. To develop new accurate methods for sexing Distichlis spicata juveniles and nonflowering individuals, we converted a random amplified polymorphic DNA-polymerase chain reaction marker that co-segregated with the female phenotype into a set of sequence-tagged site markers. We tested the marker pair on known males and females from populations in Oregon and California. A single band was obtained for all female samples but never for males.

  16. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  17. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  18. Analyses of an Expressed Sequence Tag Library from Taenia solium, Cysticerca

    PubMed Central

    Lundström, Jonas; Salazar-Anton, Fernando; Sherwood, Ellen; Andersson, Björn; Lindh, Johan

    2010-01-01

    Background Neurocysticercosis is a disease caused by the oral ingestion of eggs from the human parasitic worm Taenia solium. Although drugs are available they are controversial because of the side effects and poor efficiency. An expressed sequence tag (EST) library is a method used to describe the gene expression profile and sequence of mRNA from a specific organism and stage. Such information can be used in order to find new targets for the development of drugs and to get a better understanding of the parasite biology. Methods and Findings Here an EST library consisting of 5760 sequences from the pig cysticerca stage has been constructed. In the library 1650 unique sequences were found and of these, 845 sequences (52%) were novel to T. solium and not identified within other EST libraries. Furthermore, 918 sequences (55%) were of unknown function. Amongst the 25 most frequently expressed sequences 6 had no relevant similarity to other sequences found in the Genbank NR DNA database. A prediction of putative signal peptides was also performed and 4 among the 25 were found to be predicted with a signal peptide. Proposed vaccine and diagnostic targets T24, Tsol18/HP6 and Tso31d could also be identified among the 25 most frequently expressed. Conclusions An EST library has been produced from pig cysticerca and analyzed. More than half of the different ESTs sequenced contained a sequence with no suggested function and 845 novel EST sequences have been identified. The library increases the knowledge about what genes are expressed and to what level. It can also be used to study different areas of research such as drug and diagnostic development together with parasite fitness via e.g. immune modulation. PMID:21200421

  19. Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

    NASA Technical Reports Server (NTRS)

    Beil, Robert J.; Malin, Jane T.

    2012-01-01

    Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.

  20. TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

    PubMed Central

    O'Brien, Emmet A.; Koski, Liisa B.; Zhang, Yue; Yang, LiuSong; Wang, Eric; Gray, Michael W.; Burger, Gertraud; Lang, B. Franz

    2007-01-01

    The TBestDB database contains ∼370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact tbestdb@bch.umontreal.ca. The database can be queried at . PMID:17202165

  1. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    PubMed Central

    Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.

    2001-01-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  2. Identification of sequence tagged sites in the Asian and African elephant.

    PubMed

    Burk, N E; Messer, L A; Ernst, C W; Rothschild, M F

    1998-01-01

    To date, gene identification in elephants has essentially related to evolutionary studies. Further identification of genes in elephants could provide additional information for evolutionary studies and for evaluating genetic diversity in existing elephant populations. The objective of this project was to identify sequence tagged sites (STSs) in the Asian and the African elephant for the following genes: melatonin receptor 1a (MTNR1A), retinoic acid receptor beta (RARB), and leptin receptor (LEPR). These genes are highly conserved among mammals, and all may play a role in reproduction. Heterologous primers for PCR were designed from sequences available in other species. Fragments of size 141 base pairs (bp) for RARB and 327 bp for LEPR were obtained by amplifying genomic Asian and African elephant DNA. The LEPR fragment included an intron of 164 bp. Also, a 417 bp fragment for MTNR1A was obtained in the Asian elephant only. All PCR products were sequenced and comparison computations were made at the nucleotide and amino acid levels to sequence available in the GenBank database. Nucleotide sequence for RARB was identical for both Asian and African elephants and differed by only 3 bp for LEPR. Deduced amino acid sequence was identical for both STSs in both species. Elephants were relatively similar in comparison to other mammals and less similar to chickens.

  3. Development of polymorphic microsatellite markers based on expressed sequence tags in Populus cathayana (Salicaceae).

    PubMed

    Tian, Z Z; Zhang, F Q; Cai, Z Y; Chen, S L

    2016-01-01

    Populus cathayana occupies a large area within the northern, central, and southwestern regions of China, and is considered to be an important reforestation species in western China. In order to investigate the population genetic structure of this species, 10 polymorphic microsatellite loci were identified based on expressed sequence tags from de novo sequencing on the Illumina HiSeq 2000 platform. All microsatellite primers were tested on 48 P. cathayana individuals from four locations on the Qinghai-Tibet Plateau. The observed heterozygosity ranged from 0.000 to 1.000, and the null-allele frequency ranged from 0.000 to 0.904. These microsatellite markers may be a useful tool in genetic studies on P. cathayana and closely related species.

  4. Development of polymorphic microsatellite markers based on expressed sequence tags in Populus cathayana (Salicaceae).

    PubMed

    Tian, Z Z; Zhang, F Q; Cai, Z Y; Chen, S L

    2016-01-01

    Populus cathayana occupies a large area within the northern, central, and southwestern regions of China, and is considered to be an important reforestation species in western China. In order to investigate the population genetic structure of this species, 10 polymorphic microsatellite loci were identified based on expressed sequence tags from de novo sequencing on the Illumina HiSeq 2000 platform. All microsatellite primers were tested on 48 P. cathayana individuals from four locations on the Qinghai-Tibet Plateau. The observed heterozygosity ranged from 0.000 to 1.000, and the null-allele frequency ranged from 0.000 to 0.904. These microsatellite markers may be a useful tool in genetic studies on P. cathayana and closely related species. PMID:27525845

  5. OSIRIS-REx Touch-And-Go (TAG) Mission Design and Analysis

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Sutter, Brian; May, Alex; Williams, Ken; Barbee, Brent W.; Beckman, Mark; Williams, Bobby

    2013-01-01

    The Origins Spectral Interpretation Resource Identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) 1999 RQ36 in late 2018. After several months in formation with and orbit about the asteroid, OSIRIS-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid s surface to obtain a regolith sample. This paper describes the mission design of the TAG sequence and the propulsive maneuvers required to achieve the trajectory. This paper also shows preliminary results of orbit covariance analysis and Monte-Carlo analysis that demonstrate the ability to arrive at a targeted location on the surface of RQ36 within a 25 meter radius with 98.3% confidence.

  6. Satellite-tagged transcribing sequences in Bubalus bubalis genome undergo programmed modulation in meiocytes: possible implications for transcriptional inactivation.

    PubMed

    Chattopadhyay, M; Gangadharan, S; Kapur, V; Azfer, M A; Prakash, B; Ali, S

    2001-09-01

    We cloned and sequenced a 1378 bp BamHI satellite DNA fraction from the water buffalo Bubalus bubalis and have studied its expression in different tissues. The GC-rich sequences of the resultant contig pDS5 crosshybridize only with bovid DNA and are not conserved evolutionarily. Typing of buffalo genomic DNA using pDS5 with several restriction enzymes revealed multilocus monomorphic bands. Similar typing of cattle, buffalo, goat, sheep, and gaur genomic DNA revealed variations in copy number and allele length giving rise to species-specific band patterns. Expression study of pDS5 in bubaline samples by RNA slot-blot, Northern blot, and RT-PCR showed various levels of signal in all the somatic tissues and germline cells except heart. A GenBank database search revealed homology of pDS5 sequences in the 5' region from nt 1-1261 with collagen gene. An AluI typing analysis of DNA from bubaline semen samples showed consistent loss of two bands. The presence of corresponding bands in somatic tissues suggests a sequence modulation within the pDS5 array in meiocytes during spermatogenesis, which is restored in the somatic cells after fertilization. Modulation of the satellite-tagged transcribing sequence in the meiocytes may be a mechanism of its inactivation.

  7. Mining of SSR markers from Expressed Sequence Tags of bamboo species

    PubMed Central

    Ramalakshmi, Oviya Iyappan; Piramanayagam, Shanmughavel

    2010-01-01

    With the ever increasing number of Expressed Sequence Tags (ESTs) from various sequencing projects, ESTs have become valuable and first-hand source of in-silico mining of simple sequence repeats (SSR) markers. We examined a total of 3419 EST sequences from three bamboo species, namely, Phyllostachys edulis, Bambusa oldhamii and Dendrocalamus sinicus for the presence of di- to hexa- microsatellites. The frequency of SSR containing ESTs varied from 5.36% in B. oldhamii to 13.05% in P. edulis. No SSRs were found in D. sinicus. Tri-nucleotide repeats (49.34%) were most frequent in P. edulis, while not much comparable difference in repeats was found in B. oldhamii. Flanking primer pairs were also designed in-silico for the sequences containing SSRs and their position on the genome hypothesized using similarity searching. SSRs located in open reading frame (ORF) were given functional annotation using Gene Ontology. Polymorphic SSRs were also detected using new pipeline- polySSR. Polymorphism level was very low (2.43%) and the position of the polymorphic SSRs was determined. The development of SSRs and the study of polymorphism will help in the further study of intra- and inter- gene flow, genetic structure, variability, linkage mapping and evolutionary relationships in bamboo PMID:21364824

  8. Behavior Analysis Based on Coordinates of Body Tags

    NASA Astrophysics Data System (ADS)

    Luštrek, Mitja; Kaluža, Boštjan; Dovgan, Erik; Pogorelc, Bogdan; Gams, Matjaž

    This paper describes fall detection, activity recognition and the detection of anomalous gait in the Confidence project. The project aims to prolong the independence of the elderly by detecting falls and other types of behavior indicating a health problem. The behavior will be analyzed based on the coordinates of tags worn on the body. The coordinates will be detected with radio sensors. We describe two Confidence modules. The first one classifies the user's activity into one of six classes, including falling. The second one detects walking anomalies, such as limping, dizziness and hemiplegia. The walking analysis can automatically adapt to each person by using only the examples of normal walking of that person. Both modules employ machine learning: the paper focuses on the features they use and the effect of tag placement and sensor noise on the classification accuracy. Four tags were enough for activity recognition accuracy of over 93% at moderate sensor noise, while six were needed to detect walking anomalies with the accuracy of over 90%.

  9. Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR.

    PubMed

    Teh, Seow-Ling; Chan, Wai-Sun; Abdullah, Janna Ong; Namasivayam, Parameswari

    2011-08-01

    Vanda Mimi Palmer (VMP) is a highly sought as fragrant-orchid hybrid in Malaysia. It is economically important in cosmetic and beauty industries and also a famous potted ornamental plant. To date, no work on fragrance-related genes of vandaceous orchids has been reported from other research groups although the analysis of floral fragrance or volatiles have been extensively studied. An expressed sequence tag (EST) resource was developed for VMP principally to mine any potential fragrance-related expressed sequence tag-simple sequence repeat (EST-SSR) for future development as markers in the identification of fragrant vandaceous orchids endemic to Malaysia. Clustering, annotation and assembling of the ESTs identified 1,196 unigenes which defined 966 singletons and 230 contigs. The VMP dbEST was functionally classified by gene ontology (GO) into three groups: molecular functions (51.2%), cellular components (16.4%) and biological processes (24.6%) while the remaining 7.8% showed no hits with GO identifier. A total of 112 EST-SSR (9.4%) was mined on which at least five units of di-, tri-, tetra-, penta-, or hexa-nucleotide repeats were predicted. The di-nucleotide motif repeats appeared to be the most frequent repeats among the detected SSRs with the AT/TA types as the most abundant among the dimerics, while AAG/TTC, AGA/TCT-type were the most frequent trimerics. The mined EST-SSR is believed to be useful in the development of EST-SSR markers that is applicable in the screening and characterization of fragrance-related transcripts in closely related species.

  10. Development of polymorphic expressed sequence tag-single sequence repeat markers in the common Chinese cuttlefish, Sepiella maindroni.

    PubMed

    Li, R H; Lu, S K; Zhang, C L; Song, W W; Mu, C K; Wang, C L

    2014-01-01

    The common Chinese cuttlefish (Sepiella maindroni) is one of the popular edible cephalopod consumed across Asia. To facilitate the population genetic investigation of this species, we developed fourteen polymorphic microsatellite makers from expressed sequence tags of S. maindroni. The number of alleles at each locus ranged from 6 to 10 with an average of 7.9 alleles per locus. The ranges of observed and expected heterozygosity were from 0.615 to 0.962 and 0.685 to 0.888, respectively. Four loci were found deviated significantly from Hardy-Weinberg equilibrium. The polymorphism information content ranged from 0.638 to 0.833. These polymorphic microsatellite loci will be helpful for the population genetic, genetic linkage map, and other genetic studies of S. maindroni. PMID:25117305

  11. Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...

  12. Identification of immunological expressed sequence tags in the mealworm beetle Tenebrio molitor.

    PubMed

    Dobson, Adam J; Johnston, Paul R; Vilcinskas, Andreas; Rolff, Jens

    2012-12-01

    Understanding the evolutionary ecology of immune responses to persistent infection could provide fundamental insight into temporal dynamics or interactive mechanisms that could be co-opted for antibiotic treatment regimes. Additionally, identification of novel molecules involved in these processes could provide novel compounds for biotechnological development. The beetle Tenebrio molitor displays a high level of induced antimicrobial activity coincident with persistent immuno-resistant Staphylococcus aureus, and is the first invertebrate model for persistent infection. Here we present expressed sequence tags (ESTs) detected by suppression-subtraction hybridization of Tenebrio larvae after infection with S. aureus. Amongst others, we identified mRNAs coding for various oxidative enzymes and two antimicrobial peptides. These ESTs provide a foundation for mechanistic study of Tenebrio's immune system. PMID:23041376

  13. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    PubMed Central

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different

  14. Characterization of genic microsatellite markers derived from expressed sequence tags in Pacific abalone ( Haliotis discus hannai)

    NASA Astrophysics Data System (ADS)

    Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong

    2010-01-01

    Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.

  15. Large scale in silico identification of MYB family genes from wheat expressed sequence tags.

    PubMed

    Cai, Hongsheng; Tian, Shan; Dong, Hansong

    2012-10-01

    The MYB proteins constitute one of the largest transcription factor families in plants. Much research has been performed to determine their structures, functions, and evolution, especially in the model plants, Arabidopsis, and rice. However, this transcription factor family has been much less studied in wheat (Triticum aestivum), for which no genome sequence is yet available. Despite this, expressed sequence tags are an important resource that permits opportunities for large scale gene identification. In this study, a total of 218 sequences from wheat were identified and confirmed to be putative MYB proteins, including 1RMYB, R2R3-type MYB, 3RMYB, and 4RMYB types. A total of 36 R2R3-type MYB genes with complete open reading frames were obtained. The putative orthologs were assigned in rice and Arabidopsis based on the phylogenetic tree. Tissue-specific expression pattern analyses confirmed the predicted orthologs, and this meant that gene information could be inferred from the Arabidopsis genes. Moreover, the motifs flanking the MYB domain were analyzed using the MEME web server. The distribution of motifs among wheat MYB proteins was investigated and this facilitated subfamily classification.

  16. Construction and evaluation of cDNA libraries for large-scale expressed sequence tag sequencing in wheat (Triticum aestivum L.).

    PubMed

    Zhang, D; Choi, D W; Wanamaker, S; Fenton, R D; Chin, A; Malatrasi, M; Turuspekov, Y; Walia, H; Akhunov, E D; Kianian, P; Otto, C; Simons, K; Deal, K R; Echenique, V; Stamova, B; Ross, K; Butler, G E; Strader, L; Verhey, S D; Johnson, R; Altenbach, S; Kothari, K; Tanaka, C; Shah, M M; Laudencia-Chingcuanco, D; Han, P; Miller, R E; Crossman, C C; Chao, S; Lazo, G R; Klueva, N; Gustafson, J P; Kianian, S F; Dubcovsky, J; Walker-Simmons, M K; Gill, K S; Dvorák, J; Anderson, O D; Sorrells, M E; McGuire, P E; Qualset, C O; Nguyen, H T; Close, T J

    2004-10-01

    A total of 37 original cDNA libraries and 9 derivative libraries enriched for rare sequences were produced from Chinese Spring wheat (Triticum aestivum L.), five other hexaploid wheat genotypes (Cheyenne, Brevor, TAM W101, BH1146, Butte 86), tetraploid durum wheat (T. turgidum L.), diploid wheat (T. monococcum L.), and two other diploid members of the grass tribe Triticeae (Aegilops speltoides Tausch and Secale cereale L.). The emphasis in the choice of plant materials for library construction was reproductive development subjected to environmental factors that ultimately affect grain quality and yield, but roots and other tissues were also included. Partial cDNA expressed sequence tags (ESTs) were examined by various measures to assess the quality of these libraries. All ESTs were processed to remove cloning system sequences and contaminants and then assembled using CAP3. Following these processing steps, this assembly yielded 101,107 sequences derived from 89,043 clones, which defined 16,740 contigs and 33,213 singletons, a total of 49,953 "unigenes." Analysis of the distribution of these unigenes among the libraries led to the conclusion that the enrichment methods were effective in reducing the most abundant unigenes and to the observation that the most diverse libraries were from tissues exposed to environmental stresses including heat, drought, salinity, or low temperature. PMID:15514038

  17. Construction and Evaluation of cDNA Libraries for Large-Scale Expressed Sequence Tag Sequencing in Wheat (Triticum aestivum L.)

    PubMed Central

    Zhang, D.; Choi, D. W.; Wanamaker, S.; Fenton, R. D.; Chin, A.; Malatrasi, M.; Turuspekov, Y.; Walia, H.; Akhunov, E. D.; Kianian, P.; Otto, C.; Simons, K.; Deal, K. R.; Echenique, V.; Stamova, B.; Ross, K.; Butler, G. E.; Strader, L.; Verhey, S. D.; Johnson, R.; Altenbach, S.; Kothari, K.; Tanaka, C.; Shah, M. M.; Laudencia-Chingcuanco, D.; Han, P.; Miller, R. E.; Crossman, C. C.; Chao, S.; Lazo, G. R.; Klueva, N.; Gustafson, J. P.; Kianian, S. F.; Dubcovsky, J.; Walker-Simmons, M. K.; Gill, K. S.; Dvořák, J.; Anderson, O. D.; Sorrells, M. E.; McGuire, P. E.; Qualset, C. O.; Nguyen, H. T.; Close, T. J.

    2004-01-01

    A total of 37 original cDNA libraries and 9 derivative libraries enriched for rare sequences were produced from Chinese Spring wheat (Triticum aestivum L.), five other hexaploid wheat genotypes (Cheyenne, Brevor, TAM W101, BH1146, Butte 86), tetraploid durum wheat (T. turgidum L.), diploid wheat (T. monococcum L.), and two other diploid members of the grass tribe Triticeae (Aegilops speltoides Tausch and Secale cereale L.). The emphasis in the choice of plant materials for library construction was reproductive development subjected to environmental factors that ultimately affect grain quality and yield, but roots and other tissues were also included. Partial cDNA expressed sequence tags (ESTs) were examined by various measures to assess the quality of these libraries. All ESTs were processed to remove cloning system sequences and contaminants and then assembled using CAP3. Following these processing steps, this assembly yielded 101,107 sequences derived from 89,043 clones, which defined 16,740 contigs and 33,213 singletons, a total of 49,953 “unigenes.” Analysis of the distribution of these unigenes among the libraries led to the conclusion that the enrichment methods were effective in reducing the most abundant unigenes and to the observation that the most diverse libraries were from tissues exposed to environmental stresses including heat, drought, salinity, or low temperature. PMID:15514038

  18. An expressed sequence tag survey of gene expression in the pond snail Lymnaea stagnalis, an intermediate vector of trematodes [corrected].

    PubMed

    Davison, A; Blaxter, M L

    2005-05-01

    The pond snail Lymnaea stagnalis is an intermediate vector for the liver fluke Fasciola hepatica, a common parasite of ruminants and humans. Yet, despite being a disease of medical and economic importance, as well as a potentially useful comparative tool, the genetics of the relationship between Lymnaea and Fasciola has barely been investigated. As a complement to forthcoming F. hepatica expressed sequence tags (ESTs), we generated 1320 ESTs from L. stagnalis central nervous system (CNS) libraries. We estimate that these sequences derive from 771 different genes, of which 374 showed significant similarity to proteins in public databases, and 169 were similar to ESTs from the snail vector Biomphalaria glabrata. These L. stagnalis ESTs will provide insight into the function of the snail CNS, as well as the molecular components of behaviour and response to parasitism. In the future, the comparative analysis of Lymnaea/Fasciola with Biomphalaria/Schistosoma will help to understand both conserved and divergent aspects of the host-parasite relationship. The L. stagnalis ESTs will also assist gene prediction in the forthcoming B. glabrata genome sequence. The dataset is available for searching on the world-wide web at http://zeldia.cap.ed.ac.uk/mollusca.html.

  19. Generation of 10,154 expressed sequence tags from a leafy gametophyte of a marine red alga, Porphyra yezoensis.

    PubMed

    Nikaido, I; Asamizu, E; Nakajima, M; Nakamura, Y; Saga, N; Tabata, S

    2000-06-30

    A total of 10,154 5'-end expressed sequence tags (EST) were established from the normalized and size-selected cDNA libraries of a marine red alga, Porphyra yezoensis. Among the ESTs, 2140 were unique species, and the remaining 8014 were grouped into 1127 species. Database search of the 3267 non-redundant ESTs by BLAST algorithm showed that the sequences of 1080 species (33.1%) have similarity to those of registered genes from various organisms including higher plants, mammals, yeasts, and cyanobacteria, while 2187 (66.9%) are novel. Codon usage analysis in the coding regions of 101 non-redundant EST groups showing significant similarity to known genes indicated the higher GC contents at the third position of codons (79.4%) than the first (62.2%) and the second position (45.0%), suggesting that the genome has been exposed to high GC pressure during evolution. The sequence data of individual ESTs are available at the web site http://www.kazusa.or.jp/en/plant/porphyra/EST/.

  20. Development and characterization of novel expressed sequence tag-derived simple sequence repeat markers in Hevea brasiliensis (rubber tree).

    PubMed

    An, Z W; Li, Y C; Zhai, Q L; Xie, L L; Zhao, Y H; Huang, H S

    2013-11-22

    Cultivated clones of Hevea brasiliensis have a narrow genetic base. In order to broaden the genetic base, it is first necessary to investigate the genetic diversity of wild populations. Expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed to investigate the genetic diversity of Hevea populations. Four hundred and thirty microsatellites were identified and 148 primers were designed to amplify the loci. Twenty-nine primer pairs were synthesized and evaluated for their ability to detect genetic polymorphisms among 40 wild accessions of H. brasiliensis. Twenty-one of the 29 loci were polymorphic. The number of alleles per locus in the 40 accessions ranged from 2 to 7. H(O) and H(E) at each locus ranged from 0.0000 to 0.9000 and from 0.0000 to 0.8704, respectively. All 21 loci could amplify in H. brasiliensis, H. pauciflora, H. nitida, H. spruceana, and H. camargoana. The EST-SSR primers developed herein can be used in genetic diversity and structure studies in H. brasiliensis.

  1. Identification of odorant-binding protein genes from antennal expressed sequence tags of the onion fly, Delia antiqua.

    PubMed

    Mitaka, Hayato; Matsuo, Takashi; Miura, Nami; Ishikawa, Yukio

    2011-03-01

    Insect odorant-binding proteins (OBPs) are thought to play a crucial role in the chemosensation of hydrophobic molecules such as pheromones and host chemicals. The onion fly, Delia antiqua, is a specialist feeder of Allium plants, and utilizes a host odorant n-dipropyl disulfide as a cue for its oviposition. Because n-dipropyl disulfide is a highly hydrophobic compound, some OBPs might be indispensable for perception of it. However, no OBP gene has been identified in D. antiqua. Here, to obtain the DNA sequences of D. antiqua OBPs, we performed an analysis of antennal expressed sequence tags (ESTs). Among 288 EST clones, eight D. antiqua OBP genes were identified for the first time. Phylogenetic analysis revealed that each D. antiqua OBP gene is more closely related to its Drosophila orthologs than to the other D. antiqua OBP genes, suggesting that these OBP genes had emerged before the divergence of Delia and Drosophila species. All of the eight D. antiqua OBPs are expressed not only in the antennae but also in the legs, suggesting additional roles in the taste perception of non-volatile compounds. These findings serve as an important basis for understanding the molecular mechanisms underlying the host adaptations of D. antiqua. PMID:20848218

  2. Proteomic analysis of Trypanosoma cruzi developmental stages using isotope-coded affinity tag reagents.

    PubMed

    Paba, Jaime; Ricart, Carlos A O; Fontes, Wagner; Santana, Jaime M; Teixeira, Antonio R L; Marchese, Jason; Williamson, Brian; Hunt, Tony; Karger, Barry L; Sousa, Marcelo V

    2004-01-01

    Comparative proteome analysis of developmental stages of the human pathogen Trypanosoma cruzi was carried out by isotope-coded affinity tag technology (ICAT) associated with liquid cromatography-mass spectrometry peptide sequencing (LC-MS/MS). Protein extracts of the protozoan trypomastigote and amastigote stages were labeled with heavy (D8) and light (D0) ICAT reagents and subjected to cation exchange and avidin affinity chromatographies followed by LC-MS/MS analysis. High confidence sequence information and expression levels for 41 T. cruzi polypeptides, including metabolic enzymes, paraflagellar rod components, tubulins, and heat-shock proteins were reported. Twenty-nine proteins displayed similar levels of expression in both forms of the parasite, nine proteins presented higher levels in trypomastigotes, whereas three were more expressed in amastigotes.

  3. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags

    PubMed Central

    Etter, Paul D.; Stiffler, Nicholas; Johnson, Eric A.; Cresko, William A.

    2010-01-01

    Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP–based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural

  4. A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model

    PubMed Central

    Beldade, Patrícia; Rudd, Stephen; Gruber, Jonathan D; Long, Anthony D

    2006-01-01

    Background Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly. Results By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers. Conclusion Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics. PMID:16737530

  5. The use of archived tags in retrospective genetic analysis of fish.

    PubMed

    Bonanomi, Sara; Therkildsen, Nina Overgaard; Hedeholm, Rasmus Berg; Hemmer-Hansen, Jakob; Nielsen, Einar E

    2014-05-01

    Collections of historical tissue samples from fish (e.g. scales and otoliths) stored in museums and fisheries institutions are precious sources of DNA for conducting retrospective genetic analysis. However, in some cases, only external tags used for documentation of spatial dynamics of fish populations have been preserved. Here, we test the usefulness of fish tags as a source of DNA for genetic analysis. We extract DNA from historical tags from cod collected in Greenlandic waters between 1950 and 1968. We show that the quantity and quality of DNA recovered from tags is comparable to DNA from archived otoliths from the same individuals. Surprisingly, levels of cross-contamination do not seem to be significantly higher in DNA from external (tag) than internal (otolith) sources. Our study therefore demonstrates that historical tags can be a highly valuable source of DNA for retrospective genetic analysis of fish.

  6. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  7. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform.

    PubMed

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-09-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus.

  8. A comprehensive nonredundant expressed sequence tag collection for the developing Rattus norvegicus heart.

    PubMed

    Laffin, Jennifer J S; Scheetz, Todd E; Bonaldo, Maria de Fatima; Reiter, Rebecca S; Chang, Shereen; Eyestone, Mari; Abdulkawy, Hakeem; Brown, Bartley; Roberts, Chad; Tack, Dylan; Kucaba, Tamara; Lin, Jim Jung-Ching; Sheffield, Val C; Casavant, Thomas L; Soares, M Bento

    2004-04-13

    Congenital heart defects affect approximately 1,000,000 people in the United States, with 40,000 new births contributing to that number every year. A large percentage of these defects can be attributed to septal defects. We assembled a nonredundant collection of over 12,000 expressed sequence tags (ESTs) from a total of 30,000 ESTs, with the ultimate goal of identifying spatially and/or temporally regulated genes during heart septation. These ESTs were compiled from nonnormalized, normalized, and serially subtracted cDNA libraries derived from two sets of tissue samples. The first includes microdissected rat hearts from embryonic (E) days E13, E15, and E16.5-E18.5 and adult heart. The second includes hearts from embryonic days E17, E19, and E21 and postnatal (P) days P1, P12, P74, and P200. Over 6,000 novel ESTs were identified in the libraries derived from these two sets of tissues, all of which have been contributed to the NCBI rat UniGene collection. It is anticipated that such EST and cDNA clone resources will prove invaluable to gene expression studies aimed at the understanding of the molecular mechanisms underlying heart septation defects.

  9. The non-coding RNA composition of the mitotic chromosome by 5′-tag sequencing

    PubMed Central

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M.; Shao, Zhifeng

    2016-01-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5′-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  10. Isolation of expressed sequence tags of Agaricus bisporus and their assignment to chromosomes.

    PubMed Central

    Sonnenberg, A S; de Groot, P W; Schaap, P J; Baars, J J; Visser, J; Van Griensven, L J

    1996-01-01

    The genome of the cultivated basidiomycete Agaricus bisporus Horst U1 and of its homokaryotic parents has been characterized by using an optimized method of pulsed-field gel electrophoresis. Expressed sequence tags obtained as expressed cDNAs from a primordial tissue-derived cDNA library and a number of previously isolated genes were used to identify the individual chromosomes of the parental lines of Horst U1. The genome consists of 13 chromosomes, and its total size is 31 Mb. For those chromosomes that could not be resolved by contour-clamped homogeneous electric field electrophoresis, the segregation of marker genes was studied in a set of 86 homokaryotic offspring of Horst U1. At least two markers were assigned to each individual chromosome. In this way all individual chromosomes were unequivocally identified. The large size difference observed between the homologous chromosomes IX, harboring the rDNA repeat, was shown to be largely due to a higher copy number of rDNA in parental strain H97 than in parental strain H39. PMID:8953726

  11. Paired-end diTagging for transcriptome and genome analysis.

    PubMed

    Ng, Patrick; Wei, Chia-Lin; Ruan, Yijun

    2007-07-01

    The Paired-End diTagging (PET) procedure enables one to obtain sequence information from both termini of any contiguous DNA fragment. This is achieved by a series of enzymatic manipulations that introduce MmeI sites directly flanking each DNA insert during the construction of a plasmid library. Subsequent MmeI digestion and self-ligation results in the production of covalently-linked paired-end ditags (PETs) that can be extracted and then concatenated for efficient sequencing. By mapping the PET sequences to assembled genomes, the original DNA fragments from which the PETs were derived can be precisely localized. This unit details two applications of PET technology. In GIS-PET, ditagging of mRNA converted to full-length cDNA enables whole-transcriptome analysis, including novel gene identification, gene prediction validation, and gene expression studies. In ChIP-PET, ditagging of chromatin immunoprecipitation-enriched genomic DNA fragments enables the global mapping of transcription factor binding sites. A recent innovation (Multiplex Sequencing of Paired-End ditags; MS-PET) enables PETs to be sequenced using high-throughput 454 sequencing, greatly increasing the amount of data that can be collected in each run.

  12. Gene discovery within the planctomycete division of the domain Bacteria using sequence tags from genomic DNA libraries

    PubMed Central

    Jenkins, Cheryl; Kedar, Vishram; Fuerst, John A

    2002-01-01

    Background The planctomycetes comprise a distinct group of the domain Bacteria, forming a separate division by phylogenetic analysis. The organization of their cells into membrane-defined compartments including membrane-bounded nucleoids, their budding reproduction and complete absence of peptidoglycan distinguish them from most other Bacteria. A random sequencing approach was applied to the genomes of two planctomycete species, Gemmata obscuriglobus and Pirellula marina, to discover genes relevant to their cell biology and physiology. Results Genes with a wide variety of functions were identified in G. obscuriglobus and Pi. marina, including those of metabolism and biosynthesis, transport, regulation, translation and DNA replication, consistent with established phenotypic characters for these species. The genes sequenced were predominantly homologous to those in members of other divisions of the Bacteria, but there were also matches with nuclear genomic genes of the domain Eukarya, genes that may have appeared in the planctomycetes via horizontal gene transfer events. Significant among these matches are those with two genes atypical for Bacteria and with significant cell-biology implications - integrin alpha-V and inter-alpha-trypsin inhibitor protein - with homologs in G. obscuriglobus and Pi. marina respectively. Conclusions The random-sequence-tag approach applied here to G. obscuriglobus and Pi. marina is the first report of gene recovery and analysis from members of the planctomycetes using genome-based methods. Gene homologs identified were predominantly similar to genes of Bacteria, but some significant best matches to genes from Eukarya suggest that lateral gene transfer events between domains may have involved this division at some time during its evolution. PMID:12093378

  13. Improved statistical analysis of budding yeast TAG microarrays revealed by defined spike-in pools.

    PubMed

    Peyser, Brian D; Irizarry, Rafael A; Tiffany, Carol W; Chen, Ou; Yuan, Daniel S; Boeke, Jef D; Spencer, Forrest A

    2005-09-15

    Saccharomyces cerevisiae knockout collection TAG microarrays are an emergent platform for rapid, genome-wide functional characterization of yeast genes. TAG arrays report abundance of unique oligonucleotide 'TAG' sequences incorporated into each deletion mutation of the yeast knockout collection, allowing measurement of relative strain representation across experimental conditions for all knockout mutants simultaneously. One application of TAG arrays is to perform genome-wide synthetic lethality screens, known as synthetic lethality analyzed by microarray (SLAM). We designed a fully defined spike-in pool to resemble typical SLAM experiments and performed TAG microarray hybridizations. We describe a method for analyzing two-color array data to efficiently measure the differential knockout strain representation across two experimental conditions, and use the spike-in pool to show that the sensitivity and specificity of this method exceed typical current approaches.

  14. Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

    PubMed

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

    2014-05-01

    Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic

  15. Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

    PubMed

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

    2014-05-01

    Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic

  16. Functional dissection of the cis-acting sequences of the Arabidopsis transposable element Tag1 reveals dissimilar subterminal sequence and minimal spacing requirements for transposition.

    PubMed Central

    Liu, D; Mack, A; Wang, R; Galli, M; Belk, J; Ketpura, N I; Crawford, N M

    2001-01-01

    The Arabidopsis transposon Tag1 has an unusual subterminal structure containing four sets of dissimilar repeats: one set near the 5' end and three near the 3' end. To determine sequence requirements for efficient and regulated transposition, deletion derivatives of Tag1 were tested in Arabidopsis plants. These tests showed that a 98-bp 5' fragment containing the 22-bp inverted repeat and four copies of the AAACCX (X = C, A, G) 5' subterminal repeat is sufficient for transposition while a 52-bp 5' fragment containing only one copy of the subterminal repeat is not. At the 3' end, a 109-bp fragment containing four copies of the most 3' repeat TGACCC, but not a 55-bp fragment, which has no copies of the subterminal repeats, is sufficient for transposition. The 5' and 3' end fragments are not functionally interchangeable and require an internal spacer DNA of minimal length between 238 and 325 bp to be active. Elements with these minimal requirements show transposition rates and developmental control of excision that are comparable to the autonomous Tag1 element. Last, a DNA-binding activity that interacts with the 3' 109-bp fragment but not the 5' 98-bp fragment of Tag1 was found in nuclear extracts of Arabidopsis plants devoid of Tag1. PMID:11156999

  17. A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

    PubMed Central

    Ng, Keng-Hoong; Ho, Chin-Kuan; Phon-Amnuaisuk, Somnuk

    2012-01-01

    Background Clustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. Methodology/Principal Findings We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy) than both EST clustering tools. Conclusions/Significance The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem. PMID:23071763

  18. Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.

    PubMed

    Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A

    2014-11-01

    Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute. PMID:24628982

  19. Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.

    PubMed

    Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A

    2014-11-01

    Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute.

  20. Barcoded DNA-Tag Reporters for Multiplex Cis-Regulatory Analysis

    PubMed Central

    Nam, Jongmin; Davidson, Eric H.

    2012-01-01

    Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of “barcoded" DNA-tag reporters, “Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs). The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics. PMID:22563420

  1. ISHAN: sequence homology analysis package.

    PubMed

    Shil, Pratip; Dudani, Niraj; Vidyasagar, Pandit B

    2006-01-01

    Sequence based homology studies play an important role in evolutionary tracing and classification of proteins. Various methods are available to analyze biological sequence information. However, with the advent of proteomics era, there is a growing demand for analysis of huge amount of biological sequence information, and it has become necessary to have programs that would provide speedy analysis. ISHAN has been developed as a homology analysis package, built on various sequence analysis tools viz FASTA, ALIGN, CLUSTALW, PHYLIP and CODONW (for DNA sequences). This JAVA application offers the user choice of analysis tools. For testing, ISHAN was applied to perform phylogenetic analysis for sets of Caspase 3 DNA sequences and NF-kappaB p105 amino acid sequences. By integrating several tools it has made analysis much faster and reduced manual intervention. PMID:17274766

  2. Theoretical estimation of drag tag lengths for direct quantitative analysis of multiple miRNAs (DQAMmiR).

    PubMed

    Cherney, Leonid T; Krylov, Sergey N

    2013-01-21

    To better understand the regulatory roles of miRNA in biological functions and to use miRNA as molecular markers of diseases, we need to accurately measure amounts of multiple miRNAs in biological samples. Direct quantitative analysis of multiple miRNAs (DQAMmiR) has been recently developed by using a classical hybridization approach where miRNAs are hybridized with fluorescently labeled complementary DNA probes taken in excess, and the amounts of the hybrids and the unreacted probes are measured to calculate the amount of miRNAs. Capillary electrophoresis was used as an instrumental platform for analysis. The problem of separating the unreacted probes from the hybrids was solved by adding SSB to the run buffer. A more difficult problem of separating hybrids from each other was solved by attaching different drag tags to the probes. Biotin and a hairpin-forming extension on the probe were used as two drag tags in the proof-of-principle work. Making DQAMmiR a generic approach requires a generic solution for drag tags. Peptides have been suggested as drag tags for long oligonucleotides in DNA sequencing by electrophoresis. Here we theoretically consider short peptides of different lengths as drag tags for DQAMmiR. We find analytical equations that allow us to estimate mobilities of RNA-DNA hybrids with peptide drag tags of different lengths. Our calculations suggest that the mobility shifts required for DQAMmiR can be achieved with the length of peptide chains in the ranges of 5-20 residues for five miRNAs and 2-47 residues for nine miRNAs. Peptides of these lengths can be feasibly synthesized with good yield and purity. The results of this theoretical study will guide the design and production of hybridization probes for DQAMmiR.

  3. Comparative DNA Sequence Analysis of Wheat and Rice Genomes

    PubMed Central

    Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

    2003-01-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  4. A direct method for regiospecific analysis of TAG using alpha-MAG.

    PubMed

    Turon, F; Bachain, P; Caro, Y; Pina, M; Graille, J

    2002-08-01

    An analytical procedure was developed for regiodistribution analysis of TAG using alpha-MAG prepared by an ethyl magnesium bromide deacylation. In the present communication, the deacylation procedure is shown to lead to representative alpha-MAG, allowing the composition of the native TAG in the alpha-position to be determined directly. The composition in the beta-position can then be estimated from the composition of the alpha-MAG and TAG according to the formula 3 x TAG - 2 x alpha-MAG. The estimates are superior to those obtained using the alpha,beta-DAG and Brockerhoff calculations as they come closer to the theoretical value and have smaller SD. The present procedure, first demonstrated on a synthetic TAG, was then successfully applied to the analysis of borage oil, milkfat, and tuna oil. PMID:12371754

  5. Identification, Characterization, and Mapping of Expressed Sequence Tags from an Embryonic Zebrafish Heart cDNA Library

    PubMed Central

    Ton, Christopher; Hwang, David M.; Dempsey, Adam A.; Tang, Hong-Chang; Yoon, Jennifer; Lim, Mindy; Mably, John D.; Fishman, Mark C.; Liew, Choong-Chin

    2000-01-01

    The generation of expressed sequence tags (ESTs) has proven to be a rapid and economical approach by which to identify and characterize expressed genes. We generated 5102 ESTs from a 3-d-old embryonic zebrafish heart cDNA library. Of these, 57.6% matched to known genes, 14.2% matched only to other ESTs, and 27.8% showed no match to any ESTs or known genes. Clustering of all ESTs identified 359 unique clusters comprising 1771 ESTs, whereas the remaining 3331 ESTs did not cluster. This estimates the number of unique genes identified in the data set to be approximately 3690. A total of 1242 unique known genes were used to analyze the gene expression patterns in the zebrafish embryonic heart. These were categorized into seven categories on the basis of gene function. The largest class of genes represented those involved in gene/protein expression (25.9% of known transcripts). This class was followed by genes involved in metabolism (18.7%), cell structure/motility (16.4%), cell signaling and communication (9.6%), cell/organism defense (7.1%), and cell division (4.4%). Unclassified genes constituted the remaining 17.91%. Radiation hybrid mapping was performed for 102 ESTs and comparison of map positions between zebrafish and human identified new synteny groups. Continued comparative analysis will be useful in defining the boundaries of conserved chromosome segments between zebrafish and humans, which will facilitate the transfer of genetic information between the two organisms and improve our understanding of vertebrate evolution. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. BE693120–BE693210 and BE704450.] PMID:11116087

  6. Exploratory analysis of the spatio-temporal deformation of the myocardium during systole from tagged MRI.

    PubMed

    Clarysse, Patrick; Han, Meimei; Croisille, Pierre; Magnin, Isabelle E

    2002-11-01

    Myocardial contractile function is, with perfusion, one of the main affected factors in ischemic heart diseases. In this paper, we propose an original framework based on functional data analysis for the quantitative study of spatio-temporal parameters related to the myocardial contraction mechanics. The mechanical strains in the left-ventricular (LV) myocardium are computed from tagged magnetic resonance imaging cardiac sequences. A statistical functional model of the normal contractile function of the LV is build from the study of eight examinations on healthy subjects. We show that it is possible to detect abnormal strain patterns comparatively to this model, by generating distance maps at rest and under pharmacological stress. We demonstrate the consistency of the results for the circumferential deformation parameter on healthy and pathological data sets. PMID:12450363

  7. Identification of Disulfide Bonds in Protein Proteolytic Degradation Products Using de Novo-Protein Unique Sequence Tags Approach

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Purvine, Samuel O.; Smith, Richard D.

    2010-08-01

    Disulfide bonds are a form of posttranslational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, and specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags) we unambiguously correlated the spectra to specific database proteins. Examination of the UStags’ prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to ~10 kDa.

  8. Identification of disulfide bonds in protein proteolytic degradation products using de novo-protein unique sequence tags approach.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Purvine, Samuel O; Smith, Richard D

    2010-08-01

    Disulfide bonds are a form of post-translational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags), we unambiguously correlated the spectra to specific database proteins. Examination of the UStags' prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to approximately 10 kDa. PMID:20590115

  9. Twin Mitochondrial Sequence Analysis.

    PubMed

    Bouhlal, Yosr; Martinez, Selena; Gong, Henry; Dumas, Kevin; Shieh, Joseph T C

    2013-09-01

    When applying genome-wide sequencing technologies to disease investigation, it is increasingly important to resolve sequence variation in regions of the genome that may have homologous sequences. The human mitochondrial genome challenges interpretation given the potential for heteroplasmy, somatic variation, and homologous nuclear mitochondrial sequences (numts). Identical twins share the same mitochondrial DNA (mtDNA) from early life, but whether the mitochondrial sequence remains similar is unclear. We compared an adult monozygotic twin pair using high throughput-sequencing and evaluated variants with primer extension and mitochondrial pre-enrichment. Thirty-seven variants were shared between the twin individuals, and the variants were verified on the original genomic DNA. These studies support highly identical genetic sequence in this case. Certain low-level variant calls were of high quality and homology to the mitochondrial DNA, and they were further evaluated. When we assessed calls in pre-enriched mitochondrial DNA templates, we found that these may represent numts, which can be differentiated from mtDNA variation. We conclude that twin identity extends to mitochondrial DNA, and it is critical to differentiate between numts and mtDNA in genome sequencing, particularly since significant heteroplasmy could influence genome interpretation. Further studies on mtDNA and numts will aid in understanding how variation occurs and persists. PMID:24040623

  10. Development and characterization of 1,827 expressed sequence tag-derived simple sequence repeat markers for ramie (Boehmeria nivea L. Gaud).

    PubMed

    Liu, Touming; Zhu, Siyuan; Fu, Lili; Tang, Qingming; Yu, Yongting; Chen, Ping; Luan, Mingbao; Wang, Changbiao; Tang, Shouwei

    2013-01-01

    Ramie (Boehmeria nivea L. Gaud) is one of the most important natural fiber crops, and improvement of fiber yield and quality is the main goal in efforts to breed superior cultivars. However, efforts aimed at enhancing the understanding of ramie genetics and developing more effective breeding strategies have been hampered by the shortage of simple sequence repeat (SSR) markers. In our previous study, we had assembled de novo 43,990 expressed sequence tags (ESTs). In the present study, we searched these previously assembled ESTs for SSRs and identified 1,685 ESTs (3.83%) containing 1,878 SSRs. Next, we designed 1,827 primer pairs complementary to regions flanking these SSRs, and these regions were designated as SSR markers. Among these markers, dinucleotide and trinucleotide repeat motifs were the most abundant types (36.4% and 36.3%, respectively), whereas tetranucleotide, pentanucleotide, and hexanucleotide motifs represented <10% of the markers. The motif AG/CT was the most abundant, accounting for 28.74% of the markers. One hundred EST-SSR markers (97 SSRs located in genes encoding transcription factors and 3 SSRs in genes encoding cellulose synthases) were amplified using polymerase chain reaction for detecting 24 ramie varieties. Of these 100 markers, 98 markers were successfully amplified and 81 markers were polymorphic, with 2-6 alleles among the 24 varieties. Analysis of the genetic diversity of all 24 varieties revealed similarity coefficients that ranged from 0.51 to 0.80. The EST-SSRs developed in this study represent the first large-scale development of SSR markers for ramie. These SSR markers could be used for development of genetic and physical maps, quantitative trait loci mapping, genetic diversity studies, association mapping, and cultivar fingerprinting.

  11. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Hu, Jingjie; Wang, Xiaolong; Hu, Xiaoli; Bao, Zhenmin

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2 6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  12. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Jingjie, Hu; Xiaolong, Wang; Xiaoli, Hu; Zhenmin, Bao

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2-6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  13. The first set of expressed sequence tags (EST) from the medicinal mushroom Agaricus subrufescens delivers resource for gene discovery and marker development.

    PubMed

    Foulongne-Oriol, Marie; Lapalu, Nicolas; Férandon, Cyril; Spataro, Cathy; Ferrer, Nathalie; Amselem, Joelle; Savoie, Jean-Michel

    2014-09-01

    Agaricus subrufescens is one of the most important culinary-medicinal cultivable mushrooms with potentially high-added-value products and extended agronomical valorization. The development of A. subrufescens-related technologies is hampered by, among others, the lack of suitable molecular tools. Thus, this mushroom is considered as a genomic orphan species with a very limited number of available molecular markers or sequences. To fill this gap, this study reports the generation and analysis of the first set of expressed sequence tags (EST) for A. subrufescens. cDNA fragments obtained from young sporophores (SP) and vegetative mycelium in liquid culture (CL) were sequenced using 454 pyrosequencing technology. After assembly process, 4,989 and 5,125 sequences were obtained in SP and CL libraries, respectively. About 87% of the EST had significant similarity with Agaricus bisporus-predicted proteins, and 79% correspond to known proteins. Functional categorization according to Gene Ontology could be assigned to 49% of the sequences. Some gene families potentially involved in bioactive compound biosynthesis could be identified. A total of 232 simple sequence repeats (SSRs) were identified, and a set of 40 EST-SSR polymorphic markers were successfully developed. This EST dataset provides a new resource for gene discovery and molecular marker development. It constitutes a solid basis for further genetic and genomic studies in A. subrufescens.

  14. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system.

  15. Miniaturised wireless smart tag for optical chemical analysis applications.

    PubMed

    Steinberg, Matthew D; Kassal, Petar; Tkalčec, Biserka; Murković Steinberg, Ivana

    2014-01-01

    A novel miniaturised photometer has been developed as an ultra-portable and mobile analytical chemical instrument. The low-cost photometer presents a paradigm shift in mobile chemical sensor instrumentation because it is built around a contactless smart card format. The photometer tag is based on the radio-frequency identification (RFID) smart card system, which provides short-range wireless data and power transfer between the photometer and a proximal reader, and which allows the reader to also energise the photometer by near field electromagnetic induction. RFID is set to become a key enabling technology of the Internet-of-Things (IoT), hence devices such as the photometer described here will enable numerous mobile, wearable and vanguard chemical sensing applications in the emerging connected world. In the work presented here, we demonstrate the characterisation of a low-power RFID wireless sensor tag with an LED/photodiode-based photometric input. The performance of the wireless photometer has been tested through two different model analytical applications. The first is photometry in solution, where colour intensity as a function of dye concentration was measured. The second is an ion-selective optode system in which potassium ion concentrations were determined by using previously well characterised bulk optode membranes. The analytical performance of the wireless photometer smart tag is clearly demonstrated by these optical absorption-based analytical experiments, with excellent data agreement to a reference laboratory instrument.

  16. Miniaturised wireless smart tag for optical chemical analysis applications.

    PubMed

    Steinberg, Matthew D; Kassal, Petar; Tkalčec, Biserka; Murković Steinberg, Ivana

    2014-01-01

    A novel miniaturised photometer has been developed as an ultra-portable and mobile analytical chemical instrument. The low-cost photometer presents a paradigm shift in mobile chemical sensor instrumentation because it is built around a contactless smart card format. The photometer tag is based on the radio-frequency identification (RFID) smart card system, which provides short-range wireless data and power transfer between the photometer and a proximal reader, and which allows the reader to also energise the photometer by near field electromagnetic induction. RFID is set to become a key enabling technology of the Internet-of-Things (IoT), hence devices such as the photometer described here will enable numerous mobile, wearable and vanguard chemical sensing applications in the emerging connected world. In the work presented here, we demonstrate the characterisation of a low-power RFID wireless sensor tag with an LED/photodiode-based photometric input. The performance of the wireless photometer has been tested through two different model analytical applications. The first is photometry in solution, where colour intensity as a function of dye concentration was measured. The second is an ion-selective optode system in which potassium ion concentrations were determined by using previously well characterised bulk optode membranes. The analytical performance of the wireless photometer smart tag is clearly demonstrated by these optical absorption-based analytical experiments, with excellent data agreement to a reference laboratory instrument. PMID:24274311

  17. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  18. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  19. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  20. Exploring the Structure of Library and Information Science Web Space Based on Multivariate Analysis of Social Tags

    ERIC Educational Resources Information Center

    Joo, Soohyung; Kipp, Margaret E. I.

    2015-01-01

    Introduction: This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tripartite graphs, pattern tracing and descriptive statistics. This…

  1. Sequence analysis of diacylglycerol acyltransferases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Diacylglycerol acyltransferases (DGATs) catalyze the final step of triacylglycerol (TAG) biosynthesis in eukaryotes. DGATs esterify sn-1,2-diacylglycerol with a long-chain fatty acyl-CoA. Plants and animals deficient in DGATs accumulate less TAG and over-expression of DGATs increases TAG. DGAT knock...

  2. Expressed sequence tags and molecular cloning and characterization of gene encoding pinoresinol/lariciresinol reductase from Podophyllum hexandrum.

    PubMed

    Wankhede, Dhammaprakash Pandhari; Biswas, Dipul Kumar; Rajkumar, Subramani; Sinha, Alok Krishna

    2013-12-01

    Podophyllotoxin, an aryltetralin lignan, is the source of important anticancer drugs etoposide, teniposide, and etopophos. Roots/rhizome of Podophyllum hexandrum form one of the most important sources of podophyllotoxin. In order to understand genes involved in podophyllotoxin biosynthesis, two suppression subtractive hybridization libraries were synthesized, one each from root/rhizome and leaves using high and low podophyllotoxin-producing plants of P. hexandrum. Sequencing of clones identified a total of 1,141 Expressed Sequence Tags (ESTs) resulting in 354 unique ESTs. Several unique ESTs showed sequence similarity to the genes involved in metabolism, stress/defense responses, and signalling pathways. A few ESTs also showed high sequence similarity with genes which were shown to be involved in podophyllotoxin biosynthesis in other plant species such as pinoresinol/lariciresinol reductase. A full length coding sequence of pinoresinol/lariciresinol reductase (PLR) has been cloned from P. hexandrum which was found to encode protein with 311 amino acids and show sequence similarity with PLR from Forsythia intermedia and Linum spp. Spatial and stress-inducible expression pattern of PhPLR and other known genes of podophyllotoxin biosynthesis, secoisolariciresinol dehydrogenase (PhSDH), and dirigent protein oxidase (PhDPO) have been studied. All the three genes showed wounding and methyl jasmonate-inducible expression pattern. The present work would form a basis for further studies to understand genomics of podophyllotoxin biosynthesis in P. hexandrum.

  3. SMTAG: A code for the sequential analysis of multiple tag gas releases

    SciTech Connect

    Schmittroth, F.A.

    1989-01-01

    The code SMTAG (Sequential and Multiple TAG Analysis) is used to identify breached reactor components that have released tag gas to the reactor cover gas. Gas tags have been used (Figg et al. 1980 and Lambert 1978) to locate failed fuel pins in both the Fast Flux Test Facility (FFTF) and in the Experimental Breeder Reactor (EBR-2). In the FFTF, other reactor components have been tagged as well, including control assemblies and materials test capsules. The SMTAG code has been used extensively in gas tag analysis. This has resulted in several code enhancements and has been beneficial in learning to use the code effectively. Supporting information for each analysis is provided that is valuable in ensuring that a correct identification is obtained. The relative amounts of various components in a mixed sample are obtained, including the amount of residual gas from previous leakers, fission-product release-to-birth factors, and xenon-hangup. Statistical tests and other comparisons can flag bad or inconsistent measurements or problems in the supporting nuclear data base. The formalism for the code is reviewed here in Section 2.0. Details of the code (including descriptions of the main subroutines) are given in Section 3.0. The use of the code is documented in Section 4.0, along with a discussion of a realistic example. The SMTAG code requires a data base that includes the isotopic amounts of each tag properly corrected for burnup, depletion, and production.

  4. Modified PCR methods for 3' end amplification from serial analysis of gene expression (SAGE) tags.

    PubMed

    Xu, Wang-Jie; Wang, Zhao-Xia; Qiao, Zhong-Dong

    2009-05-01

    Serial analysis of gene expression (SAGE) is a powerful technique to study gene expression at the genome level. However, a disadvantage of the shortness of SAGE tags is that it prevents further study of SAGE library data, thus limiting extensive application of the SAGE method in gene expression studies. However, this problem can be solved by extension of the SAGE tags to 3' cDNAs. Therefore, several methods based on PCR have been developed to generate a 3' longer fragment cDNA corresponding to a SAGE tag. The list of modified methods is extensive, and includes rapid RT-PCR analysis of unknown SAGE tags (RAST-PCR), generation of longer cDNA fragments from SAGE tags for gene identification (GLGI), a high-throughput GLGI procedure, reverse SAGE (rSAGE), two-step analysis of unknown SAGE tags (TSAT-PCR), etc. These procedures are constantly being updated because they have characteristics and advantages that can be shared. Development of these methods has promoted the widespread use of the SAGE technique, and has accelerated the speed of studies of large-scale gene expression.

  5. Micro- and minisatellite-expressed sequence tag (EST) markers discriminate between populations of Rhipicephalus appendiculatus.

    PubMed

    Kanduma, Esther G; Mwacharo, Joram M; Sunter, Jack D; Nzuki, Inosters; Mwaura, Stephen; Kinyanjui, Peter W; Kibe, Michael; Heyne, Heloise; Hanotte, Olivier; Skilton, Robert A; Bishop, Richard P

    2012-06-01

    Biological differences, including vector competence for the protozoan parasite Theileria parva have been reported among populations of Rhipicephalus appendiculatus (Acari: Ixodidae) from different geographic regions. However, the genetic diversity and population structure of this important tick vector remain unknown due to the absence of appropriate genetic markers. Here, we describe the development and evaluation of a panel of EST micro- and minisatellite markers to characterize the genetic diversity within and between populations of R. appendiculatus and other rhipicephaline species. Sixty-six micro- and minisatellite markers were identified through analysis of the R. appendiculatus Gene Index (RaGI) EST database and selected bacterial artificial chromosome (BAC) sequences. These were used to genotype 979 individual ticks from 10 field populations, 10 laboratory-bred stocks, and 5 additional Rhipicephalus species. Twenty-nine markers were polymorphic and therefore informative for genetic studies while 6 were monomorphic. Primers designed from the remaining 31 loci did not reliably generate amplicons. The 29 polymorphic markers discriminated populations of R. appendiculatus and also 4 other Rhipicephalus species, but not R. zambeziensis. The percentage Principal Component Analysis (PCA) implemented using Multiple Co-inertia Analysis (MCoA) clustered populations of R. appendiculatus into 2 groups. Individual markers however differed in their ability to generate the reference typology using the MCoA approach. This indicates that different panels of markers may be required for different applications. The 29 informative polymorphic micro- and minisatellite markers are the first available tools for the analysis of the phylogeography and population genetics of R. appendiculatus. PMID:22789728

  6. SNP discovery using Paired-End RAD-tag sequencing on pooled genomic DNA of Sisymbrium austriacum (Brassicaceae).

    PubMed

    Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T

    2013-03-01

    Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50)  = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.

  7. Precipitation recycling in West Africa - regional modeling, evaporation tagging and atmospheric water budget analysis

    NASA Astrophysics Data System (ADS)

    Arnault, Joel; Kunstmann, Harald; Knoche, Hans-Richard

    2015-04-01

    Many numerical studies have shown that the West African monsoon is highly sensitive to the state of the land surface. It is however questionable to which extend a local change of land surface properties would affect the local climate, especially with respect to precipitation. This issue is traditionally addressed with the concept of precipitation recycling, defined as the contribution of local surface evaporation to local precipitation. For this study the West African monsoon has been simulated with the Weather Research and Forecasting (WRF) model using explicit convection, for the domain (1°S-21°N, 18°W-14°E) at a spatial resolution of 10 km, for the period January-October 2013, and using ERA-Interim reanalyses as driving data. This WRF configuration has been selected for its ability to simulate monthly precipitation amounts and daily histograms close to TRMM (Tropical Rainfall Measuring Mission) data. In order to investigate precipitation recycling in this WRF simulation, surface evaporation tagging has been implemented in the WRF source code as well as the budget of total and tagged atmospheric water. Surface evaporation tagging consists in duplicating all water species and the respective prognostic equations in the source code. Then, tagged water species are set to zero at the lateral boundaries of the simulated domain (no inflow of tagged water vapor), and tagged surface evaporation is considered only in a specified region. All the source terms of the prognostic equations of total and tagged water species are finally saved in the outputs for the budget analysis. This allows quantifying the respective contribution of total and tagged atmospheric water to atmospheric precipitation processes. The WRF simulation with surface evaporation tagging and budgets has been conducted two times, first with a 100 km2 tagged region (11-12°N, 1-2°W), and second with a 1000 km2 tagged region (7-16°N, 6°W -3°E). In this presentation we will investigate hydro

  8. Application of Cydia pomonella expressed sequence tags: Identification and expression of three general odorant binding proteins in codling moth

    PubMed Central

    Garczynski, Stephen F.; Coates, Brad S.; Unruh, Thomas R.; Schaeffer, Scott; Jiwan, Derick; Koepke, Tyson; Dhingra, Amit

    2014-01-01

    The codling moth, Cydia pomonella, is one of the most important pests of pome fruits in the world, yet the molecular genetics and the physiology of this insect remain poorly understood. A combined assembly of 8 341 expressed sequence tags was generated from Roche 454 GS-FLX sequencing of eight tissue-specific cDNA libraries. Putative chemosensory proteins (12) and odorant binding proteins (OBPs) (18) were annotated, which included three putative general OBP (GOBP), one more than typically reported for other Lepidoptera. To further characterize CpomGOBPs, we cloned cDNA copies of their transcripts and determined their expression patterns in various tissues. Cloning and sequencing of the 698 nt transcript for CpomGOBP1 resulted in the prediction of a 163 amino acid coding region, and subsequent RT-PCR indicated that the transcripts were mainly expressed in antennae and mouthparts. The 1 289 nt (160 amino acid) CpomGOBP2 and the novel 702 nt (169 amino acid) CpomGOBP3 transcripts are mainly expressed in antennae, mouthparts, and female abdomen tips. These results indicate that next generation sequencing is useful for the identification of novel transcripts of interest, and that codling moth expresses a transcript encoding for a new member of the GOBP subfamily. PMID:23956229

  9. Probing essential oil biosynthesis and secretion by functional evaluation of expressed sequence tags from mint glandular trichomes.

    PubMed

    Lange, B M; Wildung, M R; Stauber, E J; Sanchez, C; Pouchnik, D; Croteau, R

    2000-03-14

    Functional genomics approaches, which use combined computational and expression-based analyses of large amounts of sequence information, are emerging as powerful tools to accelerate the comprehensive understanding of cellular metabolism in specialized tissues and whole organisms. As part of an ongoing effort to identify genes of essential oil (monoterpene) biosynthesis, we have obtained sequence information from 1,316 randomly selected cDNA clones, or expressed sequence tags (ESTs), from a peppermint (Mentha x piperita) oil gland secretory cell cDNA library. After bioinformatic selection, candidate genes putatively involved in essential oil biosynthesis and secretion have been subcloned into suitable expression vectors for functional evaluation in Escherichia coli. On the basis of published and preliminary data on the functional properties of these clones, it is estimated that the ESTs involved in essential oil metabolism represent about 25% of the described sequences. An additional 7% of the recognized genes code for proteins involved in transport processes, and a subset of these is likely involved in the secretion of essential oil terpenes from the site of synthesis to the storage cavity of the oil glands. The integrated approaches reported here represent an essential step toward the development of a metabolic map of oil glands and provide a valuable resource for defining molecular targets for the genetic engineering of essential oil formation. PMID:10717007

  10. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  11. Transient Analysis Generator /TAG/ simulates behavior of large class of electrical networks

    NASA Technical Reports Server (NTRS)

    Thomas, W. J.

    1967-01-01

    Transient Analysis Generator program simulates both transient and dc steady-state behavior of a large class of electrical networks. It generates a special analysis program for each circuit described in an easily understood and manipulated programming language. A generator or preprocessor and a simulation system make up the TAG system.

  12. In silico identification of miRNAs and their targets from the expressed sequence tags of Raphanus sativus

    PubMed Central

    Muvva, Charuvaka; Tewari, Lata; Aruna, Kasoju; Ranjit, Pabbati; MD, Zahoorullah S; MD, K A Matheen; Veeramachaneni, Hemanth

    2012-01-01

    MicroRNAs (miRNAs) are a novel growing family of endogenous, small, non- coding, single-stranded RNA molecules directly involved in regulating gene expression at the posttranscriptional level. High conservation of miRNAs in plant provides the foundation for identification of new miRNAs in other plant species through homology alignment. Here, previous known plant miRNAs were BLASTed against the Expressed Sequence Tag (EST) database of Raphanus sativus, and according to a series of filtering criteria, a total of 48 miRNAs belonging to 9 miRNA families were identified, and 16 potential target genes of them were subsequently predicted, most of which seemed to encode transcription factors or enzymes participating in regulation of development, growth and other physiological processes. Overall, our findings lay the foundation for further researches of miRNAs function in R.sativus. PMID:22359443

  13. Toward a physical map of Drosophila buzzatii. Use of randomly amplified polymorphic dna polymorphisms and sequence-tagged site landmarks.

    PubMed Central

    Laayouni, H; Santos, M; Fontdevila, A

    2000-01-01

    We present a physical map based on RAPD polymorphic fragments and sequence-tagged sites (STSs) for the repleta group species Drosophila buzzatii. One hundred forty-four RAPD markers have been used as probes for in situ hybridization to the polytene chromosomes, and positive results allowing the precise localization of 108 RAPDs were obtained. Of these, 73 behave as effectively unique markers for physical map construction, and in 9 additional cases the probes gave two hybridization signals, each on a different chromosome. Most markers (68%) are located on chromosomes 2 and 4, which partially agree with previous estimates on the distribution of genetic variation over chromosomes. One RAPD maps close to the proximal breakpoint of inversion 2z(3) but is not included within the inverted fragment. However, it was possible to conclude from this RAPD that the distal breakpoint of 2z(3) had previously been wrongly assigned. A total of 39 cytologically mapped RAPDs were converted to STSs and yielded an aggregate sequence of 28,431 bp. Thirty-six RAPDs (25%) did not produce any detectable hybridization signal, and we obtained the DNA sequence from three of them. Further prospects toward obtaining a more developed genetic map than the one currently available for D. buzzatii are discussed. PMID:11102375

  14. Chromosome Bin Map of Expressed Sequence Tags in Homoeologous Group 1 of Hexaploid Wheat and Homoeology With Rice and Arabidopsis

    PubMed Central

    Peng, J. H.; Zadeh, H.; Lazo, G. R.; Gustafson, J. P.; Chao, S.; Anderson, O. D.; Qi, L. L.; Echalier, B.; Gill, B. S.; Dilbirligi, M.; Sandhu, D.; Gill, K. S.; Greene, R. A.; Sorrells, M. E.; Akhunov, E. D.; Dvořák, J.; Linkiewicz, A. M.; Dubcovsky, J.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Mahmoud, A. A.; Miftahudin; Conley, E. J.; Anderson, J. A.; Pathan, M. S.; Nguyen, H. T.; McGuire, P. E.; Qualset, C. O.; Lapitan, N. L. V.

    2004-01-01

    A total of 944 expressed sequence tags (ESTs) generated 2212 EST loci mapped to homoeologous group 1 chromosomes in hexaploid wheat (Triticum aestivum L.). EST deletion maps and the consensus map of group 1 chromosomes were constructed to show EST distribution. EST loci were unevenly distributed among chromosomes 1A, 1B, and 1D with 660, 826, and 726, respectively. The number of EST loci was greater on the long arms than on the short arms for all three chromosomes. The distribution of ESTs along chromosome arms was nonrandom with EST clusters occurring in the distal regions of short arms and middle regions of long arms. Duplications of group 1 ESTs in other homoeologous groups occurred at a rate of 35.5%. Seventy-five percent of wheat chromosome 1 ESTs had significant matches with rice sequences (E ≤ e−10), where large regions of conservation occurred between wheat consensus chromosome 1 and rice chromosome 5 and between the proximal portion of the long arm of wheat consensus chromosome 1 and rice chromosome 10. Only 9.5% of group 1 ESTs showed significant matches to Arabidopsis genome sequences. The results presented are useful for gene mapping and evolutionary and comparative genomics of grasses. PMID:15514039

  15. Improved measurement of brain deformation during mild head acceleration using a novel tagged MRI sequence.

    PubMed

    Knutsen, Andrew K; Magrath, Elizabeth; McEntee, Julie E; Xing, Fangxu; Prince, Jerry L; Bayly, Philip V; Butman, John A; Pham, Dzung L

    2014-11-01

    In vivo measurements of human brain deformation during mild acceleration are needed to help validate computational models of traumatic brain injury and to understand the factors that govern the mechanical response of the brain. Tagged magnetic resonance imaging is a powerful, noninvasive technique to track tissue motion in vivo which has been used to quantify brain deformation in live human subjects. However, these prior studies required from 72 to 144 head rotations to generate deformation data for a single image slice, precluding its use to investigate the entire brain in a single subject. Here, a novel method is introduced that significantly reduces temporal variability in the acquisition and improves the accuracy of displacement estimates. Optimization of the acquisition parameters in a gelatin phantom and three human subjects leads to a reduction in the number of rotations from 72 to 144 to as few as 8 for a single image slice. The ability to estimate accurate, well-resolved, fields of displacement and strain in far fewer repetitions will enable comprehensive studies of acceleration-induced deformation throughout the human brain in vivo.

  16. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  17. Exploiting expressed sequence tag databases for the development and characterization of gene-derived simple sequence repeat markers in the opium poppy (Papaver somniferum L.) for forensic applications.

    PubMed

    Lee, Eun Jung; Jin, Gang Nam; Lee, Kyung Lyong; Han, Myun Soo; Lee, Yang Han; Yang, Moon Sik

    2011-09-01

    Simple sequence repeat (SSR) markers in the opium poppy (Papaver somniferum L.) were identified from an expressed sequence tag (EST) database comprised of 20,340 sequences. In total, 2780 SSR-containing sequences were identified. The most frequent microsatellite had an AT/TA motif (37%). Twenty-two opium poppy EST-SSR markers were presently developed and polymorphisms of six markers (psom 2, 4, 12, 13, 17, and 22) were utilized in 135 individuals under narcotic control investigation. An average of three alleles per locus (range: 2-5 alleles) with a mean heterozygosity of 0.167 was detected. Six loci identified 29 unique profiles in 135 individuals. The EST-SSR markers exhibited small degrees of genetic differentiation (fixation index = 0.727, p < 0.001). Other variable markers will be needed to facilitate the forensic identification of the opium poppy for future cases. To determine the potential for cross-species amplification, six markers were tested in five Papaver genera species and two Eschscholzia genera. The psom 4 and psom 17 primer pair was transferable. This is the first study to report SSR markers of the opium poppy.

  18. FAST: FAST Analysis of Sequences Toolbox

    PubMed Central

    Lawrence, Travis J.; Kauffman, Kyle T.; Amrine, Katherine C. H.; Carper, Dana L.; Lee, Raymond S.; Becich, Peter J.; Canales, Claudia J.; Ardell, David H.

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought. PMID:26042145

  19. FAST: FAST Analysis of Sequences Toolbox.

    PubMed

    Lawrence, Travis J; Kauffman, Kyle T; Amrine, Katherine C H; Carper, Dana L; Lee, Raymond S; Becich, Peter J; Canales, Claudia J; Ardell, David H

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  20. Random Tagging Genotyping by Sequencing (rtGBS), an Unbiased Approach to Locate Restriction Enzyme Sites across the Target Genome

    PubMed Central

    Hilario, Elena; Barron, Lorna; Deng, Cecilia H.; Datson, Paul M.; Davy, Marcus W.; Storey, Roy D.

    2015-01-01

    Genotyping by sequencing (GBS) is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al. method: some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS). By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS) method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145) of BamH I sites shared with the reference genome, compared to only 14% (11,513) by stdGBS. PMID:26633193

  1. Accurate mass tag retention time database for urine proteome analysis by chromatography--mass spectrometry.

    PubMed

    Agron, I A; Avtonomov, D M; Kononikhin, A S; Popov, I A; Moshkovskii, S A; Nikolaev, E N

    2010-05-01

    Information about peptides and proteins in urine can be used to search for biomarkers of early stages of various diseases. The main technology currently used for identification of peptides and proteins is tandem mass spectrometry, in which peptides are identified by mass spectra of their fragmentation products. However, the presence of the fragmentation stage decreases sensitivity of analysis and increases its duration. We have developed a method for identification of human urinary proteins and peptides. This method based on the accurate mass and time tag (AMT) method does not use tandem mass spectrometry. The database of AMT tags containing more than 1381 AMT tags of peptides has been constructed. The software for database filling with AMT tags, normalizing the chromatograms, database application for identification of proteins and peptides, and their quantitative estimation has been developed. The new procedures for peptide identification by tandem mass spectra and the AMT tag database are proposed. The paper also lists novel proteins that have been identified in human urine for the first time. PMID:20632944

  2. Genome-wide search of the genes tagged with the consensus of 33.6 repeat loci in buffalo Bubalus bubalis employing minisatellite-associated sequence amplification.

    PubMed

    Pathak, Deepali; Srivastava, Jyoti; Samad, Rana; Parwez, Iqbal; Kumar, Sudhir; Ali, Sher

    2010-06-01

    Minisatellites have been implicated with chromatin organization and gene regulation, but mRNA transcripts tagged with these elements have not been systematically characterized. The aim of the present study was to gain an insight into the transcribing genes associated with consensus of 33.6 repeat loci across the tissues in water buffalo, Bubalus bubalis. Using cDNA from spermatozoa and eight different somatic tissues and an oligo primer based on two units of consensus of 33.6 repeat loci (5' CCTCCAGCCCTCCTCCAGCCCT 3'), we conducted minisatellite-associated sequence amplification (MASA) and identified 29 mRNA transcripts. These transcripts were cloned and sequenced. Blast search of the individual mRNA transcript revealed sequence homologies with various transcribing genes and contigs in the database. Using real-time PCR, we detected the highest expression of nine mRNA transcripts in spermatozoa and one each in liver and lung. Further, 21 transcripts were found to be conserved across the species; seven were specific to bovid whereas one was exclusive to the buffalo genome. The present work demonstrates innate potentials of MASA in accessing several functional genes simultaneously without screening the cDNA library. This approach may be exploited for the development of tissue-specific mRNA fingerprints in the context of genome analysis and functional and comparative genomics.

  3. Tag Questions across Irish English and British English: A Corpus Analysis of Form and Function

    ERIC Educational Resources Information Center

    Barron, Anne; Pandarova, Irina; Muderack, Karoline

    2015-01-01

    The present study, situated in the area of variational pragmatics, contrasts tag question (TQ) use in Ireland and Great Britain using spoken data from the Irish and British components of the International Corpus of English (ICE). Analysis is on the formal and functional level and also investigates form-functional relationships. Findings reveal…

  4. [Multilocus sequence typing (MLST) analysis].

    PubMed

    Matsumura, Yasufumi

    2013-12-01

    Multilocus sequence typing (MLST) analysis has been emerging as a powerful tool for genotyping specific bacterial species. MLST utilizes internal fragments of multiple housekeeping genes and the combination of each allele defines the sequence type for each isolate. MLST databases contain reference data and are freely accessible via internet websites. The standard method for investigating short-term hospital outbreaks is still pulse-field gel-electrophoresis and MLST analysis is not a substitute. However, analysis of sequence types and clonal complexes (closely related sequence types) enables identification and understanding of a specific clone that is widely spreading among drug-resistant organisms, or a key clone that is important for evolution of the organism. In the case of Escherichia coli, CTX-M-15 or CTX-M-14 extended-spectrum beta-lactamase producing ST131 clone has emerged and spread globally in the last 10 years. MLST analysis is an unambiguous procedure and is becoming a common typing method to characterize isolates. PMID:24605545

  5. Identification of Anhydrobiosis-related Genes from an Expressed Sequence Tag Database in the Cryptobiotic Midge Polypedilum vanderplanki (Diptera; Chironomidae)*

    PubMed Central

    Cornette, Richard; Kanamori, Yasushi; Watanabe, Masahiko; Nakahara, Yuichi; Gusev, Oleg; Mitsumasu, Kanako; Kadono-Okuda, Keiko; Shimomura, Michihiko; Mita, Kazuei; Kikawada, Takahiro; Okuda, Takashi

    2010-01-01

    Some organisms are able to survive the loss of almost all their body water content, entering a latent state known as anhydrobiosis. The sleeping chironomid (Polypedilum vanderplanki) lives in the semi-arid regions of Africa, and its larvae can survive desiccation in an anhydrobiotic form during the dry season. To unveil the molecular mechanisms of this resistance to desiccation, an anhydrobiosis-related Expressed Sequence Tag (EST) database was obtained from the sequences of three cDNA libraries constructed from P. vanderplanki larvae after 0, 12, and 36 h of desiccation. The database contained 15,056 ESTs distributed into 4,807 UniGene clusters. ESTs were classified according to gene ontology categories, and putative expression patterns were deduced for all clusters on the basis of the number of clones in each library; expression patterns were confirmed by real-time PCR for selected genes. Among up-regulated genes, antioxidants, late embryogenesis abundant (LEA) proteins, and heat shock proteins (Hsps) were identified as important groups for anhydrobiosis. Genes related to trehalose metabolism and various transporters were also strongly induced by desiccation. Those results suggest that the oxidative stress response plays a central role in successful anhydrobiosis. Similarly, protein denaturation and aggregation may be prevented by marked up-regulation of Hsps and the anhydrobiosis-specific LEA proteins. A third major feature is the predicted increase in trehalose synthesis and in the expression of various transporter proteins allowing the distribution of trehalose and other solutes to all tissues. PMID:20833722

  6. Identification and characterization of 43 microsatellite markers derived from expressed sequence tags of the sea cucumber ( Apostichopus japonicus)

    NASA Astrophysics Data System (ADS)

    Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

    2011-06-01

    The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.

  7. Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    PubMed Central

    Krzyzanowski, Paul M.; Price, Feodor D.; Muro, Enrique M.; Rudnicki, Michael A.; Andrade-Navarro, Miguel A.

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  8. Development of high-density linkage map and tagging leaf spot resistance in pearl millet using genotyping-by-sequencing markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pearl millet is an important forage and grain crop in many parts of the world. Genome mapping studies are a prerequisite for tagging agronomically important traits. Genotyping-by-Sequencing (GBS) markers can be used to build high density linkage maps even in species lacking a reference genome. A re...

  9. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.

  10. Plant genotyping using fluorescently tagged inter-simple sequence repeats (ISSRs): basic principles and methodology.

    PubMed

    Prince, Linda M

    2015-01-01

    Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.

  11. Harmonic phase interference for the detection of tag line crossings and beyond in homogeneous strain analysis of cardiac tagged MRI data.

    PubMed

    Bilgen, Mehmet

    2010-12-01

    Homogenous strain analysis (HSA) was developed to evaluate regional cardiac function using tagged cine magnetic resonance images of heart. Current cardiac applications of HSA are however limited in accurately detecting tag intersections within the myocardial wall, producing consistent triangulation of tag cells throughout the image series and achieving optimal spatial resolution due to the large size of the triangles. To address these issues, this article introduces a harmonic phase (HARP) interference method. In principle, as in the standard HARP analysis, the method uses harmonic phases associated with the two of the four fundamental peaks in the spectrum of a tagged image. However, the phase associated with each peak is wrapped when estimated digitally. This article shows that special combination of wrapped phases results in an image with unique intensity pattern that can be exploited to automatically detect tag intersections and to produce reliable triangulation with regularly organized partitioning of the mesh for HSA. In addition, the method offers new opportunities and freedom for evaluating myocardial function when the power and angle of the complex filtered spectra are mathematically modified prior to computing the phase. For example, the triangular elements can be shifted spatially by changing the angle and/or their sizes can be reduced by changing the power. Interference patterns obtained under a variety of power and angle conditions were presented and specific features observed in the results were explained. Together, the advanced processing capabilities increase the power of HSA by making the analysis less prone to errors from human interactions. It also allows strain measurements at higher spatial resolution and multi-scale, thereby improving the display methods for better interpretation of the analysis results. PMID:21110236

  12. Parasites as biological tags of fish stocks: a meta-analysis of their discriminatory power.

    PubMed

    Poulin, Robert; Kamiya, Tsukushi

    2015-01-01

    The use of parasites as biological tags to discriminate among marine fish stocks has become a widely accepted method in fisheries management. Here, we first link this approach to its unstated ecological foundation, the decay in the similarity of the species composition of assemblages as a function of increasing distance between them, a phenomenon almost universal in nature. We explain how distance decay of similarity can influence the use of parasites as biological tags. Then, we perform a meta-analysis of 61 uses of parasites as tags of marine fish populations in multivariate discriminant analyses, obtained from 29 articles. Our main finding is that across all studies, the observed overall probability of correct classification of fish based on parasite data was about 71%. This corresponds to a two-fold improvement over the rate of correct classification expected by chance alone, and the average effect size (Zr = 0·463) computed from the original values was also indicative of a medium-to-large effect. However, none of the moderator variables included in the meta-analysis had a significant effect on the proportion of correct classification; these moderators included the total number of fish sampled, the number of parasite species used in the discriminant analysis, the number of localities from which fish were sampled, the minimum and maximum distance between any pair of sampling localities, etc. Therefore, there are no clear-cut situations in which the use of parasites as tags is more useful than others. Finally, we provide recommendations for the future usage of parasites as tags for stock discrimination, to ensure that future applications of the method achieve statistical rigour and a high discriminatory power.

  13. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  14. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  15. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  16. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs).

    PubMed

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  17. AB039. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Li, Zesong

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome.

  18. Identification and Validation of Expressed Sequence Tags from Pigeonpea (Cajanus cajan L.) Root.

    PubMed

    Kumar, Ravi Ranjan; Yadav, Shailesh; Joshi, Shourabh; Bhandare, Prithviraj P; Patil, Vinod Kumar; Kulkarni, Pramod B; Sonkawade, Swati; Naik, G R

    2014-01-01

    Pigeonpea (Cajanus cajan (L) Millsp.) is an important food legume crop of rain fed agriculture in the arid and semiarid tropics of the world. It has deep and extensive root system which serves a number of important physiological and metabolic functions in plant development and growth. In order to identify genes associated with pigeonpea root, ESTs were generated from the root tissues of pigeonpea (GRG-295 genotype) by normalized cDNA library. A total of 105 high quality ESTs were generated by sequencing of 250 random clones which resulted in 72 unigenes comprising 25 contigs and 47 singlets. The ESTs were assigned to 9 functional categories on the basis of their putative function. In order to validate the possible expression of transcripts, four genes, namely, S-adenosylmethionine synthetase, phosphoglycerate kinase, serine carboxypeptidase, and methionine aminopeptidase, were further analyzed by reverse transcriptase PCR. The possible role of the identified transcripts and their functions associated with root will also be a valuable resource for the functional genomics study in legume crop. PMID:24895494

  19. Gene cataloging and expression profiling in human gastric cancer cells by expressed sequence tags.

    PubMed

    Kim, Nam-Soon; Hahn, Yoonsoo; Oh, Jung-Hwa; Lee, Ju-Yeon; Oh, Kyung-Jin; Kim, Jeong-Min; Park, Hong-Seog; Kim, Sangsoo; Song, Kyu-Sang; Rho, Seung-Moo; Yoo, Hyang-Sook; Kim, Yong Sung

    2004-06-01

    To understand the molecular mechanism associated with gastric carcinogenesis, we identified genes expressed in gastric cancer cell lines and tissues. Of 97,609 high-quality ESTs sequenced from 36 cDNA libraries, 92,545 were coalesced into 10,418 human Unigene clusters (Build 151). The gene expression profile was produced by counting the cluster frequencies in each library. Although the profiles of highly expressed genes varied greatly from library to library, those genes related to cell structure formation, heat shock proteins, the glycolysis pathway, and the signaling pathway were highly represented in human gastric cancer cell lines and in primary tumors. Conversely, the genes encoding immunoglobulins, ribosomal proteins, and digestive proteins were down-regulated in gastric cancer cell lines and tissues compared to normal tissues. The transcription levels of some of these genes were confirmed by RT-PCR. We found that genes related to cell adhesion, apoptosis, and cytoskeleton formation were particularly up-regulated in the gastric cancer cell lines established from malignant ascites compared to those from primary tumors. This comprehensive molecular profiling of human gastric cancer should be useful for elucidating the genetic events associated with human gastric cancer. PMID:15177556

  20. Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags.

    PubMed

    Peng, Xin; Yu, Ke-Qiang; Deng, Guan-Hua; Jiang, Yun-Xia; Wang, Yu; Zhang, Guo-Xia; Zhou, Hong-Wei

    2013-12-01

    Low cost and high throughput capacity are major advantages of using next generation sequencing (NGS) techniques to determine metagenomic 16S rRNA tag sequences. These methods have significantly changed our view of microorganisms in the fields of human health and environmental science. However, DNA extraction using commercial kits has shortcomings of high cost and time constraint. In the present study, we evaluated the determination of fecal microbiomes using a direct boiling method compared with 5 different commercial extraction methods, e.g., Qiagen and MO BIO kits. Principal coordinate analysis (PCoA) using UniFrac distances and clustering showed that direct boiling of a wide range of feces concentrations gave a similar pattern of bacterial communities as those obtained from most of the commercial kits, with the exception of the MO BIO method. Fecal concentration by boiling method affected the estimation of α-diversity indices, otherwise results were generally comparable between boiling and commercial methods. The operational taxonomic units (OTUs) determined through direct boiling showed highly consistent frequencies with those determined through most of the commercial methods. Even those for the MO BIO kit were also obtained by the direct boiling method with high confidence. The present study suggested that direct boiling could be used to determine the fecal microbiome and using this method would significantly reduce the cost and improve the efficiency of the sample preparation for studying gut microbiome diversity.

  1. Expressed sequence tags from organ-specific cDNA libraries of tea (Camellia sinensis) and polymorphisms and transferability of EST-SSRs across Camellia species.

    PubMed

    Taniguchi, Fumiya; Fukuoka, Hiroyuki; Tanaka, Junichi

    2012-06-01

    Tea is one of the most popular beverages in the world and the tea plant, Camellia sinensis (L.) O. Kuntze, is an important crop in many countries. To increase the amount of genomic information available for C. sinensis, we constructed seven cDNA libraries from various organs and used these to generate expressed sequence tags (ESTs). A total of 17,458 ESTs were generated and assembled into 5,262 unigenes. About 50% of the unigenes were assigned annotations by Gene Ontology. Some were homologous to genes involved in important biological processes, such as nitrogen assimilation, aluminum response, and biosynthesis of caffeine and catechins. Digital northern analysis showed that 67 unigenes were expressed differentially among the seven organs. Simple sequence repeat (SSR) motif searches among the unigenes identified 1,835 unigenes (34.9%) harboring SSR motifs of more than six repeat units. A subset of 100 EST-SSR primer sets was tested for amplification and polymorphism in 16 tea accessions. Seventy-one primer sets successfully amplified EST-SSRs and 70 EST-SSR loci were polymorphic. Furthermore, these 70 EST-SSR markers were transferable to 14 other Camellia species. The ESTs and EST-SSR markers will enhance the study of important traits and the molecular genetics of tea plants and other Camellia species.

  2. Linking yeast genetics to mammalian genomes: identification and mapping of the human homolog of CDC27 via the expressed sequence tag (EST) data base.

    PubMed Central

    Tugendreich, S; Boguski, M S; Seldin, M S; Hieter, P

    1993-01-01

    We describe a strategy for quickly identifying and positionally mapping human homologs of yeast genes to cross-reference the biological and genetic information known about yeast genes to mammalian chromosomal maps. Optimized computer search methods have been developed to scan the rapidly expanding expressed sequence tag (EST) data base to find human open reading frames related to yeast protein sequence queries. These methods take advantage of the newly developed BLOSUM scoring matrices and the query masking function SEG. The corresponding human cDNA is then used to obtain a high-resolution map position on human and mouse chromosomes, providing the links between yeast genetic analysis and mapped mammalian loci. By using these methods, a human homolog of Saccharomyces cerevisiae CDC27 has been identified and mapped to human chromosome 17 and mouse chromosome 11 between the Pkca and Erbb-2 genes. Human CDC27 encodes an 823-aa protein with global similarity to its fungal homologs CDC27, nuc2+, and BimA. Comprehensive cross-referencing of genes and mutant phenotypes described in humans, mice, and yeast should accelerate the study of normal eukaryotic biology and human disease states. Images Fig. 2 PMID:8234252

  3. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    SciTech Connect

    Abraham, Paul E; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert {Bob} L

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  4. The Arabidopsis root transcriptome by serial analysis of gene expression. Gene identification using the genome sequence.

    PubMed

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source.

  5. Universal drag tag for direct quantitative analysis of multiple microRNAs.

    PubMed

    Wegman, David W; Cherney, Leonid T; Yousef, George M; Krylov, Sergey N

    2013-07-01

    Using microRNA (miRNA) as molecular markers of diseases requires a method for accurate measurement of multiple miRNAs in biological samples. Direct quantitative analysis of multiple miRNAs (DQAMmiR) has been recently developed based on a classical hybridization approach. In DQAMmiR, miRNAs are hybridized with excess fluorescently labeled complementary DNA probes. Capillary electrophoresis (CE) is used to separate the unreacted probes from the hybrids and the hybrids from each other. The challenging separation is achieved by using two types of mobility modifiers. Single-strand DNA binding protein (SSB) is added to the running buffer to bind and shift the single-stranded unreacted probes from the double-stranded hybrids. Different drag tags are built into the probes to introduce significant differential mobility between their respective hybrids. For the method to be practical it requires a universal extendable drag tag. Polymers are a logical choice for making extendable drag tags. Our recent theoretical work suggested that short peptides could provide a sufficient mobility shift to facilitate DQAMmiR. Here, we experimentally confirm this prediction in the analysis of five miRNAs: mir10b, mir21, mir125b, mir145, and mir155. We conjugated four fluorescently labeled DNA molecules with peptides of 5, 10, 15, or 20 neutral amino acids in length; the fifth probe was peptide-free. The peptide tags showed no interference with SSB binding to the probes and facilitated separation of the five hybrids. The mobilities of the five hybrids were used to refine the previously suggested theory. The analysis was performed in both a pure buffer and in cell lysate. Our analysis of the experimental data suggests that using DNA-peptide probes can readily facilitate simultaneous analysis of more than 10 miRNAs.

  6. HPLC-APCI-MS analysis of triacylglycerols (TAGs) in historical pharmaceutical ointments from the eighteenth century.

    PubMed

    Saliu, Francesco; Modugno, Francesca; Orlandi, Marco; Colombini, Maria Perla

    2011-10-01

    The lipid fractions of residues from historical pharmaceutical ointments were analysed by reversed-phase liquid chromatography coupled with atmospheric pressure chemical ionization and mass spectrometer detection. The residues were contained in a series of historical apothecary jars, dating from the eighteenth century and conserved at the "Aboca Museum" in Sansepolcro (Arezzo, Italy) and at the pharmacy of the "Real Cartuja de Valldemossa" in Palma de Majorca (Spain). The analytical protocol was set up using a comparative study based on the evaluation of triacylglycerol (TAG) compositions in raw natural lipid materials and in laboratory-reproduced ointments. These ointments were prepared following pharmaceutical recipes reported in historical treatises and used as reference materials. The reference materials were also subjected to stress treatments in order to evaluate the modification occurring in the TAG profiles as an effect of ageing. TAGs were successfully detected in the reproduced formulations even in mixtures of up to ten ingredients and after harsh degradative treatments, and also in real historical samples. No particular interferences were detected from other non-lipid ingredients of the formulations. The TAG compositions detected in the historical ointments indicated a predominant use of olive oil and pig adipose material as lipid ingredients. The detection of a high level of tristearine and myristyl-palmitoyl-stearyl glycerol in two of the samples suggested the presence of a fatty material of a different origin (maybe a ruminant). On the basis of the positional isomer ratio, sn-PPO/sn-POP, it was possible to hypothesize an exclusive use of pig fat in one sample. We also evaluated the application of principal component analysis of TAG profiles as an approach for the multivariate statistical comparison of the reference and historical ointments.

  7. Analysis and verification of a proposed antenna design for an implantable RFID Tag at 915 MHz

    NASA Astrophysics Data System (ADS)

    Bakore, Rahul

    This work focused on design and analysis of an antenna to be used with an RFID tag that is implanted in human brain tissue. The goal is to maximize the power transferred between the external RFID measurement system and the implanted RFID tag while minimizing the power dissipated within the surrounding tissue. The commercial computational electromagnetics software package COMSOL, based on finite element method (FEM) has been used for design process. The COMSOL models have been validated against additional simulations using the FEKO commercial package based on method of moments (MOM) as well as against measurement of test antenna structures radiating in bulk homogeneous medium. The proposed antenna geometry is compatible with the human tissue and viable for use in implantable RFID Tag. The proposed antenna is a planar folded dipole made from a gold conductor that acts as a biocompatible material. The metal thickness is 1 micrometer and the overall antenna dimensions are 22 mm × 3.5 mm. The antenna structure also includes a dielectric substrate and an acrylic coating. The antenna impedance is 28 + j201.5 Ω at 915 MHz. The inductive reactance is high enough to compensate the capacitive reactance of RFID tag and the antenna resistance is close to effective chip resistance providing a conjugate match. This antenna fulfills the criteria for minimizing the power dissipation within the human tissue. Also, a radiation efficiency of 87% is achieved with this antenna at 915 MHz. The quality factor of greater than 10 is achieved which is sufficient to turn on the diodes in the electronic circuit of the RFID tag due to the high D.C voltage obtained.

  8. HPLC-APCI-MS analysis of triacylglycerols (TAGs) in historical pharmaceutical ointments from the eighteenth century.

    PubMed

    Saliu, Francesco; Modugno, Francesca; Orlandi, Marco; Colombini, Maria Perla

    2011-10-01

    The lipid fractions of residues from historical pharmaceutical ointments were analysed by reversed-phase liquid chromatography coupled with atmospheric pressure chemical ionization and mass spectrometer detection. The residues were contained in a series of historical apothecary jars, dating from the eighteenth century and conserved at the "Aboca Museum" in Sansepolcro (Arezzo, Italy) and at the pharmacy of the "Real Cartuja de Valldemossa" in Palma de Majorca (Spain). The analytical protocol was set up using a comparative study based on the evaluation of triacylglycerol (TAG) compositions in raw natural lipid materials and in laboratory-reproduced ointments. These ointments were prepared following pharmaceutical recipes reported in historical treatises and used as reference materials. The reference materials were also subjected to stress treatments in order to evaluate the modification occurring in the TAG profiles as an effect of ageing. TAGs were successfully detected in the reproduced formulations even in mixtures of up to ten ingredients and after harsh degradative treatments, and also in real historical samples. No particular interferences were detected from other non-lipid ingredients of the formulations. The TAG compositions detected in the historical ointments indicated a predominant use of olive oil and pig adipose material as lipid ingredients. The detection of a high level of tristearine and myristyl-palmitoyl-stearyl glycerol in two of the samples suggested the presence of a fatty material of a different origin (maybe a ruminant). On the basis of the positional isomer ratio, sn-PPO/sn-POP, it was possible to hypothesize an exclusive use of pig fat in one sample. We also evaluated the application of principal component analysis of TAG profiles as an approach for the multivariate statistical comparison of the reference and historical ointments. PMID:21713420

  9. Fine Mutational Analysis of 2B8 and 3H7 Tag Epitopes with Corresponding Specific Monoclonal Antibodies.

    PubMed

    Kim, Tae-Lim; Cho, Man-Ho; Sangsawang, Kanidta; Bhoo, Seong Hee

    2016-06-30

    Bacteriophytochromes are phytochrome-like light-sensing photoreceptors that use biliverdin as a chromophore. To study the biochemical properties of the Deinococcus radiodurans bacteriophytochrome (DrBphP) protein, two anti-DrBphP mouse monoclonal antibodies (2B8 and 3H7) were generated. Their specific epitopes were identified in our previous report. We present here fine epitope mapping of these two antibodies by using truncation and substitution of original epitope sequences in order to identify minimized epitope peptides. The previously reported original epitope sequences for 2B8 and 3H7 were truncated from both sides. Our analysis showed that the minimal peptide sequence lengths for 2B8 and 3H7 antibodies were nine amino acids (RDPLPFFPP) and six amino acids (PGEIEE), respectively. We further characterized these peptides in order to investigate their reactivity after single deletion and single substitution of the original peptides. We found that single-substituted 2B8 epitope (RDPLPAFPP) and dual-substituted 3H7 epitope (PGEIAD) showed significantly increased reactivity. These two antibodies with high reactivity for the short modified peptide sequences are valueble for developing new peptide tags for protein research. PMID:27137090

  10. Fine Mutational Analysis of 2B8 and 3H7 Tag Epitopes with Corresponding Specific Monoclonal Antibodies

    PubMed Central

    Kim, Tae-Lim; Cho, Man-Ho; Sangsawang, Kanidta; Bhoo, Seong Hee

    2016-01-01

    Bacteriophytochromes are phytochrome-like light-sensing photoreceptors that use biliverdin as a chromophore. To study the biochemical properties of the Deinococcus radiodurans bacteriophytochrome (DrBphP) protein, two anti-DrBphP mouse monoclonal antibodies (2B8 and 3H7) were generated. Their specific epitopes were identified in our previous report. We present here fine epitope mapping of these two antibodies by using truncation and substitution of original epitope sequences in order to identify minimized epitope peptides. The previously reported original epitope sequences for 2B8 and 3H7 were truncated from both sides. Our analysis showed that the minimal peptide sequence lengths for 2B8 and 3H7 antibodies were nine amino acids (RDPLPFFPP) and six amino acids (PGEIEE), respectively. We further characterized these peptides in order to investigate their reactivity after single deletion and single substitution of the original peptides. We found that single-substituted 2B8 epitope (RDPLPAFPP) and dual-substituted 3H7 epitope (PGEIAD) showed significantly increased reactivity. These two antibodies with high reactivity for the short modified peptide sequences are valueble for developing new peptide tags for protein research. PMID:27137090

  11. Analysis of myocardial motion using generalized spline models and tagged magnetic resonance images

    NASA Astrophysics Data System (ADS)

    Chen, Fang; Rose, Stephen E.; Wilson, Stephen J.; Veidt, Martin; Bennett, Cameron J.; Doddrell, David M.

    2000-06-01

    Heart wall motion abnormalities are the very sensitive indicators of common heart diseases, such as myocardial infarction and ischemia. Regional strain analysis is especially important in diagnosing local abnormalities and mechanical changes in the myocardium. In this work, we present a complete method for the analysis of cardiac motion and the evaluation of regional strain in the left ventricular wall. The method is based on the generalized spline models and tagged magnetic resonance images (MRI) of the left ventricle. The whole method combines dynamical tracking of tag deformation, simulating cardiac movement and accurately computing the regional strain distribution. More specifically, the analysis of cardiac motion is performed in three stages. Firstly, material points within the myocardium are tracked over time using a semi-automated snake-based tag tracking algorithm developed for this purpose. This procedure is repeated in three orthogonal axes so as to generate a set of one-dimensional sample measurements of the displacement field. The 3D-displacement field is then reconstructed from this sample set by using a generalized vector spline model. The spline reconstruction of the displacement field is explicitly expressed as a linear combination of a spline kernel function associated with each sample point and a polynomial term. Finally, the strain tensor (linear or nonlinear) with three direct components and three shear components is calculated by applying a differential operator directly to the displacement function. The proposed method is computationally effective and easy to perform on tagged MR images. The preliminary study has shown potential advantages of using this method for the analysis of myocardial motion and the quantification of regional strain.

  12. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  13. Sequence-tagged site (STS) content mapping of human chromosomes: theoretical considerations and early experiences.

    PubMed

    Green, E D; Green, P

    1991-11-01

    The magnitude of the effort required to complete the human genome project will require constant refinements of the tools available for the large-scale study of DNA. Such improvements must include both the development of more powerful technologies and the reformulation of the theoretical strategies that account for the changing experimental capabilities. The two technological advances described here, PCR and YAC cloning, have rapidly become incorporated into the standard armamentarium of genome analysis and represent key examples of how technological developments continue to drive experimental strategies in molecular biology. Because of its high sensitivity, specificity, and potential for automation, PCR is transforming many aspects of DNA mapping. Similarly, by providing the means to isolate and study larger pieces of DNA, YAC cloning has made practical the achievement of megabase-level continuity in physical maps. Taken together, these two technologies can be envisioned as providing a powerful strategy for constructing physical maps of whole chromosomes. Undoubtedly, future technological developments will promote even more effective mapping strategies. Nonetheless, the theoretical projections and practical experience described here suggest that constructing YAC-based STS-content maps of whole human chromosomes is now possible. Random STSs can be efficiently generated and used to screen collections of YAC clones, and contiguous YAC coverage of regions exceeding 2 Mb can be readily obtained. While the predicted laboratory effort required for mapping whole human chromosomes remains daunting, it is clearly feasible.

  14. Expressed sequence tag (EST) profiling in hyper saline shocked Dunaliella salina reveals high expression of protein synthetic apparatus components.

    PubMed

    Alkayal, Fadi; Albion, Rebecca L; Tillett, Richard L; Hathwaik, Leyla T; Lemos, Mark S; Cushman, John C

    2010-11-01

    The unicellular, halotolerant, green alga, Dunaliella salina (Chlorophyceae) has the unique ability to adapt and grow in a wide range of salt conditions from about 0.05 to 5.5M. To better understand the molecular basis of its salinity tolerance, a complementary DNA (cDNA) library was constructed from D. salina cells adapted to 2.5M NaCl, salt-shocked at 3.4M NaCl for 5h, and used to generate an expressed sequence tag (EST) database. ESTs were obtained for 2831 clones representing 1401 unique transcripts. Putative functions were assigned to 1901 (67.2%) ESTs after comparison with protein databases. An additional 154 (5.4%) ESTs had significant similarity to known sequences whose functions are unclear and 776 (27.4%) had no similarity to known sequences. For those D. salina ESTs for which functional assignments could be made, the largest functional categories included protein synthesis (35.7%), energy (photosynthesis) (21.4%), primary metabolism (13.8%) and protein fate (6.8%). Within the protein synthesis category, the vast majority of ESTs (80.3%) encoded ribosomal proteins representing about 95% of the approximately 82 subunits of the cytosolic ribosome indicating that D. salina invests substantial resources in the production and maintenance of protein synthesis. The increased mRNA expression upon salinity shock was verified for a small set of selected genes by real-time, quantitative reverse-transcription-polymerase chain reaction (qRT-PCR). This EST collection also provided important new insights into the genetic underpinnings for the biosynthesis and utilization of glycerol and other osmoprotectants, the carotenoid biosynthetic pathway, reactive oxygen-scavenging enzymes, and molecular chaperones (heat shock proteins) not described previously for D. salina. EST discovery also revealed the existence of RNA interference and signaling pathways associated with osmotic stress adaptation. The unknown ESTs described here provide a rich resource for the identification

  15. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities. PMID:26209697

  16. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities.

  17. Analysis of the early-flowering mechanisms and generation of T-DNA tagging lines in Kitaake, a model rice cultivar

    PubMed Central

    An, Gynheung

    2013-01-01

    As an extremely early flowering cultivar, rice cultivar Kitaake is a suitable model system for molecular studies. Expression analyses revealed that transcript levels of the flowering repressor Ghd7 were decreased while those of its downstream genes, Ehd1, Hd3a, and RFT1, were increased. Sequencing the known flowering-regulator genes revealed mutations in Ghd7 and OsPRR37 that cause early translation termination and amino acid substitutions, respectively. Genetic analysis of F2 progeny from a cross between cv. Kitaake and cv. Dongjin indicated that those mutations additively contribute to the early-flowering phenotype in cv. Kitaake. Because the short life cycle facilitates genetics research, this study generated 10 000 T-DNA tagging lines and deduced 6758 flanking sequence tags (FSTs), in which 3122 were genic and 3636 were intergenic. Among the genic lines, 367 (11.8%) were inserted into new genes that were not previously tagged. Because the lines were generated by T-DNA that contained the promoterless GUS reporter gene, which had an intron with triple splicing donors/acceptors in the right border region, a high efficiency of GUS expression was shown in various organs. Sequencing of the GUS-positive lines demonstrated that the third splicing donor and the first splicing acceptor of the vector were extensively used. The FST data have now been released into the public domain for seed distribution and facilitation of rice research. PMID:23966593

  18. A capture-recapture survival analysis model for radio-tagged animals

    USGS Publications Warehouse

    Pollock, K.H.; Bunck, C.M.; Winterstein, S.R.; Chen, C.-L.; North, P.M.; Nichols, J.D.

    1995-01-01

    In recent years, survival analysis of radio-tagged animals has developed using methods based on the Kaplan-Meier method used in medical and engineering applications (Pollock et al., 1989a,b). An important assumption of this approach is that all tagged animals with a functioning radio can be relocated at each sampling time with probability 1. This assumption may not always be reasonable in practice. In this paper, we show how a general capture-recapture model can be derived which allows for some probability (less than one) for animals to be relocated. This model is not simply a Jolly-Seber model because it is possible to relocate both dead and live animals, unlike when traditional tagging is used. The model can also be viewed as a generalization of the Kaplan-Meier procedure, thus linking the Jolly-Seber and Kaplan-Meier approaches to survival estimation. We present maximum likelihood estimators and discuss testing between submodels. We also discuss model assumptions and their validity in practice. An example is presented based on canvasback data collected by G. M. Haramis of Patuxent Wildlife Research Center, Laurel, Maryland, USA.

  19. Functional categorization of unique expressed sequence tags obtained from the yeast-like growth phase of the elm pathogen Ophiostoma novo-ulmi

    PubMed Central

    2011-01-01

    Background The highly aggressive pathogenic fungus Ophiostoma novo-ulmi continues to be a serious threat to the American elm (Ulmus americana) in North America. Extensive studies have been conducted in North America to understand the mechanisms of virulence of this introduced pathogen and its evolving population structure, with a view to identifying potential strategies for the control of Dutch elm disease. As part of a larger study to examine the genomes of economically important Ophiostoma spp. and the genetic basis of virulence, we have constructed an expressed sequence tag (EST) library using total RNA extracted from the yeast-like growth phase of O. novo-ulmi (isolate H327). Results A total of 4,386 readable EST sequences were annotated by determining their closest matches to known or theoretical sequences in public databases by BLASTX analysis. Searches matched 2,093 sequences to entries found in Genbank, including 1,761 matches with known proteins and 332 matches with unknown (hypothetical/predicted) proteins. Known proteins included a collection of 880 unique transcripts which were categorized to obtain a functional profile of the transcriptome and to evaluate physiological function. These assignments yielded 20 primary functional categories (FunCat), the largest including Metabolism (FunCat 01, 20.28% of total), Sub-cellular localization (70, 10.23%), Protein synthesis (12, 10.14%), Transcription (11, 8.27%), Biogenesis of cellular components (42, 8.15%), Cellular transport, facilitation and routes (20, 6.08%), Classification unresolved (98, 5.80%), Cell rescue, defence and virulence (32, 5.31%) and the unclassified category, or known sequences of unknown metabolic function (99, 7.5%). A list of specific transcripts of interest was compiled to initiate an evaluation of their impact upon strain virulence in subsequent studies. Conclusions This is the first large-scale study of the O. novo-ulmi transcriptome. The expression profile obtained from the yeast

  20. Development and optimization of sequence-tagged microsatellite site markers to detect genetic diversity within Colletotrichum capsici, a causal agent of chilli pepper anthracnose disease.

    PubMed

    Ranathunge, N P; Ford, R; Taylor, P W J

    2009-07-01

    Genomic libraries enriched for microsatellites from Colletotrichum capsici, one of the major causal agents of anthracnose disease in chilli pepper (Capsicum spp.), were developed using a modified hybridization procedure. Twenty-seven robust primer pairs were designed from microsatellite flanking sequences and were characterized using 52 isolates from three countries India, Sri Lanka and Thailand. Highest gene diversity of 0.857 was observed at the CCSSR1 with up to 18 alleles among all the isolates whereas the differentiation ranged from 0.05 to 0.45. The sequence-tagged microsatellite site markers developed in this study will be useful for genetic analyses of C. capsici populations. PMID:21564867

  1. Bacterial diversity assessment of pristine mangrove microbial community from Dhulibhashani, Sundarbans using 16S rRNA gene tag sequencing.

    PubMed

    Basak, Pijush; Pramanik, Arnab; Sengupta, Sohan; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2016-03-01

    The global knowledge of microbial diversity and function in Sundarbans ecosystem is still scarce, despite global advancement in understanding the microbial diversity. In the present study, we have analyzed the diversity and distribution of bacteria in the tropical mangrove sediments of Sundarbans using 16S rRNA gene amplicon sequencing. Metagenome is comprised of 1,53,926 sequences with 108.8 Mbp data and with 55 ± 2% G + C content. Metagenome sequence data are available at NCBI under the Bioproject database with accession no. PRJNA245459. Bacterial community metagenome sequences were analyzed by MG-RAST software representing the presence of 56,547 species belonging to 44 different phyla. The taxonomic analysis revealed the dominance of phyla Proteobacteria within our dataset. Further taxonomic analysis revealed abundance of Bacteroidetes, Acidobactreia, Firmicutes, Actinobacteria, Nitrospirae, Cyanobacteria, Planctomycetes and Fusobacteria group as the predominant bacterial assemblages in this largely pristine mangrove habitat. The distribution of different community datasets obtained from four sediment samples originated from one sampling station at two different depths providing better understanding of the sediment bacterial diversity and its relationship to the ecosystem dynamics of this pristine mangrove sediment of Dhulibhashani in, Sundarbans.

  2. EGENES: Transcriptome-Based Plant Database of Genes with Metabolic Pathway Information and Expressed Sequence Tag Indices in KEGG1[C][W][OA

    PubMed Central

    Masoudi-Nejad, Ali; Goto, Susumu; Jauregui, Ruy; Ito, Masumi; Kawashima, Shuichi; Moriya, Yuki; Endo, Takashi R.; Kanehisa, Minoru

    2007-01-01

    EGENES is a knowledge-based database for efficient analysis of plant expressed sequence tags (ESTs) that was recently added to the KEGG suite of databases. It links plant genomic information with higher order functional information in a single database. It also provides gene indices for each genome. The genomic information in EGENES is a collection of EST contigs constructed from assembly of ESTs. Due to the extremely large genomes of plant species, the bulk collection of data such as ESTs is a quick way to capture a complete repertoire of genes expressed in an organism. Using ESTs for reconstructing metabolic pathways is a new expansion in KEGG and provides researchers with a new resource for species in which only EST sequences are available. Functional annotation in EGENES is a process of linking a set of genes/transcripts in each genome with a network of interacting molecules in the cell. EGENES is a multispecies, integrated resource consisting of genomic, chemical, and network information containing a complete set of building blocks (genes and molecules) and wiring diagrams (biological pathways) to represent cellular functions. Using EGENES, genome-based pathway annotation and EST-based annotation can now be compared and mutually validated. The ultimate goals of EGENES will be to: bring new plant species into KEGG by clustering and annotating ESTs; abstract knowledge and principles from large-scale plant EST data; and improve computational prediction of systems of higher complexity. EGENES will be updated at least once a year. EGENES is publicly available and is accessible by the following link or by KEGG's navigation system (http://www.genome.jp/kegg-bin/create_kegg_menu?category=plants_egenes). PMID:17468225

  3. An expressed sequence tag (EST) library from developing fruits of an Hawaiian endemic mint (Stenogyne rugosa, Lamiaceae): characterization and microsatellite markers

    PubMed Central

    Lindqvist, Charlotte; Scheen, Anne-Cathrine; Yoo, Mi-Jeong; Grey, Paris; Oppenheimer, David G; Leebens-Mack, James H; Soltis, Douglas E; Soltis, Pamela S; Albert, Victor A

    2006-01-01

    Background The endemic Hawaiian mints represent a major island radiation that likely originated from hybridization between two North American polyploid lineages. In contrast with the extensive morphological and ecological diversity among taxa, ribosomal DNA sequence variation has been found to be remarkably low. In the past few years, expressed sequence tag (EST) projects on plant species have generated a vast amount of publicly available sequence data that can be mined for simple sequence repeats (SSRs). However, these EST projects have largely focused on crop or otherwise economically important plants, and so far only few studies have been published on the use of intragenic SSRs in natural plant populations. We constructed an EST library from developing fleshy nutlets of Stenogyne rugosa principally to identify genetic markers for the Hawaiian endemic mints. Results The Stenogyne fruit EST library consisted of 628 unique transcripts derived from 942 high quality ESTs, with 68% of unigenes matching Arabidopsis genes. Relative frequencies of Gene Ontology functional categories were broadly representative of the Arabidopsis proteome. Many unigenes were identified as putative homologs of genes that are active during plant reproductive development. A comparison between unigenes from Stenogyne and tomato (both asterid angiosperms) revealed many homologs that may be relevant for fruit development. Among the 628 unigenes, a total of 44 potentially useful microsatellite loci were predicted. Several of these were successfully tested for cross-transferability to other Hawaiian mint species, and at least five of these demonstrated interesting patterns of polymorphism across a large sample of Hawaiian mints as well as close North American relatives in the genus Stachys. Conclusion Analysis of this relatively small EST library illustrated a broad GO functional representation. Many unigenes could be annotated to involvement in reproductive development. Furthermore, first tests

  4. Expression of the Arabidopsis transposable element Tag1 is targeted to developing gametophytes.

    PubMed Central

    Galli, Mary; Theriault, Angie; Liu, Dong; Crawford, Nigel M

    2003-01-01

    The Arabidopsis transposon Tag1 undergoes late excision during vegetative and germinal development in plants containing 35S-Tag1-GUS constructs. To determine if transcriptional regulation can account for the developmental control of Tag1 excision, the transcriptional activity of Tag1 promoter-GUS fusion constructs of various lengths was examined in transgenic plants. All constructs showed expression in the reproductive organs of developing flowers but no expression in leaves. Expression was restricted to developing gametophytes in both male and female lineages. Quantitative RT-PCR analysis confirmed that Tag1 expression predominates in the reproductive organs of flower buds. These results are consistent with late germinal excision of Tag1, but they cannot explain the vegetative excision activity of Tag1 observed with 35S-Tag1-GUS constructs. To resolve this issue, Tag1 excision was reexamined using elements with no adjacent 35S promoter sequences. Tag1 excision in this context is restricted to germinal events with no detectable vegetative excision. If a 35S enhancer sequence is placed next to Tag1, vegetative excision is restored. These results indicate that the intrinsic activity of Tag1 is restricted to germinal excision due to targeted expression of the Tag1 transposase to developing gametophytes and that this activity is altered by the presence of adjacent enhancers or promoters. PMID:14704189

  5. Shift in prokaryotic diversity in Arctic sediment along a continuum Glacier -River - Fjord using massive 16S rRNA gene tag sequencing

    NASA Astrophysics Data System (ADS)

    Laghdass, M.; Deloffre, J.; Lafite, R.; Hänni, C.; Gillet, B.; Cecillon, S.; Simonet, P.; Petit, F.

    2012-04-01

    In Arctic environment, one of indirect consequences of the global climate warming is the significant amplification of the amount of inland water during the spring thaw resulting from the snow cover and permafrost melting. These freshwater transfers to the coast cause sedimentary transfers. The Arctic fjords that represent deep glacial valleys of the sea are particularly vulnerable systems. Although the previous studies have highlighted potentially the high bacterial diversity in Arctic environment by the pyrosequencing, a new-generation sequencing and high throughput method, does not escape the same bias as the one of classical molecular biology techniques involved at different stages of the analysis. In this context, our objective was to characterize the prokaryotic diversity associated to the sediment transfer along a gradient from the head of the glacier to mud patch sediment in the Goule river streaming in Kongsfjorden (Svalbard) during an active thaw. The prokaryotic diversity in sediment was characterized by combining a massive of 16S rRNA gene tag sequencing with a specific and original approach in order to overcome the bias associated to the sampling and extraction. The sediment was extracted by three different methods. One method was done in duplicate. Negative controls performed at extraction and PCR stages were also sequenced. The phylogenetic analysis of the environmental samples below phylum level revealed significantly changes in the diversity and the function of the prokaryotic community along the gradient. The subglacial Goule river sediment is characterized by bacteria with specific functions methylotroph bacteria, aerobic chemoautolithotrophic bacteria (Alphaproteobacteria with Methylobacteriaceae) whereas the mouth of the river Goule and the freshwater part of the Goule River was dominated by sulphate-reducing-bacteria, anaerobic chemooorganotroph (Deltaprotobacteria with the Desulfobulbaceae and Desulfuromonadaceae) and by

  6. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  7. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies.

  8. Nonlinear analysis of biological sequences

    SciTech Connect

    Torney, D.C.; Bruno, W.; Detours, V.

    1998-11-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The main objectives of this project involved deriving new capabilities for analyzing biological sequences. The authors focused on tabulating the statistical properties exhibited by Human coding DNA sequences and on techniques of inferring the phylogenetic relationships among protein sequences related by descent.

  9. Analysis of human collagen sequences.

    PubMed

    Nassa, Manisha; Anand, Pracheta; Jain, Aditi; Chhabra, Aastha; Jaiswal, Astha; Malhotra, Umang; Rani, Vibha

    2012-01-01

    The extracellular matrix is fast emerging as important component mediating cell-cell interactions, along with its established role as a scaffold for cell support. Collagen, being the principal component of extracellular matrix, has been implicated in a number of pathological conditions. However, collagens are complex protein structures belonging to a large family consisting of 28 members in humans; hence, there exists a lack of in depth information about their structural features. Annotating and appreciating the functions of these proteins is possible with the help of the numerous biocomputational tools that are currently available. This study reports a comparative analysis and characterization of the alpha-1 chain of human collagen sequences. Physico-chemical, secondary structural, functional and phylogenetic classification was carried out, based on which, collagens 12, 14 and 20, which belong to the FACIT collagen family, have been identified as potential players in diseased conditions, owing to certain atypical properties such as very high aliphatic index, low percentage of glycine and proline residues and their proximity in evolutionary history. These collagen molecules might be important candidates to be investigated further for their role in skeletal disorders. PMID:22359431

  10. Utilizing Social Bookmarking Tag Space for Web Content Discovery: A Social Network Analysis Approach

    ERIC Educational Resources Information Center

    Wei, Wei

    2010-01-01

    Social bookmarking has gained popularity since the advent of Web 2.0. Keywords known as tags are created to annotate web content, and the resulting tag space composed of the tags, the resources, and the users arises as a new platform for web content discovery. Useful and interesting web resources can be located through searching and browsing based…

  11. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function. PMID:25743335

  12. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-03-06

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.

  13. Molecular genetic analysis of activation-tagged transcription factors thought to be involved in photomorphogenesis

    SciTech Connect

    Neff, Michael M.

    2011-06-23

    This is a final report for Department of Energy Grant No. DE-FG02-08ER15927 entitled “Molecular Genetic Analysis of Activation-Tagged Transcription Factors Thought to be Involved in Photomorphogenesis”. Based on our preliminary photobiological and genetic analysis of the sob1-D mutant, we hypothesized that OBP3 is a transcription factor involved in both phytochrome and cryptochrome-mediated signal transduction. In addition, we hypothesized that OBP3 is involved in auxin signaling and root development. Based on our preliminary photobiological and genetic analysis of the sob2-D mutant, we also hypothesized that a related gene, LEP, is involved in hormone signaling and seedling development.

  14. Expressed sequence tags from larval gut of the European corn borer (Ostrinia nubilalis): Exploring candidate genes potentially involved in Bacillus thuringiensis toxicity and resistance

    PubMed Central

    Khajuria, Chitvan; Zhu, Yu Cheng; Chen, Ming-Shun; Buschman, Lawrent L; Higgins, Randall A; Yao, Jianxiu; Crespo, Andre LB; Siegfried, Blair D; Muthukrishnan, Subbaratnam; Zhu, Kun Yan

    2009-01-01

    Background Lepidoptera represents more than 160,000 insect species which include some of the most devastating pests of crops, forests, and stored products. However, the genomic information on lepidopteran insects is very limited. Only a few studies have focused on developing expressed sequence tag (EST) libraries from the guts of lepidopteran larvae. Knowledge of the genes that are expressed in the insect gut are crucial for understanding basic physiology of food digestion, their interactions with Bacillus thuringiensis (Bt) toxins, and for discovering new targets for novel toxins for use in pest management. This study analyzed the ESTs generated from the larval gut of the European corn borer (ECB, Ostrinia nubilalis), one of the most destructive pests of corn in North America and the western world. Our goals were to establish an ECB larval gut-specific EST database as a genomic resource for future research and to explore candidate genes potentially involved in insect-Bt interactions and Bt resistance in ECB. Results We constructed two cDNA libraries from the guts of the fifth-instar larvae of ECB and sequenced a total of 15,000 ESTs from these libraries. A total of 12,519 ESTs (83.4%) appeared to be high quality with an average length of 656 bp. These ESTs represented 2,895 unique sequences, including 1,738 singletons and 1,157 contigs. Among the unique sequences, 62.7% encoded putative proteins that shared significant sequence similarities (E-value ≤ 10-3)with the sequences available in GenBank. Our EST analysis revealed 52 candidate genes that potentially have roles in Bt toxicity and resistance. These genes encode 18 trypsin-like proteases, 18 chymotrypsin-like proteases, 13 aminopeptidases, 2 alkaline phosphatases and 1 cadherin-like protein. Comparisons of expression profiles of 41 selected candidate genes between Cry1Ab-susceptible and resistant strains of ECB by RT-PCR showed apparently decreased expressions in 2 trypsin-like and 2 chymotrypsin

  15. Phylogenetic Analysis of Poliovirus Sequences.

    PubMed

    Jorba, Jaume

    2016-01-01

    Comparative genomic sequencing is a major surveillance tool in the Polio Laboratory Network. Due to the rapid evolution of polioviruses (~1 % per year), pathways of virus transmission can be reconstructed from the pathways of genomic evolution. Here, we describe three main phylogenetic methods; estimation of genetic distances, reconstruction of a maximum-likelihood (ML) tree, and estimation of substitution rates using Bayesian Markov chain Monte Carlo (MCMC). The data set used consists of complete capsid sequences from a survey of poliovirus sequences available in GenBank. PMID:26983737

  16. Development and Validation of Single Nucleotide Polymorphism (SNP) Markers from an Expressed Sequence Tag (EST) Database in Olive Flounder (Paralichthys olivaceus).

    PubMed

    Kim, Jung Eun; Lee, Young Mee; Lee, Jeong-Ho; Noh, Jae Koo; Kim, Hyun Chul; Park, Choul-Ji; Park, Jong-Won; Kim, Kyung-Kil

    2014-12-01

    To successful molecular breeding, identification and functional characterization of breeding related genes and development of molecular breeding techniques using DNA markers are essential. Although the development of a useful marker is difficult in the aspect of time, cost and effort, many markers are being developed to be used in molecular breeding and developed markers have been used in many fields. Single nucleotide polymorphisms (SNPs) markers were widely used for genomic research and breeding, but has hardly been validated for screening functional genes in olive flounder. We identified single nucleotide polymorphisms (SNPs) from expressed sequence tag (EST) database in olive flounder; out of a total 4,327 ESTs, 693 contigs and 514 SNPs were detected in total EST, and these substitutions include 297 transitions and 217 transversions. As a result, 144 SNP markers were developed on the basis of 514 SNP to selection of useful gene region, and then applied to each of eight wild and culture olive flounder (total 16 samples). In our experimental result, only 32 markers had detected polymorphism in sample, also identified 21 transitions and 11 transversions, whereas indel was not detected in polymorphic SNPs. Heterozygosity of wild and cultured olive flounder using the 32 SNP markers is 0.34 and 0.29, respectively. In conclusion, we identified SNP and polymorphism in olive flounder using newly designed marker, it supports that developed markers are suitable for SNP detection and diversity analysis in olive flounder. The outcome of this study can be basic data for researches for immunity gene and characteristic with SNP.

  17. Analysis of passive cardiac constitutive laws for parameter estimation using 3D tagged MRI.

    PubMed

    Hadjicharalambous, Myrianthi; Chabiniok, Radomir; Asner, Liya; Sammut, Eva; Wong, James; Carr-White, Gerald; Lee, Jack; Razavi, Reza; Smith, Nicolas; Nordsletten, David

    2015-08-01

    An unresolved issue in patient-specific models of cardiac mechanics is the choice of an appropriate constitutive law, able to accurately capture the passive behavior of the myocardium, while still having uniquely identifiable parameters tunable from available clinical data. In this paper, we aim to facilitate this choice by examining the practical identifiability and model fidelity of constitutive laws often used in cardiac mechanics. Our analysis focuses on the use of novel 3D tagged MRI, providing detailed displacement information in three dimensions. The practical identifiability of each law is examined by generating synthetic 3D tags from in silico simulations, allowing mapping of the objective function landscape over parameter space and comparison of minimizing parameter values with original ground truth values. Model fidelity was tested by comparing these laws with the more complex transversely isotropic Guccione law, by characterizing their passive end-diastolic pressure-volume relation behavior, as well as by considering the in vivo case of a healthy volunteer. These results show that a reduced form of the Holzapfel-Ogden law provides the best balance between identifiability and model fidelity across the tests considered. PMID:25510227

  18. Genome Sequencing and Analysis Conference IV

    SciTech Connect

    Not Available

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  19. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  20. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  1. Methyl-CpG island-associated genome signature tags

    SciTech Connect

    Dunn, John J

    2014-05-20

    Disclosed is a method for analyzing the organismic complexity of a sample through analysis of the nucleic acid in the sample. In the disclosed method, through a series of steps, including digestion with a type II restriction enzyme, ligation of capture adapters and linkers and digestion with a type IIS restriction enzyme, genome signature tags are produced. The sequences of a statistically significant number of the signature tags are determined and the sequences are used to identify and quantify the organisms in the sample. Various embodiments of the invention described herein include methods for using single point genome signature tags to analyze the related families present in a sample, methods for analyzing sequences associated with hyper- and hypo-methylated CpG islands, methods for visualizing organismic complexity change in a sampling location over time and methods for generating the genome signature tag profile of a sample of fragmented DNA.

  2. Analysis of DNA Sequence Variants Detected by High Throughput Sequencing

    PubMed Central

    Adams, David R; Sincan, Murat; Fajardo, Karin Fuentes; Mullikin, James C; Pierson, Tyler M; Toro, Camilo; Boerkoel, Cornelius F; Tifft, Cynthia J; Gahl, William A; Markello, Tom C

    2014-01-01

    The Undiagnosed Diseases Program at the National Institutes of Health uses High Throughput Sequencing (HTS) to diagnose rare and novel diseases. HTS techniques generate large numbers of DNA sequence variants, which must be analyzed and filtered to find candidates for disease causation. Despite the publication of an increasing number of successful exome-based projects, there has been little formal discussion of the analytic steps applied to HTS variant lists. We present the results of our experience with over 30 families for whom HTS sequencing was used in an attempt to find clinical diagnoses. For each family, exome sequence was augmented with high-density SNP-array data. We present a discussion of the theory and practical application of each analytic step and provide example data to illustrate our approach. The paper is designed to provide an analytic roadmap for variant analysis, thereby enabling a wide range of researchers and clinical genetics practitioners to perform direct analysis of HTS data for their patients and projects. PMID:22290882

  3. Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags

    PubMed Central

    Lipton, Mary S.; Paša-Tolić, Ljiljana; Anderson, Gordon A.; Anderson, David J.; Auberry, Deanna L.; Battista, John R.; Daly, Michael J.; Fredrickson, Jim; Hixson, Kim K.; Kostandarithes, Heather; Masselon, Christophe; Markillie, Lye Meng; Moore, Ronald J.; Romine, Margaret F.; Shen, Yufeng; Stritmatter, Eric; Tolić, Nikola; Udseth, Harold R.; Venkateswaran, Amudhan; Wong, Kwong-Kwok; Zhao, Rui; Smith, Richard D.

    2002-01-01

    Understanding biological systems and the roles of their constituents is facilitated by the ability to make quantitative, sensitive, and comprehensive measurements of how their proteome changes, e.g., in response to environmental perturbations. To this end, we have developed a high-throughput methodology to characterize an organism's dynamic proteome based on the combination of global enzymatic digestion, high-resolution liquid chromatographic separations, and analysis by Fourier transform ion cyclotron resonance mass spectrometry. The peptides produced serve as accurate mass tags for the proteins and have been used to identify with high confidence >61% of the predicted proteome for the ionizing radiation-resistant bacterium Deinococcus radiodurans. This fraction represents the broadest proteome coverage for any organism to date and includes 715 proteins previously annotated as either hypothetical or conserved hypothetical. PMID:12177431

  4. Design and Analysis of Salmonid Tagging Studies in the Columbia Basin, Volume XVI; Alternative Designs for Future Adult PIT-Tag Detection Studies, 2000 Technical Report.

    SciTech Connect

    Perez-Comas, Jose A.; Skalski, John R.

    2000-09-25

    In the advent of the installation of a PIT-tag interrogation system in the Cascades Island fish ladder at Bonneville Dam (BON), and other CRB dams, this overview describes in general terms what can and cannot be estimated under seven different scenarios of adult PIT-tag detection capabilities in the CRB. Moreover, this overview attempted to identify minimal adult PIT-tag detection configurations required by the ten threatened Columbia River Basin (CRB) chinook and steelhead ESUs. A minimal adult PIT-tag detection configuration will require the installation of adult PIT-tag detection facilities at Bonneville Dam and another dam above BON. Thus, the Snake River spring/summer and fall chinook salmon, and the Snake River steelhead will require a minimum of three dams with adult PIT-tag detection capabilities to guarantee estimates of ''ocean survival'' and at least of one independent, in-river returning adult survival (e.g., adult PIT-tag detection facilities at BON and LGR dams and at any other intermediary dam such as IHR). The Upper Columbia River spring chinook salmon and steelhead will also require a minimum of three dams with adult PIT-tag detection capabilities: BON and two other dams on the BON-WEL reach. The current CRB dam system configuration and BPA's and COE's commitment to install adult PIT-tag detectors only in major CRB projects will not allow the estimation of an ''ocean survival'' and of any in-river adult survival for the Lower Columbia River chinook salmon and steelhead. The Middle Columbia River steelhead ESU will require a minimum of two dams with adult PIT-tag detection capabilities: BON and another upstream dam on the BON-McN reach. Finally, in spite of their importance in terms of releases, PIT-tag survival studies for the Upper Willamette chinook and Upper Willamette steelhead ESUs cannot be perform with the current CRB dam system configuration and PIT-tag detection capabilities.

  5. New data base-independent, sequence tag-based scoring of peptide MS/MS data validates Mowse scores, recovers below threshold data, singles out modified peptides, and assesses the quality of MS/MS techniques.

    PubMed

    Savitski, Mikhail M; Nielsen, Michael L; Zubarev, Roman A

    2005-08-01

    The Mascot score (M-score) is one of the conventional validity measures in data base identification of peptides and proteins by MS/MS data. Although tremendously useful, M-score has a number of limitations. For the same MS/MS data, M-score may change if the protein data base is expanded. A low M-value may not necessarily mean poor match but rather poor MS/MS quality. In addition M-score does not fully utilize the advantage of combined use of complementary fragmentation techniques collisionally activated dissociation (CAD) and electron capture dissociation (ECD). To address these issues, a new data base-independent scoring method (S-score) was designed that is based on the maximum length of the peptide sequence tag provided by the combined CAD and ECD data. The quality of MS/MS spectra assessed by S-score allows poor data (39% of all MS/MS spectra) to be filtered out before the data base search, speeding up the data analysis and eliminating a major source of false positive identifications. Spectra with below threshold M-scores (poor matches) but high S-scores are validated. Spectra with zero M-score (no data base match) but high S-score are classified as belonging to modified sequences. As an extension of S-score, an extremely reliable sequence tag was developed based on complementary fragments simultaneously appearing in CAD and ECD spectra. Comparison of this tag with the data base-derived sequence gives the most reliable peptide identification validation to date. The combined use of M- and S-scoring provides positive sequence identification from >25% of all MS/MS data, a 40% improvement over traditional M-scoring performed on the same Fourier transform MS instrumentation. The number of proteins reliably identified from Escherichia coli cell lysate hereby increased by 29% compared with the traditional M-score approach. Finally S-scoring provides a quantitative measure of the quality of fragmentation techniques such as the minimum abundance of the precursor ion

  6. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  7. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  8. Fractal Analysis of DNA Sequence Data

    NASA Astrophysics Data System (ADS)

    Berthelsen, Cheryl Lynn

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the "sandbox method." Analysis of 164 human DNA sequences compared to three types of control sequences (random, base -content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than do invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  9. Validation of Shewanella oneidensis MR-1 Small Proteins by AMT Tag-based Proteome Analysis

    SciTech Connect

    Romine, Margaret F.; Elias, Dwayne A.; Monroe, Matthew E.; Auberry, Kenneth J.; Fang, Ruihua; Fredrickson, Jim K.; Anderson, Gordon A.; Smith, Richard D.; Lipton, Mary S.

    2004-09-01

    Using stringent criteria for protein identification by accurate mass and time (AMT) tag mass spectrometric methodology, we detected 36 proteins <101 amino acids in length, including 10 that were annotated as hypothetical proteins, in 172 global tryptic digests of Shewanella oneidensis MR-1 proteins analyzed. Peptides that map to the conserved, but functionally uncharacterized proteins SO4134 and SO2787, were the most frequently detected small proteins in these samples, while hypotheticals SO2669 and SO2063, conserved hypotheticals SO0335 and SO2176, and the SlyX protein (SO1063) were observed at frequencies similar to small expected abundant ribosomal proteins and translation initiation factor IF-1 and consequently, likely to encode important cellular functions. In addition, 30 proteins including three of the small proteins that map to genes predicted to encode frameshifts, point mutations, or recoding signals were detected. Of these 30 genes, peptides that map to positions beyond internal stop codons were detected in 13 genes (SO0101, SO0419, SO0590, SO0738, SO1113, SO1211, SO3079, SO3130, SO3240, SO4231, SO4328, SO4422, and SO4657). While expression of the full-length formate dehydrogenase encoded by SO0101 can be explained by incorporation of selenocysteine at the internal stop codon, the mechanism of translating downstream sequences in the remaining genes remains unknown.

  10. Fluorescent Protein-Tagged Sindbis Virus E2 Glycoprotein Allows Single Particle Analysis of Virus Budding from Live Cells.

    PubMed

    Jose, Joyce; Tang, Jinghua; Taylor, Aaron B; Baker, Timothy S; Kuhn, Richard J

    2015-12-01

    Sindbis virus (SINV) is an enveloped, mosquito-borne alphavirus. Here we generated and characterized a fluorescent protein-tagged (FP-tagged) SINV and found that the presence of the FP-tag (mCherry) affected glycoprotein transport to the plasma membrane whereas the specific infectivity of the virus was not affected. We examined the virions by transmission electron cryo-microscopy and determined the arrangement of the FP-tag on the surface of the virion. The fluorescent proteins are arranged icosahedrally on the virus surface in a stable manner that did not adversely affect receptor binding or fusion functions of E2 and E1, respectively. The delay in surface expression of the viral glycoproteins, as demonstrated by flow cytometry analysis, contributed to a 10-fold reduction in mCherry-E2 virus titer. There is a 1:1 ratio of mCherry to E2 incorporated into the virion, which leads to a strong fluorescence signal and thus facilitates single-particle tracking experiments. We used the FP-tagged virus for high-resolution live-cell imaging to study the spatial and temporal aspects of alphavirus assembly and budding from mammalian cells. These processes were further analyzed by thin section microscopy. The results demonstrate that SINV buds from the plasma membrane of infected cells and is dispersed into the surrounding media or spread to neighboring cells facilitated by its close association with filopodial extensions.

  11. Four-dimensional B-spline-based motion analysis of tagged cardiac MR images

    NASA Astrophysics Data System (ADS)

    Ozturk, Cengizhan; McVeigh, Elliot R.

    1999-05-01

    In recent years, with development of new MRI techniques, noninvasive evaluation of global and regional cardiac function is becoming a reality. One of the methods used for this purpose is MRI tagging. In tagging, spatially encoded magnetic saturation planes, tags, are created within tissues. These act as temporary markers and move with the tissue. In cardiac tagging, tag deformation pattern provides useful qualitative and quantitative information about the functional properties of underlying myocardium. The measured deformation of a single tag plane contains only unidirectional information of the past motion. In order to track the motion of a cardiac material point, this sparse, single dimensional data has to be combined with similar information gathered from other tag sets and all time frames. Previously, several methods have been developed which rely on the specific geometry of the chambers. Here, we employ an image plane based, simple cartesian coordinate system and provide a stepwise method to describe the heart motion using a four-dimensional tensor product of B-splines. The proposed displacement and forward motion fields exhibited sub-pixel accuracy. Since our motion fields are parametric and based on an image plane based coordinate system, trajectories or other derived values (velocity, acceleration, strains...) can be calculated for any desired point on the MRI images. This method is sufficiently general so that the motion of any tagged structure can be tracked.

  12. Categorical and Specificity Differences between User-Supplied Tags and Search Query Terms for Images. An Analysis of "Flickr" Tags and Web Image Search Queries

    ERIC Educational Resources Information Center

    Chung, EunKyung; Yoon, JungWon

    2009-01-01

    Introduction: The purpose of this study is to compare characteristics and features of user supplied tags and search query terms for images on the "Flickr" Website in terms of categories of pictorial meanings and level of term specificity. Method: This study focuses on comparisons between tags and search queries using Shatford's categorization…

  13. A High-Throughput Data Mining of Single Nucleotide Polymorphisms in Coffea Species Expressed Sequence Tags Suggests Differential Homeologous Gene Expression in the Allotetraploid Coffea arabica1[W

    PubMed Central

    Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

    2010-01-01

    Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545

  14. Sequence and Phylogenetic Analysis of FAD Synthetase

    NASA Astrophysics Data System (ADS)

    Schubert, Luisa; Frago, Susana; Martínez-Júlvez, Marta; Medina, Milagros

    2006-08-01

    An evolutionary analysis of the sequences available till now for FAD synthetases has been carried out. Several identical conserved residues have been observed along the sequences of all the FAD synthetases analyzed, which might correlate with role for these residues in the catalytic activity of the enzyme. Phylogenetic analysis shows that FAD synthetase sequences can be organized in two main clusters. One of them mainly contains temperature, pressure or pH resistant organisms, whereas in the other one organisms with pathogenic character can be found.

  15. In-vivo motion analysis of bi-ventricular hearts from tagged MR images

    NASA Astrophysics Data System (ADS)

    Park, Kyoungju; Axel, Leon; Metaxas, Dimitris N.

    2005-04-01

    We conduct experiments to look at the in-vivo cardiac motion during systole, to visualize heart contraction, and to examine the clinical usefulness. Our model-based technique incorporates subject-specific modeling, motion analysis and the extraction of clinically relevant parameters within one framework. Previous bi-ventricular model based method could only handle up to the mid-ventricles and have a few test-subjects. Our parameterized model includes the LV, RV and up to the basal area for full ventricular motion study. Finite element methods capture cardiac motion by tracking the material points from tagged Magnetic Resonance (MR) images. A number of experiments from ten subjects are evaluated and analyzed. We tested subject several times and compared the resulting parameters to ensure the reproducibility and deviations. The resulting parameters can be used to describe the cardiac motion of normal subjects. The patterns of normal subjects were derived from experiments. While significant shape and motion variations were apparent in normal subjects, the quantitative analysis show typical patterns. Generally, the basal area moves downwards and the apical area contracts towards the cavity. The principal strain analysis describes the directions and magnitudes of maximum shortening, and maximum thickening.

  16. Chemical tagging of chlorinated phenols for their facile detection and analysis by NMR spectroscopy

    SciTech Connect

    Valdez, Carlos A.; Leif, Roald N.

    2015-03-22

    A derivatization method that employs diethyl (bromodifluoromethyl) phosphonate (DBDFP) to efficiently tag the endocrine disruptor pentachlorophenol (PCP) and other chlorinated phenols (CPs) along with their reliable detection and analysis by NMR is presented. The method accomplishes the efficient alkylation of the hydroxyl group in CPs with the difluoromethyl (CF2H) moiety in extremely rapid fashion (5 min), at room temperature and in an environmentally benign manner. The approach proved successful in difluoromethylating a panel of 18 chlorinated phenols, yielding derivatives that displayed unique 1H, 19F NMR spectra allowing for the clear discrimination between isomerically related CPs. Due to its biphasic nature, the derivatization can be applied to both aqueous and organic mixtures where the analysis of CPs is required. Furthermore, the methodology demonstrates that PCP along with other CPs can be selectively derivatized in the presence of other various aliphatic alcohols, underscoring the superiority of the approach over other general derivatization methods that indiscriminately modify all analytes in a given sample. The present work demonstrates the first application of NMR on the qualitative analysis of these highly toxic and environmentally persistent species.

  17. The Design and Analysis of Salmonid Tagging Studies in the Columbia Basin : Volume II: Experiment Salmonid Survival with Combined PIT-CWT Tagging.

    SciTech Connect

    Newman, Ken

    1997-06-01

    Experiment designs to estimate the effect of transportation on survival and return rates of Columbia River system salmonids are discussed along with statistical modeling techniques. Besides transportation, river flow and dam spill are necessary components in the design and analysis otherwise questions as to the effects of reservoir drawdowns and increased dam spill may never be satisfactorily answered. Four criteria for comparing different experiment designs are: (1) feasibility, (2) clarity of results, (3) scope of inference, and (4) time to learn. In this report, alternative designs for conducting experimental manipulations of smolt tagging studies to study effects of river operations such as flow levels, spill fractions, and transporting outmigrating salmonids around dams in the Columbia River system are presented. The principles of study design discussed in this report have broad implications for the many studies proposed to investigate both smolt and adult survival relationships. The concepts are illustrated for the case of the design and analysis of smolt transportation experiments. The merits of proposed transportation studies should be measured relative to these principles of proper statistical design and analysis.

  18. Ginger and turmeric expressed sequence tags identify signature genes for rhizome identity and development and the biosynthesis of curcuminoids, gingerols and terpenoids

    PubMed Central

    2013-01-01

    Background Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. Results In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. Conclusion A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific

  19. Proteomic analysis of astrocytic secretion that regulates neurogenesis using quantitative amine-specific isobaric tagging

    SciTech Connect

    Yan, Hu; Zhou, Wenhao; Wei, Liming; Zhong, Fan; Yang, Yi

    2010-01-08

    Astrocytes are essential components of neurogenic niches that affect neurogenesis through membrane association and/or the release of soluble factors. To identify factors released from astrocytes that could regulate neural stem cell differentiation and proliferation, we used mild oxygen-glucose deprivation (OGD) to inhibit the secretory capacity of astrocytes. Using the Transwell co-culture system, we found that OGD-treated astrocytes could not promote neural stem cell differentiation and proliferation. Next, isobaric tagging for the relative and absolute quantitation (iTRAQ) proteomics techniques was performed to identify the proteins in the supernatants of astrocytes (with or without OGD). Through a multi-step analysis and gene ontology classification, 130 extracellular proteins were identified, most of which were involved in neuronal development, the inflammatory response, extracellular matrix composition and supportive functions. Of these proteins, 44 had never been reported to be produced by astrocytes. Using ProteinPilot software analysis, we found that 60 extracellular proteins were significantly altered (27 upregulated and 33 downregulated) in the supernatant of OGD-treated astrocytes. Among these proteins, 7 have been reported to be able to regulate neurogenesis, while others may have the potential to regulate neurogenesis. This study profiles the major proteins released by astrocytes, which play important roles in the modulation of neurogenesis.

  20. A White Campion (Silene latifolia) floral expressed sequence tag (EST) library: annotation, EST-SSR characterization, transferability, and utility for comparative mapping

    PubMed Central

    Moccia, Maria Domenica; Oger-Desfeux, Christine; Marais, Gabriel AB; Widmer, Alex

    2009-01-01

    Background Expressed sequence tag (EST) databases represent a valuable resource for the identification of genes in organisms with uncharacterized genomes and for development of molecular markers. One class of markers derived from EST sequences are simple sequence repeat (SSR) markers, also known as EST-SSRs. These are useful in plant genetic and evolutionary studies because they are located in transcribed genes and a putative function can often be inferred from homology searches. Another important feature of EST-SSR markers is their expected high level of transferability to related species that makes them very promising for comparative mapping. In the present study we constructed a normalized EST library from floral tissue of Silene latifolia with the aim to identify expressed genes and to develop polymorphic molecular markers. Results We obtained a total of 3662 high quality sequences from a normalized Silene cDNA library. These represent 3105 unigenes, with 73% of unigenes matching genes in other species. We found 255 sequences containing one or more SSR motifs. More than 60% of these SSRs were trinucleotides. A total of 30 microsatellite loci were identified from 106 ESTs having sufficient flanking sequences for primer design. The inheritance of these loci was tested via segregation analyses and their usefulness for linkage mapping was assessed in an interspecific cross. Tests for crossamplification of the EST-SSR loci in other Silene species established their applicability to related species. Conclusion The newly characterized genes and gene-derived markers from our Silene EST library represent a valuable genetic resource for future studies on Silene latifolia and related species. The polymorphism and transferability of EST-SSR markers facilitate comparative linkage mapping and analyses of genetic diversity in the genus Silene. PMID:19467153

  1. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  2. Preparation of a Ytterbium-tagged Gunshot Residue Standard for Quality Control in the Forensic Analysis of GSR.

    PubMed

    Hearns, Nigel G R; Laflèche, Denis N; Sandercock, Mark L

    2015-05-01

    Preparation of a ytterbium-tagged gunshot residue (GSR) reference standard for scanning electron microscopy and energy dispersive X-ray spectroscopic (SEM-EDS) microanalysis is reported. Two different chemical markers, ytterbium and neodymium, were evaluated by spiking the primers of 38 Special ammunition cartridges (no propellant, no projectile) and discharging them onto 12.7 mm diameter aluminum SEM pin stubs. Following SEM-EDS microanalysis, the majority of tri-component particles containing lead, barium, and antimony (PbBaSb) were successfully tagged with the chemical marker. Results demonstrate a primer spiked with 0.75% weight percent of ytterbium nitrate affords PbBaSb particles characteristic of GSR with a ytterbium inclusion efficiency of between 77% and 100%. Reproducibility of the method was verified, and durability of the ytterbium-tagged tri-component particles under repeated SEM-EDS analysis was also tested. The ytterbium-tagged PbBaSb particles impart synthetic traceability to a GSR reference standard and are suitable for analysis alongside case work samples, as a positive control for quality assurance purposes.

  3. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  4. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data.

  5. Auditory sequence analysis and phonological skill.

    PubMed

    Grube, Manon; Kumar, Sukhbinder; Cooper, Freya E; Turton, Stuart; Griffiths, Timothy D

    2012-11-01

    This work tests the relationship between auditory and phonological skill in a non-selected cohort of 238 school students (age 11) with the specific hypothesis that sound-sequence analysis would be more relevant to phonological skill than the analysis of basic, single sounds. Auditory processing was assessed across the domains of pitch, time and timbre; a combination of six standard tests of literacy and language ability was used to assess phonological skill. A significant correlation between general auditory and phonological skill was demonstrated, plus a significant, specific correlation between measures of phonological skill and the auditory analysis of short sequences in pitch and time. The data support a limited but significant link between auditory and phonological ability with a specific role for sound-sequence analysis, and provide a possible new focus for auditory training strategies to aid language development in early adolescence. PMID:22951739

  6. Sequencing and Analysis of Neanderthal Genomic DNA

    PubMed Central

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Pääbo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2008-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library are of Neanderthal origin, the strongest being the ascertainment of sequence identities between Neanderthal and chimpanzee at sites where the human genomic sequence is different. These results enabled us to calculate the human-Neanderthal divergence time based on multiple randomly distributed autosomal loci. Our analyses suggest that on average the Neanderthal genomic sequence we obtained and the reference human genome sequence share a most recent common ancestor ~706,000 years ago, and that the human and Neanderthal ancestral populations split ~370,000 years ago, before the emergence of anatomically modern humans. Our finding that the Neanderthal and human genomes are at least 99.5% identical led us to develop and successfully implement a targeted method for recovering specific ancient DNA sequences from metagenomic libraries. This initial analysis of the Neanderthal genome advances our understanding of the evolutionary relationship of Homo sapiens and Homo neanderthalensis and signifies the dawn of Neanderthal genomics. PMID:17110569

  7. Investigating the genetics of Bti resistance using mRNA tag sequencing: application on laboratory strains and natural populations of the dengue vector Aedes aegypti

    PubMed Central

    Paris, Margot; Marcombe, Sebastien; Coissac, Eric; Corbel, Vincent; David, Jean-Philippe; Després, Laurence

    2013-01-01

    Mosquito control is often the main method used to reduce mosquito-transmitted diseases. In order to investigate the genetic basis of resistance to the bio-insecticide Bacillus thuringiensis subsp. israelensis (Bti), we used information on polymorphism obtained from cDNA tag sequences from pooled larvae of laboratory Bti-resistant and susceptible Aedes aegypti mosquito strains to identify and analyse 1520 single nucleotide polymorphisms (SNPs). Of the 372 SNPs tested, 99.2% were validated using DNA Illumina GoldenGate® array, with a strong correlation between the allelic frequencies inferred from the pooled and individual data (r = 0.85). A total of 11 genomic regions and five candidate genes were detected using a genome scan approach. One of these candidate genes showed significant departures from neutrality in the resistant strain at sequence level. Six natural populations from Martinique Island were sequenced for the 372 tested SNPs with a high transferability (87%), and association mapping analyses detected 14 loci associated with Bti resistance, including one located in a putative receptor for Cry11 toxins. Three of these loci were also significantly differentiated between the laboratory strains, suggesting that most of the genes associated with resistance might differ between the two environments. It also suggests that common selected regions might harbour key genes for Bti resistance. PMID:24187584

  8. Optimizing cancer genome sequencing and analysis

    PubMed Central

    Griffith, Malachi; Miller, Christopher A.; Griffith, Obi L.; Krysiak, Kilannin; Skidmore, Zachary L.; Ramu, Avinash; Walker, Jason R.; Dang, Ha X.; Trani, Lee; Larson, David E.; Demeter, Ryan T.; Wendl, Michael C.; McMichael, Joshua F.; Austin, Rachel E.; Magrini, Vincent; McGrath, Sean D.; Ly, Amy; Kulkarni, Shashikant; Cordes, Matthew G.; Fronick, Catrina C.; Fulton, Robert S.; Maher, Christopher A.; Ding, Li; Klco, Jeffery M.; Mardis, Elaine R.; Ley, Timothy J.; Wilson, Richard K.

    2015-01-01

    Summary Tumors are typically sequenced to depths of 75–100× (exome) or 30–50× (whole genome). We demonstrate that current sequencing paradigms are inadequate for tumors that are impure, aneuploid or clonally heterogeneous. To reassess optimal sequencing strategies, we performed ultra-deep (up to ~312×) whole genome sequencing (WGS) and exome capture (up to ~433×) of a primary acute myeloid leukemia, its subsequent relapse, and a matched normal skin sample. We tested multiple alignment and variant calling algorithms and validated ~200,000 putative SNVs by sequencing them to depths of ~1,000×. Additional targeted sequencing provided over 10,000× coverage and ddPCR assays provided up to ~250,000× sampling of selected sites. We evaluated the effects of different library generation approaches, depth of sequencing, and analysis strategies on the ability to effectively characterize a complex tumor. This dataset, representing the most comprehensively sequenced tumor described to date, will serve as an invaluable community resource (dbGaP accession id phs000159). PMID:26645048

  9. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  10. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  11. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  12. Phylogenetic analysis of adenovirus sequences.

    PubMed

    Harrach, Balázs; Benko, Mária

    2007-01-01

    Members of the family Adenoviridae have been isolated from a large variety of hosts, including representatives from every major vertebrate class from fish to mammals. The high prevalence, together with the fairly conserved organization of the central part of their genomes, make the adenoviruses one of (if not the) best models for studying viral evolution on a larger time scale. Phylogenetic calculation can infer the evolutionary distance among adenovirus strains on serotype, species, and genus levels, thus helping the establishment of a correct taxonomy on the one hand, and speeding up the process of typing new isolates on the other. Initially, four major lineages corresponding to four genera were recognized. Later, the demarcation criteria of lower taxon levels, such as species or types, could also be defined with phylogenetic calculations. A limited number of possible host switches have been hypothesized and convincingly supported. Application of the web-based BLAST and MultAlin programs and the freely available PHYLIP package, along with the TreeView program, enables everyone to make correct calculations. In addition to step-by-step instruction on how to perform phylogenetic analysis, critical points where typical mistakes or misinterpretation of the results might occur will be identified and hints for their avoidance will be provided. PMID:17656792

  13. Phylogenetic analysis of adenovirus sequences.

    PubMed

    Harrach, Balázs; Benko, Mária

    2007-01-01

    Members of the family Adenoviridae have been isolated from a large variety of hosts, including representatives from every major vertebrate class from fish to mammals. The high prevalence, together with the fairly conserved organization of the central part of their genomes, make the adenoviruses one of (if not the) best models for studying viral evolution on a larger time scale. Phylogenetic calculation can infer the evolutionary distance among adenovirus strains on serotype, species, and genus levels, thus helping the establishment of a correct taxonomy on the one hand, and speeding up the process of typing new isolates on the other. Initially, four major lineages corresponding to four genera were recognized. Later, the demarcation criteria of lower taxon levels, such as species or types, could also be defined with phylogenetic calculations. A limited number of possible host switches have been hypothesized and convincingly supported. Application of the web-based BLAST and MultAlin programs and the freely available PHYLIP package, along with the TreeView program, enables everyone to make correct calculations. In addition to step-by-step instruction on how to perform phylogenetic analysis, critical points where typical mistakes or misinterpretation of the results might occur will be identified and hints for their avoidance will be provided.

  14. SMASH, a fragmentation and sequencing method for genomic copy number analysis

    PubMed Central

    Wang, Zihua; Andrews, Peter; Kendall, Jude; Ma, Beicong; Hakker, Inessa; Rodgers, Linda; Ronemus, Michael; Wigler, Michael; Levy, Dan

    2016-01-01

    Copy number variants (CNVs) underlie a significant amount of genetic diversity and disease. CNVs can be detected by a number of means, including chromosomal microarray analysis (CMA) and whole-genome sequencing (WGS), but these approaches suffer from either limited resolution (CMA) or are highly expensive for routine screening (both CMA and WGS). As an alternative, we have developed a next-generation sequencing-based method for CNV analysis termed SMASH, for short multiply aggregated sequence homologies. SMASH utilizes random fragmentation of input genomic DNA to create chimeric sequence reads, from which multiple mappable tags can be parsed using maximal almost-unique matches (MAMs). The SMASH tags are then binned and segmented, generating a profile of genomic copy number at the desired resolution. Because fewer reads are necessary relative to WGS to give accurate CNV data, SMASH libraries can be highly multiplexed, allowing large numbers of individuals to be analyzed at low cost. Increased genomic resolution can be achieved by sequencing to higher depth. PMID:27197213

  15. Fluorogenic Tagging of Peptide and Protein 3-Nitrotyrosine with 4-(Aminomethyl)-benzenesulfonic Acid for Quantitative Analysis of Protein Tyrosine Nitration

    PubMed Central

    Sharov, Victor S.; Dremina, Elena S.; Galeva, Nadezhda A.; Gerstenecker, Gary S.; Li, Xiaobao; Dobrowsky, Rick T.; Stobaugh, John F.

    2010-01-01

    Protein 3-nitrotyrosine (3-NT) has been recognized as an important biomarker of nitroxidative stress associated with inflammatory and degenerative diseases, and biological aging. Analysis of protein-bound 3-NT continues to represent a challenge since in vivo it frequently does not accumulate on proteins in amounts detectable by quantitative analytical methods. Here, we describe a novel approach of fluorescent tagging and quantitation of peptide-bound 3-NT residues based on the selective reduction to 3-AT followed by reaction with 4-(amino-methyl)benzenesulfonic acid (ABS) in the presence of K3Fe(CN)6 to form a highly fluorescent 2-phenylbenzoxazole product. Synthetic 3-NT peptide (0.005–1 μM) upon reduction with 10 mM sodium dithionite and tagging with 2 mM ABS and 5 μM K3Fe(CN)6 in 0.1 M Na2HPO4 buffer (pH 9.0) was converted with yields >95% to a single fluorescent product incorporating two ABS molecules per 3-NT residue, with fluorescence excitation and emission maxima at 360 ± 2 and 490 ± 2 nm, respectively, and a quantum yield of 0.77 ± 0.08, based on reverse-phase LC with UV and fluorescence detection, fluorescence spectroscopy and LC–MS–MS analysis. This protocol was successfully tested for quantitative analysis of in vitro Tyr nitration in a model protein, rabbit muscle phosphorylase b, and in a complex mixture of proteins from C2C12 cultured cells exposed to peroxynitrite, with a detection limit of ca. 1 pmol 3-NT by fluorescence spectrometry, and an apparent LOD of 12 and 40 pmol for nitropeptides alone or in the presence of 100 μg digested cell proteins, respectively. LC–MS–MS analysis of ABS tagged peptides revealed that the fluorescent derivatives undergo efficient backbone fragmentations, allowing for sequence-specific characterization of protein Tyr nitration in proteomic studies. Fluorogenic tagging with ABS also can be instrumental for detection and visualization of protein 3-NT in LC and gel-based protein separations. PMID:20703364

  16. Fluorogenic Tagging of Peptide and Protein 3-Nitrotyrosine with 4-(Aminomethyl)-benzenesulfonic Acid for Quantitative Analysis of Protein Tyrosine Nitration.

    PubMed

    Sharov, Victor S; Dremina, Elena S; Galeva, Nadezhda A; Gerstenecker, Gary S; Li, Xiaobao; Dobrowsky, Rick T; Stobaugh, John F; Schöneich, Christian

    2010-01-01

    Protein 3-nitrotyrosine (3-NT) has been recognized as an important biomarker of nitroxidative stress associated with inflammatory and degenerative diseases, and biological aging. Analysis of protein-bound 3-NT continues to represent a challenge since in vivo it frequently does not accumulate on proteins in amounts detectable by quantitative analytical methods. Here, we describe a novel approach of fluorescent tagging and quantitation of peptide-bound 3-NT residues based on the selective reduction to 3-AT followed by reaction with 4-(amino-methyl)benzenesulfonic acid (ABS) in the presence of K(3)Fe(CN)(6) to form a highly fluorescent 2-phenylbenzoxazole product. Synthetic 3-NT peptide (0.005-1 μM) upon reduction with 10 mM sodium dithionite and tagging with 2 mM ABS and 5 μM K(3)Fe(CN)(6) in 0.1 M Na(2)HPO(4) buffer (pH 9.0) was converted with yields >95% to a single fluorescent product incorporating two ABS molecules per 3-NT residue, with fluorescence excitation and emission maxima at 360 ± 2 and 490 ± 2 nm, respectively, and a quantum yield of 0.77 ± 0.08, based on reverse-phase LC with UV and fluorescence detection, fluorescence spectroscopy and LC-MS-MS analysis. This protocol was successfully tested for quantitative analysis of in vitro Tyr nitration in a model protein, rabbit muscle phosphorylase b, and in a complex mixture of proteins from C2C12 cultured cells exposed to peroxynitrite, with a detection limit of ca. 1 pmol 3-NT by fluorescence spectrometry, and an apparent LOD of 12 and 40 pmol for nitropeptides alone or in the presence of 100 μg digested cell proteins, respectively. LC-MS-MS analysis of ABS tagged peptides revealed that the fluorescent derivatives undergo efficient backbone fragmentations, allowing for sequence-specific characterization of protein Tyr nitration in proteomic studies. Fluorogenic tagging with ABS also can be instrumental for detection and visualization of protein 3-NT in LC and gel-based protein separations. PMID

  17. Haplotypes of the TaGS5-A1 Gene Are Associated with Thousand-Kernel Weight in Chinese Bread Wheat

    PubMed Central

    Wang, Shasha; Yan, Xuefang; Wang, Yongyan; Liu, Hongmei; Cui, Dangqun; Chen, Feng

    2016-01-01

    In previous work, we cloned TaGS5 gene and found the association of TaGS5-A1 alleles with agronomic traits. In this study, the promoter sequence of the TaGS5-A1 gene was isolated from bread wheat. Sequencing results revealed that a G insertion was found in position -1925 bp of the TaGS5-A1 gene (Reference to ATG), which occurred in the Sp1 domain of the promoter sequence. Combined with previous single nucleotide polymorphism (SNP) in the TaGS5-A1 exon sequence, four genotypes were formed at the TaGS5-A1 locus and were designated as TaGS5-A1a-a, TaGS5-A1a-b, TaGS5-A1b-a, and TaGS5-A1b-b, respectively. Analysis of the association of TaGS5-A1 alleles with agronomic traits indicated that cultivars with the TaGS5-A1a-b allele possessed significantly higher thousand-kernel weight (TKW) and lower plant height than cultivars with the TaGS5-A1a-a allele, and cultivars with the TaGS5-A1b-b allele showed higher TKW than cultivars with the TaGS5-A1b-a allele. The differences of these traits between the TaGS5-A1a-a and TaGS5-A1a-b alleles were larger than those of the TaGS5-A1b-a and TaGS5-A1b-b alleles, suggesting that the -1925G insertion plays the more important role in TaGS5-A1a genotypes than in TaGS5-A1b genotypes. qRT-PCR indicated that TaGS5-A1b-b possessed the significantly highest expression level among four TaGS5-A1 haplotypes in mature seeds and further showed a significantly higher expression level than TaGS5-A1b-a at five different developmental stages of the seeds, suggesting that high expression of TaGS5-A1 was positively associated with high TKW in bread wheat. This study could provide a relatively superior genotype in view of TKW in wheat breeding programs and could also provide important information for dissection of the regulatory mechanism of the yield-related traits. PMID:27375643

  18. Sequence analysis by iterated maps, a review.

    PubMed

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  19. Sequence analysis by iterated maps, a review.

    PubMed

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results. PMID:24162172

  20. Cladistic analysis of anuran POMC sequences.

    PubMed

    Alrubaian, Jasem; Danielson, Phillip; Walker, David; Dores, Robert M

    2002-03-01

    Procedures for performing cladistic analyses can provide powerful tools for understanding the evolution of neuropeptide and polypeptide hormone coding genes. These analyses can be done on either amino acid data sets or nucleotide data sets and can utilize several different algorithms that are dependent on distinct sets of operating assumptions and constraints. In some cases, the results of these analyses can be used to gauge phylogenetic relationships between taxa. Selecting the proper cladistic analysis strategy is dependent on the taxonomic level of analysis and the rate of evolution within the orthologous genes being evaluated. For example, previous studies have shown that the amino acid sequence of proopiomelanocortin (POMC), the common precursor for the melanocortins and beta-endorphin, can be used to resolve phylogenetic relationships at the class and order level. This study tested the hypothesis that POMC sequences could be used to resolve phylogenetic relationships at the family taxonomic level. Cladistic analyses were performed on amphibian POMC sequences characterized from the marine toad, Bufo marinus (family Bufonidae; this study), the spadefoot toad, Spea multiplicatus (family Pelobatidae), the African clawed frog, Xenopus laevis (family Pipidae) and the laughing frog, Rana ridibunda (family Ranidae). In these analyses the sequence of Australian lungfish POMC was used as the outgroup. The analyses were done at the amino acid level using the maximum parsimony algorithm and at the nucleotide level using the maximum likelihood algorithm. For the anuran POMC genes, analysis at the nucleotide level using the maximum likelihood algorithm generated a cladogram with higher bootstrap values than the maximum parsimony analysis of the POMC amino acid data set. For anuran POMC sequences, analysis of nucleotide sequences using the maximum likelihood algorithm would appear to be the preferred strategy for resolving phylogenetic relationships at the family taxonomic

  1. Computational analysis of wake structure and body forces on marine animal research tag

    NASA Astrophysics Data System (ADS)

    Rosanio, Matthew; Morrida, Jacob; Green, Melissa

    2013-11-01

    The Acousounde 3B marine animal research tag is used to study the relationship between the sounds made by whales and their behaviors, and ultimately to improve whale conservation efforts. In practical implementation, some researchers have attached external GPS Fastloc devices to the top surface of the tag, in order to accurately record the position of the whales throughout the deployment. There is a need to characterize the flow over the tag in order to better understand the body forces being exerted on it and how wake turbulence could affect noise measurements. The addition of the GPS Fastloc exacerbates both of these concerns, as it complicates the hydrodynamics of the device. Using CFD techniques, we were able to simulate the flow over the tag with a GPS attachment at multiple yaw angles. We used Pointwise to construct the mesh and Fluent to simulate the flow. We have also used flow visualization to experimentally validate our computational results. It was found that the GPS has a minimal effect on the wake of the tag at a 0 degree offset from the freestream flow. However, at increasing offset angles, the presence of the GPS greatly increased the amount of wake turbulence observed. Performed work while undergrad at Syracuse.

  2. A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence tags.

    PubMed Central

    Tani, Naoki; Takahashi, Tomokazu; Iwata, Hiroyoshi; Mukai, Yuzuru; Ujino-Ihara, Tokuko; Matsumoto, Asako; Yoshimura, Kensuke; Yoshimaru, Hiroshi; Murai, Masafumi; Nagasaka, Kazutoshi; Tsumura, Yoshihiko

    2003-01-01

    A consensus map for sugi (Cryptomeria japonica) was constructed by integrating linkage data from two unrelated third-generation pedigrees, one derived from a full-sib cross and the other by self-pollination of F1 individuals. The progeny segregation data of the first pedigree were derived from cleaved amplified polymorphic sequences, microsatellites, restriction fragment length polymorphisms, and single nucleotide polymorphisms. The data of the second pedigree were derived from cleaved amplified polymorphic sequences, isozyme markers, morphological traits, random amplified polymorphic DNA markers, and restriction fragment length polymorphisms. Linkage analyses were done for the first pedigree with JoinMap 3.0, using its parameter set for progeny derived by cross-pollination, and for the second pedigree with the parameter set for progeny derived from selfing of F1 individuals. The 11 chromosomes of C. japonica are represented in the consensus map. A total of 438 markers were assigned to 11 large linkage groups, 1 small linkage group, and 1 nonintegrated linkage group from the second pedigree; their total length was 1372.2 cM. On average, the consensus map showed 1 marker every 3.0 cM. PCR-based codominant DNA markers such as cleaved amplified polymorphic sequences and microsatellite markers were distributed in all linkage groups and occupied about half of mapped loci. These markers are very useful for integration of different linkage maps, QTL mapping, and comparative mapping for evolutional study, especially for species with a large genome size such as conifers. PMID:14668402

  3. Dynamic changes in the composition of photosynthetic picoeukaryotes in the northwestern Pacific Ocean revealed by high-throughput tag sequencing of plastid 16S rRNA genes.

    PubMed

    Choi, Dong H; An, Sung M; Chun, Sungjun; Yang, Eun C; Selph, Karen E; Lee, Charity M; Noh, Jae H

    2016-02-01

    Photosynthetic picoeukaryotes (PPEs) are major oceanic primary producers. However, the diversity of such communities remains poorly understood, especially in the northwestern (NW) Pacific. We investigated the abundance and diversity of PPEs, and recorded environmental variables, along a transect from the coast to the open Pacific Ocean. High-throughput tag sequencing (using the MiSeq system) revealed the diversity of plastid 16S rRNA genes. The dominant PPEs changed at the class level along the transect. Prymnesiophyceae were the only dominant PPEs in the warm pool of the NW Pacific, but Mamiellophyceae dominated in coastal waters of the East China Sea. Phylogenetically, most Prymnesiophyceae sequences could not be resolved at lower taxonomic levels because no close relatives have been cultured. Within the Mamiellophyceae, the genera Micromonas and Ostreococcus dominated in marginal coastal areas affected by open water, whereas Bathycoccus dominated in the lower euphotic depths of oligotrophic open waters. Cryptophyceae and Phaeocystis (of the Prymnesiophyceae) dominated in areas affected principally by coastal water. We also defined the biogeographical distributions of Chrysophyceae, prasinophytes, Bacillariophyceaea and Pelagophyceae. These distributions were influenced by temperature, salinity and chlorophyll a and nutrient concentrations. PMID:26712350

  4. Dynamic changes in the composition of photosynthetic picoeukaryotes in the northwestern Pacific Ocean revealed by high-throughput tag sequencing of plastid 16S rRNA genes.

    PubMed

    Choi, Dong H; An, Sung M; Chun, Sungjun; Yang, Eun C; Selph, Karen E; Lee, Charity M; Noh, Jae H

    2016-02-01

    Photosynthetic picoeukaryotes (PPEs) are major oceanic primary producers. However, the diversity of such communities remains poorly understood, especially in the northwestern (NW) Pacific. We investigated the abundance and diversity of PPEs, and recorded environmental variables, along a transect from the coast to the open Pacific Ocean. High-throughput tag sequencing (using the MiSeq system) revealed the diversity of plastid 16S rRNA genes. The dominant PPEs changed at the class level along the transect. Prymnesiophyceae were the only dominant PPEs in the warm pool of the NW Pacific, but Mamiellophyceae dominated in coastal waters of the East China Sea. Phylogenetically, most Prymnesiophyceae sequences could not be resolved at lower taxonomic levels because no close relatives have been cultured. Within the Mamiellophyceae, the genera Micromonas and Ostreococcus dominated in marginal coastal areas affected by open water, whereas Bathycoccus dominated in the lower euphotic depths of oligotrophic open waters. Cryptophyceae and Phaeocystis (of the Prymnesiophyceae) dominated in areas affected principally by coastal water. We also defined the biogeographical distributions of Chrysophyceae, prasinophytes, Bacillariophyceaea and Pelagophyceae. These distributions were influenced by temperature, salinity and chlorophyll a and nutrient concentrations.

  5. Analysis and Design of a Long Range PTFE Substrate UHF RFID Tag for Cargo Container Identification

    NASA Astrophysics Data System (ADS)

    Petrariu, Adrian-Ioan; Popa, Valentin

    2016-01-01

    In this paper, a high-performances microstrip antenna for UHF (ultra high frequency) RFID (radio frequency identification) tag is designed, prototyped and tested. The antenna consists of two main components: a 1.52 mm RT/duroid 5880 laminate substrate on which the antenna is designed and a 10 mm polytetrafluoroethylene (PTFE) dielectric material placed as a separator between the antenna and the reference ground plane for the microstrip antenna. With this structure, the RFID tag can reach a maximum reading distance of 19 m, although the antenna has a compact size of 80 mm × 50 mm. The long reading distance is obtained by attaching to the antenna an RFID chip that can provide a reading sensitivity of -20.5 dBm. The high bandwidth from 677 MHz to 947 MHz measured at -10 dB, makes the tag being usable worldwide especially for cargo container identification, the main purpose of this research.

  6. Sequence analysis of the AAA protein family.

    PubMed Central

    Beyer, A.

    1997-01-01

    The AAA protein family, a recently recognized group of Walker-type ATPases, has been subjected to an extensive sequence analysis. Multiple sequence alignments revealed the existence of a region of sequence similarity, the so-called AAA cassette. The borders of this cassette were localized and within it, three boxes of a high degree of conservation were identified. Two of these boxes could be assigned to substantial parts of the ATP binding site (namely, to Walker motifs A and B); the third may be a portion of the catalytic center. Phylogenetic trees were calculated to obtain insights into the evolutionary history of the family. Subfamilies with varying degrees of intra-relatedness could be discriminated; these relationships are also supported by analysis of sequences outside the canonical AAA boxes: within the cassette are regions that are strongly conserved within each subfamily, whereas little or even no similarity between different subfamilies can be observed. These regions are well suited to define fingerprints for subfamilies. A secondary structure prediction utilizing all available sequence information was performed and the result was fitted to the general 3D structure of a Walker A/GTPase. The agreement was unexpectedly high and strongly supports the conclusion that the AAA family belongs to the Walker superfamily of A/GTPases. PMID:9336829

  7. Advancing the surgical implantation of electronic tags in fish: a gap analysis and research agenda based on a review of trends in intracoelomic tagging effects studies

    SciTech Connect

    Cooke, Steven J.; Woodley, Christa M.; Eppard, M. B.; Brown, Richard S.; Nielsen, Jennifer L.

    2011-03-08

    Early approaches to surgical implantation of electronic tags in fish were often through trial and error, however, in recent years there has been an interest in using scientific research to identify techniques and procedures that improve the outcome of surgical procedures and determine the effects of tagging on individuals. Here we summarize the trends in 108 peer-reviewed electronic tagging effect studies focused on intracoleomic implantation to determine opportunities for future research. To date, almost all of the studies have been conducted in freshwater, typically in laboratory environments, and have focused on biotelemetry devices. The majority of studies have focused on salmonids, cyprinids, ictalurids and centrarchids, with a regional bias towards North America, Europe and Australia. Most studies have focused on determining whether there is a negative effect of tagging relative to control fish, with proportionally fewer that have contrasted different aspects of the surgical procedure (e.g., methods of sterilization, incision location, wound closure material) that could advance the discipline. Many of these studies included routine endpoints such as mortality, growth, healing and tag retention, with fewer addressing sublethal measures such as swimming ability, predator avoidance, physiological costs, or fitness. Continued research is needed to further elevate the practice of electronic tag implantation in fish in order to ensure that the data generated are relevant to untagged conspecifics (i.e., no long-term behavioural or physiological consequences) and the surgical procedure does not impair the health and welfare status of the tagged fish. To that end, we advocate for i) rigorous controlled manipulations based on statistical designs that have adequate power, account for inter-individual variation, and include controls and shams, ii) studies that transcend the laboratory and the field with more studies in marine waters, iii) incorporation of knowledge and

  8. Information theory applications for biological sequence analysis.

    PubMed

    Vinga, Susana

    2014-05-01

    Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.

  9. Computational identification and characterization of conserved miRNAs and their target genes in garlic (Allium sativum L.) expressed sequence tags.

    PubMed

    Panda, Debashis; Dehury, Budheswar; Sahu, Jagajjit; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra K

    2014-03-10

    The endogenous small non-coding functional microRNAs (miRNAs) are short in size, range from ~21 to 24 nucleotides in length, play a pivotal role in gene expression in plants and animals by silencing genes either by destructing or blocking of translation of homologous mRNA. Although various high-throughput, time consuming and expensive techniques like forward genetics and direct cloning are employed to detect miRNAs in plants but comparative genomics complemented with novel bioinformatic tools pave the way for efficient and cost-effective identification of miRNAs through homologous sequence search with previously known miRNAs. In this study, an attempt was made to identify and characterize conserved miRNAs in garlic expressed sequence tags (ESTs) through computational means. For identification of novel miRNAs in garlic, a total 3227 known mature miRNAs of plant kingdom Viridiplantae were searched for homology against 21,637 EST sequences resulting in identification of 6 potential miRNA candidates belonging to 6 different miRNA families. The psRNATarget server predicted 33 potential target genes and their probable functions for the six identified miRNA families in garlic. Most of the garlic miRNA target genes seem to encode transcription factors as well as genes involved in stress response, metabolism, plant growth and development. The results from the present study will shed more light on the understanding of molecular mechanisms of miRNA in garlic which may aid in the development of novel and precise techniques to understand some post-transcriptional gene silencing mechanism in response to stress tolerance.

  10. Analysis of 3-D Tongue Motion from Tagged and Cine Magnetic Resonance Images

    ERIC Educational Resources Information Center

    Xing, Fangxu; Woo, Jonghye; Lee, Junghoon; Murano, Emi Z.; Stone, Maureen; Prince, Jerry L.

    2016-01-01

    Purpose: Measuring tongue deformation and internal muscle motion during speech has been a challenging task because the tongue deforms in 3 dimensions, contains interdigitated muscles, and is largely hidden within the vocal tract. In this article, a new method is proposed to analyze tagged and cine magnetic resonance images of the tongue during…

  11. Sequence analysis by iterated maps, a review

    PubMed Central

    2014-01-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, ‘Chaos Game Representation’. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results. PMID:24162172

  12. Versatile Trans-Replication Systems for Chikungunya Virus Allow Functional Analysis and Tagging of Every Replicase Protein

    PubMed Central

    Utt, Age; Quirin, Tania; Saul, Sirle; Hellström, Kirsi; Ahola, Tero; Merits, Andres

    2016-01-01

    Chikungunya virus (CHIKV; genus Alphavirus, family Togaviridae) has recently caused several major outbreaks affecting millions of people. There are no licensed vaccines or antivirals, and the knowledge of the molecular biology of CHIKV, crucial for development of efficient antiviral strategies, remains fragmentary. CHIKV has a 12 kb positive-strand RNA genome, which is translated to yield a nonstructural (ns) or replicase polyprotein. CHIKV structural proteins are expressed from a subgenomic RNA synthesized in infected cells. Here we have developed CHIKV trans-replication systems, where replicase expression and RNA replication are uncoupled. Bacteriophage T7 RNA polymerase or cellular RNA polymerase II were used for production of mRNAs for CHIKV ns polyprotein and template RNAs, which are recognized by CHIKV replicase and encode for reporter proteins. CHIKV replicase efficiently amplified such RNA templates and synthesized large amounts of subgenomic RNA in several cell lines. This system was used to create tagged versions of ns proteins including nsP1 fused with enhanced green fluorescent protein and nsP4 with an immunological tag. Analysis of these constructs and a matching set of replicon vectors revealed that the replicases containing tagged ns proteins were functional and maintained their subcellular localizations. When cells were co-transfected with constructs expressing template RNA and wild type or tagged versions of CHIKV replicases, formation of characteristic replicase complexes (spherules) was observed. Analysis of mutations associated with noncytotoxic phenotype in CHIKV replicons showed that a low level of RNA replication is not a pre-requisite for reduced cytotoxicity. The CHIKV trans-replicase does not suffer from genetic instability and represents an efficient, sensitive and reliable tool for studies of different aspects of CHIKV RNA replication process. PMID:26963103

  13. The Design and Analysis of Salmonid Tagging Studies in the Columbia Basin; Volume XII; A Multinomial Model for Estimating Ocean Survival from Salmonid Coded Wire-Tag Data.

    SciTech Connect

    Ryding, Kristen E.; Skalski, John R.

    1999-06-01

    The purpose of this report is to illustrate the development of a stochastic model using coded wire-tag (CWT) release and age-at-return data, in order to regress first year ocean survival probabilities against coastal ocean conditions and climate covariates.

  14. Identification of genes expressed in human CD34+ hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning

    PubMed Central

    Mao, Mao; Fu, Gang; Wu, Ji-Sheng; Zhang, Qing-Hua; Zhou, Jun; Kan, Li-Xin; Huang, Qiu-Hua; He, Kai-Li; Gu, Bai-Wei; Han, Ze-Guang; Shen, Yu; Gu, Jian; Yu, Ya-Ping; Xu, Shu-Hua; Wang, Ya-Xin; Chen, Sai-Juan; Chen, Zhu

    1998-01-01

    Hematopoietic stem/progenitor cells (HSPCs) possess the potentials of self-renewal, proliferation, and differentiation toward different lineages of blood cells. These cells not only play a primordial role in hematopoietic development but also have important clinical application. Characterization of the gene expression profile in CD34+ HSPCs may lead to a better understanding of the regulation of normal and pathological hematopoiesis. In the present work, genes expressed in human umbilical cord blood CD34+ cells were catalogued by partially sequencing a large amount of cDNA clones [or expressed sequence tags (ESTs)] and analyzing these sequences with the tools of bioinformatics. Among 9,866 ESTs thus obtained, 4,697 (47.6%) showed identity to known genes in the GenBank database, 2,603 (26.4%) matched to the ESTs previously deposited in a public domain database, 1,415 (14.3%) were previously undescribed ESTs, and the remaining 1,151 (11.7%) were mitochondrial DNA, ribosomal RNA, or repetitive (Alu or L1) sequences. Integration of ESTs of known genes generated a profile including 855 genes that could be divided into different categories according to their functions. Some (8.2%) of the genes in this profile were considered related to early hematopoiesis. The possible function of ESTs corresponding to so far unknown genes were approached by means of homology and functional motif searches. Moreover, attempts were made to generate libraries enriched for full-length cDNAs, to better explore the genes in HSPCs. Nearly 60% of the cDNA clones of mRNA under 2 kb in our libraries had 5′ ends upstream of the first ATG codon of the ORF. With this satisfactory result, we have developed an efficient working system that allowed fast sequencing of 32 full-length cDNAs, 16 of them being mapped to the chromosomes with radiation hybrid panels. This work may lay a basis for the further research on the molecular network of hematopoietic regulation. PMID:9653160

  15. A versatile PCR-based tandem epitope tagging system for Streptomyces coelicolor genome.

    PubMed

    Kim, Ji-Nu; Yi, Jeong Sang; Lee, Bo-Rahm; Kim, Eun-Jung; Kim, Min Woo; Song, Yoseb; Cho, Byung-Kwan; Kim, Byung-Gee

    2012-07-20

    Epitope tagging approaches have been widely used for the analysis of functions, interactions and subcellular distributions of proteins. However, incorporating epitope sequence into protein loci in Streptomyces is time-consuming procedure due to the absence of the versatile tagging methods. Here, we developed a versatile PCR-based tandem epitope tagging tool for the Streptomyces genome engineering. We constructed a series of template plasmids that carry repeated sequence of c-myc epitope, Flp recombinase target (FRT) sites, and apramycin resistance marker to insert epitope tags into any desired spot of the chromosomal loci. A DNA module which includes the tandem epitope-encoding sequence and a selectable marker was amplified by PCR with primers that carry homologous extensions to the last portion and downstream region of the targeted gene. We fused the epitope tags at the 3' region of global transcription factors of Streptomyces coelicolor to test the validity of this system. The proper insertion of the epitope tag was confirmed by PCR and western blot analysis. The recombinants showed the identical phenotype to the wild-type that proved the conservation of in vivo function of the tagged proteins. Finally, the direct binding targets were successfully detected by chromatin immunoprecipitation with the increase in the signal-to-noise ratio. The epitope tagging system describes here would provide wide applications to study the protein functions in S. coelicolor. PMID:22704935

  16. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum)

    PubMed Central

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156–5p, vco-miR156–3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs. PMID:25763692

  17. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum).

    PubMed

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156-5p, vco-miR156-3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs.

  18. A high-density genetic recombination map of sequence-tagged sites for sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses.

    PubMed Central

    Bowers, John E; Abbey, Colette; Anderson, Sharon; Chang, Charlene; Draye, Xavier; Hoppe, Alison H; Jessup, Russell; Lemke, Cornelia; Lennington, Jennifer; Li, Zhikang; Lin, Yann-Rong; Liu, Sin-Chieh; Luo, Lijun; Marler, Barry S; Ming, Reiguang; Mitchell, Sharon E; Qiang, Dou; Reischmann, Kim; Schulze, Stefan R; Skinner, D Neil; Wang, Yue-Wen; Kresovich, Stephen; Schertz, Keith F; Paterson, Andrew H

    2003-01-01

    We report a genetic recombination map for Sorghum of 2512 loci spaced at average 0.4 cM ( approximately 300 kb) intervals based on 2050 RFLP probes, including 865 heterologous probes that foster comparative genomics of Saccharum (sugarcane), Zea (maize), Oryza (rice), Pennisetum (millet, buffelgrass), the Triticeae (wheat, barley, oat, rye), and Arabidopsis. Mapped loci identify 61.5% of the recombination events in this progeny set and reveal strong positive crossover interference acting across intervals of sequence-tagged sites will foster many structural, functional and evolutionary genomic studies in major food, feed, and biomass crops. PMID:14504243

  19. A Unique Set of 11,008 Onion Expressed Sequence Tags Reveals Expressed Sequence and Genomic Differences between the Monocot Orders Asparagales and PoalesW⃞

    PubMed Central

    Kuhl, Joseph C.; Cheung, Foo; Yuan, Qiaoping; Martin, William; Zewdie, Yayeh; McCallum, John; Catanach, Andrew; Rutherford, Paul; Sink, Kenneth C.; Jenderek, Maria; Prince, James P.; Town, Christopher D.; Havey, Michael J.

    2004-01-01

    Enormous genomic resources have been developed for plants in the monocot order Poales; however, it is not clear how representative the Poales are for the monocots as a whole. The Asparagales are a monophyletic order sister to the lineage carrying the Poales and possess economically important plants such as asparagus, garlic, and onion. To assess the genomic differences between the Asparagales and Poales, we generated 11,008 unique ESTs from a normalized cDNA library of onion. Sequence analyses of these ESTs revealed microsatellite markers, single nucleotide polymorphisms, and homologs of transposable elements. Mean nucleotide similarity between rice and the Asparagales was 78% across coding regions. Expressed sequence and genomic comparisons revealed strong differences between the Asparagales and Poales for codon usage and mean GC content, GC distribution, and relative GC content at each codon position, indicating that genomic characteristics are not uniform across the monocots. The Asparagales were more similar to eudicots than to the Poales for these genomic characteristics. PMID:14671025

  20. Molecular Genetic Analysis of Activation-tagged Transcription Factors Thought to be Involved in Photomorphogenesis

    SciTech Connect

    Neff, Michael

    2011-06-23

    Plants utilize light as a source of information via families of photoreceptors such as the red/far-red absorbing phytochromes (PHY) and the blue/UVA absorbing cryptochromes (CRY). The main goal of the Neff lab is to use molecular-genetic mutant screens to elucidate signaling components downstream of these photoreceptors. Activation-tagging mutagenesis led to the identification of two putative transcription factors that may be involved in both photomorphogenesis and hormone signaling pathways. sob1-D (suppressor of phyB-dominant) mutant phenotypes are caused by the over-expression of a Dof transcription factor previously named OBP3. Our previous studies indicate that OBP3 is a negative regulator of light-mediated cotyledon expansion and may be involved in modulating responsiveness to the growth-regulating hormone auxin. The sob2-D mutant uncovers a role for LEP, a putative AP2/EREBP-like transcription factor, in seed germination, hypocotyl elongation and responsiveness to the hormone abscisic acid. Based on photobiological and genetic analysis of OBP3-knockdown and LEP-null mutations, we hypothesize that these transcription factors are involved in both light-mediated seedling development and hormone signaling. To examine the role that these genes play in photomorphogenesis we will: 1) Further explore the genetic role of OBP3 in cotyledon/leaf expansion and other photomorphogenic processes as well as examine potential physical interactions between OBP3 and CRY1 or other signaling components that genetically interact with this transcription factor 2) Test the hypothesis that OBP3 is genetically involved in auxin signaling and root development as well as examine the affects of this hormone and light on OBP3 protein accumulation. 3) Test the hypothesis that LEP is involved in seed germination, seedling photomorphogenesis and hormone signaling. Together these experiments will lead to a greater understanding of the complexity of interactions between photoreceptors and DNA

  1. In silico identification and characterization of conserved miRNAs and their target genes in sweet potato (Ipomoea batatas L.) expressed sequence tags (ESTs).

    PubMed

    Dehury, Budheswar; Panda, Debashis; Sahu, Jagajjit; Sahu, Mousumi; Sarma, Kishore; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra

    2013-01-01

    The endogenous small non-coding micro RNAs (miRNAs), which are typically ~21-24 nt nucleotides, play a crucial role in regulating the intrinsic normal growth of cells and development of the plants as well as in maintaining the integrity of genomes. These small non-coding RNAs function as the universal specificity factors in post-transcriptional gene silencing. Discovering miRNAs, identifying their targets, and further inferring miRNA functions is a routine process to understand normal biological processes of miRNAs and their roles in the development of plants. Comparative genomics based approach using expressed sequence tags (EST) and genome survey sequences (GSS) offer a cost-effective platform for identification and characterization of miRNAs and their target genes in plants. Despite the fact that sweet potato (Ipomoea batatas L.) is an important staple food source for poor small farmers throughout the world, the role of miRNA in various developmental processes remains largely unknown. In this paper, we report the computational identification of miRNAs and their target genes in sweet potato from their ESTs. Using comparative genomics-based approach, 8 potential miRNA candidates belonging to miR168, miR2911, and miR156 families were identified from 23 406 ESTs in sweet potato. A total of 42 target genes were predicted and their probable functions were illustrated. Most of the newly identified miRNAs target transcription factors as well as genes involved in plant growth and development, signal transduction, metabolism, defense, and stress response. The identification of miRNAs and their targets is expected to accelerate the pace of miRNA discovery, leading to an improved understanding of the role of miRNA in development and physiology of sweet potato, as well as stress response.

  2. Sorghum expressed sequence tags identify signature genes for drought, pathogenesis, and skotomorphogenesis from a milestone set of 16,801 unique transcripts.

    PubMed

    Pratt, Lee H; Liang, Chun; Shah, Manish; Sun, Feng; Wang, Haiming; Reid, St Patrick; Gingle, Alan R; Paterson, Andrew H; Wing, Rod; Dean, Ralph; Klein, Robert; Nguyen, Henry T; Ma, Hong-Mei; Zhao, Xin; Morishige, Daryl T; Mullet, John E; Cordonnier-Pratt, Marie-Michèle

    2005-10-01

    Improved knowledge of the sorghum transcriptome will enhance basic understanding of how plants respond to stresses and serve as a source of genes of value to agriculture. Toward this goal, Sorghum bicolor L. Moench cDNA libraries were prepared from light- and dark-grown seedlings, drought-stressed plants, Colletotrichum-infected seedlings and plants, ovaries, embryos, and immature panicles. Other libraries were prepared with meristems from Sorghum propinquum (Kunth) Hitchc. that had been photoperiodically induced to flower, and with rhizomes from S. propinquum and johnsongrass (Sorghum halepense L. Pers.). A total of 117,682 expressed sequence tags (ESTs) were obtained representing both 3' and 5' sequences from about half that number of cDNA clones. A total of 16,801 unique transcripts, representing tentative UniScripts (TUs), were identified from 55,783 3' ESTs. Of these TUs, 9,032 are represented by two or more ESTs. Collectively, these libraries were predicted to contain a total of approximately 31,000 TUs. Individual libraries, however, were predicted to contain no more than about 6,000 to 9,000, with the exception of light-grown seedlings, which yielded an estimate of close to 13,000. In addition, each library exhibits about the same level of complexity with respect to both the number of TUs preferentially expressed in that library and the frequency with which two or more ESTs is found in only that library. These results indicate that the sorghum genome is expressed in highly selective fashion in the individual organs and in response to the environmental conditions surveyed here. Close to 2,000 differentially expressed TUs were identified among the cDNA libraries examined, of which 775 were differentially expressed at a confidence level of 98%. From these 775 TUs, signature genes were identified defining drought, Colletotrichum infection, skotomorphogenesis (etiolation), ovary, immature panicle, and embryo.

  3. NexGen Production – Sequencing and Analysis

    SciTech Connect

    Muzny, Donna

    2010-06-02

    Donna Muzny of the Baylor College of Medicine Human Genome Sequencing Center discusses next generation sequencing platforms and evaluating pipeline performance on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  4. An integrated mobile system for non-destructive analysis with tagged neutrons

    NASA Astrophysics Data System (ADS)

    Cester, D.; Nebbia, G.; Stevanato, L.; Viesti, G.; Neri, F.; Petrucci, S.; Selmi, S.; Tintori, C.

    2013-04-01

    An integrated mobile system for port security is presented. The system is designed to perform active investigations by using the tagged neutron inspection technique of suspect dangerous materials as well as passive measurements of neutrons and gamma rays to search and identify radioactive and special nuclear materials. The system has been employed in detection tests of special nuclear material as well as in a seaport demonstration.

  5. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.

  6. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  7. Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags

    PubMed Central

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V.; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-01-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  8. Analysis and Improvement of a Pseudorandom Number Generator for EPC Gen2 Tags

    NASA Astrophysics Data System (ADS)

    Melia-Segui, J.; Garcia-Alfaro, J.; Herrera-Joancomarti, J.

    The EPC Gen2 is an international standard that proposes the use of Radio Frequency Identification (RFID) in the supply chain. It is designed to balance cost and functionality. The development of Gen2 tags faces, in fact, several challenging constraints such as cost, compatibility regulations, power consumption, and performance requirements. As a consequence, security on board of Gen2 tags is often minimal. It is, indeed, mainly based on the use of on board pseudorandomness. This pseudorandomness is used to blind the communication between readers and tags; and to acknowledge the proper execution of password-protected operations. Gen2 manufacturers are often reluctant to show the design of their pseudorandom generators. Security through obscurity has always been ineffective. Some open designs have also been proposed. Most of them fail, however, to prove their correctness. We analyze a recent proposal presented in the literature and demonstrate that it is, in fact, insecure. We propose an alternative mechanism that fits the Gen2 constraints and satisfies the security requirements.

  9. Photoswitching-free FRAP analysis with a genetically encoded fluorescent tag.

    PubMed

    Morisaki, Tatsuya; McNally, James G

    2014-01-01

    Fluorescence recovery after photobleaching (FRAP) is a widely used imaging technique for measuring protein dynamics in live cells that has provided many important biological insights. Although FRAP presumes that the conversion of a fluorophore from a bright to a dark state is irreversible, GFP as well as other genetically encoded fluorescent proteins now in common use can also exhibit a reversible conversion known as photoswitching. Various studies have shown how photoswitching can cause at least four different artifacts in FRAP, leading to false conclusions about various biological phenomena, including the erroneous identification of anomalous diffusion or the overestimation of the freely diffusible fraction of a cellular protein. Unfortunately, identifying and then correcting these artifacts is difficult. Here we report a new characteristic of an organic fluorophore tetramethylrhodamine bound to the HaloTag protein (TMR-HaloTag), which like GFP can be genetically encoded, but which directly and simply overcomes the artifacts caused by photoswitching in FRAP. We show that TMR exhibits virtually no photoswitching in live cells under typical imaging conditions for FRAP. We also demonstrate that TMR eliminates all of the four reported photoswitching artifacts in FRAP. Finally, we apply this photoswitching-free FRAP with TMR to show that the chromatin decondensation following UV irradiation does not involve loss of nucleosomes from the damaged DNA. In sum, we demonstrate that the TMR Halo label provides a genetically encoded fluorescent tag very well suited for accurate FRAP experiments. PMID:25233348

  10. Photoswitching-Free FRAP Analysis with a Genetically Encoded Fluorescent Tag

    PubMed Central

    Morisaki, Tatsuya; McNally, James G.

    2014-01-01

    Fluorescence recovery after photobleaching (FRAP) is a widely used imaging technique for measuring protein dynamics in live cells that has provided many important biological insights. Although FRAP presumes that the conversion of a fluorophore from a bright to a dark state is irreversible, GFP as well as other genetically encoded fluorescent proteins now in common use can also exhibit a reversible conversion known as photoswitching. Various studies have shown how photoswitching can cause at least four different artifacts in FRAP, leading to false conclusions about various biological phenomena, including the erroneous identification of anomalous diffusion or the overestimation of the freely diffusible fraction of a cellular protein. Unfortunately, identifying and then correcting these artifacts is difficult. Here we report a new characteristic of an organic fluorophore tetramethylrhodamine bound to the HaloTag protein (TMR-HaloTag), which like GFP can be genetically encoded, but which directly and simply overcomes the artifacts caused by photoswitching in FRAP. We show that TMR exhibits virtually no photoswitching in live cells under typical imaging conditions for FRAP. We also demonstrate that TMR eliminates all of the four reported photoswitching artifacts in FRAP. Finally, we apply this photoswitching-free FRAP with TMR to show that the chromatin decondensation following UV irradiation does not involve loss of nucleosomes from the damaged DNA. In sum, we demonstrate that the TMR Halo label provides a genetically encoded fluorescent tag very well suited for accurate FRAP experiments. PMID:25233348

  11. Protein sequence analysis using Hewlett-Packard biphasic sequencing cartridges in an applied biosystems 473A protein sequencer.

    PubMed

    Tang, S; Mozdzanowski, J; Anumula, K R

    1999-01-01

    Protein sequence analysis using an adsorptive biphasic sequencing cartridge, a set of two coupled columns introduced by Hewlett-Packard for protein sequencing by Edman degradation, in an Applied Biosystems 473A protein sequencer has been demonstrated. Samples containing salts, detergents, excipients, etc. (e.g., formulated protein drugs) can be easily analyzed using the ABI sequencer. Simple modifications to the ABI sequencer to accommodate the cartridge extend its utility in the analysis of difficult samples. The ABI sequencer solvents and reagents were compatible with the HP cartridge for sequencing. Sequence information up to ten residues can be easily generated by this nonoptimized procedure, and it is sufficient for identifying proteins by database search and for preparing a DNA probe for cloning novel proteins.

  12. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  13. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    PubMed

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  14. Tag removal in cardiac tagged MRI images using coupled dictionary learning.

    PubMed

    Makram, Abram W; Rushdi, Muhammad A; Khalifa, Ayman M; El-Wakad, Mohamed T

    2015-01-01

    Tagged Magnetic Resonance Imaging (tMRI) is considered to be the gold standard for quantitative assessment of the cardiac local functions. However, the tagging patterns and low myocardium-to-blood-pool contrast of tagged images bring great challenges to cardiac image processing and analysis tasks such as myocardium segmentation and tracking. Hence, there has been growing interest in techniques for removing tagging lines. In this work, a method for removing tagging patterns in tagged MR images using a coupled dictionary learning (CDL) model is proposed. In this model, identical sparse representations are assumed for image patches in the tagged MRI and corresponding cine MRI image spaces. First, we learn a dictionary for the tagged MRI image space. Then, we compute a dictionary for the cine MRI image space so that corresponding tagged and cine patches have the same sparse codes in terms of their respective dictionaries. Finally, in order to produce the de-tagged (cine version) of a test tagged image, the sparse codes of the tagged patches and the trained cine dictionary are used together to construct the de-tagged patches. We have tested this tag removal method on a dataset of tagged cardiac MR images. Our experimental results compared favorably with a recently proposed tag removal method that removes tags in the frequency domain using an optimal band-stop filter of harmonic peaks.

  15. Surface Plasmon Resonance Analysis of Histidine-Tagged F1-ATPase Surface Adsorption

    NASA Astrophysics Data System (ADS)

    Tucker, Jenifer K.; Richter, Mark L.; Berrie, Cindy L.

    2015-11-01

    Studies of the rotational activity of the enzymatic core (α3β3γ) of the F1-ATPase motor protein have relied on binding the enzyme to NTA-coated glass surfaces via polyhistidine tags engineered into the C-termini of each of the three α or β subunits. Those studies revealed the rotational motion of the central γ subunit by monitoring the motion of attached micron-long actin filaments or spherical nanoparticles. However, only a small percentage of the attached filaments or particles were observed to rotate, likely due, at least in part, to non-uniform surface attachment of the motor proteins. In this study, we have applied surface plasmon resonance to monitor the kinetics and affinity of binding of the His-tagged motor protein to NTA-coated gold sensor surfaces. The binding data, when fit to a heterogeneous binding model, exhibit two sets of adsorption-desorption rate constants with two dissociation constants of 4.0 × 10-9 M and 8.6 × 10-11 M for 6His-α3β3γ binding to the nickel ion-activated NTA surface. The data are consistent with mixed attachment of the protein via two (bimodal) and three (trimodal) NTA/Ni2+-His-tag interactions, respectively, with the less stable bimodal interaction dominating. The results provide a partial explanation for the low number of surface-attached F1 motors previously observed in rotation studies and suggest alternative approaches to uniform F1 motor surface attachment for future fabrication of motor-based nanobiodevices and materials.

  16. Meshless deformable models for 3D cardiac motion and strain analysis from tagged MRI.

    PubMed

    Wang, Xiaoxu; Chen, Ting; Zhang, Shaoting; Schaerer, Joël; Qian, Zhen; Huh, Suejung; Metaxas, Dimitris; Axel, Leon

    2015-01-01

    Tagged magnetic resonance imaging (TMRI) provides a direct and noninvasive way to visualize the in-wall deformation of the myocardium. Due to the through-plane motion, the tracking of 3D trajectories of the material points and the computation of 3D strain field call for the necessity of building 3D cardiac deformable models. The intersections of three stacks of orthogonal tagging planes are material points in the myocardium. With these intersections as control points, 3D motion can be reconstructed with a novel meshless deformable model (MDM). Volumetric MDMs describe an object as point cloud inside the object boundary and the coordinate of each point can be written in parametric functions. A generic heart mesh is registered on the TMRI with polar decomposition. A 3D MDM is generated and deformed with MR image tagging lines. Volumetric MDMs are deformed by calculating the dynamics function and minimizing the local Laplacian coordinates. The similarity transformation of each point is computed by assuming its neighboring points are making the same transformation. The deformation is computed iteratively until the control points match the target positions in the consecutive image frame. The 3D strain field is computed from the 3D displacement field with moving least squares. We demonstrate that MDMs outperformed the finite element method and the spline method with a numerical phantom. Meshless deformable models can track the trajectory of any material point in the myocardium and compute the 3D strain field of any particular area. The experimental results on in vivo healthy and patient heart MRI show that the MDM can fully recover the myocardium motion in three dimensions.

  17. Meshless deformable models for 3D cardiac motion and strain analysis from tagged MRI.

    PubMed

    Wang, Xiaoxu; Chen, Ting; Zhang, Shaoting; Schaerer, Joël; Qian, Zhen; Huh, Suejung; Metaxas, Dimitris; Axel, Leon

    2015-01-01

    Tagged magnetic resonance imaging (TMRI) provides a direct and noninvasive way to visualize the in-wall deformation of the myocardium. Due to the through-plane motion, the tracking of 3D trajectories of the material points and the computation of 3D strain field call for the necessity of building 3D cardiac deformable models. The intersections of three stacks of orthogonal tagging planes are material points in the myocardium. With these intersections as control points, 3D motion can be reconstructed with a novel meshless deformable model (MDM). Volumetric MDMs describe an object as point cloud inside the object boundary and the coordinate of each point can be written in parametric functions. A generic heart mesh is registered on the TMRI with polar decomposition. A 3D MDM is generated and deformed with MR image tagging lines. Volumetric MDMs are deformed by calculating the dynamics function and minimizing the local Laplacian coordinates. The similarity transformation of each point is computed by assuming its neighboring points are making the same transformation. The deformation is computed iteratively until the control points match the target positions in the consecutive image frame. The 3D strain field is computed from the 3D displacement field with moving least squares. We demonstrate that MDMs outperformed the finite element method and the spline method with a numerical phantom. Meshless deformable models can track the trajectory of any material point in the myocardium and compute the 3D strain field of any particular area. The experimental results on in vivo healthy and patient heart MRI show that the MDM can fully recover the myocardium motion in three dimensions. PMID:25157446

  18. Meshless deformable models for 3D cardiac motion and strain analysis from tagged MRI

    PubMed Central

    Wang, Xiaoxu; Chen, Ting; Zhang, Shaoting; Schaerer, Joël; Qian, Zhen; Huh, Suejung; Metaxas, Dimitris; Axel, Leon

    2016-01-01

    Tagged magnetic resonance imaging (TMRI) provides a direct and noninvasive way to visualize the in-wall deformation of the myocardium. Due to the through-plane motion, the tracking of 3D trajectories of the material points and the computation of 3D strain field call for the necessity of building 3D cardiac deformable models. The intersections of three stacks of orthogonal tagging planes are material points in the myocardium. With these intersections as control points, 3D motion can be reconstructed with a novel meshless deformable model (MDM). Volumetric MDMs describe an object as point cloud inside the object boundary and the coordinate of each point can be written in parametric functions. A generic heart mesh is registered on the TMRI with polar decomposition. A 3D MDM is generated and deformed with MR image tagging lines. Volumetric MDMs are deformed by calculating the dynamics function and minimizing the local Laplacian coordinates. The similarity transformation of each point is computed by assuming its neighboring points are making the same transformation. The deformation is computed iteratively until the control points match the target positions in the consecutive image frame. The 3D strain field is computed from the 3D displacement field with moving least squares. We demonstrate that MDMs outperformed the finite element method and the spline method with a numerical phantom. Meshless deformable models can track the trajectory of any material point in the myocardium and compute the 3D strain field of any particular area. The experimental results on in vivo healthy and patient heart MRI show that the MDM can fully recover the myocardium motion in three dimensions. PMID:25157446

  19. Advances in analytical methodology for bioinorganic speciation analysis: metallomics, metalloproteomics and heteroatom-tagged proteomics and metabolomics.

    PubMed

    Szpunar, Joanna

    2005-04-01

    The recent developments in analytical techniques capable of providing information on the identity and quantity of heteroatom-containing biomolecules are critically discussed. Particular attention is paid to the emerging areas of bioinorganic analysis including: (i) a comprehensive analysis of the entirety of metal and metalloid species within a cell or tissue type (metallomics), (ii) the study of the part of the metallome involving the protein ligands (metalloproteomics), and (iii) the use of a heteroelement, naturally present in a protein or introduced in a tag added by means of derivatisation, for the spotting and quantification of proteins (heteroatom-tagged proteomics). Inductively coupled plasma mass spectrometry (ICP MS), used as detector in chromatography and electrophoresis, and supported by electrospray and MALDI MS, appears as the linchpin analytical technique for these emerging areas. This review focuses on the recent advances in ICP MS in biological speciation analysis including sensitive detection of non-metals, especially of sulfur and phosphorus, couplings to capillary and nanoflow HPLC and capillary electrophoresis, laser ablation ICP MS detection of proteins in gel electrophoresis, and isotope dilution quantification of biomolecules. The paper can be considered as a followup of a previous review by the author on a similar topic (J. Szpunar, Analyst, 2000, 125, 963).

  20. Integrating Sequence Evolution into Probabilistic Orthology Analysis.

    PubMed

    Ullah, Ikram; Sjöstrand, Joel; Andersson, Peter; Sennblad, Bengt; Lagergren, Jens

    2015-11-01

    Orthology analysis, that is, finding out whether a pair of homologous genes are orthologs - stemming from a speciation - or paralogs - stemming from a gene duplication - is of central importance in computational biology, genome annotation, and phylogenetic inference. In particular, an orthologous relationship makes functional equivalence of the two genes highly likely. A major approach to orthology analysis is to reconcile a gene tree to the corresponding species tree, (most commonly performed using the most parsimonious reconciliation, MPR). However, most such phylogenetic orthology methods infer the gene tree without considering the constraints implied by the species tree and, perhaps even more importantly, only allow the gene sequences to influence the orthology analysis through the a priori reconstructed gene tree. We propose a sound, comprehensive Bayesian Markov chain Monte Carlo-based method, DLRSOrthology, to compute orthology probabilities. It efficiently sums over the possible gene trees and jointly takes into account the current gene tree, all possible reconciliations to the species tree, and the, typically strong, signal conveyed by the sequences. We compare our method with PrIME-GEM, a probabilistic orthology approach built on a probabilistic duplication-loss model, and MrBayesMPR, a probabilistic orthology approach that is based on conventional Bayesian inference coupled with MPR. We find that DLRSOrthology outperforms these competing approaches on synthetic data as well as on biological data sets and is robust to incomplete taxon sampling artifacts. PMID:26130236

  1. OSIRIS-REx Touch-And-Go (TAG) Navigation Performance

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Antreasian, Peter; Moreau, Michael C.; May, Alex; Sutter, Brian

    2015-01-01

    The Origins Spectral Interpretation Resource identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) Bennu in late 2018. Following an extensive campaign of proximity operations activities to characterize the properties of Bennu and select a suitable sample site, OSIRIES-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid's surface to obtain a regolith sample. The paper summarizes the mission design of the TAG sequence, the propulsive required to achieve the trajectory, and the sequence of events leading up to the TAG event. The paper will summarize the Monte-Carlo simulation of the TAG sequence and present analysis results that demonstrate the ability to conduct the TAG within 25 meters of the selected sample site and +-2 cms of the targeted contact velocity. The paper will describe some of the challenges associated with conducting precision navigation operations and ultimately contacting a very small asteroid.

  2. OSIRI-REx Touch and Go (TAG) Navigation Performance

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Antreasian, Peter; Moreau, Michael C.; May, Alex; Sutter, Brian

    2015-01-01

    The Origins Spectral Interpretation Resource Identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) Bennu in late 2018. Following an extensive campaign of proximity operations activities to characterize the properties of Bennu and select a suitable sample site, OSIRIS-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid's surface to obtain a regolith sample. The paper summarizes the mission design of the TAG sequence, the propulsive maneuvers required to achieve the trajectory, and the sequence of events leading up to the TAG event. The paper also summarizes the Monte-Carlo simulation of the TAG sequence and presents analysis results that demonstrate the ability to conduct the TAG within 25 meters of the selected sample site and 2 cm/s of the targeted contact velocity. The paper describes some of the challenges associated with conducting precision navigation operations and ultimately contacting a very small asteroid.

  3. [Genetic mapping of resistant gene to southern corn rust and the tagging analysis on different genetic background].

    PubMed

    Chen, Cui-Xia; Xing, Quan-Hua; Liang, Chun-Yang; Yu, Yuan-Jie; Liang, Feng-Shan; Wang, Hong-Gang; Wang, Zhen-Lin; Wang, Bin

    2003-04-01

    Southern corn rust (SCR) is a destructive disease in maize. The inbred line Qi319 is highly resistant to southern corn rust. SSR technique was employed to preliminary mapping of the resistance gene. Bulked segregant analysis revealed that two primers, phi 118 and phi 041, amplified polymorphic bands. SSR analysis on populations indicated the two primers were linked to the rust resistance gene, which was mapped on the short arm of chromosome 10. In addition, comparative analysis of the amplification bands among different populations revealed that the amplification products with the same primer in different populations were dissimilar. This result indicates that the genetic background may affect results of gene mapping and tagging. So, it is important to select suitable population to performing molecular marker analysis and gene mapping.

  4. Multilocus sequence analysis (MLSA) in prokaryotic taxonomy.

    PubMed

    Glaeser, Stefanie P; Kämpfer, Peter

    2015-06-01

    To obtain a higher resolution of the phylogenetic relationships of species within a genus or genera within a family, multilocus sequence analysis (MLSA) is currently a widely used method. In MLSA studies, partial sequences of genes coding for proteins with conserved functions ('housekeeping genes') are used to generate phylogenetic trees and subsequently deduce phylogenies. However, MLSA is not only suggested as a phylogenetic tool to support and clarify the resolution of bacterial species with a higher resolution, as in 16S rRNA gene-based studies, but has also been discussed as a replacement for DNA-DNA hybridization (DDH) in species delineation. Nevertheless, despite the fact that MLSA has become an accepted and widely used method in prokaryotic taxonomy, no common generally accepted recommendations have been devised to date for either the whole area of microbial taxonomy or for taxa-specific applications of individual MLSA schemes. The different ways MLSA is performed can vary greatly for the selection of genes, their number, and the calculation method used when comparing the sequences obtained. Here, we provide an overview of the historical development of MLSA and critically review its current application in prokaryotic taxonomy by highlighting the advantages and disadvantages of the method's numerous variations. This provides a perspective for its future use in forthcoming genome-based genotypic taxonomic analyses.

  5. Analysis of mixtures using next generation sequencing of mitochondrial DNA hypervariable regions

    PubMed Central

    Kim, Hanna; Erlich, Henry A.; Calloway, Cassandra D.

    2015-01-01

    Aim To apply massively parallel and clonal sequencing (next generation sequencing or NGS) to the analysis of forensic mixed samples. Methods A duplex polymerase chain reaction (PCR) assay targeting the mitochondrial DNA (mtDNA) hypervariable regions I/II (HVI/HVII) was developed for NGS analysis on the Roche 454 GS Junior instrument. Eight sets of multiplex identifier-tagged 454 fusion primers were used in a combinatorial approach for amplification and deep sequencing of up to 64 samples in parallel. Results This assay was shown to be highly sensitive for sequencing limited DNA amounts ( ~ 100 mtDNA copies) and analyzing contrived and biological mixtures with low level variants ( ~ 1%) as well as “complex” mixtures (≥3 contributors). PCR artifact “hybrid” sequences generated by jumping PCR or template switching were observed at a low level (<2%) in the analysis of mixed samples but could be eliminated by reducing the PCR cycle number. Conclusion This study demonstrates the power of NGS technologies targeting the mtDNA HVI/HVII regions for analysis of challenging forensic samples, such as mixtures and specimens with limited DNA. PMID:26088845

  6. Comparative characterization of sweetpotato antioxidant genes from expressed sequence tags of dehydration-treated fibrous roots under different abiotic stress conditions.

    PubMed

    Kim, Yun-Hee; Jeong, Jae Cheol; Lee, Haeng-Soon; Kwak, Sang-Soo

    2013-04-01

    Drought stress is one of the most adverse conditions for plant growth and productivity. The plant antioxidant system is an important defense mechanism and includes antioxidant enzymes and low-molecular weight antioxidants. Understanding the biochemical and molecular responses to drought is essential for improving plant resistance to water-limited conditions. Previously, we isolated and characterized expressed sequence tags (ESTs) from a full-length enriched cDNA library prepared from fibrous roots of sweetpotato subjected to dehydration stress (Kim et al. in BMB Rep 42:271-276, [5]). In this study, we isolated and characterized 11 sweetpotato antioxidant genes from sweetpotato EST library under various abiotic stress conditions, which included six intracellular CuZn superoxide dismutases (CuZnSOD), ascorbate peroxidase, catalase, glutathione peroxidase (GPX), glutathione-S-transferase, thioredoxin (TRX), and five extracellular peroxidase genes. The expression of almost all the antioxidant genes induced under dehydration treatments occurred in leaves, with the exception of extracellular swPB6, whereas some antioxidant genes showed increased expression levels in the fibrous roots, such as intracellular GPX, TRX, extracellular swPA4, and swPB7 genes. During various abiotic stress treatments in leaves, such as exposure to NaCl, cold, and abscisic acid, several intracellular antioxidant genes were strongly expressed compared with the expression of extracellular antioxidant genes. These results indicated that some intracellular antioxidant genes, especially swAPX1 and CuZnSOD, might be specifically involved in important defense mechanisms against oxidative stress induced by various abiotic stresses including dehydration in sweetpotato plants.

  7. Substantial prevalence of microdeletions of the Y-chromosome in infertile men with idiopathic azoospermia and oligozoospermia detected using a sequence-tagged site-based mapping strategy

    SciTech Connect

    Najmabadi, H.; Huang, V.; Bhasin, D.

    1996-04-01

    Genes on the long arm of Y (Yq), particularly within interval 6, are believed to play a critical role in human spermatogenesis. Cytogenetically detectable deletions of this region are associated with azoospermia in men, but are relatively uncommon. The objective of this study was to validate a sequence-tagged site (STS)-mapping strategy for the detection of Yq microdeletions and to use this method to determine the proportion of men with idiopathic azoospermia or severe oligozoospermia who carry microdeletions in Yq. STS mapping of a sufficiently large sample of infertile men should also help further localize the putative gene(s) involved in the pathogenesis of male infertility. Genomic DNA was extracted from peripheral leukocytes of 16 normal fertile men, 7 normal fertile women, 60 infertile men, and 15 patients with the X-linked disorder, ichthyosis. PCR primers were synthesized for 26 STSs that span Yq interval 6. None of the 16 normal men of known fertility had microdeletions. Seven normal fertile women failed to amplify any of the 26 STSs, providing evidence of their Y specificity. No microdeletions were detected in any of the 15 patients with ichthyosis. Of the 60 infertile men typed with 26 STSs, 11 (18%; 10 azoospermic and 1 oligozoospermic) failed to amplify 1 or more STS. Interestingly, 4 of the 11 patients had microdeletions in a region that is outside the Yq region from which the DAZ (deleted in azoospermia gene region) gene was cloned. In an additional 3 patients, microdeletions were present both inside and outside the DAZ region. The physical locations of these microdeletions provide further support for the concept that a gene(s) on Yq deletion interval 6 plays an important role in spermatogenesis. The presence of deletions that do not overlap with the DAZ region suggests that genes other than the DAZ gene may also be implicated in the pathogenesis of some subsets of male infertility. 48 refs., 2 figs., 2 tabs.

  8. Analyses of expressed sequence tags from the maize foliar pathogen Cercospora zeae-maydis identify novel genes expressed during vegetative, infectious, and reproductive growth

    PubMed Central

    Bluhm, Burton H; Dhillon, Braham; Lindquist, Erika A; Kema, Gert HJ; Goodwin, Stephen B; Dunkle, Larry D

    2008-01-01

    Background The ascomycete fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial losses annually throughout the Western Hemisphere. Despite its impact on maize production, little is known about the regulation of pathogenesis in C. zeae-maydis at the molecular level. The objectives of this study were to generate a collection of expressed sequence tags (ESTs) from C. zeae-maydis and evaluate their expression during vegetative, infectious, and reproductive growth. Results A total of 27,551 ESTs was obtained from five cDNA libraries constructed from vegetative and sporulating cultures of C. zeae-maydis. The ESTs, grouped into 4088 clusters and 531 singlets, represented 4619 putative unique genes. Of these, 36% encoded proteins similar (E value ≤ 10-05) to characterized or annotated proteins from the NCBI non-redundant database representing diverse molecular functions and biological processes based on Gene Ontology (GO) classification. We identified numerous, previously undescribed genes with potential roles in photoreception, pathogenesis, and the regulation of development as well as Zephyr, a novel, actively transcribed transposable element. Differential expression of selected genes was demonstrated by real-time PCR, supporting their proposed roles in vegetative, infectious, and reproductive growth. Conclusion Novel genes that are potentially involved in regulating growth, development, and pathogenesis were identified in C. zeae-maydis, providing specific targets for characterization by molecular genetics and functional genomics. The EST data establish a foundation for future studies in evolutionary and comparative genomics among species of Cercospora and other groups of plant pathogenic fungi. PMID:18983654

  9. Expression of Epitope-Tagged Proteins in Mammalian Cells in Culture.

    PubMed

    Bhatt, Jay M; Styers, Melanie L; Sztul, Elizabeth

    2016-01-01

    Before the advent of molecular methods to tag proteins, visualization of proteins within cells required the use of antibodies directed against the protein of interest. Thus, only proteins for which antibodies were available could be visualized. Epitope tagging allows the detection of all proteins with existing sequence information, irrespective of the availability of antibodies directed against them. This technique involves the generation of DNA constructs that express the protein of interest tagged with an epitope that can be recognized by a commercially available antibody. Proteins can be tagged with a wide variety of epitopes using commercially available vectors that allow expression in mammalian cells. Epitope-tagged proteins are easily transfected into mammalian cell lines and, in most cases, tightly mimic the behavior of the endogenous protein. Tagged proteins exogenously expressed in cells provide different types of information depending on the subsequent detection approaches. Using immunofluorescence and immunoelectron microscopy with anti-tag antibodies, relative to known markers of cellular organelles, can provide information on the subcellular localization of the tagged protein and may provide clues regarding the protein's function. Immunofluorescence with anti-tag antibodies can also be utilized to assess the tagged protein's responses to cellular signals and pharmacological treatments. Immunoprecipitations with anti-tag antibodies can recover protein complexes containing the protein of interest, resulting in the identification of interacting proteins. Recovery of tagged proteins on affinity matrices allows their purification for use in biochemical assays. In addition, specialized fluorescent tags, such as the green fluorescent protein (GFP) allow the analysis of cellular dynamics in live cells in real time. PMID:27515071

  10. Spawning behavior in Atlantic cod: analysis by use of data storage tags

    USGS Publications Warehouse

    Grabowski, Timothy B.; Thorsteinsson, Vilhjalmur; Marteinsdóttir, Gudrún

    2014-01-01

     Electronic data storage tags (DSTs) were implanted into Atlantic cod captured in Icelandic waters from 2002 to 2007 and the depth profiles recovered from these tags (females: n = 31, males: n = 27) were used to identify patterns consistent with published descriptions of cod courtship and spawning behavior. The individual periods of time that males spent exhibiting behavior consistent with being present in a spawning aggregation—i.e. periods consisting of a clear tidal signature in the DST depth profile associated with an individual remaining on or near the substrate—were longer than those of females. Over the course of a spawning season, male cod spent approximately twice the amount of time in spawning aggregations than females, but female cod visited more aggregations per unit time. On average, males participated in approximately 57% more putative spawning events, i.e. vertical ascents potentially corresponding to gamete release, than did females. However, males <85 cm total length participated in the same number of putative spawning events as females of comparable size. In both sexes, larger individuals and/or individuals that spent a longer period of time within an aggregation participated in a larger number of putative spawning events. Although further validation and refinement is necessary, particularly in the identification of spawning events, the ability offered by DSTs to quantify cod spawning behavior may aid in the development of management and conservation plans.

  11. Kinetic analysis of his-tagged protein binding to nickel-chelating nanolipoprotein particles.

    PubMed

    Blanchette, Craig D; Fischer, Nicholas O; Corzett, Michele; Bench, Graham; Hoeprich, Paul D

    2010-07-21

    Nanolipoprotein particles (NLPs) are discoidal self-assembling membrane mimetics that have been primarily used as a platform for the solubilization and stabilization of membrane proteins. Nickel-chelating nanolipoprotein particles (NiNLPs) containing nickel-chelating lipids (Ni-lipid) for the targeted immobilization of His-tagged proteins hold promise as carriers of hydrophilic biological molecules for a range of applications. The effect of protein loading (i.e., the number of proteins bound per NiNLP) and Ni-lipid content on the time scales and kinetics of binding are important to various applications such as vaccine development, diagnostic imaging, and drug delivery. We have immobilized hexa-His-tagged LsrB, a Yersinia pestis transport protein, onto NiNLPs to examine the effect of protein binding stoichiometry and Ni-lipid content on the time scales and kinetics of protein binding by surface plasmon resonance (SPR). Data indicate that the dissociation half-time increases with Ni-lipid content up to a molar concentration of 35% and decreases as the number of bound protein per NiNLP increases. These findings indicate that the kinetics of protein binding are highly dependent on both the number of bound protein per NiNLP and Ni-lipid content. PMID:20586461

  12. Bayesian Correlation Analysis for Sequence Count Data

    PubMed Central

    Lau, Nelson; Perkins, Theodore J.

    2016-01-01

    Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449

  13. Coumarin tags for analysis of peptides by MALDI-TOF MS and MS/MS. 2. Alexa Fluor 350 tag for increased peptide and protein Identification by LC-MALDI-TOF/TOF MS.

    PubMed

    Pashkova, Anna; Chen, Hsuan-Shen; Rejtar, Tomas; Zang, Xin; Giese, Roger; Andreev, Victor; Moskovets, Eugene; Karger, Barry L

    2005-04-01

    The goal of this study was the development of N-terminal tags to improve peptide identification using high-throughput MALDI-TOF/TOF MS. Part 1 of the study was focused on the influence of derivatization on the intensities of MALDI-TOF MS signals of peptides. In part 2, various derivatization approaches for the improvement of peptide fragmentation efficiency in MALDI-TOF/TOF MS are explored. We demonstrate that permanent cation tags, while significantly improving signal intensity in the MS mode, lead to severe suppression of MS/MS fragmentation, making these tags unsuitable for high-throughput MALDI-TOF/TOF MS analysis. In the present work, it was found that labeling with Alexa Fluor 350, a coumarin tag containing a sulfo group, along with guanidation of epsilon-amino groups of Lys, could enhance unimolecular fragmentation of peptides with the formation of a high-intensity y-ion series, while the peptide intensities in the MS mode were not severely affected. LC-MALDI-TOF/TOF MS analysis of tryptic peptides from the SCX fractions of an E. coli lysate revealed improved peptide scores, a doubling of the total number of peptides, and a 30% increase in the number of proteins identified, as a result of labeling. Furthermore, by combining the data from native and labeled samples, confidence in correct identification was increased, as many proteins were identified by different peptides in the native and labeled data sets. Additionally, derivatization was found not to impair chromatographic behavior of peptides. All these factors suggest that labeling with Alexa Fluor 350 is a promising approach to the high-throughput LC-MALDI-TOF/TOF MS analysis of proteomic samples.

  14. Integrative visual analysis of protein sequence mutations

    PubMed Central

    2014-01-01

    Background An important aspect of studying the relationship between protein sequence, structure and function is the molecular characterization of the effect of protein mutations. To understand the functional impact of amino acid changes, the multiple biological properties of protein residues have to be considered together. Results Here, we present a novel visual approach for analyzing residue mutations. It combines different biological visualizations and integrates them with molecular data derived from external resources. To show various aspects of the biological information on different scales, our approach includes one-dimensional sequence views, three-dimensional protein structure views and two-dimensional views of residue interaction networks as well as aggregated views. The views are linked tightly and synchronized to reduce the cognitive load of the user when switching between them. In particular, the protein mutations are mapped onto the views together with further functional and structural information. We also assess the impact of individual amino acid changes by the detailed analysis and visualization of the involved residue interactions. We demonstrate the effectiveness of our approach and the developed software on the data provided for the BioVis 2013 data contest. Conclusions Our visual approach and software greatly facilitate the integrative and interactive analysis of protein mutations based on complementary visualizations. The different data views offered to the user are enriched with information about molecular properties of amino acid residues and further biological knowledge. PMID:25237389

  15. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  16. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi. PMID:26542221

  17. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  18. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    SciTech Connect

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  19. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    PubMed Central

    2010-01-01

    Background Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35% of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies. PMID:20096101

  20. An Analysis of the Effects of RFID Tags on Narrowband Navigation and Communication Receivers

    NASA Technical Reports Server (NTRS)

    LaBerge, E. F. Charles

    2007-01-01

    The simulated effects of the Radio Frequency Identification (RFID) tag emissions on ILS Localizer and ILS Glide Slope functions match the analytical models developed in support of DO-294B provided that the measured peak power levels are adjusted for 1) peak-to-average power ratio, 2) effective duty cycle, and 3) spectrum analyzer measurement bandwidth. When these adjustments are made, simulated and theoretical results are in extraordinarily good agreement. The relationships hold over a large range of potential interference-to-desired signal power ratios, provided that the adjusted interference power is significantly higher than the sum of the receiver noise floor and the noise-like contributions of all other interference sources. When the duty-factor adjusted power spectral densities are applied in the evaluation process described in Section 6 of DO-294B, most narrowband guidance and communications radios performance parameters are unaffected by moderate levels of RFID interference. Specific conclusions and recommendations are provided.

  1. Transcriptome Profile Analysis of Sugarcane Responses to Sporisorium scitaminea Infection Using Solexa Sequencing Technology

    PubMed Central

    Xu, Liping; Guo, Jinlong; Su, Yachun

    2013-01-01

    To understand the molecular basis of sugarcane-smut interaction, it is important to identify sugarcane genes that respond to the pathogen attack. High-throughput tag-sequencing (tag-seq) analysis by Solexa technology was performed on sugarcane infected with Sporisorium scitaminea, which should have massively increased the amount of data available for transcriptome profile analysis. After mapping to sugarcane EST databases in NCBI, we obtained 2015 differentially expressed genes, of which 1125 were upregulated and 890 downregulated by infection. Gene ontology (GO) analysis revealed that the differentially expressed genes involve in many cellular processes. Pathway analysis revealed that metabolic pathways and ribosome function are significantly affected, where upregulation of expression dominates over downregulation. Differential expression of three candidate genes involved in MAP kinase signaling pathway, ScBAK1 (GenBank Accession number: KC857629), ScMapkk (GenBank Accession number: KC857627), and ScGloI (GenBank Accession number: KC857628), was confirmed by reverse transcription polymerase chain reaction (RT-PCR). Real-time quantitative PCR (qRT-PCR) analysis concluded that the expression of these genes were all up-regulated after the infection of S. scitaminea and may play a role in pathogen response in sugarcane. The present study provides insights into the molecular mechanism of sugarcane defense to S. scitaminea infection, leading to a more comprehensive understanding of sugarcane-smut interaction. PMID:24288673

  2. Multilocus sequence analysis of the family Halomonadaceae.

    PubMed

    de la Haba, Rafael R; Márquez, M Carmen; Papke, R Thane; Ventosa, Antonio

    2012-03-01

    Multilocus sequence analysis (MLSA) protocols have been developed for species circumscription for many taxa. However, at present, no studies based on MLSA have been performed within any moderately halophilic bacterial group. To test the usefulness of MLSA with these kinds of micro-organisms, the family Halomonadaceae, which includes mainly halophilic bacteria, was chosen as a model. This family comprises ten genera with validly published names and 85 species of environmental, biotechnological and clinical interest. In some cases, the phylogenetic relationships between members of this family, based on 16S rRNA gene sequence comparisons, are not clear and a deep phylogenetic analysis using several housekeeping genes seemed appropriate. Here, MLSA was applied using the 16S rRNA, 23S rRNA, atpA, gyrB, rpoD and secA genes for species of the family Halomonadaceae. Phylogenetic trees based on the individual and concatenated gene sequences revealed that the family Halomonadaceae formed a monophyletic group of micro-organisms within the order Oceanospirillales. With the exception of the genera Halomonas and Modicisalibacter, all other genera within this family were phylogenetically coherent. Five of the six studied genes (16S rRNA, 23S rRNA, gyrB, rpoD and secA) showed a consistent evolutionary history. However, the results obtained with the atpA gene were different; thus, this gene may not be considered useful as an individual gene phylogenetic marker within this family. The phylogenetic methods produced variable results, with those generated from the maximum-likelihood and neighbour-joining algorithms being more similar than those obtained by maximum-parsimony methods. Horizontal gene transfer (HGT) plays an important evolutionary role in the family Halomonadaceae; however, the impact of recombination events in the phylogenetic analysis was minimized by concatenating the six loci, which agreed with the current taxonomic scheme for this family. Finally, the findings of

  3. A Comparison of Hyperelastic Warping of PET Images with Tagged MRI for the Analysis of Cardiac Deformation

    DOE PAGESBeta

    Veress, Alexander I.; Klein, Gregory; Gullberg, Grant T.

    2013-01-01

    Tmore » he objectives of the following research were to evaluate the utility of a deformable image registration technique known as hyperelastic warping for the measurement of local strains in the left ventricle through the analysis of clinical, gated PET image datasets.wo normal human male subjects were sequentially imaged with PET and tagged MRI imaging. Strain predictions were made for systolic contraction using warping analyses of the PET images and HARP based strain analyses of the MRI images. Coefficient of determination R 2 values were computed for the comparison of circumferential and radial strain predictions produced by each methodology.here was good correspondence between the methodologies, with R 2 values of 0.78 for the radial strains of both hearts and from an R 2 = 0.81 and R 2 = 0.83 for the circumferential strains.he strain predictions were not statistically different ( P ≤ 0.01 ) . A series of sensitivity results indicated that the methodology was relatively insensitive to alterations in image intensity, random image noise, and alterations in fiber structure.his study demonstrated that warping was able to provide strain predictions of systolic contraction of the LV consistent with those provided by tagged MRI Warping.« less

  4. Shark Tagging Activities.

    ERIC Educational Resources Information Center

    Current: The Journal of Marine Education, 1998

    1998-01-01

    In this group activity, children learn about the purpose of tagging and how scientists tag a shark. Using a cut-out of a shark, students identify, measure, record data, read coordinates, and tag a shark. Includes introductory information about the purpose of tagging and the procedure, a data sheet showing original tagging data from Tampa Bay, and…

  5. Whole-Genome Sequencing in Outbreak Analysis

    PubMed Central

    Turner, Stephen D.; Riley, Margaret F.; Petri, William A.; Hewlett, Erik L.

    2015-01-01

    SUMMARY In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  6. Analysis of E. coli promoter sequences.

    PubMed Central

    Harley, C B; Reynolds, R P

    1987-01-01

    We have compiled and analyzed 263 promoters with known transcriptional start points for E. coli genes. Promoter elements (-35 hexamer, -10 hexamer, and spacing between these regions) were aligned by a program which selects the arrangement consistent with the start point and statistically most homologous to a reference list of promoters. The initial reference list was that of Hawley and McClure (Nucl. Acids Res. 11, 2237-2255, 1983). Alignment of the complete list was used for reference until successive analyses did not alter the structure of the list. In the final compilation, all bases in the -35 (TTGACA) and -10 (TATAAT) hexamers were highly conserved, 92% of promoters had inter-region spacing of 17 +/- 1 bp, and 75% of the uniquely defined start points initiated 7 +/- 1 bases downstream of the -10 region. The consensus sequence of promoters with inter-region spacing of 16, 17 or 18 bp did not differ. This compilation and analysis should be useful for studies of promoter structure and function and for programs which identify potential promoter sequences. PMID:3550697

  7. Whole-genome sequencing in outbreak analysis.

    PubMed

    Gilchrist, Carol A; Turner, Stephen D; Riley, Margaret F; Petri, William A; Hewlett, Erik L

    2015-07-01

    In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  8. Time fluctuation analysis of forest fire sequences

    NASA Astrophysics Data System (ADS)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  9. Analysis of new microsatellite markers developed from reported sequences of Japanese flounder Paralichthys olivaceus

    NASA Astrophysics Data System (ADS)

    Yu, Haiyang; Jiang, Liming; Chen, Wei; Wang, Xubo; Wang, Zhigang; Zhang, Quanqi

    2010-12-01

    The expressed sequence tags (ESTs) of Japanese flounder, Paralichthys olivaceus, were selected from GenBank to identify simple sequence repeats (SSRs) or microsatellites. A bioinformatic analysis of 11111 ESTs identified 751 SSR-containing ESTs, including 440 dinucleotide, 254 trinucleotide, 53 tetranucleotide, 95 pentanucleotide and 40 hexanucleotide microsatellites respectively. The CA/TG and GA/TC repeats were the most abundant microsatellites. AT-rich types were predominant among trinucleotide and tetranucleotide microsatellites. PCR primers were designed to amplify 10 identified microsatellites loci. The PCR results from eight pairs of primers showed polymorphisms in wild populations. In 30 wild individuals, the mean observed and expected heterozygosities of these 8 polymorphic SSRs were 0.71 and 0.83 respectively and the average PIC value was 0.8. These microsatellite markers should prove to be a useful addition to the microsatellite markers that are now available for this species.

  10. EST sequencing of Onychophora and phylogenomic analysis of Metazoa.

    PubMed

    Roeding, Falko; Hagner-Holler, Silke; Ruhberg, Hilke; Ebersberger, Ingo; von Haeseler, Arndt; Kube, Michael; Reinhardt, Richard; Burmester, Thorsten

    2007-12-01

    Onychophora (velvet worms) represent a small animal taxon considered to be related to Euarthropoda. We have obtained 1873 5' cDNA sequences (expressed sequence tags, ESTs) from the velvet worm Epiperipatus sp., which were assembled into 833 contigs. BLAST similarity searches revealed that 51.9% of the contigs had matches in the protein databases with expectation values lower than 10(-4). Most ESTs had the best hit with proteins from either Chordata or Arthropoda (approximately 40% respectively). The ESTs included sequences of 27 ribosomal proteins. The orthologous sequences from 28 other species of a broad range of phyla were obtained from the databases, including other EST projects. A concatenated amino acid alignment comprising 5021 positions was constructed, which covers 4259 positions when problematic regions were removed. Bayesian and maximum likelihood methods place Epiperipatus within the monophyletic Ecdysozoa (Onychophora, Arthropoda, Tardigrada and Nematoda), but its exact relation to the Euarthropoda remained unresolved. The "Articulata" concept was not supported. Tardigrada and Nematoda formed a well-supported monophylum, suggesting that Tardigrada are actually Cycloneuralia. In agreement with previous studies, we have demonstrated that random sequencing of cDNAs results in sequence information suitable for phylogenomic approaches to resolve metazoan relationships. PMID:17933557

  11. EST sequencing of Onychophora and phylogenomic analysis of Metazoa.

    PubMed

    Roeding, Falko; Hagner-Holler, Silke; Ruhberg, Hilke; Ebersberger, Ingo; von Haeseler, Arndt; Kube, Michael; Reinhardt, Richard; Burmester, Thorsten

    2007-12-01

    Onychophora (velvet worms) represent a small animal taxon considered to be related to Euarthropoda. We have obtained 1873 5' cDNA sequences (expressed sequence tags, ESTs) from the velvet worm Epiperipatus sp., which were assembled into 833 contigs. BLAST similarity searches revealed that 51.9% of the contigs had matches in the protein databases with expectation values lower than 10(-4). Most ESTs had the best hit with proteins from either Chordata or Arthropoda (approximately 40% respectively). The ESTs included sequences of 27 ribosomal proteins. The orthologous sequences from 28 other species of a broad range of phyla were obtained from the databases, including other EST projects. A concatenated amino acid alignment comprising 5021 positions was constructed, which covers 4259 positions when problematic regions were removed. Bayesian and maximum likelihood methods place Epiperipatus within the monophyletic Ecdysozoa (Onychophora, Arthropoda, Tardigrada and Nematoda), but its exact relation to the Euarthropoda remained unresolved. The "Articulata" concept was not supported. Tardigrada and Nematoda formed a well-supported monophylum, suggesting that Tardigrada are actually Cycloneuralia. In agreement with previous studies, we have demonstrated that random sequencing of cDNAs results in sequence information suitable for phylogenomic approaches to resolve metazoan relationships.

  12. Image sequence analysis workstation for multipoint motion analysis

    NASA Astrophysics Data System (ADS)

    Mostafavi, Hassan

    1990-08-01

    This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.

  13. Sequencing, Assembly and Analysis of Human Microbial Communities

    SciTech Connect

    Petrosino, Joe

    2010-06-04

    Joe Petrosino of Baylor College of Medicine discusses using next generation sequencing technologies to study human microbial communities associated with health and disease on June 4, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  14. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  15. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    PubMed

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  16. Schlieren sequence analysis using computer vision

    NASA Astrophysics Data System (ADS)

    Smith, Nathanial Timothy

    Computer vision-based methods are proposed for extraction and measurement of flow structures of interest in schlieren video. As schlieren data has increased with faster frame rates, we are faced with thousands of images to analyze. This presents an opportunity to study global flow structures over time that may not be evident from surface measurements. A degree of automation is desirable to extract flow structures and features to give information on their behavior through the sequence. Using an interdisciplinary approach, the analysis of large schlieren data is recast as a computer vision problem. The double-cone schlieren sequence is used as a testbed for the methodology; it is unique in that it contains 5,000 images, complex phenomena, and is feature rich. Oblique structures such as shock waves and shear layers are common in schlieren images. A vision-based methodology is used to provide an estimate of oblique structure angles through the unsteady sequence. The methodology has been applied to a complex flowfield with multiple shocks. A converged detection success rate between 94% and 97% for these structures is obtained. The modified curvature scale space is used to define features at salient points on shock contours. A challenge in developing methods for feature extraction in schlieren images is the reconciliation of existing techniques with features of interest to an aerodynamicist. Domain-specific knowledge of physics must therefore be incorporated into the definition and detection phases. Known location and physically possible structure representations form a knowledge base that provides a unique feature definition and extraction. Model tip location and the motion of a shock intersection across several thousand frames are identified, localized, and tracked. Images are parsed into physically meaningful labels using segmentation. Using this representation, it is shown that in the double-cone flowfield, the dominant unsteady motion is associated with large scale

  17. Differential Proteomic Analysis of Human Saliva using Tandem Mass Tags Quantification for Gastric Cancer Detection

    PubMed Central

    Xiao, Hua; Zhang, Yan; Kim, Yong; Kim, Sung; Kim, Jae Joon; Kim, Kyoung Mee; Yoshizawa, Janice; Fan, Liu-Yin; Cao, Cheng-Xi; Wong, David T. W.

    2016-01-01

    Novel biomarkers and non-invasive diagnostic methods are urgently needed for the screening of gastric cancer to reduce its high mortality. We employed quantitative proteomics approach to develop discriminatory biomarker signatures from human saliva for the detection of gastric cancer. Salivary proteins were analyzed and compared between gastric cancer patients and matched control subjects by using tandem mass tags (TMT) technology. More than 500 proteins were identified with quantification, and 48 of them showed significant difference expression (p < 0.05) between normal controls and gastric cancer patients, including 7 up-regulated proteins and 41 down-regulated proteins. Five proteins were selected for initial verification by ELISA and three were successfully verified, namely cystatin B (CSTB), triosephosphate isomerase (TPI1), and deleted in malignant brain tumors 1 protein (DMBT1). All three proteins could differentiate gastric cancer patients from normal control subjects, dramatically (p < 0.05). The combination of these three biomarkers could reach 85% sensitivity and 80% specificity for the detection of gastric cancer with accuracy of 0.93. This study provides the proof of concept of salivary biomarkers for the non-invasive detection of gastric cancer. It is highly encouraging to turn these biomarkers into an applicable clinical test after large scale validation. PMID:26911362

  18. Characterization and RNA-seq analysis of underperformer, an activation-tagged potato mutant.

    PubMed

    Aulakh, Sukhwinder S; Veilleux, Richard E; Dickerman, Allan W; Tang, Guozhu; Flinn, Barry S

    2014-04-01

    The potato cv. Bintje and a Bintje activation-tagged mutant, underperformer (up) were compared. Mutant up plants grown in vitro were dwarf, with abundant axillary shoot growth, greater tuber yield, altered tuber traits and early senescence compared to wild type. Under in vivo conditions, the dwarf and early senescence phenotypes of the mutant remained, but the up plants exhibited a lower tuber yield and fewer axillary shoots compared to wild type. Southern blot analyses indicated a single T-DNA insertion in the mutant, located on chromosome 10. Initial PCR-based gene expression studies indicated transcriptional activation/repression of several genes in the mutant flanking the insertion. The gene immediately flanking the right border of the T-DNA insertion, which encoded an uncharacterized Broad complex, Tramtrac, Bric-a-brac; also known as Pox virus and Zinc finger (BTB/POZ) domain-containing protein (StBTB/POZ1) containing an Armadillo repeat region, was up-regulated in the mutant. Global gene expression comparisons between Bintje and up using RNA-seq on leaves from 60 day-old plants revealed a dataset of over 1,600 differentially expressed genes. Gene expression analyses suggested a variety of biological processes and pathways were modified in the mutant, including carbohydrate and lipid metabolism, cell division and cell cycle activity, biotic and abiotic stress responses, and proteolysis.

  19. Whole exome sequence analysis of Peters anomaly

    PubMed Central

    Weh, Eric; Reis, Linda M.; Happ, Hannah C.; Levin, Alex V.; Wheeler, Patricia G.; David, Karen L.; Carney, Erin; Angle, Brad; Hauser, Natalie

    2015-01-01

    Peters anomaly is a rare form of anterior segment ocular dysgenesis, which can also be associated with additional systemic defects. At this time, the majority of cases of Peters anomaly lack a genetic diagnosis. We performed whole exome sequencing of 27 patients with syndromic or isolated Peters anomaly to search for pathogenic mutations in currently known ocular genes. Among the eight previously recognized Peters anomaly genes, we identified a de novo missense mutation in PAX6, c.155G>A, p.(Cys52Tyr), in one patient. Analysis of 691 additional genes currently associated with a different ocular phenotype identified a heterozygous splicing mutation c.1025+2T>A in TFAP2A, a de novo heterozygous nonsense mutation c.715C>T, p.(Gln239*) in HCCS, a hemizygous mutation c.385G>A, p.(Glu129Lys) in NDP, a hemizygous mutation c.3446C>T, p.(Pro1149Leu) in FLNA, and compound heterozygous mutations c.1422T>A, p.(Tyr474*) and c.2544G>A, p.(Met848Ile) in SLC4A11; all mutations, except for the FLNA and SLC4A11 c.2544G>A alleles, are novel. This is the frst study to use whole exome sequencing to discern the genetic etiology of a large cohort of patients with syndromic or isolated Peters anomaly. We report five new genes associated with this condition and suggest screening of TFAP2A and FLNA in patients with Peters anomaly and relevant syndromic features and HCCS, NDP and SLC4A11 in patients with isolated Peters anomaly. PMID:25182519

  20. Phylogenetic analysis of burkholderia species by multilocus sequence analysis.

    PubMed

    Estrada-de los Santos, Paulina; Vinuesa, Pablo; Martínez-Aguilar, Lourdes; Hirsch, Ann M; Caballero-Mellado, Jesús

    2013-07-01

    Burkholderia comprises more than 60 species of environmental, clinical, and agro-biotechnological relevance. Previous phylogenetic analyses of 16S rRNA, recA, gyrB, rpoB, and acdS gene sequences as well as genome sequence comparisons of different Burkholderia species have revealed two major species clusters. In this study, we undertook a multilocus sequence analysis of 77 type and reference strains of Burkholderia using atpD, gltB, lepA, and recA genes in combination with the 16S rRNA gene sequence and employed maximum likelihood and neighbor-joining criteria to test this further. The phylogenetic analysis revealed, with high supporting values, distinct lineages within the genus Burkholderia. The two large groups were named A and B, whereas the B. rhizoxinica/B. endofungorum, and B. andropogonis groups consisted of two and one species, respectively. The group A encompasses several plant-associated and saprophytic bacterial species. The group B comprises the B. cepacia complex (opportunistic human pathogens), the B. pseudomallei subgroup, which includes both human and animal pathogens, and an assemblage of plant pathogenic species. The distinct lineages present in Burkholderia suggest that each group might represent a different genus. However, it will be necessary to analyze the full set of Burkholderia species and explore whether enough phenotypic features exist among the different clusters to propose that these groups should be considered separate genera.

  1. DNA sequence analysis by MALDI mass spectrometry.

    PubMed Central

    Kirpekar, F; Nordhoff, E; Larsen, L K; Kristiansen, K; Roepstorff, P; Hillenkamp, F

    1998-01-01

    Conventional DNA sequencing is based on gel electrophoretic separation of the sequencing products. Gel casting and electrophoresis are the time limiting steps, and the gel separation is occasionally imperfect due to aberrant mobility of certain fragments, leading to erroneous sequence determination. Furthermore, illegitimately terminated products frequently cannot be distinguished from correctly terminated ones, a phenomenon that also obscures data interpretation. In the present work the use of MALDI mass spectrometry for sequencing of DNA amplified from clinical samples is implemented. The unambiguous and fast identification of deletions and substitutions in DNA amplified from heterozygous carriers realistically suggest MALDI mass spectrometry as a future alternative to conventional sequencing procedures for high throughput screening for mutations. Unique features of the method are demonstrated by sequencing a DNA fragment that could not be sequenced conventionally because of gel electrophoretic band compression and the presence of multiple non-specific termination products. Taking advantage of the accurate mass information provided by MALDI mass spectrometry, the sequence was deduced, and the nature of the non-specific termination could be determined. The method described here increases the fidelity in DNA sequencing, is fast, compatible with standard DNA sequencing procedures, and amenable to automation. PMID:9592136

  2. Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences.

    PubMed

    Irizarry, K; Kustanovich, V; Li, C; Brown, N; Nelson, S; Wong, W; Lee, C J

    2000-10-01

    Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).

  3. Project Report: Automatic Sequence Processor Software Analysis

    NASA Technical Reports Server (NTRS)

    Benjamin, Brandon

    2011-01-01

    The Mission Planning and Sequencing (MPS) element of Multi-Mission Ground System and Services (MGSS) provides space missions with multi-purpose software to plan spacecraft activities, sequence spacecraft commands, and then integrate these products and execute them on spacecraft. Jet Propulsion Laboratory (JPL) is currently is flying many missions. The processes for building, integrating, and testing the multi-mission uplink software need to be improved to meet the needs of the missions and the operations teams that command the spacecraft. The Multi-Mission Sequencing Team is responsible for collecting and processing the observations, experiments and engineering activities that are to be performed on a selected spacecraft. The collection of these activities is called a sequence and ultimately a sequence becomes a sequence of spacecraft commands. The operations teams check the sequence to make sure that no constraints are violated. The workflow process involves sending a program start command, which activates the Automatic Sequence Processor (ASP). The ASP is currently a file-based system that is comprised of scripts written in perl, c-shell and awk. Once this start process is complete, the system checks for errors and aborts if there are any; otherwise the system converts the commands to binary, and then sends the resultant information to be radiated to the spacecraft.

  4. Microbial community analysis of a methane-oxidizing biofilm using ribosomal tag pyrosequencing.

    PubMed

    Kim, Tae Gwan; Lee, Eun-Hee; Cho, Kyung-Suk

    2012-03-01

    Current ecological knowledge of methanotrophic biofilms is incomplete, although they have been broadly studied in biotechnological processes. Four individual DNA samples were prepared from a methanotrophic biofilm, and a multiplex 16S rDNA pyrosequencing was performed. A complete library (before being de-multiplexed) contained 33,639 sequences (average length, 415 nt). Interestingly, methanotrophs were not dominant, only making up 23% of the community. Methylosinus, Methylomonas, and Methylosarcina were the dominant methanotrophs. Type II methanotrophs were more abundant than type I (56 vs. 44%), but less richer and diverse. Dominant non-methanotrophic genera included Hydrogenophaga, Flavobacterium, and Hyphomicrobium. The library was de-multiplexed into four libraries, with different sequencing efforts (3,915-20,133 sequences). Sørrenson abundance similarity results showed that the four libraries were almost identical (indices > 0.97), and phylogenetic comparisons using UniFrac test and P-test revealed the same results. It was demonstrated that the pyrosequencing was highly reproducible. These survey results can provide an insight into the management and/or manipulation of methanotrophic biofilms. PMID:22450792

  5. Automated shielding analysis sequences for spent fuel casks

    SciTech Connect

    Tang, J.S.; Parks, C.V.; Hermann, O.W.

    1987-01-01

    Two important Shielding Analysis Sequences (SAS) have recently been developed within the SCALE computational system. These sequences significantly enhance the existing SCALE system capabilities for evaluating radiation doses exterior to spent fuel casks. These new control module sequences (SAS1 and SAS4) and their capabilities are discussed and demonstrated, together with the existing SAS2 sequence that is used to generate radiation sources for spent fuel. Particular attention is given to the new SAS4 sequence which provides an automated scheme for generating and using biasing parameters in a subsequent Monte Carlo analysis of a cask.

  6. Understanding why users tag: A survey of tagging motivation literature and results from an empirical study

    PubMed Central

    Strohmaier, Markus; Körner, Christian; Kern, Roman

    2012-01-01

    While recent progress has been achieved in understanding the structure and dynamics of social tagging systems, we know little about the underlying user motivations for tagging, and how they influence resulting folksonomies and tags. This paper addresses three issues related to this question. (1) What distinctions of user motivations are identified by previous research, and in what ways are the motivations of users amenable to quantitative analysis? (2) To what extent does tagging motivation vary across different social tagging systems? (3) How does variability in user motivation influence resulting tags and folksonomies? In this paper, we present measures to detect whether a tagger is primarily motivated by categorizing or describing resources, and apply these measures to datasets from seven different tagging systems. Our results show that (a) users’ motivation for tagging varies not only across, but also within tagging systems, and that (b) tag agreement among users who are motivated by categorizing resources is significantly lower than among users who are motivated by describing resources. Our findings are relevant for (1) the development of tag-based user interfaces, (2) the analysis of tag semantics and (3) the design of search algorithms for social tagging systems. PMID:23471473

  7. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2001-06-05

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  8. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  9. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1999-10-26

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  10. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  11. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2003-08-19

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  12. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, M.S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.

  13. Molecular cloning and characterization of a plant alpha1,3/4-fucosidase based on sequence tags from almond fucosidase I.

    PubMed

    Zeleny, Reinhard; Leonard, Renaud; Dorfner, Georg; Dalik, Thomas; Kolarich, Daniel; Altmann, Friedrich

    2006-04-01

    Our work with almond peptide N-glycosidase A made us interested also in the alpha1,3/4-fucosidase which is used as a specific reagent for glycoconjugate analysis. The enzyme was purified to presumed homogeneity by a series of chromatographic steps including dye affinity and fast-performance anion exchange chromatography. The 63 kDa band was analyzed by tandem mass spectrometry which yielded several partial sequences. A homology search retrieved the hypothetical protein Q8GW72 from Arabidopsis thaliana. This protein has recently been described as being specific for alpha1,2-linkages. However, cDNA cloning and expression in Pichia pastoris of the A. thaliana fucosidase showed that it hydrolyzed fucose in 3- and 4-linkage to GlcNAc in Lewis determinants whereas neither 2-linked fucose nor fucose in 3-linkage to the innermost GlcNAc residue were attacked. This first cloning of a plant alpha1,3/4-fucosidase also confirmed the identity of the purified almond enzyme and thus settles the not