Science.gov

Sample records for sequence tags analysis

  1. Expressed sequence tags: analysis and annotation.

    PubMed

    Parkinson, John; Blaxter, Mark

    2004-01-01

    Expressed sequence tags (ESTs) present a special set of problems for bioinformatic analysis. They are partial and error-prone, and large datasets can have significant internal redundancy. To facilitate analysis of small EST datasets from in-house projects, we present an integrated "pipeline" of tools that take EST data from sequence trace to database submission. These tools also can be used to provide clustering of ESTs into putative genes and to annotate these genes with preliminary sequence similarity searches. The systems are written to use the public-domain LINUX environment and other openly available analytical tools. PMID:15153624

  2. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    NASA Astrophysics Data System (ADS)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  3. Expressed sequence tag analysis in tef (Eragrostis tef (Zucc) Trotter).

    PubMed

    Yu, Ju-Kyung; Sun, Qi; Rota, Mauricio La; Edwards, Hugh; Tefera, Hailu; Sorrells, Mark E

    2006-04-01

    Tef (Eragrostis tef (Zucc.) Trotter) is the most important cereal crop in Ethiopia; however, there is very little DNA sequence information available for this species. Expressed sequence tags (ESTs) were generated from 4 cDNA libraries: seedling leaf, seedling root, and inflorescence of E. tef and seedling leaf of Eragrostis pilosa, a wild relative of E. tef. Clustering of 3603 sequences produced 530 clusters and 1890 singletons, resulting in 2420 tef unigenes. Approximately 3/4 of tef unigenes matched protein or nucleotide sequences in public databases. Annotation of unigenes associated 68% of the putative tef genes with gene ontology categories. Identification of the translated unigenes for conserved protein domains revealed 389 protein family domains (Pfam), the most frequent of which was protein kinase. A total of 170 ESTs containing simple sequence repeats (EST-SSRs) were identified and 80 EST-SSR markers were developed. In addition, 19 single-nucleotide polymorphism (SNP) and (or) insertion-deletion (indel) and 34 intron fragment length polymorphism (IFLP) markers were developed. The EST database and molecular markers generated in this study will be valuable resources for further tef genetic research. PMID:16699556

  4. Analysis of the dermatophyte Trichophyton rubrum expressed sequence tags

    PubMed Central

    Wang, Lingling; Ma, Li; Leng, Wenchuan; Liu, Tao; Yu, Lu; Yang, Jian; Yang, Li; Zhang, Wenliang; Zhang, Qian; Dong, Jie; Xue, Ying; Zhu, Yafang; Xu, Xingye; Wan, Zhe; Ding, Guohui; Yu, Fudong; Tu, Kang; Li, Yixue; Li, Ruoyu; Shen, Yan; Jin, Qi

    2006-01-01

    Background Dermatophytes are the primary causative agent of dermatophytoses, a disease that affects billions of individuals worldwide. Trichophyton rubrum is the most common of the superficial fungi. Although T. rubrum is a recognized pathogen for humans, little is known about how its transcriptional pattern is related to development of the fungus and establishment of disease. It is therefore necessary to identify genes whose expression is relevant to growth, metabolism and virulence of T. rubrum. Results We generated 10 cDNA libraries covering nearly the entire growth phase and used them to isolate 11,085 unique expressed sequence tags (ESTs), including 3,816 contigs and 7,269 singletons. Comparisons with the GenBank non-redundant (NR) protein database revealed putative functions or matched homologs from other organisms for 7,764 (70%) of the ESTs. The remaining 3,321 (30%) of ESTs were only weakly similar or not similar to known sequences, suggesting that these ESTs represent novel genes. Conclusion The present data provide a comprehensive view of fungal physiological processes including metabolism, sexual and asexual growth cycles, signal transduction and pathogenic mechanisms. PMID:17032460

  5. Analysis of expressed sequence tags (ESTs) from Agrostis species obtained using sequence related amplified polymorphism.

    PubMed

    Dinler, Gizem; Budak, Hikmet

    2008-10-01

    Bentgrass (Agrostis spp.), a genus of the Poaceae family, consists of more than 200 species and is mainly used in athletic fields and golf courses. Creeping bentgrass (A. stolonifera L.) is the most commonly used species in maintaining golf courses, followed by colonial bentgrass (A. capillaris L.) and velvet bentgrass (A. canina L.). The presence and nature of sequence related amplified polymorphism (SRAP) at the cDNA level were investigated. We isolated 80 unique cDNA fragment bands from these species using 56 SRAP primer combinations. Sequence analysis of cDNA clones and analysis of putative translation products revealed that some encoded amino acid sequences were similar to proteins involved in DNA synthesis, transcription, and signal transduction. The cytosolic glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene (GenBank accession no. EB812822) was also identified from velvet bentgrass, and the corresponding protein sequence is further analyzed due to its critical role in many cellular processes. The partial peptide sequence obtained was 112 amino acids long, presenting a high degree of homology to parts of the N-terminal and C-terminal regions of cytosolic phosphorylating GAPDH (GapC). The existence of common expressed sequence tags (ESTs) revealed by a minimum evolutionary dendrogram among the Agrostis ESTs indicated the usefulness of SRAP for comparative genome analysis of transcribed genes in the grass species. PMID:18726683

  6. Expressed sequence tags: an overview.

    PubMed

    Parkinson, John; Blaxter, Mark

    2009-01-01

    Expressed sequence tags (ESTs) are fragments of mRNA sequences derived through single sequencing reactions performed on randomly selected clones from cDNA libraries. To date, over 45 million ESTs have been generated from over 1400 different species of eukaryotes. For the most part, EST projects are used to either complement existing genome projects or serve as low-cost alternatives for purposes of gene discovery. However, with improvements in accuracy and coverage, they are beginning to find application in fields such as phylogenetics, transcript profiling and proteomics. This volume provides practical details on the generation and analysis of ESTs. Chapters are presented which cover creation of cDNA libraries; generation and processing of sequence data; bioinformatics analysis of ESTs; and their application to phylogenetics and transcript profiling. PMID:19277571

  7. Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

    PubMed

    Al-Swailem, Abdulaziz M; Shehata, Maher M; Abu-Duhier, Faisel M; Al-Yamani, Essam J; Al-Busadah, Khalid A; Al-Arawi, Mohammed S; Al-Khider, Ali Y; Al-Muhaimeed, Abdullah N; Al-Qahtani, Fahad H; Manee, Manee M; Al-Shomrani, Badr M; Al-Qhtani, Saad M; Al-Harthi, Amer S; Akdemir, Kadir C; Inan, Mehmet S; Otu, Hasan H

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and approximately 40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  8. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  9. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    PubMed Central

    Abernathy, Jason W; Xu, Peng; Li, Ping; Xu, De-Hai; Kucuktas, Huseyin; Klesius, Phillip; Arias, Covadonga; Liu, Zhanjiang

    2007-01-01

    Background The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate). Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan). BLASTX searches produced 2,518 significant (E-value < 10-5) hits and further Gene Ontology (GO) analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289). Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence. PMID:17577414

  10. Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

    PubMed Central

    Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

    2007-01-01

    Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730

  11. Analysis of early hepatic stage schistosomula gene expression by subtractive expressed sequence tags library.

    PubMed

    Wang, Xinzhi; Gobert, Geoffrey N; Feng, XinGang; Fu, Zhiqiang; Jin, Yamei; Peng, Jinbiao; Lin, Jiaojiao

    2009-07-01

    Schistosome parasites require a complex lifecycle requiring two hosts and aquatic phases of development. The schistosomula is a key phase of parasite development within the mammalian host, however relatively little is understood about the molecular processes underlying this stage. In this study 5723 subtractive expressed sequence tags (ESTs) were randomly selected from a 7 day hepatic schistosomula enriched library constructed using suppression subtractive hybridization method. Sequence analysis of these ESTs identified 1762 unique genes (contigs). Among them, 989 contigs were annotated with known genes, 311 contigs were homologous to established genes, 101 contigs were similar to established genes, 72 contigs were weakly similar to established genes and 289 sequences did not match any published sequences. Genes identified related to metabolism, cellular development, immune evasion and host-parasite interactions were identified as enriched in the hepatic schistosomula stage. The future identification of poorly annotated but stage-specific genes may potentially represent new drugs or vaccine targets, applicable for the future controlling of schistosomiasis. PMID:19428674

  12. Expressed sequence tag analysis of the emu (Dromaius novaehollandiae) pituitary by 454 GS Junior pyrosequencing.

    PubMed

    Kim, Ji Eun; Leung, Frederick C; Jiang, Jingwei; Kwok, Amy H Y; Bennett, Darin C; Cheng, Kimberly M

    2013-01-01

    Emus (Dromaius novaehollandiae) are farmed for their oil for pharmaceutical and cosmetic uses. This emu pituitary expressed sequence tag study was undertaken to identify novel transcripts in the emu pituitary to propel their identification and functional studies. By mapping reads derived from the Roche 454 GS Junior pyrosequencer to 8 reference species (human, mouse, chicken, zebra finch, fruit fly, turkey, round worm, and Carolina anole lizard) from the UniGene database, a total of 81,788 reads (53,312 mapped reads) were obtained and assembled with Reference Sequence (RefSeq). We annotated 6,676 potential emu genes by referencing 7 species (excluding lizard) and identified 1,232 potential genes common among 3 species (human, mouse, and chicken) with complete available reference genomes. Gene Ontology analysis revealed 376 Gene Ontology terms showing, with the highest counts, their involvements in biological processes, metabolism, and cellular components. These potential genes were detected to associate with 20 pathways including mitogen-activated protein kinase, insulin, neurotrophin signaling pathways, and carbohydrate digestion and absorption pathway. We also revealed a panel of tissue-specific genes including regulator of G-protein signaling protein (RGS), glucagon-like peptide receptor (GLPR), and growth hormone-inducible transmembrane protein (GHITM). Additionally, fatty acid binding protein (FABP), fatty acid desaturase (FAS), and stearoyl-coenzyme A desaturase (SCD), key enzyme genes in fat metabolism, were found to be also expressed in emu pituitary. This expressed sequence tag study represents the first step in functional characterization of emu pituitary gene expression and SNP identification for the improvement of fat production in the emu. PMID:23243234

  13. Comparative gene expression in the symbiotic and aposymbiotic Aiptasia pulchella by expressed sequence tag analysis.

    PubMed

    Kuo, Jimmy; Chen, Ming-Chyuan; Lin, Chorng-Horng; Fang, Lee-Shing

    2004-05-21

    Intracellular symbiotic relationships are prevalent between cnidarians, such as corals and sea anemones, and the photosynthetic dinoflagellate symbionts. However, there is little understanding about how the genes express when the symbiotic relationship is set up. To characterize genes involved in this association, the endosymbiosis between sea anemone, Aiptasia pulchella, and dinoflagellate zooxanthellae, Symbiodinium spp., was employed as a model. Two complementary DNA (cDNA) libraries were constructed from RNA isolated from symbiotic and aposymbiotic A. pulchella. Using single-pass sequencing of cDNA clones, a total of 870 expressed sequence tags (ESTs) clones were generated from the two libraries: 474 from symbiotic animal and 396 from aposymbiotic animal. The initial ESTs consisted of 143 clusters and 231 singletons. A BLASTX search revealed that 147 unique genes had similarities with protein sequences available from databases; 120 of these clones were categorized according to their putative function. However, many ESTs could not assign functionally. The putative roles of some of the identified genes relative to endosymbiosis were discussed. This is the first report of the use of EST analysis to examine the gene expression in symbiotic and aposymbiotic states of the cnidarians. The systematic analysis of EST from this study provides a useful database for future investigations of the molecular mechanisms involved in algal-cnidarian symbiosis. PMID:15110770

  14. Genome-wide analysis of immune system genes by expressed sequence Tag profiling.

    PubMed

    Giallourakis, Cosmas C; Benita, Yair; Molinie, Benoit; Cao, Zhifang; Despo, Orion; Pratt, Henry E; Zukerberg, Lawrence R; Daly, Mark J; Rioux, John D; Xavier, Ramnik J

    2013-06-01

    Profiling studies of mRNA and microRNA, particularly microarray-based studies, have been extensively used to create compendia of genes that are preferentially expressed in the immune system. In some instances, functional studies have been subsequently pursued. Recent efforts such as the Encyclopedia of DNA Elements have demonstrated the benefit of coupling RNA sequencing analysis with information from expressed sequence tags (ESTs) for transcriptomic analysis. However, the full characterization and identification of transcripts that function as modulators of human immune responses remains incomplete. In this study, we demonstrate that an integrated analysis of human ESTs provides a robust platform to identify the immune transcriptome. Beyond recovering a reference set of immune-enriched genes and providing large-scale cross-validation of previous microarray studies, we discovered hundreds of novel genes preferentially expressed in the immune system, including noncoding RNAs. As a result, we have established the Immunogene database, representing an integrated EST road map of gene expression in human immune cells, which can be used to further investigate the function of coding and noncoding genes in the immune system. Using this approach, we have uncovered a unique metabolic gene signature of human macrophages and identified PRDM15 as a novel overexpressed gene in human lymphomas. Thus, we demonstrate the utility of EST profiling as a basis for further deconstruction of physiologic and pathologic immune processes. PMID:23616578

  15. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    PubMed

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

    2016-01-01

    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting. PMID:27420958

  16. Expressed sequence tag analysis of the erythrocytic stage of Plasmodium berghei.

    PubMed

    Seok, Ji-Woong; Lee, Yong-Seok; Moon, Eun-Kyung; Lee, Jung-Yub; Jha, Bijay Kumar; Kong, Hyun-Hee; Chung, Dong-Il; Hong, Yeonchul

    2011-09-01

    Rodent malaria parasites, such as Plasmodium berghei, are practical and useful model organisms for human malaria research because of their analogies to the human malaria in terms of structure, physiology, and life cycle. Exploiting the available genetic sequence information, we constructed a cDNA library from the erythrocytic stages of P. berghei and analyzed the expressed sequence tag (EST). A total of 10,040 ESTs were generated and assembled into 2,462 clusters. These EST clusters were compared against public protein databases and 48 putative new transcripts, most of which were hypothetical proteins with unknown function, were identified. Genes encoding ribosomal or membrane proteins and purine nucleotide phosphorylases were highly abundant clusters in P. berghei. Protein domain analyses and the Gene Ontology functional categorization revealed translation/protein folding, metabolism, protein degradation, and multiple family of variant antigens to be mainly prevalent. The presently-collected ESTs and its bioinformatic analysis will be useful resources to identify for drug target and vaccine candidates and validate gene predictions of P. berghei. PMID:22072821

  17. Transcriptome analysis of expressed sequence tags from the venom glands of the fish Thalassophryne nattereri.

    PubMed

    Magalhães, G S; Junqueira-de-Azevedo, I L M; Lopes-Ferreira, M; Lorenzini, D M; Ho, P L; Moura-da-Silva, A M

    2006-06-01

    Thalassophryne nattereri (niquim) is a venomous fish found on the northern and northeastern coasts of Brazil. Every year, hundreds of humans are affected by the poison, which causes excruciating local pain, edema, and necrosis, and can lead to permanent disabilities. In experimental models, T. nattereri venom induces edema and nociception, which are correlated to human symptoms and dependent on venom kininogenase activity; myotoxicity; impairment of blood flow; platelet lysis and cytotoxicity on endothelial cells. These effects were observed with minute amounts of venom. To characterize the primary structure of T. nattereri venom toxins, a list of transcripts within the venom gland was made using the expressed sequence tag (EST) strategy. Here we report the analysis of 775 ESTs that were obtained from a directional cDNA library of T. nattereri venom gland. Of these ESTs, 527 (68%) were related to sequences previously described. These were categorized into 10 groups according to their biological functions. Sequences involved in gene and protein expression accounted for 14.3% of the ESTs, reflecting the important role of protein synthesis in this gland. Other groups included proteins engaged in the assembly of disulfide bonds (0.5%), chaperones involved in the folding of nascent proteins (1.4%), and sequences related to clusterin (1.5%), as well as transcripts related to calcium binding proteins (1.0%). We detected a large cluster (1.3%) related to cocaine- and amphetamine-regulated transcript (CART), a peptide involved in the regulation of food intake. Surprisingly, several retrotransposon-like sequences (1.0%) were found in the library. It may be that their presence accounts for some of the variation in venom toxins. The toxin category (18.8%) included natterins (18%), which are a new group of kininogenases recently described by our group, and a group of C-type lectins (0.8%). In addition, a considerable number of sequences (32%) was not related to sequences in the

  18. Alternative splicing and expression profile analysis of expressed sequence tags in domestic pig.

    PubMed

    Zhang, Liang; Tao, Lin; Ye, Lin; He, Ling; Zhu, Yuan-Zhong; Zhu, Yue-Dong; Zhou, Yan

    2007-02-01

    Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different non-normalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account. PMID:17572361

  19. Comprehensive analysis of expressed sequence tags from cultivated and wild radish (Raphanus spp.)

    PubMed Central

    2013-01-01

    Background Radish (Raphanus sativus L., 2n = 2× = 18) is an economically important vegetable crop worldwide. A large collection of radish expressed sequence tags (ESTs) has been generated but remains largely uncharacterized. Results In this study, approximately 315,000 ESTs derived from 22 Raphanus cDNA libraries from 18 different genotypes were analyzed, for the purpose of gene and marker discovery and to evaluate large-scale genome duplication and phylogenetic relationships among Raphanus spp. The ESTs were assembled into 85,083 unigenes, of which 90%, 65%, 89% and 89% had homologous sequences in the GenBank nr, SwissProt, TrEMBL and Arabidopsis protein databases, respectively. A total of 66,194 (78%) could be assigned at least one gene ontology (GO) term. Comparative analysis identified 5,595 gene families unique to radish that were significantly enriched with genes related to small molecule metabolism, as well as 12,899 specific to the Brassicaceae that were enriched with genes related to seed oil body biogenesis and responses to phytohormones. The analysis further indicated that the divergence of radish and Brassica rapa occurred approximately 8.9-14.9 million years ago (MYA), following a whole-genome duplication event (12.8-21.4 MYA) in their common ancestor. An additional whole-genome duplication event in radish occurred at 5.1-8.4 MYA, after its divergence from B. rapa. A total of 13,570 simple sequence repeats (SSRs) and 28,758 high-quality single nucleotide polymorphisms (SNPs) were also identified. Using a subset of SNPs, the phylogenetic relationships of eight different accessions of Raphanus was inferred. Conclusion Comprehensive analysis of radish ESTs provided new insights into radish genome evolution and the phylogenetic relationships of different radish accessions. Moreover, the radish EST sequences and the associated SSR and SNP markers described in this study represent a valuable resource for radish functional genomics studies and

  20. pISTil: a pipeline for yeast two-hybrid Interaction Sequence Tags identification and analysis

    PubMed Central

    Pellet, Johann; Meyniel, Laurène; Vidalain, Pierre-Olivier; de Chassey, Benoît; Tafforeau, Lionel; Lotteau, Vincent; Rabourdin-Combe, Chantal; Navratil, Vincent

    2009-01-01

    Background High-throughput screening of protein-protein interactions opens new systems biology perspectives for the comprehensive understanding of cell physiology in normal and pathological conditions. In this context, yeast two-hybrid system appears as a promising approach to efficiently reconstruct protein interaction networks at the proteome-wide scale. This protein interaction screening method generates a large amount of raw sequence data, i.e. the ISTs (Interaction Sequence Tags), which urgently need appropriate tools for their systematic and standardised analysis. Findings We develop pISTil, a bioinformatics pipeline combined with a user-friendly web-interface: (i) to establish a standardised system to analyse and to annotate ISTs generated by two-hybrid technologies with high performance and flexibility and (ii) to provide high-quality protein-protein interaction datasets for systems-level approach. This pipeline has been validated on a large dataset comprising more than 11.000 ISTs. As a case study, a detailed analysis of ISTs obtained from yeast two-hybrid screens of Hepatitis C Virus proteins against human cDNA libraries is also provided. Conclusion We have developed pISTil, an open source pipeline made of a collection of several applications governed by a Perl script. The pISTil pipeline is intended to laboratories, with IT-expertise in system administration, scripting and database management, willing to automatically process large amount of ISTs data for accurate reconstruction of protein interaction networks in a systems biology perspective. pISTil is publicly available for download at . PMID:19874608

  1. Expressed sequence tag analysis in Cycas, the most primitive living seed plant

    PubMed Central

    Brenner, Eric D; Stevenson, Dennis W; McCombie, Richard W; Katari, Manpreet S; Rudd, Stephen A; Mayer, Klaus FX; Palenchar, Peter M; Runko, Suzan J; Twigg, Richard W; Dai, Guangwei; Martienssen, Rob A; Benfey, Phillip N; Coruzzi, Gloria M

    2003-01-01

    Background Cycads are ancient seed plants (living fossils) with origins in the Paleozoic. Cycads are sometimes considered a 'missing link' as they exhibit characteristics intermediate between vascular non-seed plants and the more derived seed plants. Cycads have also been implicated as the source of 'Guam's dementia', possibly due to the production of S(+)-beta-methyl-alpha, beta-diaminopropionic acid (BMAA), which is an agonist of animal glutamate receptors. Results A total of 4,200 expressed sequence tags (ESTs) were created from Cycas rumphii and clustered into 2,458 contigs, of which 1,764 had low-stringency BLAST similarity to other plant genes. Among those cycad contigs with similarity to plant genes, 1,718 cycad 'hits' are to angiosperms, 1,310 match genes in gymnosperms and 734 match lower (non-seed) plants. Forty-six contigs were found that matched only genes in lower plants and gymnosperms. Upon obtaining the complete sequence from the clones of 37/46 contigs, 14 still matched only gymnosperms. Among those cycad contigs common to higher plants, ESTs were discovered that correspond to those involved in development and signaling in present-day flowering plants. We purified a cycad EST for a glutamate receptor (GLR)-like gene, as well as ESTs potentially involved in the synthesis of the GLR agonist BMAA. Conclusions Analysis of cycad ESTs has uncovered conserved and potentially novel genes. Furthermore, the presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling. PMID:14659015

  2. Identification of candidates for cyclotide biosynthesis and cyclisation by expressed sequence tag analysis of Oldenlandia affinis

    PubMed Central

    2010-01-01

    Background Cyclotides are a family of circular peptides that exhibit a range of biological activities, including anti-bacterial, cytotoxic, anti-HIV activities, and are proposed to function in plant defence. Their high stability has motivated their development as scaffolds for the stabilisation of peptide drugs. Oldenlandia affinis is a member of the Rubiaceae (coffee) family from which 18 cyclotides have been sequenced to date, but the details of their processing from precursor proteins have only begun to be elucidated. To increase the speed at which genes involved in cyclotide biosynthesis and processing are being discovered, an expressed sequence tag (EST) project was initiated to survey the transcript profile of O. affinis and to propose some future directions of research on in vivo protein cyclisation. Results Using flow cytometry the holoploid genome size (1C-value) of O. affinis was estimated to be 4,210 - 4,284 Mbp, one of the largest genomes of the Rubiaceae family. High-quality ESTs were identified, 1,117 in total, from leaf cDNAs and assembled into 502 contigs, comprising 202 consensus sequences and 300 singletons. ESTs encoding the cyclotide precursors for kalata B1 (Oak1) and kalata B2 (Oak4) were among the 20 most abundant ESTs. In total, 31 ESTs encoded cyclotide precursors, representing a distinct commitment of 2.8% of the O. affinis transcriptome to cyclotide biosynthesis. The high expression levels of cyclotide precursor transcripts are consistent with the abundance of mature cyclic peptides in O. affinis. A new cyclotide precursor named Oak5 was isolated and represents the first cDNA for the bracelet class of cyclotides in O. affinis. Clones encoding enzymes potentially involved in processing cyclotides were also identified and include enzymes involved in oxidative folding and proteolytic processing. Conclusion The EST library generated in this study provides a valuable resource for the study of the cyclisation of plant peptides. Further analysis

  3. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    PubMed Central

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  4. Not All Sequence Tags Are Created Equal: Designing and Validating Sequence Identification Tags Robust to Indels

    PubMed Central

    Faircloth, Brant C.; Glenn, Travis C.

    2012-01-01

    Ligating adapters with unique synthetic oligonucleotide sequences (sequence tags) onto individual DNA samples before massively parallel sequencing is a popular and efficient way to obtain sequence data from many individual samples. Tag sequences should be numerous and sufficiently different to ensure sequencing, replication, and oligonucleotide synthesis errors do not cause tags to be unrecoverable or confused. However, many design approaches only protect against substitution errors during sequencing and extant tag sets contain too few tag sequences. We developed an open-source software package to validate sequence tags for conformance to two distance metrics and design sequence tags robust to indel and substitution errors. We use this software package to evaluate several commercial and non-commercial sequence tag sets, design several large sets (maxcount = 7,198) of edit metric sequence tags having different lengths and degrees of error correction, and integrate a subset of these edit metric tags to polymerase chain reaction (PCR) primers and sequencing adapters. We validate a subset of these edit metric tagged PCR primers and sequencing adapters by sequencing on several platforms and subsequent comparison to commercially available alternatives. We find that several commonly used sets of sequence tags or design methodologies used to produce sequence tags do not meet the minimum expectations of their underlying distance metric, and we find that PCR primers and sequencing adapters incorporating edit metric sequence tags designed by our software package perform as well as their commercial counterparts. We suggest that researchers evaluate sequence tags prior to use or evaluate tags that they have been using. The sequence tag sets we design improve on extant sets because they are large, valid across the set, and robust to the suite of substitution, insertion, and deletion errors affecting massively parallel sequencing workflows on all currently used platforms

  5. Desiccation survival in an Antarctic nematode: molecular analysis using expressed sequenced tags

    PubMed Central

    Adhikari, Bishwo N; Wall, Diana H; Adams, Byron J

    2009-01-01

    Background Nematodes are the dominant soil animals in Antarctic Dry Valleys and are capable of surviving desiccation and freezing in an anhydrobiotic state. Genes induced by desiccation stress have been successfully enumerated in nematodes; however we have little knowledge of gene regulation by Antarctic nematodes which can survive multiple environmental stresses. To address this problem we investigated the genetic responses of a nematode species, Plectus murrayi, that is capable of tolerating Antarctic environmental extremes, in particular desiccation and freezing. In this study, we provide the first insight into the desiccation induced transcriptome of an Antarctic nematode through cDNA library construction and suppressive subtractive hybridization. Results We obtained 2,486 expressed sequence tags (ESTs) from 2,586 clones derived from the cDNA library of desiccated P. murrayi. The 2,486 ESTs formed 1,387 putative unique transcripts of which 523 (38%) had matches in the model-nematode Caenorhabditis elegans, 107 (7%) in nematodes other than C. elegans, 153 (11%) in non-nematode organisms and 605 (44%) had no significant match to any sequences in the current databases. The 1,387 unique transcripts were functionally classified by using Gene Ontology (GO) hierarchy and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The results indicate that the transcriptome contains a group of transcripts from diverse functional areas. The subtractive library of desiccated nematodes showed 80 transcripts differentially expressed during desiccation stress, of which 28% were metabolism related, 19% were involved in environmental information processing, 28% involved in genetic information processing and 21% were novel transcripts. Expression profiling of 14 selected genes by quantitative Real-time PCR showed 9 genes significantly up-regulated, 3 down-regulated and 2 continuously expressed in response to desiccation. Conclusion The establishment of a desiccation EST

  6. Generation and analysis of end sequence database for T-DNA tagging lines in rice.

    PubMed

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-12-01

    We analyzed 6749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3793 genomic sequences flanking the T-DNA. Among the insertions, 1846 T-DNAs were integrated into genic regions, and 1864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  7. Transcriptome analysis of Loxosceles laeta (Araneae, Sicariidae) spider venomous gland using expressed sequence tags

    PubMed Central

    Fernandes-Pedrosa, Matheus de F; Junqueira-de-Azevedo, Inácio de LM; Gonçalves-de-Andrade, Rute M; Kobashi, Leonardo S; Almeida, Diego D; Ho, Paulo L; Tambourgi, Denise V

    2008-01-01

    Background The bite of spiders belonging to the genus Loxosceles can induce a variety of clinical symptoms, including dermonecrosis, thrombosis, vascular leakage, haemolysis, and persistent inflammation. In order to examine the transcripts expressed in venom gland of Loxosceles laeta spider and to unveil the potential of its products on cellular structure and functional aspects, we generated 3,008 expressed sequence tags (ESTs) from a cDNA library. Results All ESTs were clustered into 1,357 clusters, of which 16.4% of the total ESTs belong to recognized toxin-coding sequences, being the Sphingomyelinases D the most abundant transcript; 14.5% include "possible toxins", whose transcripts correspond to metalloproteinases, serinoproteinases, hyaluronidases, lipases, C-lectins, cystein peptidases and inhibitors. Thirty three percent of the ESTs are similar to cellular transcripts, being the major part represented by molecules involved in gene and protein expression, reflecting the specialization of this tissue for protein synthesis. In addition, a considerable number of sequences, 25%, has no significant similarity to any known sequence. Conclusion This study provides a first global view of the gene expression scenario of the venom gland of L. laeta described so far, indicating the molecular bases of its venom composition. PMID:18547439

  8. Pyrosequence analysis of expressed sequence tags for Manduca sexta hemolymph proteins involved in immune responses.

    PubMed

    Zou, Zhen; Najar, Fares; Wang, Yang; Roe, Bruce; Jiang, Haobo

    2008-06-01

    The tobacco hornworm Manduca sexta is widely used as a model organism to investigate the biochemical basis of insect physiological processes but little transcriptome information is available. To get a broad view of the larval hemolymph proteins, particularly those related to immunity, we synthesized and sequenced cDNA fragments from a mixture of eight total RNA samples: fat body and hemocytes from larvae injected with killed bacteria, fat body, hemocytes, integument and trachea from naïve larvae, and fat body and hemocytes from wandering larvae. Using massively parallel pyrosequencing, we obtained 95,458 M. sexta expressed sequence tags (ESTs) at an average size of 185bp per read. A majority of the sequences (69,429 reads) could be assembled into 7231 contigs with an average size of 300bp, 1178 of which had significant similarity with Drosophila genes from various functional groups. Only approximately 8% (606) of the contigs matched known M. sexta cDNA sequences, representing 186 of the 375 unique NCBI entries. The remaining 6625 contigs represented newly discovered cDNA segments from this well studied biochemical model insect. A search of the 7231 contigs using Tribolium castaneum, Drosophila melanogaster, and Bombyx mori immunity-related sequences revealed 424 cDNA contigs with significant similarity (E-value <1 x 10(-5)). These included 218 previously unknown M. sexta sequences coding for putative defense molecules such as pattern recognition receptors, serine proteinases, serpins, Spätzle, Toll-like receptors, intracellular signaling molecules, and antimicrobial peptides. PMID:18510979

  9. Functional annotation of an expressed sequence tag library from Haliotis diversicolor and analysis of its plant-like sequences.

    PubMed

    Jiang, Jing-Zhe; Zhang, Wei; Guo, Zhi-Xun; Cai, Chen-Chen; Su, You-Lu; Wang, Rui-Xuan; Wang, Jiang-Yong

    2011-09-01

    The small abalone, Haliotis diversicolor, is a widely distributed and cultured species in the subtropical coastal area of China. To identify and classify functional genes of this important species, a normalized expressed sequence tag (EST) library, including 7069 high quality ESTs from the total body of H. diversicolor, was analyzed. A total of 4781 unigenes were assembled and 2991 novel abalone genes were identified. The GC content, codon and amino acid usage of the transcriptome were analyzed. For the accurate annotation of the abalone library, different influencing factors were evaluated. The gene ontology (GO) database provided a higher annotation rate (69.6%), and sequences longer than 800bp were easily subjected to a BLAST search. The taxonomy of the BLAST results showed that lancelet and invertebrates are most closely related to abalone. Sixty-seven identified plant-like genes were further examined by reverse transcription-polymerase chain reaction (RT-PCR) and sequencing, only seven of these were real transcripts in abalone. Phylogenic trees were also constructed to illustrate the positions of two Cystatin sequences and one Calmodulin protein sequence identified in abalone. To perform functional classification, three different databases (GO, KEGG and COG) were used and 60 immune or disease-related unigenes were determined. This work has greatly enlarged the known gene pool of H. diversicolor and will have important implications for future molecular and genetic analyses in this organism. PMID:21867971

  10. High-Throughput Tag-Sequencing Analysis of Early Events Induced by Ochratoxin A in HepG-2 Cells.

    PubMed

    Zhang, Yu; Qi, Xiaozhe; Zheng, Juanjuan; Luo, YunBo; Huang, Kunlun; Xu, Wentao

    2016-01-01

    Ochratoxin A (OTA) is produced by fungi of the species Aspergillus and Penicillium. OTA has displayed hepatotoxicity in mammals. Although recent studies have indicated that OTA influences liver function, little is known regarding its impact on differential early liver toxicity. In this study, we report high-throughput tag-sequencing (Tag-seq) analysis of the transcriptome using Solexa Analyzer platform after 4 h of OTA treatment on HepG-2 cells. The analyses of differentially expressed genes revealed the substantial changes. A total of 21,449 genes were identified and quantified, with 2726 displaying significantly altered expression levels. Expression level data were then integrated with a network of gene-gene interactions, and biological pathways to obtain a systems-level view of changes in the transcriptome that occur with OTA resistance. Our data suggest that OTA exposure leads to an imbalance in zinc finger expression and shed light on splicing factor and mitochondrial-based mechanisms. PMID:26377828

  11. Assembly of a gene sequence tag microarray by reversible biotin-streptavidin capture for transcript analysis of Arabidopsis thaliana

    PubMed Central

    Wirta, Valtteri; Holmberg, Anders; Lukacs, Morten; Nilsson, Peter; Hilson, Pierre; Uhlén, Mathias; Bhalerao, Rishikesh P; Lundeberg, Joakim

    2005-01-01

    Background Transcriptional profiling using microarrays has developed into a key molecular tool for the elucidation of gene function and gene regulation. Microarray platforms based on either oligonucleotides or purified amplification products have been utilised in parallel to produce large amounts of data. Irrespective of platform examined, the availability of genome sequence or a large number of representative expressed sequence tags (ESTs) is, however, a pre-requisite for the design and selection of specific and high-quality microarray probes. This is of great importance for organisms, such as Arabidopsis thaliana, with a high number of duplicated genes, as cross-hybridisation signals between evolutionary related genes cannot be distinguished from true signals unless the probes are carefully designed to be specific. Results We present an alternative solid-phase purification strategy suitable for efficient preparation of short, biotinylated and highly specific probes suitable for large-scale expression profiling. Twenty-one thousand Arabidopsis thaliana gene sequence tags were amplified and subsequently purified using the described technology. The use of the arrays is exemplified by analysis of gene expression changes caused by a four-hour indole-3-acetic (auxin) treatment. A total of 270 genes were identified as differentially expressed (120 up-regulated and 150 down-regulated), including several previously known auxin-affected genes, but also several previously uncharacterised genes. Conclusions The described solid-phase procedure can be used to prepare gene sequence tag microarrays based on short and specific amplified probes, facilitating the analysis of more than 21 000 Arabidopsis transcripts. PMID:15689241

  12. Transcriptome analysis in the midgut of the earthworm (Eisenia andrei) using expressed sequence tags.

    PubMed

    Lee, Myung Sik; Cho, Sung Jin; Tak, Eun Sik; Lee, Jong Ae; Cho, Hyun Ju; Park, Bum Joon; Shin, Chuog; Kim, Dae Kyong; Park, Soon Cheol

    2005-03-25

    In order to gain insight into the expression profiles of the earthworm midgut, we analyzed 1106 expressed sequence tags (ESTs) derived from the earthworm midgut cDNA library. Among the 1106 ESTs analyzed, 557 (50.4%) ESTs showed significant similarity to known genes and represented 229 unique genes of which 166 ESTs were singletons and 63 ESTs manifest as two or more ESTs. While 552 ESTs (49.9%) were sequenced only once, 230 ESTs (20.8%) appeared two to five times and 324 ESTs (29.3%) were sequenced more than five times. Considering this redundancy of expression, it is likely that the gene expression profile of the earthworm midgut would be polarized. The expression of globin-related proteins, including ferritin and linker chain, and fibrinolytic enzymes appeared to account for 10.1% and 4.7% of the total ESTs analyzed in this study, respectively. This suggests that the prime functions of the midgut in the earthworm would be associated with protein hydrolysis as well as globin formation. Among the recognized protein-coding genes, the gene category involved in protein synthesis appeared to be the largest one accounting for 15.6% of the expression in the midgut, followed by gene categories associated with energy (11.2%), homeostasis (10.8%), metabolism (3.6%), cytoskeleton (2.5%), and protein fate (1.4%). With regard to functional aspects, the most abundantly expressed genes were associated with respiratory pigment (10.1%), cellular respiration (8.6%), and fibrin hydrolysis (4.7%). In addition, we were able to identify novel ESTs in the earthworm, which were related to the innate immune system, including destabilase, a possible antagonist of transglutaminase. PMID:15708003

  13. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  14. Analysis of expressed sequence tags from Musa acuminata ssp. burmannicoides, var. Calcutta 4 (AA) leaves submitted to temperature stresses.

    PubMed

    Santos, C M R; Martins, N F; Hörberg, H M; de Almeida, E R P; Coelho, M C F; Togawa, R C; da Silva, F R; Caetano, A R; Miller, R N G; Souza, M T

    2005-05-01

    In order to discover genes expressed in leaves of Musa acuminata ssp. burmannicoides var. Calcutta 4 (AA), from plants submitted to temperature stress, we produced and characterized two full-length enriched cDNA libraries. Total RNA from plants subjected to temperatures ranging from 5 degrees C to 25 degrees C and from 25 degrees C to 45 degrees C was used to produce a COLD and a HOT cDNA library, respectively. We sequenced 1,440 clones from each library. Following quality analysis and vector trimming, we assembled 2,286 sequences from both libraries into 1,019 putative transcripts, consisting of 217 clusters and 802 singletons, which we denoted Musa acuminata assembled expressed sequence tagged (EST) sequences (MaAES). Of these MaAES, 22.87% showed no matches with existing sequences in public databases. A global analysis of the MaAES data set indicated that 10% of the sequenced cDNAs are present in both cDNA libraries, while 42% and 48% are present only in the COLD or in the HOT libraries, respectively. Annotation of the MaAES data set categorized them into 22 functional classes. Of the 2,286 high-quality sequences, 715 (31.28%) originated from full-length cDNA clones and resulted in a set of 149 genes. PMID:15841358

  15. Analysis of expressed sequence tags from the blue-green sharpshooter, Graphocephala atropunctata

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We used a metagenomic approach and identified and sequenced 6,836 genetic sequences isolated from adult blue-green sharpshooters, BGSS, Graphocephala atropunctata. These results provided over 70% of the mitochondrial genome sequence which is being completed. The BGSS is endemic to southern Californ...

  16. Analysis of expressed sequence tags (ESTs) from a normalized cDNA library and isolation of EST simple sequence repeats from the invasive cotton mealybug Phenacoccus solenopsis.

    PubMed

    Li, Hui; Lang, Kun-Ling; Fu, Hai-Bin; Shen, Chang-Peng; Wan, Fang-Hao; Chu, Dong

    2015-12-01

    The cotton mealybug, Phenacoccus solenopsis Tinsley, is a serious and invasive pest. At present, genetic resources for studying P. solenopsis are limited, and this negatively affects genetic research on the organism and, consequently, translational work to improve management of this pest. In the present study, expressed sequence tags (ESTs) were analyzed from a normalized complementary DNA library of P. solenopsis. In addition, EST-derived microsatellite loci (also known as simple sequence repeats or SSRs) were isolated and characterized. A total of 1107 high-quality ESTs were acquired from the library. Clustering and assembly analysis resulted in 785 unigenes, which were classified functionally into 23 categories according to the Gene Ontology database. Seven EST-based SSR markers were developed in this study and are expected to be useful in characterizing how this invasive species was introduced, as well as providing insights into its genetic microevolution. PMID:25380551

  17. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  18. Gene discovery and expression profile analysis through sequencing of expressed sequence tags from different developmental stages of the chytridiomycete Blastocladiella emersonii.

    PubMed

    Ribichich, Karina F; Salem-Izacc, Silvia M; Georg, Raphaela C; Vêncio, Ricardo Z N; Navarro, Luci D; Gomes, Suely L

    2005-02-01

    Blastocladiella emersonii is an aquatic fungus of the chytridiomycete class which diverged early from the fungal lineage and is notable for the morphogenetic processes which occur during its life cycle. Its particular taxonomic position makes this fungus an interesting system to be considered when investigating phylogenetic relationships and studying the biology of lower fungi. To contribute to the understanding of the complexity of the B. emersonii genome, we present here a survey of expressed sequence tags (ESTs) from various stages of the fungal development. Nearly 20,000 cDNA clones from 10 different libraries were partially sequenced from their 5' end, yielding 16,984 high-quality ESTs. These ESTs were assembled into 4,873 putative transcripts, of which 48% presented no matches with existing sequences in public databases. As a result of Gene Ontology (GO) project annotation, 1,680 ESTs (35%) were classified into biological processes of the GO structure, with transcription and RNA processing, protein biosynthesis, and transport as prevalent processes. We also report full-length sequences, useful for construction of molecular phylogenies, and several ESTs that showed high similarity with known proteins, some of which were not previously described in fungi. Furthermore, we analyzed the expression profile (digital Northern analysis) of each transcript throughout the life cycle of the fungus using Bayesian statistics. The in silico approach was validated by Northern blot analysis with good agreement between the two methodologies. PMID:15701807

  19. Analysis and RT-PCR identification of viral sequences in peanut (Arachis hypogaea L.) expressed sequence tags from different peanut tissues

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Peanut plants grown in the field have been naturally infected with different viruses resulting in economic yield loss in the southeastern US, such as tomato spotted wilt tospovirus (TSWV) in peanuts. The objectives of this study were to investigate peanut sequences of expressed sequence tags (EST) f...

  20. Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

    PubMed Central

    Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

    2010-01-01

    Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085

  1. Transcriptome analysis of the phytopathogenic fungus Rhizoctonia solani AG1-IB 7/3/14 applying high-throughput sequencing of expressed sequence tags (ESTs).

    PubMed

    Wibberg, Daniel; Jelonek, Lukas; Rupp, Oliver; Kröber, Magdalena; Goesmann, Alexander; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2014-01-01

    Rhizoctonia solani is a soil-borne plant pathogenic fungus of the phylum Basidiomycota. It affects a wide range of agriculturally important crops and hence is responsible for economically relevant crop losses. Transcriptome analysis of the bottom rot pathogen R. solani AG1-1B (isolate 7/3/14) by applying high-throughput sequencing and bioinformatics methods addressing Expressed Sequence Tag (EST) data interpretation provided new insights in expressed genes of this fungus. Two normalized cDNA libraries representing different cultivation conditions of the fungus were sequenced on the 454 FLX (Roche) system. Subsequent to cDNA sequence assembly and quality control, ESTs were analysed applying advanced bioinformatics methods. More than 14 000 transcript isoforms originating from approximately 10 000 predictable R. solani AG1-IB 7/3/14 genes are represented in each dataset. Comparative analyses revealed several differentially expressed genes depending on the growth conditions applied. Determinants with predicted functions in recognition processes between the fungus and the host plant were identified. Moreover, many R. solani AG1-IB ESTs were predicted to encode putative cellulose, pectin, and lignin degrading enzymes. Furthermore, genes playing a possible role in mitogen-activated protein (MAP) kinase cascades, 4-aminobutyric acid (GABA) metabolism, melanin synthesis, plant defence antagonism, phytotoxin, and mycotoxin synthesis were detected. PMID:25209639

  2. Comparative analysis and functional annotation of a large expressed sequence tag collection of apple

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A total of 34 apple cDNA libraries were constructed from root, leaf, bud, shoot, flower, and fruit tissues, at varying developmental stages and/or under biotic or abiotic stress conditions, and of several genotypes. From these libraries, 190,425 clones were partially sequenced from the 5’ end and 4...

  3. Analysis and functional annotation of expressed sequence tags from the Asian longhorned beetle, Anoplophora glabripennis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified 600 genetic sequences of which ~380 were uniquely identified to the Asian longhorned beetle (ALB), Anoplophora glabripennis, (Coleoptera) which is one of the most serious invasive forest insect pests discovered in North America in recent years. Despite the substantial impact of this p...

  4. Analysis of expressed sequence tags from Uromyces appendiculatus hyphae and haustoria and their comparison to sequences from other rust fungi

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two separate cDNA libraries were prepared for RNA extracted from bean rust (Uromyces appendiculatus) hyphae and haustoria isolated from infected leaves bean leaves (Phaseolus vulgaris cv Pint 111) between 2 and 8 dpi. Approximately 13,000 clones were sequenced from both ends and the sequences assem...

  5. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat

    PubMed Central

    Goswami, Suneha; Kumar, Ranjeet R.; Dubey, Kavita; Singh, Jyoti P.; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C.; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C.; Kala, Yugal K.; Singh, Gyanendra P.; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D.

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat—a novel step toward the development of

  6. Transcriptomic analysis of the venom gland of the red-headed krait (Bungarus flaviceps) using expressed sequence tags

    PubMed Central

    2010-01-01

    Background The Red-headed krait (Bungarus flaviceps, Squamata: Serpentes: Elapidae) is a medically important venomous snake that inhabits South-East Asia. Although the venoms of most species of the snake genus Bungarus have been well characterized, a detailed compositional analysis of B. flaviceps is currently lacking. Results Here, we have sequenced 845 expressed sequence tags (ESTs) from the venom gland of a B. flaviceps. Of the transcripts, 74.8% were putative toxins; 20.6% were cellular; and 4.6% were unknown. The main venom protein families identified were three-finger toxins (3FTxs), Kunitz-type serine protease inhibitors (including chain B of β-bungarotoxin), phospholipase A2 (including chain A of β-bungarotoxin), natriuretic peptide (NP), CRISPs, and C-type lectin. Conclusion The 3FTxs were found to be the major component of the venom (39%). We found eight groups of unique 3FTxs and most of them were different from the well-characterized 3FTxs. We found three groups of Kunitz-type serine protease inhibitors (SPIs); one group was comparable to the classical SPIs and the other two groups to chain B of β-bungarotoxins (with or without the extra cysteine) based on sequence identity. The latter group may be functional equivalents of dendrotoxins in Bungarus venoms. The natriuretic peptide (NP) found is the first NP for any Asian elapid, and distantly related to Australian elapid NPs. Our study identifies several unique toxins in B. flaviceps venom, which may help in understanding the evolution of venom toxins and the pathophysiological symptoms induced after envenomation. PMID:20350308

  7. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat

  8. Sequence tagging reveals unexpected modifications in toxicoproteomics

    PubMed Central

    Dasari, Surendra; Chambers, Matthew C.; Codreanu, Simona G.; Liebler, Daniel C.; Collins, Ben C.; Pennington, Stephen R.; Gallagher, William M.; Tabb, David L.

    2010-01-01

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications. PMID:21214251

  9. An expressed sequence tag database of T-cell-enriched activated chicken splenocytes: sequence analysis of 5251 clones.

    PubMed

    Tirunagaru, V G; Sofer, L; Cui, J; Burnside, J

    2000-06-01

    The cDNA and gene sequences of many mammalian cytokines and their receptors are known. However, corresponding information on avian cytokines is limited due to the lack of cross-species activity at the functional level or strong homology at the molecular level. To improve the efficiency of identifying cytokines and novel chicken genes, a directionally cloned cDNA library from T-cell-enriched activated chicken splenocytes was constructed, and the partial sequence of 5251 clones was obtained. Sequence clustering indicates that 2357 (42%) of the clones are present as a single copy, and 2961 are distinct clones, demonstrating the high level of complexity of this library. Comparisons of the sequence data with known DNA sequences in GenBank indicate that approximately 25% of the clones match known chicken genes, 39% have similarity to known genes in other species, and 11% had no match to any sequence in the database. Several previously uncharacterized chicken cytokines and their receptors were present in our library. This collection provides a useful database for cataloging genes expressed in T cells and a valuable resource for future investigations of gene expression in avian immunology. A chicken EST Web site (http://udgenome. ags.udel. edu/chickest/chick.htm) has been created to provide access to the data, and a set of unique sequences has been deposited with GenBank (Accession Nos. AI979741-AI982511). Our new Web site (http://www. chickest.udel.edu) will be active as of March 3, 2000, and will also provide keyword-searching capabilities for BLASTX and BLASTN hits of all our clones. PMID:10860659

  10. Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger

    PubMed Central

    Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui

    2010-01-01

    In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers. PMID:20941376

  11. Generation and analysis of expressed sequence tags from Trypanosoma cruzi trypomastigote and amastigote cDNA libraries.

    PubMed

    Agüero, Fernán; Abdellah, Karim Ben; Tekiel, Valeria; Sánchez, Daniel O; González, Antonio

    2004-08-01

    We have generated 2771 expressed sequence tags (ESTs) from two cDNA libraries of Trypanosoma cruzi CL-Brener. The libraries were constructed from trypomastigote and amastigotes, using a spliced leader primer to synthesize the cDNA second strand, thus selecting for full-length cDNAs. Since the libraries were not normalized nor pre-screened, we compared the representation of transcripts between the two using a statistical test and identify a subset of transcripts that show apparent differential representation. A non-redundant set of 1619 reconstructed transcripts was generated by sequence clustering. This dataset was used to perform similarity searches against protein and nucleotide databases. Based on these searches, 339 sequences could be assigned a putative identity. One thousand one-hundred and sixteen sequences in the non-redundant clustered dataset (68.8%) are new expression tags, not represented in the T. cruzi epimastigote ESTs that are in the public databases. Additional information is provided online at http://genoma.unsam.edu.ar/projects/tram. To the best of our knowledge these are the first ESTs reported for the life cycle stages of T. cruzi that occur in the vertebrate host. PMID:15478800

  12. Expressed sequence tags of the peanut pod nematode Ditylenchus africanus: the first transcriptome analysis of an Anguinid nematode

    PubMed Central

    Haegeman, Annelies; Jacob, Joachim; Vanholme, Bartel; Kyndt, Tina; Mitreva, Makedonka; Gheysen, Godelieve

    2009-01-01

    In this study, 4847 expressed sequenced tags (ESTs) from mixed stages of the migratory plant-parasitic nematode Ditylenchus africanus (peanut pod nematode) were investigated. It is the first molecular survey of a nematode which belongs to the family of the Anguinidae (order Rhabditida, superfamily Sphaerularioidea). The sequences were clustered into 2596 unigenes, of which 43% did not show any homology to known protein, nucleotide, nematode EST or plant-parasitic nematode genome sequences. Gene ontology mapping revealed that most putative proteins are involved in developmental and reproductive processes. In addition unigenes involved in oxidative stress as well as in anhydrobiosis, such as LEA (late embryogenesis abundant protein) and trehalose-6-phosphate synthase were identified. Other tags showed homology to genes previously described as being involved in parasitism (expansin, SEC-2, calreticulin, 14-3-3b and various allergen proteins). In situ hybridization revealed that the expression of a putative expansin and a venom allergen protein was restricted to the gland cell area of the nematode, being in agreement with their presumed role in parasitism. Furthermore, 7 putative novel candidate parasitism genes were identified based on the prediction of a signal peptide in the corresponding protein sequence and homologous ESTs exclusively in parasitic nematodes. These genes are interesting for further research and functional characterization. Finally, 34 unigenes were retained as good target candidates for future RNAi experiments, because of their nematode specific nature and observed lethal phenotypes of Caenorhabditis elegans homologs. PMID:19383517

  13. Identification of novel highly expressed genes in pancreatic ductal adenocarcinomas through a bioinformatics analysis of expressed sequence tags.

    PubMed

    Cao, Dengfeng; Hustinx, Steven R; Sui, Guoping; Bala, P; Sato, Norihiro; Martin, Sean; Maitra, Anirban; Murphy, Kathleen M; Cameron, John L; Yeo, Charles J; Kern, Scott E; Goggins, Michael; Pandey, Akhilesh; Hruban, Ralph H

    2004-11-01

    In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize

  14. Complementary DNA sequencing: Expressed sequence tags and human genome project

    SciTech Connect

    Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. ); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. )

    1991-06-21

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

  15. Expressed sequence tag analysis of khat (Catha edulis) provides a putative molecular biochemical basis for the biosynthesis of phenylpropylamino alkaloids.

    PubMed

    Hagel, Jillian M; Krizevski, Raz; Kilpatrick, Korey; Sitrit, Yaron; Marsolais, Frédéric; Lewinsohn, Efraim; Facchini, Peter J

    2011-10-01

    Khat (Catha edulis Forsk.) is a flowering perennial shrub cultivated for its neurostimulant properties resulting mainly from the occurrence of (S)-cathinone in young leaves. The biosynthesis of (S)-cathinone and the related phenylpropylamino alkaloids (1S,2S)-cathine and (1R,2S)-norephedrine is not well characterized in plants. We prepared a cDNA library from young khat leaves and sequenced 4,896 random clones, generating an expressed sequence tag (EST) library of 3,293 unigenes. Putative functions were assigned to > 98% of the ESTs, providing a key resource for gene discovery. Candidates potentially involved at various stages of phenylpropylamino alkaloid biosynthesis from L-phenylalanine to (1S,2S)-cathine were identified. PMID:22215969

  16. Expressed sequence tag analysis of khat (Catha edulis) provides a putative molecular biochemical basis for the biosynthesis of phenylpropylamino alkaloids

    PubMed Central

    Hagel, Jillian M.; Krizevski, Raz; Kilpatrick, Korey; Sitrit, Yaron; Marsolais, Frédéric; Lewinsohn, Efraim; Facchini, Peter J.

    2011-01-01

    Khat (Catha edulis Forsk.) is a flowering perennial shrub cultivated for its neurostimulant properties resulting mainly from the occurrence of (S)-cathinone in young leaves. The biosynthesis of (S)-cathinone and the related phenylpropylamino alkaloids (1S,2S)-cathine and (1R,2S)-norephedrine is not well characterized in plants. We prepared a cDNA library from young khat leaves and sequenced 4,896 random clones, generating an expressed sequence tag (EST) library of 3,293 unigenes. Putative functions were assigned to > 98% of the ESTs, providing a key resource for gene discovery. Candidates potentially involved at various stages of phenylpropylamino alkaloid biosynthesis from L-phenylalanine to (1S,2S)-cathine were identified. PMID:22215969

  17. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  18. Analysis of expressed sequence tags from the anamorphic basidiomycetous yeast, Pseudozyma antarctica, which produces glycolipid biosurfactants, mannosylerythritol lipids.

    PubMed

    Morita, Tomotake; Konishi, Masaaki; Fukuoka, Tokuma; Imura, Tomohiro; Kitamoto, Dai

    2006-07-15

    Pseudozyma antarctica T-34 secretes a large amount of biosurfactants (BS), mannosylerythritol lipids (MEL), from different carbon sources such as hydrocarbons and vegetable oils. The detailed biosynthetic pathway of MEL remained unknown due to lack of genetic information on the anamorphic basidiomycetous yeasts, including the genus Pseudozyma. Here, in order to obtain genetic information on P. antarctica T-34, we constructed a cDNA library from yeast cells producing MEL from soybean oil and identified the genes expressed through the creation of an expressed sequence tags (EST) library. We generated 398 ESTs, assembled into 146 contiguous sequences. Based upon a BLAST search similarity cut-off of Esequences in the protein database; 60.3% of all contiguous sequences shared significant identities to hypothetical protein of Ustilago maydis, which is a smut fungus and BS producer. Based on the gene expression study using real-time reverse transcriptase-PCR, the predicted genes, such as mannosyltranferase and acyltransferase, were demonstrated to be highly involved in MEL biosynthesis in soybean oil-grown cells. PMID:16845679

  19. Multiplexed genotyping with sequence-tagged molecular inversion probes.

    PubMed

    Hardenbol, Paul; Banér, Johan; Jain, Maneesh; Nilsson, Mats; Namsaraev, Eugeni A; Karlin-Neumann, George A; Fakhrai-Rad, Hossein; Ronaghi, Mostafa; Willis, Thomas D; Landegren, Ulf; Davis, Ronald W

    2003-06-01

    We report on the development of molecular inversion probe (MIP) genotyping, an efficient technology for large-scale single nucleotide polymorphism (SNP) analysis. This technique uses MIPs to produce inverted sequences, which undergo a unimolecular rearrangement and are then amplified by PCR using common primers and analyzed using universal sequence tag DNA microarrays, resulting in highly specific genotyping. With this technology, multiplex analysis of more than 1,000 probes in a single tube can be done using standard laboratory equipment. Genotypes are generated with a high call rate (95%) and high accuracy (>99%) as determined by independent sequencing. PMID:12730666

  20. Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation

    PubMed Central

    Poh, Huay Mei; Peh, Su Qin; Ong, Chin Thing; Zhang, Jingyao; Ruan, Xiaoan; Ruan, Yijun

    2012-01-01

    Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14. Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8. We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video. PMID:22564980

  1. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

    PubMed

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

    2016-01-01

    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family. PMID:27323082

  2. Generation and Analysis of Expressed Sequence Tags from Chimonanthus praecox (Wintersweet) Flowers for Discovering Stress-Responsive and Floral Development-Related Genes

    PubMed Central

    Sui, Shunzhao; Luo, Jianghui; Ma, Jing; Zhu, Qinlong; Lei, Xinghua; Li, Mingyang

    2012-01-01

    A complementary DNA library was constructed from the flowers of Chimonanthus praecox, an ornamental perennial shrub blossoming in winter in China. Eight hundred sixty-seven high-quality expressed sequence tag sequences with an average read length of 673.8 bp were acquired. A nonredundant set of 479 unigenes, including 94 contigs and 385 singletons, was identified after the expressed sequence tags were clustered and assembled. BLAST analysis against the nonredundant protein database and nonredundant nucleotide database revealed that 405 unigenes shared significant homology with known genes. The homologous unigenes were categorized according to Gene Ontology hierarchies (biological, cellular, and molecular). By BLAST analysis and Gene Ontology annotation, 95 unigenes involved in stress and defense and 19 unigenes related to floral development were identified based on existing knowledge. Twelve genes, of which 9 were annotated as “cold response,” were examined by real-time RT-PCR to understand the changes in expression patterns under cold stress and to validate the findings. Fourteen genes, including 11 genes related to floral development, were also detected by real-time RT-PCR to validate the expression patterns in the blooming process and in different tissues. This study provides a useful basis for the genomic analysis of C. praecox. PMID:22536115

  3. TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

    PubMed Central

    2010-01-01

    Background Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data. Results TagCleaner is a web application developed to automatically identify and remove known or unknown tag sequences allowing insertions and deletions in the dataset. TagCleaner is designed to filter the trimmed reads for duplicates, short reads, and reads with high rates of ambiguous sequences. An additional screening for and splitting of fragment-to-fragment concatenations that gave rise to artificial concatenated sequences can increase the quality of the dataset. Users may modify the different filter parameters according to their own preferences. Conclusions TagCleaner is a publicly available web application that is able to automatically detect and efficiently remove tag sequences from metagenomic datasets. It is easily configurable and provides a user-friendly interface. The interactive web interface facilitates export functionality for subsequent data processing, and is available at http://edwards.sdsu.edu/tagcleaner. PMID:20573248

  4. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  5. Exploring the host parasitism of the migratory plant-parasitic nematode Ditylenchus destuctor by expressed sequence tags analysis.

    PubMed

    Peng, Huan; Gao, Bing-li; Kong, Ling-an; Yu, Qing; Huang, Wen-kun; He, Xu-feng; Long, Hai-bo; Peng, De-liang

    2013-01-01

    The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary

  6. Exploring the Host Parasitism of the Migratory Plant-Parasitic Nematode Ditylenchus destuctor by Expressed Sequence Tags Analysis

    PubMed Central

    Peng, Huan; Gao, Bing-li; Kong, Ling-an; Yu, Qing; Huang, Wen-kun; He, Xu-feng; Long, Hai-bo; Peng, De-liang

    2013-01-01

    The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary

  7. Expressed sequence-tag analysis of ovaries of Brachiaria brizantha reveals genes associated with the early steps of embryo sac differentiation of apomictic plants.

    PubMed

    Silveira, Erica Duarte; Guimarães, Larissa Arrais; de Alencar Dusi, Diva Maria; da Silva, Felipe Rodrigues; Martins, Natália Florencio; do Carmo Costa, Marcos Mota; Alves-Ferreira, Márcio; de Campos Carneiro, Vera Tavares

    2012-02-01

    In apomixis, asexual mode of plant reproduction through seeds, an unreduced megagametophyte is formed due to circumvented or altered meiosis. The embryo develops autonomously from the unreduced egg cell, independently of fertilization. Brachiaria is a genus of tropical forage grasses that reproduces sexually or by apomixis. A limited number of studies have reported the sequencing of apomixis-related genes and a few Brachiaria sequences have been deposited at genebank databases. This work shows sequencing and expression analyses of expressed sequence-tags (ESTs) of Brachiaria genus and points to transcripts from ovaries with preferential expression at megasporogenesis in apomictic plants. From the 11 differentially expressed sequences from immature ovaries of sexual and apomictic Brachiaria brizantha obtained from macroarray analysis, 9 were preferentially detected in ovaries of apomicts, as confirmed by RT-qPCR. A putative involvement in early steps of Panicum-type embryo sac differentiation of four sequences from B. brizantha ovaries: BbrizHelic, BbrizRan, BbrizSec13 and BbrizSti1 is suggested. Two of these, BbrizSti1 and BbrizHelic, with similarity to a gene coding to stress induced protein and a helicase, respectively, are preferentially expressed in the early stages of apomictic ovaries development, especially in the nucellus, in a stage previous to the differentiation of aposporous initials, as verified by in situ hybridization. PMID:22068439

  8. Adult midgut expressed sequence tags from the tsetse fly Glossina morsitans morsitans and expression analysis of putative immune response genes

    PubMed Central

    Lehane, M J; Aksoy, S; Gibson, W; Kerhornou, A; Berriman, M; Hamilton, J; Soares, M B; Bonaldo, M F; Lehane, S; Hall, N

    2003-01-01

    Background Tsetse flies transmit African trypanosomiasis leading to half a million cases annually. Trypanosomiasis in animals (nagana) remains a massive brake on African agricultural development. While trypanosome biology is widely studied, knowledge of tsetse flies is very limited, particularly at the molecular level. This is a serious impediment to investigations of tsetse-trypanosome interactions. We have undertaken an expressed sequence tag (EST) project on the adult tsetse midgut, the major organ system for establishment and early development of trypanosomes. Results A total of 21,427 ESTs were produced from the midgut of adult Glossina morsitans morsitans and grouped into 8,876 clusters or singletons potentially representing unique genes. Putative functions were ascribed to 4,035 of these by homology. Of these, a remarkable 3,884 had their most significant matches in the Drosophila protein database. We selected 68 genes with putative immune-related functions, macroarrayed them and determined their expression profiles following bacterial or trypanosome challenge. In both infections many genes are downregulated, suggesting a malaise response in the midgut. Trypanosome and bacterial challenge result in upregulation of different genes, suggesting that different recognition pathways are involved in the two responses. The most notable block of genes upregulated in response to trypanosome challenge are a series of Toll and Imd genes and a series of genes involved in oxidative stress responses. Conclusions The project increases the number of known Glossina genes by two orders of magnitude. Identification of putative immunity genes and their preliminary characterization provides a resource for the experimental dissection of tsetse-trypanosome interactions. PMID:14519198

  9. Extending RAD tag analysis to microbial ecology: a comparison between MultiLocus Sequence Typing and 2b-RAD to investigate Listeria monocytogenes genetic structure.

    PubMed

    Pauletto, Marianna; Carraro, Lisa; Babbucci, Massimiliano; Lucchini, Rosaria; Bargelloni, Luca; Cardazzo, Barbara

    2016-05-01

    The advent of next-generation sequencing (NGS) has dramatically changed bacterial typing technologies, increasing our ability to differentiate bacterial isolates. Despite it is now possible to sequence a bacterial genome in a few days and at reasonable costs, most genetic analyses do not require whole-genome sequencing, which also remains impractical for large population samples due to the cost of individual library preparation and bioinformatics. More traditional sequencing approaches, however, such as MultiLocus Sequence Typing (mlst) are quite laborious and time-consuming, especially for large-scale analyses. In this study, a genotyping approach based on restriction site-associated (RAD) tag sequencing, 2b-RAD, was applied to characterize Listeria monocytogenes strains. To verify the feasibility of the method, an in silico analysis was performed on 30 available complete genomes. For the same set of strains, in silico mlst analysis was conducted as well. Subsequently, 2b-RAD and mlst analyses were experimentally carried out on 58 isolates collected from food samples or food-processing sites. The obtained results demonstrate that 2b-RAD predicts mlst types and often provides more detailed information on population structure than mlst. Moreover, the majority of variants differentiating identical sequence type isolates mapped against accessory fragments, thus providing additional information to characterize strains. Although mlst still represents a reliable typing method, large-scale studies on molecular epidemiology and public health, as well as bacterial phylogenetics, population genetics and biosafety could benefit of a low cost and fast turnaround time approach such as the 2b-RAD analysis proposed here. PMID:26613186

  10. Analysis of expression sequence tags from a full-length-enriched cDNA library of developing sesame seeds (Sesamum indicum)

    PubMed Central

    2011-01-01

    Background Sesame (Sesamum indicum) is one of the most important oilseed crops with high oil contents and rich nutrient value. However, genetic improvement efforts in sesame could not get benefit from molecular biology technology due to poor DNA and RNA sequence resources. In this study, we carried out a large scale of expressed sequence tags (ESTs) sequencing from developing sesame seeds and further conducted analysis on seed storage products-related genes. Results A normalized and full-length enriched cDNA library from 5 ~ 30 days old immature seeds was constructed and randomly sequenced, leading to generation of 41,248 expressed sequence tags (ESTs) which then formed 4,713 contigs and 27,708 singletons with 44.9% uniESTs being putative full-length open reading frames. Approximately 26,091 of all these uniESTs have significant matches to the counterparts in Nr database of GenBank, and 21,628 of them were assigned to one or more Gene ontology (GO) terms. Homologous genes involved in oil biosynthesis were identified including some conservative transcription factors regulating oil biosynthesis such as LEAFY COTYLEDON1 (LEC1), PICKLE (PKL), WRINKLED1 (WRI1) and majority of them were found for the first time in sesame seeds. One hundred and 17 ESTs were identified possibly involved in biosynthesis of sesame lignans, sesamin and sesamolin. In total, 9,347 putative functional genes from developing seeds were identified, which accounts for one third of total genes in the sesame genome. Further analysis of the uniESTs identified 1,949 non-redundant simple sequence repeats (SSRs). Conclusions This study has provided an overview of genes expressed during sesame seed development. This collection of sesame full-length cDNAs covered a wide variety of genes in seeds, in particular, candidate genes involved in biosynthesis of sesame oils and lignans. These EST sequences enriched with full length will contribute to comparative genomic studies on sesame and other oilseed plants

  11. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  12. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    PubMed Central

    Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  13. Development of Microsatellite Markers Derived from Expressed Sequence Tags of Polyporales for Genetic Diversity Analysis of Endangered Polyporus umbellatus.

    PubMed

    Zhang, Yuejin; Chen, Yuanyuan; Wang, Ruihong; Zeng, Ailin; Deyholos, Michael K; Shu, Jia; Guo, Hongbo

    2015-01-01

    A large scale of EST sequences of Polyporales was screened in this investigation in order to identify EST-SSR markers for various applications. The distribution of EST sequences and SSRs in five families of Polyporales was analyzed, respectively. Mononucleotide was the most abundant type, followed by trinucleotide. Among five families, Ganodermataceae occupied the most SSR markers, followed by Coriolaceae. Functional prediction of SSR marker-containing EST sequences in Ganoderma lucidum obtained three main groups, namely, cellular component, biological process, and molecular function. Thirty EST-SSR primers were designed to evaluate the genetic diversity of 13 natural Polyporus umbellatus accessions. Twenty one EST-SSRs were polymorphic with average PIC value of 0.33 and transferability rate of 71%. These 13 P. umbellatus accessions showed relatively high genetic diversity. The expected heterozygosity, Nei's gene diversity, and Shannon information index were 0.41, 0.39, and 0.57, respectively. Both UPGMA dendrogram and principal coordinate analysis (PCA) showed the same cluster result that divided the 13 accessions into three or four groups. PMID:26146636

  14. Expressed sequence tag analysis and development of gene associated markers in a near-isogenic plant system of Eragrostis curvula.

    PubMed

    Cervigni, Gerardo D L; Paniego, Norma; Díaz, Marina; Selva, Juan P; Zappacosta, Diego; Zanazzi, Darío; Landerreche, Iñaki; Martelotto, Luciano; Felitti, Silvina; Pessino, Silvina; Spangenberg, Germán; Echenique, Viviana

    2008-05-01

    Eragrostis curvula (Schrad.) Nees is a forage grass native to the semiarid regions of Southern Africa, which reproduces mainly by pseudogamous diplosporous apomixis. A collection of ESTs was generated from four cDNA libraries, three of them obtained from panicles of near-isogenic lines with different ploidy levels and reproductive modes, and one obtained from 12 days-old plant leaves. A total of 12,295 high-quality ESTs were clustered and assembled, rendering 8,864 unigenes, including 1,490 contigs and 7,394 singletons, with a genome coverage of 22%. A total of 7,029 (79.11%) unigenes were functionally categorized by BLASTX analysis against sequences deposited in public databases, but only 37.80% could be classified according to Gene Ontology. Sequence comparison against the cereals genes indexes (GI) revealed 50% significant hits. A total of 254 EST-SSRs were detected from 219 singletons and 35 from contigs. Di- and tri- motifs were similarly represented with percentages of 38.95 and 40.16%, respectively. In addition, 190 SNPs and Indels were detected in 18 contigs generated from 3 to 4 libraries. The ESTs and the molecular markers obtained in this study will provide valuable resources for a wide range of applications including gene identification, genetic mapping, cultivar identification, analysis of genetic diversity, phenotype mapping and marker assisted selection. PMID:18196464

  15. Identification of salt-induced genes from Salicornia brachiata, an extreme halophyte through expressed sequence tags analysis.

    PubMed

    Jha, Bhavanath; Agarwal, Pradeep K; Reddy, Palakolanu Sudhakar; Lal, Sanjay; Sopory, Sudhir K; Reddy, Malireddy K

    2009-04-01

    Salinity severely affects plant growth and development causing crop loss worldwide. We have isolated a large number of salt-induced genes as well as unknown and hypothetical genes from Salicornia brachiata Roxb. (Amaranthaceae). This is the first description of identification of genes in response to salinity stress in this extreme halophyte plant. Salicornia accumulates salt in its pith and survives even at 2 M NaCl under field conditions. For isolating salt responsive genes, cDNA subtractive hybridization was performed between control and 500 mM NaCl treated plants. Out of the 1200 recombinant clones, 930 sequences were submitted to the NCBI database (GenBank accession: EB484528 to EB485289 and EC906125 to EC906292). 789 ESTs showed matching with different genes in NCBI database. 4.8% ESTs belonged to stress-tolerant gene category and approximately 29% ESTs showed no homology with known functional gene sequences, thus classified as unknown or hypothetical. The detection of a large number of ESTs with unknown putative function in this species makes it an interesting contribution. The 90 unknown and hypothetical genes were selected to study their differential regulation by reverse Northern analysis for identifying their role in salinity tolerance. Interestingly, both up and down regulation at 500 mM NaCl were observed (21 and 10 genes, respectively). Northern analysis of two important salt tolerant genes, ASR1 (Abscisic acid stress ripening gene) and plasma membrane H+ATPase, showed the basal level of transcripts in control condition and an increase with NaCl treatment. ASR1 gene is made full length using 5' RACE and its potential role in imparting salt tolerance is being studied. PMID:19556705

  16. A sequence-tagged site map of human chromosome 11.

    PubMed

    Smith, M W; Clark, S P; Hutchinson, J S; Wei, Y H; Churukian, A C; Daniels, L B; Diggle, K L; Gen, M W; Romo, A J; Lin, Y

    1993-09-01

    We report the construction of 370 sequence-tagged sites (STSs) that are detectable by PCR amplification under sets of standardized conditions and that have been regionally mapped to human chromosome 11. DNA sequences were determined by sequencing directly from cosmid templates using primers complementary to T3 and T7 promoters present in the cloning vector. Oligonucleotide PCR primers were predicted by computer and tested using a battery of genomic DNAs. Cosmids were regionally localized on chromosome 11 by using fluorescence in situ hybridization or by analyzing a somatic cell hybrid panel. Additional STSs corresponding to known genes and markers on chromosome 11 were also produced under the same series of standardized conditions. The resulting STSs provide uniform coverage of chromosome 11 with an average spacing of 340 kb. The DNA sequence determined for use in STS production corresponds to about 0.1% (116 kb) of chromosome 11 and has been analyzed for the presence of repetitive sequences, similarities to known genes and motifs, and possible exons. Computer analysis of this sequence has identified and therefore mapped at least eight new genes on chromosome 11. PMID:8244387

  17. Computational exploration of microRNAs from expressed sequence tags of Humulus lupulus, target predictions and expression analysis.

    PubMed

    Mishra, Ajay Kumar; Duraisamy, Ganesh Selvaraj; Týcová, Anna; Matoušek, Jaroslav

    2015-12-01

    Among computationally predicted and experimentally validated plant miRNAs, several are conserved across species boundaries in the plant kingdom. In this study, a combined experimental-in silico computational based approach was adopted for the identification and characterization of miRNAs in Humulus lupulus (hop), which is widely cultivated for use by the brewing industry and apart from, used as a medicinal herb. A total of 22 miRNAs belonging to 17 miRNA families were identified in hop following comparative computational approach and EST-based homology search according to a series of filtering criteria. Selected miRNAs were validated by end-point PCR and quantitative reverse transcription-polymerase chain reaction (qRT-PCR), confirmed the existence of conserved miRNAs in hop. Based on the characteristic that miRNAs exhibit perfect or nearly perfect complementarity with their targeted mRNA sequences, a total of 47 potential miRNA targets were identified in hop. Strikingly, the majority of predicted targets were belong to transcriptional factors which could regulate hop growth and development, including leaf, root and even cone development. Moreover, the identified miRNAs may also be involved in other cellular and metabolic processes, such as stress response, signal transduction, and other physiological processes. The cis-regulatory elements relevant to biotic and abiotic stress, plant hormone response, flavonoid biosynthesis were identified in the promoter regions of those miRNA genes. Overall, findings from this study will accelerate the way for further researches of miRNAs, their functions in hop and shows a path for the prediction and analysis of miRNAs to those species whose genomes are not available. PMID:26476128

  18. OligoTag: a program for designing sets of tags for next-generation sequencing of multiplexed samples.

    PubMed

    Coissac, Eric

    2012-01-01

    Next-generation sequencing systems allow high-throughput production of DNA sequence data. But this technology is more adapted for analyzing a small number of samples needing a huge amount of sequences rather than a large number of samples needing a small number of sequences. One solution to this problem is sample multiplexing. To achieve this, one can add a small tag at the extremities of the sequenced DNA molecules. These tags will be identified using bioinformatics tools after the sequencing step to sort sequences among samples. The rules to apply for selecting a good set of tags adapted to each situation are described in this chapter. Depending on the number of samples to tag and on the required quality of assignation, different solutions are possible. The software oligoTag, a part of OBITools that computes these sets of tags, is presented with some example sets of tags. PMID:22665273

  19. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    PubMed Central

    de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084

  20. Generation and Analysis of Expressed Sequence Tags (ESTs) from Halophyte Atriplex canescens to Explore Salt-Responsive Related Genes

    PubMed Central

    Li, Jingtao; Sun, Xinhua; Yu, Gang; Jia, Chengguo; Liu, Jinliang; Pan, Hongyu

    2014-01-01

    Little information is available on gene expression profiling of halophyte A. canescens. To elucidate the molecular mechanism for stress tolerance in A. canescens, a full-length complementary DNA library was generated from A. canescens exposed to 400 mM NaCl, and provided 343 high-quality ESTs. In an evaluation of 343 valid EST sequences in the cDNA library, 197 unigenes were assembled, among which 190 unigenes (83.1% ESTs) were identified according to their significant similarities with proteins of known functions. All the 343 EST sequences have been deposited in the dbEST GenBank under accession numbers JZ535802 to JZ536144. According to Arabidopsis MIPS functional category and GO classifications, we identified 193 unigenes of the 311 annotations EST, representing 72 non-redundant unigenes sharing similarities with genes related to the defense response. The sets of ESTs obtained provide a rich genetic resource and 17 up-regulated genes related to salt stress resistance were identified by qRT-PCR. Six of these genes may contribute crucially to earlier and later stage salt stress resistance. Additionally, among the 343 unigenes sequences, 22 simple sequence repeats (SSRs) were also identified contributing to the study of A. canescens resources. PMID:24960361

  1. Generation and analysis of expressed sequence tags(ESTs) for marker development in yam (Dioscores alata L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A total of 44,757 EST sequences , 1705 EST-SSR and 104 SNP markers were generated from the cDNA libraries of the resistant and susceptible genotypes. We have developed a comprehensive annotated transcriptome data set in yam to enrich the EST information in public databases. These EST resources prov...

  2. Identification of Tuber borchii Vittad. mycelium proteins separated by two-dimensional polyacrylamide gel electrophoresis using amino acid analysis and sequence tagging.

    PubMed

    Vallorani, L; Bernardini, F; Sacconi, C; Pierleoni, R; Pieretti, B; Piccoli, G; Buffalini, M; Stocchi, V

    2000-11-01

    This paper reports the first results in the proteome analysis of Tuber borchii Vittad. mycelium, an ectomycorrhizal fungus poorly defined genetically, but known for its generation of edible fruit bodies known as white truffles. Employing isoelectric focusing on immobilized pH gradients, followed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, we obtained an electropherogram presenting over 800 spots within the window of isoelectric points (pI) 3.5-9 and a molecular mass of 10-200 kDa. Different reducing agents were tested in the sample preparation buffers, and the standard lysis buffer plus 2% w/v polyvinylpolypyrrolidone allowed the best solubilization and resolution of the proteins. The T. borchii proteins separated in micropreparative gels were electroblotted onto polyvinylidene difluoride membranes and visualized by Coomassie staining. Twenty-three proteins were excised and analyzed by the combination of amino acid and N-terminal analysis. One protein was identified by matching its amino acid composition, estimated isoelectric point and molecular mass against the SWISS-PROT and EMBL databases. Four spots were successfully tagged by Edman microsequencing but no homologous sequences were found in databases. PMID:11271490

  3. Analysis of bacterial and archaeal diversity in coastal microbial mats using massive parallel 16S rRNA gene tag sequencing

    PubMed Central

    Bolhuis, Henk; Stal, Lucas J

    2011-01-01

    Coastal microbial mats are small-scale and largely closed ecosystems in which a plethora of different functional groups of microorganisms are responsible for the biogeochemical cycling of the elements. Coastal microbial mats play an important role in coastal protection and morphodynamics through stabilization of the sediments and by initiating the development of salt-marshes. Little is known about the bacterial and especially archaeal diversity and how it contributes to the ecological functioning of coastal microbial mats. Here, we analyzed three different types of coastal microbial mats that are located along a tidal gradient and can be characterized as marine (ST2), brackish (ST3) and freshwater (ST3) systems. The mats were sampled during three different seasons and subjected to massive parallel tag sequencing of the V6 region of the 16S rRNA genes of Bacteria and Archaea. Sequence analysis revealed that the mats are among the most diverse marine ecosystems studied so far and consist of several novel taxonomic levels ranging from classes to species. The diversity between the different mat types was far more pronounced than the changes between the different seasons at one location. The archaeal community for these mats have not been studied before and revealed a strong reaction on a short period of draught during summer resulting in a massive increase in halobacterial sequences, whereas the bacterial community was barely affected. We concluded that the community composition and the microbial diversity were intrinsic of the mat type and depend on the location along the tidal gradient indicating a relation with salinity. PMID:21544102

  4. A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

    PubMed Central

    2012-01-01

    Background Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis. Results To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response

  5. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  6. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  7. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  8. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    PubMed Central

    Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

    2007-01-01

    Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547

  9. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-01-01

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species. PMID:26505424

  10. Generation and analysis of expressed sequence tags (ESTs) of Camelina sativa to mine drought stress-responsive genes.

    PubMed

    Kanth, Bashistha Kumar; Kumari, Shipra; Choi, Seo Hee; Ha, Hye-Jeong; Lee, Geung-Joo

    2015-11-01

    Camelina sativa is an oil-producing crop belonging to the family of Brassicaceae. Due to exceptionally high content of omega fatty acid, it is commercially grown around the world as edible oil, biofuel, and animal feed. A commonly referred 'false flax' or gold-of-pleasure Camelina sativa has been interested as one of biofuel feedstocks. The species can grow on marginal land due to its superior drought tolerance with low requirement of agricultural inputs. This crop has been unexploited due to very limited transcriptomic and genomic data. Use of gene-specific molecular markers is an important strategy for new cultivar development in breeding program. In this study, Illumina paired-end sequencing technology and bioinformatics tools were used to obtain expression profiling of genes responding to drought stress in Camelina sativa BN14. A total of more than 60,000 loci were assembled, corresponding to approximately 275 K transcripts. When the species was exposed to 10 kPa drought stress, 100 kPa drought stress, and rehydrated conditions, a total of 107, 2,989, and 982 genes, respectively, were up-regulated, while 146, 3,659, and 1189 genes, respectively, were down-regulated compared to control condition. Some unknown genes were found to be highly expressed under drought conditions, together with some already reported gene families such as senescence-associated genes, CAP160, and LEA under 100 kPa soil water condition, cysteine protease, 2OG, Fe(II)-dependent oxygenase, and RAD-like 1 under rehydrated condition. These genes will be further validated and mapped to determine their function and loci. This EST library will be favorably applied to develop gene-specific molecular markers and discover genes responsible for drought tolerance in Camelina species. PMID:26410535

  11. Applying thiouracil (TU)-tagging for mouse transcriptome analysis

    PubMed Central

    Gay, Leslie; Karfilis, Kate V.; Miller, Michael R.; Doe, Chris Q.; Stankunas, Kryn

    2014-01-01

    Transcriptional profiling is a powerful approach to study mouse development, physiology, and disease models. Here, we describe a protocol for mouse thiouracil-tagging (TU-tagging), a transcriptome analysis technology that includes in vivo covalent labeling, purification, and analysis of cell type-specific RNA. TU-tagging enables 1) the isolation of RNA from a given cell population of a complex tissue, avoiding transcriptional changes induced by cell isolation trauma, and 2) the identification of actively transcribed RNAs and not pre-existing transcripts. Therefore, in contrast to other cell-specific transcriptional profiling methods based on purification of tagged ribosomes or nuclei, TU-tagging provides a direct examination of transcriptional regulation. We describe how to: 1) deliver 4-thiouracil to transgenic mice to thio-label cell lineage-specific transcripts, 2) purify TU-tagged RNA and prepare libraries for Illumina sequencing, and 3) follow a straight-forward bioinformatics workflow to identify cell type-enriched or differentially expressed genes. Tissue containing TU-tagged RNA can be obtained in one day, RNA-Seq libraries generated within two days, and, following sequencing, an initial bioinformatics analysis completed in one additional day. PMID:24457332

  12. Needles in the EST Haystack: Large-Scale Identification and Analysis of Excretory-Secretory (ES) Proteins in Parasitic Nematodes Using Expressed Sequence Tags (ESTs)

    PubMed Central

    Nagaraj, Shivashankar H.; Gasser, Robin B.; Ranganathan, Shoba

    2008-01-01

    Background Parasitic nematodes of humans, other animals and plants continue to impose a significant public health and economic burden worldwide, due to the diseases they cause. Promising antiparasitic drug and vaccine candidates have been discovered from excreted or secreted (ES) proteins released from the parasite and exposed to the immune system of the host. Mining the entire expressed sequence tag (EST) data available from parasitic nematodes represents an approach to discover such ES targets. Methods and Findings In this study, we predicted, using EST2Secretome, a novel, high-throughput, computational workflow system, 4,710 ES proteins from 452,134 ESTs derived from 39 different species of nematodes, parasitic in animals (including humans) or plants. In total, 2,632, 786, and 1,292 ES proteins were predicted for animal-, human-, and plant-parasitic nematodes. Subsequently, we systematically analysed ES proteins using computational methods. Of these 4,710 proteins, 2,490 (52.8%) had orthologues in Caenorhabditis elegans, whereas 621 (13.8%) appeared to be novel, currently having no significant match to any molecule available in public databases. Of the C. elegans homologues, 267 had strong “loss-of-function” phenotypes by RNA interference (RNAi) in this nematode. We could functionally classify 1,948 (41.3%) sequences using the Gene Ontology (GO) terms, establish pathway associations for 573 (12.2%) sequences using Kyoto Encyclopaedia of Genes and Genomes (KEGG), and identify protein interaction partners for 1,774 (37.6%) molecules. We also mapped 758 (16.1%) proteins to protein domains including the nematode-specific protein family “transthyretin-like” and “chromadorea ALT,” considered as vaccine candidates against filariasis in humans. Conclusions We report the large-scale analysis of ES proteins inferred from EST data for a range of parasitic nematodes. This set of ES proteins provides an inventory of known and novel members of ES proteins as a

  13. Identification of potential vaccine and drug target candidates by expressed sequence tag analysis and immunoscreening of Onchocerca volvulus larval cDNA libraries.

    PubMed

    Lizotte-Waniewski, M; Tawe, W; Guiliano, D B; Lu, W; Liu, J; Williams, S A; Lustigman, S

    2000-06-01

    The search for appropriate vaccine candidates and drug targets against onchocerciasis has so far been confronted with several limitations due to the unavailability of biological material, appropriate molecular resources, and knowledge of the parasite biology. To identify targets for vaccine or chemotherapy development we have undertaken two approaches. First, cDNA expression libraries were constructed from life cycle stages that are critical for establishment of Onchocerca volvulus infection, the third-stage larvae (L3) and the molting L3. A gene discovery effort was then initiated by random expressed sequence tag analysis of 5,506 cDNA clones. Cluster analyses showed that many of the transcripts were up-regulated and/or stage specific in either one or both of the cDNA libraries when compared to the microfilariae, L2, and both adult stages of the parasite. Homology searches against the GenBank database facilitated the identification of several genes of interest, such as proteinases, proteinase inhibitors, antioxidant or detoxification enzymes, and neurotransmitter receptors, as well as structural and housekeeping genes. Other O. volvulus genes showed homology only to predicted genes from the free-living nematode Caenorhabditis elegans or were entirely novel. Some of the novel proteins contain potential secretory leaders. Secondly, by immunoscreening the molting L3 cDNA library with a pool of human sera from putatively immune individuals, we identified six novel immunogenic proteins that otherwise would not have been identified as potential vaccinogens using the gene discovery effort. This study lays a solid foundation for a better understanding of the biology of O. volvulus as well as for the identification of novel targets for filaricidal agents and/or vaccines against onchocerciasis based on immunological and rational hypothesis-driven research. PMID:10816503

  14. A score system for quality evaluation of RNA sequence tags: an improvement for gene expression profiling

    PubMed Central

    Pinheiro, Daniel G; Galante, Pedro AF; de Souza, Sandro J; Zago, Marco A; Silva, Wilson A

    2009-01-01

    Background High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at . S3T source code and datasets can also be downloaded from the aforementioned website. PMID:19500384

  15. Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

    PubMed Central

    Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

    2015-01-01

    Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular

  16. Tag jumps illuminated--reducing sequence-to-sample misidentifications in metabarcoding studies.

    PubMed

    Schnell, Ida Baerholm; Bohmann, Kristine; Gilbert, M Thomas P

    2015-11-01

    Metabarcoding of environmental samples on second-generation sequencing platforms has rapidly become a valuable tool for ecological studies. A fundamental assumption of this approach is the reliance on being able to track tagged amplicons back to the samples from which they originated. In this study, we address the problem of sequences in metabarcoding sequencing outputs with false combinations of used tags (tag jumps). Unless these sequences can be identified and excluded from downstream analyses, tag jumps creating sequences with false, but already used tag combinations, can cause incorrect assignment of sequences to samples and artificially inflate diversity. In this study, we document and investigate tag jumping in metabarcoding studies on Illumina sequencing platforms by amplifying mixed-template extracts obtained from bat droppings and leech gut contents with tagged generic arthropod and mammal primers, respectively. We found that an average of 2.6% and 2.1% of sequences had tag combinations, which could be explained by tag jumping in the leech and bat diet study, respectively. We suggest that tag jumping can happen during blunt-ending of pools of tagged amplicons during library build and as a consequence of chimera formation during bulk amplification of tagged amplicons during library index PCR. We argue that tag jumping and contamination between libraries represents a considerable challenge for Illumina-based metabarcoding studies, and suggest measures to avoid false assignment of tag jumping-derived sequences to samples. PMID:25740652

  17. Development of expressed sequence tag-based microsatellite markers for the critically endangered Isoëtes sinensis (Isoetaceae) based on transcriptome analysis.

    PubMed

    Gichira, A W; Long, Z C; Wang, Q F; Chen, J M; Liao, K

    2016-01-01

    Isoëtes sinensis is a critically endangered quillwort. To facilitate studies on the conservation genetics of this species, we developed expressed sequence tag-simple sequence repeat (EST-SSR) markers. A total of 50,063 unigenes were predicted by transcriptome sequencing, 5294 (10.6%) of which significantly matched 3011 Gene Ontology annotations and 2363 were assigned to Kyoto Encyclopedia of Genes and Genomes metabolic pathways. Most of these (2297) were involved in metabolism. A total of 1982 SSR motifs were identified, with trinucleotides being the dominant repeat motif, and 1438 (72.6%) SSR primers were designed. Eighteen randomly selected primer pairs were used to genotype 24 I. sinensis accessions, which confirmed the suitability of these novel markers for molecular studies of I. sinensis. The heterozygosity index value ranged between 0.0799 and 0.9106, while the Shannon-Wiener diversity index value ranged between 0.1732 and 2.5589. The EST-SSRs reported in this study are linked to genic sequences, and are therefore ideal for investigating the evolutionary history of I. sinensis. These markers, together with the large EST dataset generated in this study, will greatly facilitate conservation genetic studies of I. sinensis. PMID:27525847

  18. Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

    PubMed Central

    2009-01-01

    Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as

  19. HIV-1 Quasispecies Delineation by Tag Linkage Deep Sequencing

    PubMed Central

    Wu, Nicholas C.; De La Cruz, Justin; Al-Mawsawi, Laith Q.; Olson, C. Anders; Qi, Hangfei; Luan, Harding H.; Nguyen, Nguyen; Du, Yushen; Le, Shuai; Wu, Ting-Ting; Li, Xinmin; Lewis, Martha J.; Yang, Otto O.; Sun, Ren

    2014-01-01

    Trade-offs between throughput, read length, and error rates in high-throughput sequencing limit certain applications such as monitoring viral quasispecies. Here, we describe a molecular-based tag linkage method that allows assemblage of short sequence reads into long DNA fragments. It enables haplotype phasing with high accuracy and sensitivity to interrogate individual viral sequences in a quasispecies. This approach is demonstrated to deduce ∼2000 unique 1.3 kb viral sequences from HIV-1 quasispecies in vivo and after passaging ex vivo with a detection limit of ∼0.005% to ∼0.001%. Reproducibility of the method is validated quantitatively and qualitatively by a technical replicate. This approach can improve monitoring of the genetic architecture and evolution dynamics in any quasispecies population. PMID:24842159

  20. A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A blackberry (Rubus L.) expressed sequence tag (EST) library was produced for developing simple sequence repeat (SSR) markers from the tetraploid blackberry cultivar, Merton Thornless, the source of the thornless trait in commercial cultivars. RNA was extracted from young expanding leaves and used f...

  1. Genomic and cDNA sequence tags of the hyperthermophilic archaeon Pyrobaculum aerophilum.

    PubMed Central

    Völkl, P; Markiewicz, P; Baikalov, C; Fitz-Gibbon, S; Stetter, K O; Miller, J H

    1996-01-01

    The hyperthermophilic archaeum, Pyrobaculum aerophilum, grows optimally at 100 degrees C with a doubling time of 180 min. It is a member of the phylogenetically ancient Thermoproteales order, but differs significantly from all other members by its facultatively aerobic metabolism. Due to its simple cultivation requirements and its nearly 100% plating efficiency, it was chosen as a model organism for studying the genome organization of hyperthermophilic ancient archaea. By a G+C content of the DNA of 52 mol%, sequence analysis was easily possible. At least some of the mRNA of P. aerophilum carried poly-A tails facilitating the construction of a cDNA library. 245 sequence tags of a poly-A primed cDNA library and 55 sequence tags from a 1-2 kb Sau3AI-fragment containing genomic library were analyzed and the corresponding amino acid sequences compared with protein sequences from databases. Fourteen percent of the cDNA and >9% of genomic DNA sequence tags revealed significant similarities to proteins in the databases. Matches were obtained to proteins from archaeal, bacterial and eukaryal sources. Some sequences showed greatest similarity to eukaryal rather than to bacterial versions of proteins, other matches were found to proteins which had previously only been found in eukaryotes. PMID:8948626

  2. Improved Sequence Tag Generation Method for Peptide Identification in Tandem Mass Spectrometry

    PubMed Central

    Cao, Xia; Nesvizhskii, Alexey I.

    2013-01-01

    The sequence tag-based peptide identification methods are a promising alternative to the traditional database search approach. However, a more comprehensive analysis, optimization, and comparison with established methods are necessary before these methods can gain widespread use in the proteomics community. Using the InsPecT open source code base (Tanner et al., Anal Chem. 2005, 77:4626–39), we present an improved sequence tag generation method that directly incorporates multi-charged fragment ion peaks present in many tandem mass spectra of higher charge states. We also investigate the performance of sequence tagging under different settings using control datasets generated on five different types of mass spectrometers, as well as using a complex phosphopeptide-enriched sample. We also demonstrate that additional modeling of InsPecT search scores using a semi-parametric approach incorporating the accuracy of the precursor ion mass measurement provides additional improvement in the ability to discriminate between correct and incorrect peptide identifications. The overall superior performance of the sequence tag-based peptide identification method is demonstrated by comparison with a commonly used SEQUEST/PeptideProphet approach. PMID:18785767

  3. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  4. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  5. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  6. Expressed sequence tags from the plant trypanosomatid Phytomonas serpens.

    PubMed

    Pappas, Georgios J; Benabdellah, Karim; Zingales, Bianca; González, Antonio

    2005-08-01

    We have generated 2190 expressed sequence tags (ESTs) from a cDNA library of the plant trypanosomatid Phytomonas serpens. Upon processing and clustering the set of 1893 accepted sequences was reduced to 697 clusters consisting of 452 singletons and 245 contigs. Functional categories were assigned based on BLAST searches against a database of the eukaryotic orthologous groups of proteins (KOG). Thirty six percent of the generated sequences showed no hits against the KOG database and 39.6% presented similarity to the KOG classes corresponding to translation, ribosomal structure and biogenesis. The most populated cluster contained 45 ESTs homologous to members of the glucose transporter family. This fact can be immediately correlated to the reported Phytomonas dependence on anaerobic glycolytic ATP production due to the lack of cytochrome-mediated respiratory chain. In this context, not only a number of enzymes of the glycolytic pathway were identified but also of the Krebs cycle as well as specific components of the respiratory chain. The data here reported, including a few hundred unique sequences and the description of tandemly repeated motifs and putative transcript stability motifs at untranslated mRNA ends, represent an initial approach to overcome the lack of information on the molecular biology of this organism. PMID:15869816

  7. Initiation of a Sarcocystis neurona expressed sequence tag (EST) sequencing project: a preliminary report.

    PubMed

    Howe, D K

    2001-02-26

    To accelerate genetic and molecular characterization of Sarcocystis neurona, the primary causative agent of equine protozoal myeloencephalitis (EPM), a sequencing project has been initiated that will generate approximately 7000-8000 expressed sequence tags (ESTs) from this apicomplexan parasite. Poly(A)(+) RNA was isolated from culture-derived S. neurona merozoites, and a cDNA library was constructed in a unidirectional lambda phage cloning vector. Sixty phage clones were randomly picked from the library, and the cDNA inserts were amplified from these clones using the T3 and T7 primers that flank the multi-cloning site of the lambda vector. This analysis demonstrated that 100% (60/60) of the clones selected from this library contained recombinant cDNA inserts ranging in size from 0.4 to 4.0 kilobases (kb) with an average size of 1.23kb. Single-pass sequencing from the 5' end of the 60 amplified cDNAs produced high-quality nucleotide sequence from 53 of the clones. Comparison of these ESTs to the current gene databases revealed significant matches for 10 of the ESTs, six of which are similar to sequences from other Apicomplexa (i.e., Toxoplasma gondii). Importantly, none of the ESTs were of obvious mammalian origin, thus indicating that the cDNAs in this library were derived primarily from parasite mRNA and not from mRNA of the bovine turbinate host cells. Collectively, these data indicate that the described cDNA library will provide an excellent substrate for generating a portion of the ESTs that are planned from S. neurona. This sequencing project will greatly hasten gene discovery for this protozoan pathogen thereby enhancing efforts towards the development of improved diagnostics, treatments, and preventatives for EPM. In addition, the S. neurona ESTs will represent a significant contribution to the extensive database of sequences from the Apicomplexa. Comparative analyses of these apicomplexan sequences will likely offer a multitude of important information

  8. Peptides derivatized with bicyclic quaternary ammonium ionization tags. Sequencing via tandem mass spectrometry.

    PubMed

    Setner, Bartosz; Rudowska, Magdalena; Klem, Ewelina; Cebrat, Marek; Szewczuk, Zbigniew

    2014-10-01

    Improving the sensitivity of detection and fragmentation of peptides to provide reliable sequencing of peptides is an important goal of mass spectrometric analysis. Peptides derivatized by bicyclic quaternary ammonium ionization tags: 1-azabicyclo[2.2.2]octane (ABCO) or 1,4-diazabicyclo[2.2.2]octane (DABCO), are characterized by an increased detection sensitivity in electrospray ionization mass spectrometry (ESI-MS) and longer retention times on the reverse-phase (RP) chromatography columns. The improvement of the detection limit was observed even for peptides dissolved in 10 mM NaCl. Collision-induced dissociation tandem mass spectrometry of quaternary ammonium salts derivatives of peptides showed dominant a- and b-type ions, allowing facile sequencing of peptides. The bicyclic ionization tags are stable in collision-induced dissociation experiments, and the resulted fragmentation pattern is not significantly influenced by either acidic or basic amino acid residues in the peptide sequence. Obtained results indicate the general usefulness of the bicyclic quaternary ammonium ionization tags for ESI-MS/MS sequencing of peptides. PMID:25303389

  9. Analysis of Expressed Sequence Tags from Chinese Bayberry Fruit (Myrica rubra Sieb. and Zucc.) at Different Ripening Stages and Their Association with Fruit Quality Development

    PubMed Central

    Zhu, Changqing; Feng, Chao; Li, Xian; Xu, Changjie; Sun, Chongde; Chen, Kunsong

    2013-01-01

    A total of 2000 EST sequences were produced from cDNA libraries generated from Chinese bayberry fruit (Myrica rubra Sieb. and Zucc. cv. “Biqi”) at four different ripening stages. After cluster and assembly analysis of the datasets by UniProt, 395 unigenes were identified, and their presumed functions were assigned to 14 putative cellular roles. Furthermore, a sequence BLAST was done for the top ten highly expressed genes in the ESTs, and genes associated with disease/defense and anthocyanin accumulation were analyzed. Gene-encoding elements associated with ethylene biosynthesis and signal transductions, in addition to other senescence-regulating proteins, as well as those associated with quality formation during fruit ripening, were also identified. Their possible roles were subsequently discussed. PMID:23377019

  10. Characterization of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers for Aspergillus flavus: Emphasis on Variability of Isolates from the Southern United States

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Simple Sequence Repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers wer...

  11. Identification of antimicrobial peptides from teleosts and anurans in expressed sequence tag databases using conserved signal sequences.

    PubMed

    Tessera, Valentina; Guida, Filomena; Juretić, Davor; Tossi, Alessandro

    2012-03-01

    The problem of multidrug resistance requires the efficient and accurate identification of new classes of antimicrobial agents. Endogenous antimicrobial peptides produced by most organisms are a promising source of such molecules. We have exploited the high conservation of signal sequences in teleost and anuran antimicrobial peptides to search cDNA (expressed sequence tag) databases for likely candidates. Subject sequences were then analysed for the presence of potential antimicrobial peptides based on physicochemical properties (amphipathic helical structure, cationicity) and use of the D-descriptor model to predict the therapeutic index (relation between the minimum inhibitory concentration and the concentration giving 50% haemolysis). This analysis also suggested mutations to probe the role of the primary structure in determining potency and selectivity. Selected sequences were chemically synthesized and the antimicrobial activity of the peptides was confirmed. In particular, a short (21-residue) sequence, likely of sticklefish origin, showed potent activity and it was possible to tune the spectrum of action and/or selectivity by combining three directed mutations. Membrane permeabilization studies on both bacterial and host cells indicate that the mode of action was prevalently membranolytic. This method opens up the possibility for more effective searching of the vast and continuously growing expressed sequence tag databases for novel antimicrobial peptides, which are likely abundant, and the efficient identification of the most promising candidates among them. PMID:22188679

  12. Analysis of STIS time-tag data

    NASA Technical Reports Server (NTRS)

    Lindler, Don J.; Gull, Theodore R.; Kraemer, Steven B.; Hulbert, Stephen J.

    1997-01-01

    Very high time resolution data can be obtained from the Space Telescope Imaging Spectrograph (STIS) Multi-Anode Microchannel Array (MAMA) detectors using the time-tag observing mode. In this mode, the photon events are not accumulated onboard the spacecraft. Instead, each event is recorded internally and transmitted to the ground as an X and Y location with an event time. Event times are recorded in units of 125 microseconds. Analysis of STIS Crab Pulsar data demonstrates that a time resolution of approaching 125 microseconds can be achieved. Furthermore, the time-tag observing mode has been demonstrated to be a very powerful diagnostic tool and can be used to increase the resolution of both imaging and spectral data.

  13. Perceptual learning of contrast discrimination under roving: the role of semantic sequence in stimulus tagging.

    PubMed

    Cong, Lin-Juan; Zhang, Jun-Yun

    2014-01-01

    Perceptual learning may occur when multiple contrasts are practiced in a fixed, but not in a roving (random), temporal sequence. However, learning may escape roving disruption when each contrast is assigned a letter tag (i.e., A, B, C, D). Because these letter tags carry not only stimulus identity information, but also semantic sequence information, here we investigated whether the semantic sequence information is necessary for learning of tagged contrasts under the roving condition. We found that assigning number tags (i.e., 1, 2, 3, 4), which also contained both identity and semantic sequence information, to four roving contrasts enabled significant learning of discrimination of each contrast, confirming previous data. However, learning became insignificant when the contrast tags were replaced with Greek letters that were familiar to our Chinese observers except their sequence or Chinese characters that carried no sequence information. In addition, assigning orientation tags, which carried no sequence information either, to roving contrasts was ineffective as well because learning occurred only with sequenced but not roving contrasts. These results suggest that semantic sequence information is necessary for stimulus tagging to effectively enable perceptual learning of multiple contrast discrimination under roving. PMID:25368338

  14. TagSmart: analysis and visualization for yeast mutant fitness data measured by tag microarrays

    PubMed Central

    Kim, Chulyun; Kim, Sangkyum; Dorer, Russell; Xie, Dan; Han, Jiawei; Zhong, Sheng

    2007-01-01

    Background A nearly complete collection of gene-deletion mutants (96% of annotated open reading frames) of the yeast Saccharomyces cerevisiae has been systematically constructed. Tag microarrays are widely used to measure the fitness of each mutant in a mutant mixture. The tag array experiments can have a complex experimental design, such as time course measurements and drug treatment with multiple dosages. Results TagSmart is a web application for analysis and visualization of Saccharomyces cerevisiae mutant fitness data measured by tag microarrays. It implements a robust statistical approach to assess the concentration differences among S. cerevisiae mutant strains. It also provides an interactive environment for data analysis and visualization. TagSmart has the following advantages over previously described analysis procedures: 1) it is user-friendly software rather than merely a description of analytical procedure; 2) It can handle complicated experimental designs, such as multiple time points and treatment with multiple dosages; 3) it has higher sensitivity and specificity; 4) It allows users to mask out "bad" tags in the analysis. Two biological tests were performed to illustrate the performance of TagSmart. First, we generated titration mixtures of mutant strains, in which the relative concentration of each strain was controlled. We used tag microarrays to measure the numbers of tag copies in each titration mixture. The data was analyzed with TagSmart and the result showed high precision and recall. Second, TagSmart was applied to a dataset in which heterozygous deletion strain mixture pools were treated with a new drug, Cincreasin. TagSmart identified 53 mutant strains as sensitive to Cincreasin treatment. We individually tested each identified mutant, and found 52 out of the 53 predicted mutants were indeed sensitive to Cincreasin. Conclusion TagSmart is provided "as is" to analyze tag array data produced by Affymetrix and Agilent arrays. TagSmart web

  15. Cardiac motion estimation by joint alignment of tagged MRI sequences.

    PubMed

    Oubel, E; De Craene, M; Hero, A O; Pourmorteza, A; Huguet, M; Avegliano, G; Bijnens, B H; Frangi, A F

    2012-01-01

    Image registration has been proposed as an automatic method for recovering cardiac displacement fields from tagged Magnetic Resonance Imaging (tMRI) sequences. Initially performed as a set of pairwise registrations, these techniques have evolved to the use of 3D+t deformation models, requiring metrics of joint image alignment (JA). However, only linear combinations of cost functions defined with respect to the first frame have been used. In this paper, we have applied k-Nearest Neighbors Graphs (kNNG) estimators of the α-entropy (H(α)) to measure the joint similarity between frames, and to combine the information provided by different cardiac views in an unified metric. Experiments performed on six subjects showed a significantly higher accuracy (p<0.05) with respect to a standard pairwise alignment (PA) approach in terms of mean positional error and variance with respect to manually placed landmarks. The developed method was used to study strains in patients with myocardial infarction, showing a consistency between strain, infarction location, and coronary occlusion. This paper also presents an interesting clinical application of graph-based metric estimators, showing their value for solving practical problems found in medical imaging. PMID:22000567

  16. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    PubMed Central

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  17. Analysis of expressed sequence tags from a single wheat cultivar facilitates interpretation of tandem mass spectrometry data and discrimination of gamma gliadin proteins that may play different functional roles in flour

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complement of gamma gliadin genes expressed in the wheat cultivar Butte 86 was evaluated by analyzing publicly available expressed sequence tag (EST) data. Eleven contigs were assembled from 153 Butte 86 ESTs. Nine of the contigs encoded full-length proteins and four of the proteins contained an...

  18. Peanut (Arachis hypogaea) expressed sequence tag (EST) project: Progress and application.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Millions of expressed sequence tag (EST) sequences from several hundred plant species have been deposited in public EST databases. Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research commu...

  19. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank. PMID:12107414

  20. Paired-end genomic signature tags: a method for the functional analysis of genomes and epigenomes.

    PubMed

    Dunn, John J; McCorkle, Sean R; Everett, Logan; Anderson, Carl W

    2007-01-01

    Because paired-end genomic signature tags are sequenced-based, they have the potential to become an alternate tool to tiled microarray hybridization as a method for genome-wide localization of transcription factors and other sequence-specific DNA binding proteins. As outlined here the method also can be used for global analysis of DNA methylation. One advantage of this approach is the ability to easily switch between different genome types without having to fabricate a new microarray for each and every DNA type. However, the method does have some disadvantages. Among the most rate-limiting steps of our PE-GST protocol are the need to concatemerize the diTAGs, size fractionate them and then clone them prior to sequencing. This is usually followed by additional steps to amplify and size select for long (> or = 500) concatemer inserts prior to sequencing. These time-consuming steps are important for standard DNA sequencing as they increase efficiency approximately 20-30-fold since each amplified concatemer can now provide information on multiple tags; the limitation on data acqui- sition is read length during sequencing. However, the development of new sequencing methods such as Life Sciences' 454 new nanotechnology-based sequencing instrument (41) could increase tag sequencing efficiency by several orders of magnitude (> or = 100,000 diTAG reads/run), which is sufficient to provide in-depth global analysis of all ChIP PE-GSTs in a single run. This is because the lengths of our paired-end diTAGs (approximately 60 bp) fall well within the region of high accuracy for read lengths on this instrument. In principle, sequence analysis of diTAGs could begin as soon as they are generated, thereby completely bypassing the need for the concatemerization, sizing, downstream cloning steps and sequencing template purification. In addition, our protocol places any one of several unique four-base long nucleotide sequences, such as GATC, between each and every diTAG pair, which could

  1. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  2. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study. PMID:11313146

  3. Insilico analysis of three different tag polypeptides with dual roles in scFv antibodies.

    PubMed

    Mohammadi, Mozafar; Nejatollahi, Foroogh; Sakhteman, Amirhossein; Zarei, Neda

    2016-08-01

    Single chain fragment variable (scFv) antibodies are composed of variable heavy (VH) and variable light (VL) domains that are joined by a polypeptide linker. Typically, [(Gly4Ser) n] sequence is used as a linker to retain the integrity of the antigen-binding domain. Due to its low immunogenicity, this sequence cannot be used as a tag for scFv detection and purification. Several evidences have shown that the addition of an N or C-terminal tag for scFv detection and purification will result in the decreased expression and binding capacity of this antibody fragment. In this study, we substituted the traditional linker (GGGGS) with His-tag, C-myc or E-tag sequences through molecular modeling. Stability and integrity of all models were assessed by molecular dynamic (MD) simulation. Based on MD simulation analysis, the model containing E-tag sequence as a linker indicated more stability compared to other molecules. The results suggest that E-tag not only can be substituted for the traditional linker, also eliminates the necessity of using additional tag for scFv detection and purification. PMID:27113782

  4. Express Sequence Tag Analysis - Identification of Anseriformes Trypsin Genes from Full-Length cDNA Library of the Duck (Anas platyrhynchos) and Characterization of Their Structure and Function.

    PubMed

    Yu, Haining; Cai, Shasha; Gao, Jiuxiang; Wang, Chen; Qiao, Xue; Wang, Hui; Feng, Lan; Wang, Yipeng

    2016-02-01

    Trypsins are key proteins important in animal protein digestion by breaking down the peptide bonds on the carboxyl side of lysine and arginine residues, hence it has been used widely in various biotechnological processes. In the current study, a full-length cDNA library with capacity of 5·10(5) CFU/ml from the duck (Anas platyrhynchos) was constructed. Using express sequence tag (EST) sequencing, genes coding two trypsins were identified and two full-length trypsin cDNAs were then obtained by rapid-amplification of cDNA end (RACE)-PCR. Using Blast, they were classified into the trypsin I and II subfamilies, but both encoded a signal peptide, an activation peptide, and a 223-a.a. mature protein located in the C-terminus. The two deduced mature proteins were designated as trypsin-IAP and trypsin-IIAP, and their theoretical isoelectric points (pI) and molecular weights (MW) were 7.99/23466.4 Da and 4.65/24066.0 Da, respectively. Molecular characterizations of genes were further performed by detailed bioinformatics analysis. Phylogenetic analysis revealed that trypsin-IIAP has an evolution pattern distinct from trypsin-IAP, suggesting its evolutionary advantage. Then the duck trypsin-IIAP was expressed in an Escherichia coli system, and its kinetic parameters were measured. The three dimensional structures of trypsin-IAP and trypsin-IIAP were predicted by homology modeling, and the conserved residues required for functionality were identified. Two loops controlling the specificity of the trypsin and the substrate-binding pocket represented in the model are almost identical in primary sequences and backbone tertiary structures of the trypsin families. PMID:27260395

  5. Degradation of C-terminal tag sequences on domain antibodies purified from E. coli supernatant.

    PubMed

    Lykkemark, Simon; Mandrup, Ole Aalund; Friis, Niels Anton; Kristensen, Peter

    2014-01-01

    Expression of recombinant proteins often takes advantage of peptide tags expressed in fusion to allow easy detection and purification of the expressed proteins. However, as the fusion peptides most often are flexible appendages at the N- or C-terminal, proteolytic cleavage may result in removal of the tag sequence. Here, we evaluated the functionality and stability of 14 different combinations of commonly used tags for purification and detection of recombinant antibody fragments. The tag sequences were inserted in fusion with the c-terminal end of a domain antibody based on the HEL4 scaffold in a phagemid vector. This particular antibody fragment was able to refold on the membrane after blotting, allowing us to detect c-terminal tag breakdown by use of protein A in combination with detection of the tags in the specific constructs. The degradation of the c-terminal tags suggested specific sites to be particularly prone to proteolytic cleavage, leaving some of the tag combinations partially or completely degraded. This specific work illustrates the importance of tag design with regard to recombinant antibody expression in E. coli, but also aids the more general understanding of protein expression. PMID:25426869

  6. Degradation of C-terminal tag sequences on domain antibodies purified from E. coli supernatant

    PubMed Central

    Lykkemark, Simon; Mandrup, Ole Aalund; Friis, Niels Anton; Kristensen, Peter

    2014-01-01

    Expression of recombinant proteins often takes advantage of peptide tags expressed in fusion to allow easy detection and purification of the expressed proteins. However, as the fusion peptides most often are flexible appendages at the N- or C-terminal, proteolytic cleavage may result in removal of the tag sequence. Here, we evaluated the functionality and stability of 14 different combinations of commonly used tags for purification and detection of recombinant antibody fragments. The tag sequences were inserted in fusion with the c-terminal end of a domain antibody based on the HEL4 scaffold in a phagemid vector. This particular antibody fragment was able to refold on the membrane after blotting, allowing us to detect c-terminal tag breakdown by use of protein A in combination with detection of the tags in the specific constructs. The degradation of the c-terminal tags suggested specific sites to be particularly prone to proteolytic cleavage, leaving some of the tag combinations partially or completely degraded. This specific work illustrates the importance of tag design with regard to recombinant antibody expression in E. coli, but also aids the more general understanding of protein expression. PMID:25426869

  7. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  8. Expressed sequence tags of Chinese cabbage flower bud cDNA.

    PubMed Central

    Lim, C O; Kim, H Y; Kim, M G; Lee, S I; Chung, W S; Park, S H; Hwang, I; Cho, M J

    1996-01-01

    We randomly selected and partially sequenced cDNA clones from a library of Chinese cabbage (Brassica campestris L. ssp. pekinensis) flower bud cDNAs. Out of 1216 expressed sequence tags (ESTs), 904 cDNA clones were unique or nonredundant. Five hundred eighty-eight clones (48.4%) had sequence homology to functionally defined genes at the peptide level. Only 5 clones encoded known flower-specific proteins. Among the cDNAs with no similarity to known protein sequences (628), 184 clones had significant similarity to nucleotide sequences registered in the databases. Among these 184 clones, 142 exhibited similarities at the nucleotide level only with plant ESTs. Also, sequence similarities were evident between these 142 ESTs and their matching ESTs when compared using the deduced amino acid sequences. Therefore, it is possible that the anonymous ESTs encode plant-specific ubiquitous proteins. Our extensive EST analysis of genes expressed in floral organs not only contributes to the understanding of the dynamics of genome expression patterns in floral organs but also adds data to the repertoire of all genomic genes. PMID:8787028

  9. Primer and platform effects on 16S rRNA tag sequencing

    PubMed Central

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-01-01

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. Beta diversity metrics are surprisingly robust to both primer and sequencing platform biases. PMID:26300854

  10. Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

    PubMed Central

    Hartman, Stefanie; Touchton, Greg; Wynn, Jessica; Geng, Tuoyu; Chong, Nelson W.

    2005-01-01

    The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences designated to be on the Z_random, while one matched a W chromosome sequence and could be useful in cataloguing functionally important genes on this sex chromosome. Additionally, single nucleotide polymorphisms (SNPs) were identified and validated in 10 ESTs that showed 98% or higher sequence similarity to known chicken genes. Here, we have described resources that may be useful in comparative and functional genomic analysis of genes expressed in an important organ, the pineal gland, in a model and agriculturally important organism. PMID:18629218

  11. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  12. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future. PMID:24630959

  13. Identification of reproduction-related genes and SSR-markers through expressed sequence tags analysis of a monsoon breeding carp rohu, Labeo rohita (Hamilton).

    PubMed

    Sahu, Dinesh K; Panda, Soumya P; Panda, Sujata; Das, Paramananda; Meher, Prem K; Hazra, Rupenangshu K; Peatman, Eric; Liu, Zhanjiang J; Eknath, Ambekar E; Nandi, Samiran

    2013-07-15

    Labeo rohita (Ham.) also called rohu is the most important freshwater aquaculture species on the Indian sub continent. Monsoon dependent breeding restricts its seed production beyond season indicating a strong genetic control about which very limited information is available. Additionally, few genomic resources are publicly available for this species. Here we sought to identify reproduction-relevant genes from normalized cDNA libraries of the brain-pituitary-gonad-liver (BPGL-axis) tissues of adult L. rohita collected during post preparatory phase. 6161 random clones sequenced (Sanger-based) from these libraries produced 4642 (75.34%) high-quality sequences. They were assembled into 3631 (78.22%) unique sequences composed of 709 contigs and 2922 singletons. A total of 182 unique sequences were found to be associated with reproduction-related genes, mainly under the GO term categories of reproduction, neuro-peptide hormone activity, hormone and receptor binding, receptor activity, signal transduction, embryonic development, cell-cell signaling, cell death and anti-apoptosis process. Several important reproduction-related genes reported here for the first time in L. rohita are zona pellucida sperm-binding protein 3, aquaporin-12, spermine oxidase, sperm associated antigen 7, testis expressed 261, progesterone receptor membrane component, Neuropeptide Y and Pro-opiomelanocortin. Quantitative RT-PCR-based analyses of 8 known and 8 unknown transcripts during preparatory and post-spawning phase showed increased expression level of most of the transcripts during preparatory phase (except Neuropeptide Y) in comparison to post-spawning phase indicating possible roles in initiation of gonad maturation. Expression of unknown transcripts was also found in prolific breeder common carp and tilapia, but levels of expression were much higher in seasonal breeder rohu. 3631 unique sequences contained 236 (6.49%) putative microsatellites with the AG (28.16%) repeat as the most

  14. Phylogeny of Saccharina and Laminaria (Laminariaceae, Laminariales, Phaeophyta) in sequence-tagged-site markers

    NASA Astrophysics Data System (ADS)

    Qu, Jieqiong; Zhang, Jing; Wang, Xumin; Chi, Shan; Liu, Cui; Liu, Tao

    2014-01-01

    Laminaria and Saccharina have recently been recognized as two independent clades from the former genus Laminaria. Traditional morphological taxonomy is being challenged by molecular evidence from both nucleus and plastid. Intensive work is in great demand from the perspective of genome colinearity. In this study, 118 sequence-tagged site (STS) markers were screened for phylogenetic analyses, 29 based on genome sequences, while 89 were based on expressed sequence tag (EST) sequences. EST-based STS marker development (29.37%) had an effi ciency twice as high as genome-sequence-based development (9.48%) as a result of high conservation of gene transcripts among the relative species. S. ochotensis, S. religiosa, S. japonica, and L. hyperborea showed great homogeneity in all 118 STS markers. Our result supports the view that the diversifi cation between the genera Saccharina and Laminaria was a more recent event and that Saccharina and Laminaria shared high phylogenetic affi nity. However, when it came to the single nucleotide polymorphism (SNP) level among the 41 SNPs, L. hyperborea owned 29 unique SNPs against 12 within the left three Saccharina species and 12 of the 13 indels were supposedly unique for L. hyperborea, indicated by its high variability. Originating from homologous ancestors, species between the recently diverged genera Laminaria and Saccharina may have taken in enough mutations at the SNP level only, in spite of different evolutionary strategies for better adaptation to the environment. Our study lays a solid foundation from a new perspective, although more accurate phylogenetic analysis is still needed to clarify the evolutionary traces between the genera Saccharina and Laminaria.

  15. [Differentiation, identification and development of database of T. aestivum L. varieties of Ukrainian selection on the basis of sequence-tagged analysis of microsatellite repeats].

    PubMed

    Chebotar', S V; Sivolap, Iu M

    2001-01-01

    Determination of the variety genotype is very important for the development of theory and practice of plant breeding and for right protection of a variety originator. In this reason attention is focused on the molecular markers generated by polymerase chain reaction. On the basis of STMS-analysis principles of identification and development of database, which reflect molecular-genetics peculiarities of some varieties of the Plant Breeding and Genetics Institute and other Ukrainian breeding organizations, are formulated. Allelic state at microsatellite loci and their distribution were investigated. Wheat varieties were ranged according to genetic distances, data on pedigree and cluster distribution of varieties obtained using computer programs were compared. PMID:11944322

  16. Development of peanut EST (expressed sequence tag)-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  17. Development of peanut expessed sequence tag-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  18. TAG Sequence Identification of Genomic Regions Using TAGdb.

    PubMed

    Ruperao, Pradeep

    2016-01-01

    Second-generation sequencing (SGS) technology has enabled the sequencing of genomes and identification of genes. However, large complex plant genomes remain particularly difficult for de novo assembly. Access to the vast quantity of raw sequence data may facilitate discoveries; however the volume of this data makes access difficult. This chapter discusses the Web-based tool TAGdb that enables researchers to identify paired read second-generation DNA sequence data that share identity with a submitted query sequence. The identified reads can be used for PCR amplification of genomic regions to identify genes and promoters without the need for genome assembly. PMID:26519409

  19. Comparison of Sequencing (Barcode Region) and Sequence-Tagged-Site PCR for Blastocystis Subtyping

    PubMed Central

    2013-01-01

    Blastocystis is the most common nonfungal microeukaryote of the human intestinal tract and comprises numerous subtypes (STs), nine of which have been found in humans (ST1 to ST9). While efforts continue to explore the relationship between human health status and subtypes, no consensus regarding subtyping methodology exists. It has been speculated that differences detected in subtype distribution in various cohorts may to some extent reflect different approaches. Blastocystis subtypes have been determined primarily in one of two ways: (i) sequencing of small subunit rRNA gene (SSU-rDNA) PCR products and (ii) PCR with subtype-specific sequence-tagged-site (STS) diagnostic primers. Here, STS primers were evaluated against a panel of samples (n = 58) already subtyped by SSU-rDNA sequencing (barcode region), including subtypes for which STS primers are not available, and a small panel of DNAs from four other eukaryotes often present in feces (n = 18). Although the STS primers appeared to be highly specific, their sensitivity was only moderate, and the results indicated that some infections may go undetected when this method is used. False-negative STS results were not linked exclusively to certain subtypes or alleles, and evidence of substantial genetic variation in STS loci was obtained. Since the majority of DNAs included here were extracted from feces, it is possible that STS primers may generally work better with DNAs extracted from Blastocystis cultures. In conclusion, due to its higher applicability and sensitivity, and since sequence information is useful for other forms of research, SSU-rDNA barcoding is recommended as the method of choice for Blastocystis subtyping. PMID:23115257

  20. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp. PMID:23898551

  1. Identification of Simple Sequence Repeat Biomarkers through Cross-Species Comparison in a Tag Cloud Representation

    PubMed Central

    2014-01-01

    Simple sequence repeats (SSRs) are not only applied as genetic markers in evolutionary studies but they also play an important role in gene regulatory activities. Efficient identification of conserved and exclusive SSRs through cross-species comparison is helpful for understanding the evolutionary mechanisms and associations between specific gene groups and SSR motifs. In this paper, we developed an online cross-species comparative system and integrated it with a tag cloud visualization technique for identifying potential SSR biomarkers within fourteen frequently used model species. Ultraconserved or exclusive SSRs among cross-species orthologous genes could be effectively retrieved and displayed through a friendly interface design. Four different types of testing cases were applied to demonstrate and verify the retrieved SSR biomarker candidates. Through statistical analysis and enhanced tag cloud representation on defined functional related genes and cross-species clusters, the proposed system can correctly represent the patterns, loci, colors, and sizes of identified SSRs in accordance with gene functions, pattern qualities, and conserved characteristics among species. PMID:24800246

  2. AGIA Tag System Based on a High Affinity Rabbit Monoclonal Antibody against Human Dopamine Receptor D1 for Protein Analysis

    PubMed Central

    Yano, Tomoya; Takeda, Hiroyuki; Uematsu, Atsushi; Yamanaka, Satoshi; Nomura, Shunsuke; Nemoto, Keiichirou; Iwasaki, Takahiro; Takahashi, Hirotaka; Sawasaki, Tatsuya

    2016-01-01

    Polypeptide tag technology is widely used for protein detection and affinity purification. It consists of two fundamental elements: a peptide sequence and a binder which specifically binds to the peptide tag. In many tag systems, antibodies have been used as binder due to their high affinity and specificity. Recently, we obtained clone Ra48, a high-affinity rabbit monoclonal antibody (mAb) against dopamine receptor D1 (DRD1). Here, we report a novel tag system composed of Ra48 antibody and its epitope sequence. Using a deletion assay, we identified EEAAGIARP in the C-terminal region of DRD1 as the minimal epitope of Ra48 mAb, and we named this sequence the “AGIA” tag, based on its central sequence. The tag sequence does not include the four amino acids, Ser, Thr, Tyr, or Lys, which are susceptible to post-translational modification. We demonstrated performance of this new tag system in biochemical and cell biology applications. SPR analysis demonstrated that the affinity of the Ra48 mAb to the AGIA tag was 4.90 × 10−9 M. AGIA tag showed remarkably high sensitivity and specificity in immunoblotting. A number of AGIA-fused proteins overexpressed in animal and plant cells were detected by anti-AGIA antibody in immunoblotting and immunostaining with low background, and were immunoprecipitated efficiently. Furthermore, a single amino acid substitution of the second Glu to Asp (AGIA/E2D) enabled competitive dissociation of AGIA/E2D-tagged protein by adding wild-type AGIA peptide. It enabled one-step purification of AGIA/E2D-tagged recombinant proteins by peptide competition under physiological conditions. The sensitivity and specificity of the AGIA system makes it suitable for use in multiple methods for protein analysis. PMID:27271343

  3. AGIA Tag System Based on a High Affinity Rabbit Monoclonal Antibody against Human Dopamine Receptor D1 for Protein Analysis.

    PubMed

    Yano, Tomoya; Takeda, Hiroyuki; Uematsu, Atsushi; Yamanaka, Satoshi; Nomura, Shunsuke; Nemoto, Keiichirou; Iwasaki, Takahiro; Takahashi, Hirotaka; Sawasaki, Tatsuya

    2016-01-01

    Polypeptide tag technology is widely used for protein detection and affinity purification. It consists of two fundamental elements: a peptide sequence and a binder which specifically binds to the peptide tag. In many tag systems, antibodies have been used as binder due to their high affinity and specificity. Recently, we obtained clone Ra48, a high-affinity rabbit monoclonal antibody (mAb) against dopamine receptor D1 (DRD1). Here, we report a novel tag system composed of Ra48 antibody and its epitope sequence. Using a deletion assay, we identified EEAAGIARP in the C-terminal region of DRD1 as the minimal epitope of Ra48 mAb, and we named this sequence the "AGIA" tag, based on its central sequence. The tag sequence does not include the four amino acids, Ser, Thr, Tyr, or Lys, which are susceptible to post-translational modification. We demonstrated performance of this new tag system in biochemical and cell biology applications. SPR analysis demonstrated that the affinity of the Ra48 mAb to the AGIA tag was 4.90 × 10-9 M. AGIA tag showed remarkably high sensitivity and specificity in immunoblotting. A number of AGIA-fused proteins overexpressed in animal and plant cells were detected by anti-AGIA antibody in immunoblotting and immunostaining with low background, and were immunoprecipitated efficiently. Furthermore, a single amino acid substitution of the second Glu to Asp (AGIA/E2D) enabled competitive dissociation of AGIA/E2D-tagged protein by adding wild-type AGIA peptide. It enabled one-step purification of AGIA/E2D-tagged recombinant proteins by peptide competition under physiological conditions. The sensitivity and specificity of the AGIA system makes it suitable for use in multiple methods for protein analysis. PMID:27271343

  4. New aldehyde tag sequences identified by screening formylglycine generating enzymes in vitro and in vivo.

    PubMed

    Rush, Jason S; Bertozzi, Carolyn R

    2008-09-17

    Formylglycine generating enzyme (FGE) performs a critical posttranslational modification of type I sulfatases, converting cysteine within the motif CxPxR to the aldehyde-bearing residue formylglycine (FGly). This concise motif can be installed within heterologous proteins as a genetically encoded "aldehyde tag" for site-specific labeling with aminooxy- or hydrazide-functionalized probes. In this report, we screened FGEs from M. tuberculosis and S. coelicolor against synthetic peptide libraries and identified new substrate sequences that diverge from the canonical motif. We found that E. coli's FGE-like activity is similarly promiscuous, enabling the use of novel aldehyde tag sequences for in vivo modification of recombinant proteins. PMID:18722427

  5. Peanut (Arachis hypogaea) Expressed Sequence Tag Project: Progress and Application

    PubMed Central

    Feng, Suping; Wang, Xingjun; Zhang, Xinyou; Dang, Phat M.; Holbrook, C. Corley; Culbreath, Albert K.; Wu, Yaoting; Guo, Baozhu

    2012-01-01

    Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research community had the historic 2004 Atlanta Genomics Workshop and named the EST project as a main priority. As of August 2011, the peanut research community had deposited 252,832 ESTs in the public NCBI EST database, and this resource has been providing the community valuable tools and core foundations for various genome-scale experiments before the whole genome sequencing project. These EST resources have been used for marker development, gene cloning, microarray gene expression and genetic map construction. Certainly, the peanut EST sequence resources have been shown to have a wide range of applications and accomplished its essential role at the time of need. Then the EST project contributes to the second historic event, the Peanut Genome Project 2010 Inaugural Meeting also held in Atlanta where it was decided to sequence the entire peanut genome. After the completion of peanut whole genome sequencing, ESTs or transcriptome will continue to play an important role to fill in knowledge gaps, to identify particular genes and to explore gene function. PMID:22745594

  6. Comparative mapping of expressed sequence tags containing microsatellites in rainbow trout (Oncorhynchus mykiss)

    PubMed Central

    Rexroad, Caird E; Rodriguez, Maria F; Coulibaly, Issa; Gharbi, Karim; Danzmann, Roy G; DeKoning, Jenefer; Phillips, Ruth; Palti, Yniv

    2005-01-01

    Background Comparative genomics, through the integration of genetic maps from species of interest with whole genome sequences of other species, will facilitate the identification of genes affecting phenotypes of interest. The development of microsatellite markers from expressed sequence tags will serve to increase marker densities on current salmonid genetic maps and initiate in silico comparative maps with species whose genomes have been fully sequenced. Results Eighty-nine polymorphic microsatellite markers were generated for rainbow trout of which at least 74 amplify in other salmonids. Fifty-five have been associated with functional annotation and 30 were mapped on existing genetic maps. Homologous sequences were identified for 20 of the EST containing microsatellites to identify comparative assignments within the tetraodon, mouse, and/or human genomes. Conclusion The addition of microsatellite markers constructed from expressed sequence tag data will facilitate the development of high-density genetic maps for rainbow trout and comparative maps with other salmonids and better studied species. PMID:15836796

  7. Analysis of common bean expressed sequence tags identifies sulfur metabolic pathways active in seed and sulfur-rich proteins highly expressed in the absence of phaseolin and major lectins

    PubMed Central

    2011-01-01

    Background A deficiency in phaseolin and phytohemagglutinin is associated with a near doubling of sulfur amino acid content in genetically related lines of common bean (Phaseolus vulgaris), particularly cysteine, elevated by 70%, and methionine, elevated by 10%. This mostly takes place at the expense of an abundant non-protein amino acid, S-methyl-cysteine. The deficiency in phaseolin and phytohemagglutinin is mainly compensated by increased levels of the 11S globulin legumin and residual lectins. Legumin, albumin-2, defensin and albumin-1 were previously identified as contributing to the increased sulfur amino acid content in the mutant line, on the basis of similarity to proteins from other legumes. Results Profiling of free amino acid in developing seeds of the BAT93 reference genotype revealed a biphasic accumulation of gamma-glutamyl-S-methyl-cysteine, the main soluble form of S-methyl-cysteine, with a lag phase occurring during storage protein accumulation. A collection of 30,147 expressed sequence tags (ESTs) was generated from four developmental stages, corresponding to distinct phases of gamma-glutamyl-S-methyl-cysteine accumulation, and covering the transitions to reserve accumulation and dessication. Analysis of gene ontology categories indicated the occurrence of multiple sulfur metabolic pathways, including all enzymatic activities responsible for sulfate assimilation, de novo cysteine and methionine biosynthesis. Integration of genomic and proteomic data enabled the identification and isolation of cDNAs coding for legumin, albumin-2, defensin D1 and albumin-1A and -B induced in the absence of phaseolin and phytohemagglutinin. Their deduced amino acid sequences have a higher content of cysteine than methionine, providing an explanation for the preferential increase of cysteine in the mutant line. Conclusion The EST collection provides a foundation to further investigate sulfur metabolism and the differential accumulation of sulfur amino acids in seed

  8. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Hixson, Kim K; Purvine, Samuel O; Anderson, Gordon A; Smith, Richard D

    2008-10-15

    De novo sequencing is a spectrum analysis approach for mass spectrometry data to discover post-translational modifications in proteins; however, such an approach is still in its infancy and is still not widely applied to proteomic practices due to its limited reliability. In this work, we describe a de novo sequencing approach for the discovery of protein modifications based on identification of the proteome UStags (Shen, Y.; Tolić, N.; Hixson, K. K.; Purvine, S. O.; Pasa-Tolić, L.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Smith, R. D. Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry data for peptides and polypeptides from a yeast lysate, and the de novo sequences obtained were selected based on filter levels designed to provide a limited yet high quality subset of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags' prefix and suffix sequences and the UStags themselves) were used to infer possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances within several yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. To determine false discovery rates, two random (false) databases were independently used for sequence matching, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity of the approach were investigated and described. The combined de novo-UStag approach complements the UStag method previously reported by enabling the discovery of new protein modifications. PMID:18783246

  9. Characterization of genome-wide ordered sequence-tagged Mycobacterium mutant libraries by Cartesian Pooling-Coordinate Sequencing

    PubMed Central

    Vandewalle, Kristof; Festjens, Nele; Plets, Evelyn; Vuylsteke, Marnik; Saeys, Yvan; Callewaert, Nico

    2015-01-01

    Reverse genetics research approaches require the availability of methods to rapidly generate specific mutants. Alternatively, where these methods are lacking, the construction of pre-characterized libraries of mutants can be extremely valuable. However, this can be complex, expensive and time consuming. Here, we describe a robust, easy to implement parallel sequencing-based method (Cartesian Pooling-Coordinate Sequencing or CP-CSeq) that reports both on the identity as well as on the location of sequence-tagged biological entities in well-plate archived clone collections. We demonstrate this approach using a transposon insertion mutant library of the Mycobacterium bovis BCG vaccine strain, providing the largest resource of mutants in any strain of the M. tuberculosis complex. The method is applicable to any entity for which sequence-tagged identification is possible. PMID:25960123

  10. Microbial Diversity in Deep-sea Methane Seep Sediments Presented by SSU rRNA Gene Tag Sequencing

    PubMed Central

    Nunoura, Takuro; Takaki, Yoshihiro; Kazama, Hiromi; Hirai, Miho; Ashi, Juichiro; Imachi, Hiroyuki; Takai, Ken

    2012-01-01

    Microbial community structures in methane seep sediments in the Nankai Trough were analyzed by tag-sequencing analysis for the small subunit (SSU) rRNA gene using a newly developed primer set. The dominant members of Archaea were Deep-sea Hydrothermal Vent Euryarchaeotic Group 6 (DHVEG 6), Marine Group I (MGI) and Deep Sea Archaeal Group (DSAG), and those in Bacteria were Alpha-, Gamma-, Delta- and Epsilonproteobacteria, Chloroflexi, Bacteroidetes, Planctomycetes and Acidobacteria. Diversity and richness were examined by 8,709 and 7,690 tag-sequences from sediments at 5 and 25 cm below the seafloor (cmbsf), respectively. The estimated diversity and richness in the methane seep sediment are as high as those in soil and deep-sea hydrothermal environments, although the tag-sequences obtained in this study were not sufficient to show whole microbial diversity in this analysis. We also compared the diversity and richness of each taxon/division between the sediments from the two depths, and found that the diversity and richness of some taxa/divisions varied significantly along with the depth. PMID:22510646

  11. Mining and comparison of haplotype-based expressed sequence tag single nucleotide polymorphisms among citrus cultivars

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially...

  12. Patterns of gene expression in microarrays and expressed sequence tags from normal and cataractous lenses.

    PubMed

    Sousounis, Konstantinos; Tsonis, Panagiotis A

    2012-01-01

    In this contribution, we have examined the patterns of gene expression in normal and cataractous lenses as presented in five different papers using microarrays and expressed sequence tags. The purpose was to evaluate unique and common patterns of gene expression during development, aging and cataracts. PMID:23244575

  13. Seventy microsatellite markers from Persea americana Miller (avocado) expressed sequence tags

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tags (ESTs) for Persea americana Mill. were investigated to expand upon the number of informative microsatellite markers available for avocado. Seventy informative loci were discovered using twenty-four P. americana var. americana Mill. accessions. The number of alleles detected r...

  14. CoffeebEST: an integrated resource for Coffea spp expressed sequence tags.

    PubMed

    Paschoal, A R; Fernandes, E D M; Silva, J C; Lopes, F M; Pereira, L F P; Domingues, D S

    2014-01-01

    Coffee is one of the most important commodities in the world, and its production relies mainly on two species, Coffea arabica and Coffea canephora. Although there are diverse transcriptome datasets available for coffee trees, few research groups have exploited the potential knowledge contained in these data, especially with respect to fruit and seed development. Here, we present a comparative analysis of the transcriptomes of Coffea arabica and Coffea canephora with a focus on fruit development using publicly available expressed sequence tags (ESTs). Most of the fruit and seed EST data has been obtained from C. canephora. Therefore, we performed a fruit EST analysis of the 5 developmental stages of this species (18, 22, 30, 42, and 46 weeks after flowering) comprising 29,009 sequences. We compared C. canephora fruit ESTs to reference unigenes of C. canephora (7710 contigs and 8955 singletons) and C. arabica (15,656 contigs and 16,351 singletons). Additional analyses included functional annotation based on Gene Onthology, as well as an annotation using PlantCyc, a curated plant protein database. The Coffee Bean EST (CoffeebEST) is a public database available at http://bioinfo-02.cp.utfpr.edu.br/. This database represents an additional resource for the coffee scientific community, offering a user-friendly collection of information for non-specialists in coffee molecular biology to support experimental research on comparative and functional genomics. PMID:25526212

  15. Identification of genes related to Parkinson's disease using expressed sequence tags.

    PubMed

    Kim, Jeong-Min; Lee, Kyu-Hwa; Jeon, Yeo-Jin; Oh, Jung-Hwa; Jeong, So-Young; Song, In-Sung; Kim, Jin-Man; Lee, Dong-Seok; Kim, Nam-Soon

    2006-12-31

    In a search for novel target genes related to Parkinson's disease (PD), two full-length cDNA libraries were constructed from a human normal substantia nigra (SN) and a PD patient's SN. An analysis of the gene expression profiles between them was done using the expressed sequence tags (ESTs) frequency. Data for the differently expressed genes were verified by quantitative real-time RT-PCR, immunohistochemical analysis and a cell death assay. Among the 76 genes identified with a significant difference (P > 0.9), 21 upregulated genes and 13 downregulated genes were confirmed to be differentially expressed in human PD tissues and/or in an MPTP-treated mice model by quantitative real-time RT-PCR. Among those genes, an immunohistochemical analysis using an MPTP mice model for alpha-tubulin including TUBA3 and TUBA6 showed that the protein levels are downregulated, as well as the RNA levels. In addition, MBP, PBP and GNAS were confirmed to accelerate cell death activity, whereas SPP1 and TUBA3 to retard this process. Using an analysis of ESTs frequency, it was possible to identify a large number of genes related to human PD. These new genes, MBP, PBP, GNAS, SPP1 and TUBA3 in particular, represent potential biomarkers for PD and could serve as useful targets for elucidating the molecular mechanisms associated with PD. PMID:17213182

  16. Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

    PubMed Central

    2011-01-01

    Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882

  17. Expressed sequence tags reveal genetic diversity and putative virulence factors of the pathogenic oomycete Pythium insidiosum.

    PubMed

    Krajaejun, Theerapong; Khositnithikul, Rommanee; Lerksuthirat, Tassanee; Lowhnoo, Tassanee; Rujirawat, Thidarat; Petchthong, Thanom; Yingyong, Wanta; Suriyaphol, Prapat; Smittipat, Nat; Juthayothin, Tada; Phuntumart, Vipaporn; Sullivan, Thomas D

    2011-07-01

    Oomycetes are unique eukaryotic microorganisms that share a mycelial morphology with fungi. Many oomycetes are pathogenic to plants, and a more limited number are pathogenic to animals. Pythium insidiosum is the only oomycete that is capable of infecting both humans and animals, and causes a life-threatening infectious disease, called "pythiosis". In the majority of pythiosis patients life-long handicaps result from the inevitable radical excision of infected organs, and many die from advanced infection. Better understanding P. insidiosum pathogenesis at molecular levels could lead to new forms of treatment. Genetic and genomic information is lacking for P. insidiosum, so we have undertaken an expressed sequence tag (EST) study, and report on the first dataset of 486 ESTs, assembled into 217 unigenes. Of these, 144 had significant sequence similarity with known genes, including 47 with ribosomal protein homology. Potential virulence factors included genes involved in antioxidation, thermal adaptation, immunomodulation, and iron and sterol binding. Effectors resembling pathogenicity factors of plant-pathogenic oomycetes were also discovered, such as, a CBEL-like protein (possible involvement in host cell adhesion and hemagglutination), a putative RXLR effector (possibly involved in host cell modulation) and elicitin-like (ELL) proteins. Phylogenetic analysis mapped P. insidiosum ELLs to several novel clades of oomycete elicitins (ELIs), and homology modeling predicted that P. insidiosum ELLs should bind sterols. Most of the P. insidiosum ESTs showed homology to sequences in the genome or EST databases of other oomycetes, but one putative gene, with unknown function, was found to be unique to P. insidiosum. The EST dataset reported here represents the first steps in identifying genes of P. insidiosum and beginning transcriptome analysis. This genetic information will facilitate understanding of pathogenic mechanisms of this devastating pathogen. PMID:21724174

  18. Sub-wavelength plasmonic readout for direct linear analysis of optically tagged DNA

    NASA Astrophysics Data System (ADS)

    Varsanik, Jonathan; Teynor, William; LeBlanc, John; Clark, Heather; Krogmeier, Jeffrey; Yang, Tian; Crozier, Kenneth; Bernstein, Jonathan

    2010-02-01

    This work describes the development and fabrication of a novel nanofluidic flow-through sensing chip that utilizes a plasmonic resonator to excite fluorescent tags with sub-wavelength resolution. We cover the design of the microfluidic chip and simulation of the plasmonic resonator using Finite Difference Time Domain (FDTD) software. The fabrication methods are presented, with testing procedures and preliminary results. This research is aimed at improving the resolution limits of the Direct Linear Analysis (DLA) technique developed by US Genomics [1]. In DLA, intercalating dyes which tag a specific 8 base-pair sequence are inserted in a DNA sample. This sample is pumped though a nano-fluidic channel, where it is stretched into a linear geometry and interrogated with light which excites the fluorescent tags. The resulting sequence of optical pulses produces a characteristic "fingerprint" of the sample which uniquely identifies any sample of DNA. Plasmonic confinement of light to a 100 nm wide metallic nano-stripe enables resolution of a higher tag density compared to free space optics. Prototype devices have been fabricated and are being tested with fluorophore solutions and tagged DNA. Preliminary results show evanescent coupling to the plasmonic resonator is occurring with 0.1 micron resolution, however light scattering limits the S/N of the detector. Two methods to reduce scattered light are presented: index matching and curved waveguides.

  19. Velocity measurement of clay intrusion through a sudden contraction step using a tagging pulse sequence.

    PubMed

    Tsushima, Shohji; Hasegawa, Atsushi; Suekane, Tetsuya; Hirai, Shuichiro; Tanaka, Yoshihiro; Nakasuji, Yoshizumi

    2003-07-01

    Magnetic resonance imaging (MRI) with a spatial tagging sequence was used to measure the velocity distribution of clay that was forced past a sudden contraction. A spatial tagging sequence provided magnetic resonance images of clay that allowed measurement of the velocity distribution in the clay, which can provide profound insights on the deformation process of clay during the intrusion process. The experiments were conducted using a specially-designed vessel that could operate at up to 30 MPa. The vessel offers a rectangle test section with a sudden contraction step that had a ratio of contraction of 2:1. The vessel was installed into a commercial magnetic resonance imaging equipment and then the fluid motion of clay flowing into the narrow contracted channel was quantitatively investigated to examine behaviors of flowing clay as non-Newtonian fluid. MRI results are compared with those obtained by computational fluid dynamics (CFD) calculation. Velocity distributions obtained from each tag displacement did not well agree with those predicted by CFD results near the contraction step where the fluid accelerated rapidly. However, a post-processing on calculation results, in which virtual tag displacement is calculated, gave better agreement with experiment and enabled us to compare MRI results with CFD results. PMID:12915199

  20. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors

    PubMed Central

    Owen, Jeremy G.; Charlop-Powers, Zachary; Smith, Alexandra G.; Ternei, Melinda A.; Calle, Paula Y.; Reddy, Boojala Vijay B.; Montiel, Daniel; Brady, Sean F.

    2015-01-01

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A–E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome. PMID:25831524

  1. Identification of genes encoding Schistosoma mansoni antigens using an antigenic sequence tag strategy.

    PubMed

    Zouain, C S; Azevedo, V A; Franco, G R; Pena, S D; Goes, A M

    1998-12-01

    Another approach for the identification of genes that code for antigenic products is described using an antigenic sequence tag (AST) strategy. A Schistosoma mansoni adult worm cDNA library was screened with affinity chromatography-purified immunoglobulins from infected human sera and a mild oxidation treatment with sodium periodate. From 1 or both ends of 30 cDNA clones, 30 ASTs were obtained. Of these, 22 were previously known Sm antigens. One clone had matches with entries for other organisms in the databases and 6 had homology with Sm-expressed sequence tags (EST) entries. These clones, together with another 1 that had no significant database matches, were considered new antigenic genes in S. mansoni. The strategy proved to be efficient for the identification of genes that could be used for immunological studies and evaluation as vaccine candidates. PMID:9920341

  2. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  3. Transferability of microsatellite and sequence tagged site markers in Oryza species.

    PubMed

    Brondani, Claudio; Rangel, Paulo Hideo Nakano; Borba, Tereza Cristina Oliveira; Brondani, Rosana Pereira Vianello

    2003-01-01

    The genus Oryza comprises 22 species which are potentially useful as a source of genetic variability that can be introgressed into the worldwide cultivated rice, Oryza sativa. Molecular markers are useful tools for monitoring gene introgressions and for detecting polymorphism among species. In this study, cross-amplification was estimated among 28 accessions of 16 Oryza species, representing the genomes AA, BB, CC, BBCC and CCDD, using 59 microsatellite (OG, OS and RM series) and 15 STS (Sequence Tagged Sites) markers. All markers amplified at least one Oryza species, indicating different levels of transferability across species. Markers based on microsatellite sequences amplified 37 % of the accessions, with an average of 6.58 alleles per locus and an average polymorphism information content (PIC) of 70 %. For STS markers, the amplification level was 53.3 %, and the average number of alleles and PIC values were 1.6 and 10 %, respectively. These Results showed that although the STS markers detected a reduced level of genetic diversity, the transferability was higher, indicating that they can be used for genetic analysis when evaluating less genetically related species of Oryza. Among the microsatellite markers, an analysis of species with an AA genome showed that the OG markers produced the highest level of polymorphic loci (54.6 %), followed by RM markers (48 %). Highly polymorphic and transferable molecular markers in Oryza can be useful for exploiting the genetic resources of this genus, for detecting allelic variants in loci associated with important agronomic traits, and for monitoring alleles introgressed from wild relatives to cultivated rice. PMID:14641482

  4. Sixteen Polymorphic Simple Sequence Repeat Markers from Expressed Sequence Tags of the Chinese Mitten Crab Eriocheir sinensis

    PubMed Central

    Gao, Xiang-Gang; Li, Hong-Jun; Li, Yun-Feng; Sui, Li-Jun; Zhu, Bao; Liang, Yu; Liu, Wei-Dong; He, Chong-Bo

    2010-01-01

    The Chinese mitten crab (Eriocheir sinensis) is an economically important aquaculture species in China. In this study, we developed and evaluated simple sequence repeat markers from expressed sequence tags of E. sinensis. Among the 40 wild E. sinensis individuals tested, 16 loci were polymorphic. The number of alleles per locus ranged from two to ten. The observed heterozygosity ranged from 0.0667 to 0.9667, whereas the expected heterozygosity ranged from 0.0661 to 0.9051. These markers have the potential for use in genetic studies of population structure and intraspecific variation in E. sinensis. PMID:21152289

  5. Identification and Characterization of Microsatellites in Expressed Sequence Tags and Their Cross Transferability in Different Plants

    PubMed Central

    Haq, Shamshad ul; Jain, Rohit; Sharma, Meenakshi; Kachhwaha, Sumita; Kothari, S. L.

    2014-01-01

    Expressed sequence tags (EST) are potential source for the development of genic microsatellite markers, gene discovery, comparative genomics, and other genomic studies. In the present study, 7630 ESTs were examined from NCBI for SSR identification and characterization. A total of 263 SSRs were identified with an average density of one SSR/4.2 kb (3.4% frequency). Analysis revealed that trinucleotide repeats (47.52%) were most abundant followed by tetranucleotide (19.77%), dinucleotide (19.01%), pentanucleotide (9.12%), and hexanucleotide repeats (4.56%). Functional annotation was done through homology search and gene ontology, and 35 EST-SSRs were selected. Primer pairs were designed for evaluation of cross transferability and polymorphism among 11 plants belonging to five different families. Total 402 alleles were generated at 155 loci with an average of 2.6 alleles/locus and the polymorphic information content (PIC) ranged from 0.15 to 0.92 with an average of 0.75. The cross transferability ranged from 34.84% to 98.06% in different plants, with an average of 67.86%. Thus, the validation study of annotated 35 EST-SSR markers which correspond to particular metabolic activity revealed polymorphism and evolutionary nature in different families of Angiospermic plants. PMID:25389527

  6. Sequencing degraded DNA from non-destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics.

    PubMed

    Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich

    2014-01-01

    Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation. PMID:24828244

  7. Identification of SNP and SSR markers in eggplant using RAD tag sequencing

    PubMed Central

    2011-01-01

    Background The eggplant (Solanum melongena L.) genome is relatively unexplored, especially compared to those of the other major Solanaceae crops tomato and potato. In particular, no SNP markers are publicly available; on the other hand, over 1,000 SSR markers were developed and publicly available. We have combined the recently developed Restriction-site Associated DNA (RAD) approach with Illumina DNA sequencing for rapid and mass discovery of both SNP and SSR markers for eggplant. Results RAD tags were generated from the genomic DNA of a pair of eggplant mapping parents, and sequenced to produce ~17.5 Mb of sequences arrangeable into ~78,000 contigs. The resulting non-redundant genomic sequence dataset consisted of ~45,000 sequences, of which ~29% were putative coding sequences and ~70% were in common between the mapping parents. The shared sequences allowed the discovery of ~10,000 SNPs and nearly 1,000 indels, equivalent to a SNP frequency of 0.8 per Kb and an indel frequency of 0.07 per Kb. Over 2,000 of the SNPs are likely to be mappable via the Illumina GoldenGate assay. A subset of 384 SNPs was used to successfully fingerprint a panel of eggplant germplasm, producing a set of informative diversity data. The RAD sequences also included nearly 2,000 putative SSRs, and primer pairs were designed to amplify 1,155 loci. Conclusion The high throughput sequencing of the RAD tags allowed the discovery of a large number of DNA markers, which will prove useful for extending our current knowledge of the genome organization of eggplant, for assisting in marker-aided selection and for carrying out comparative genomic analyses within the Solanaceae family. PMID:21663628

  8. [Isolation and expression of novel expressed sequence tags (ESTs) from ovarian follicles of Shaoxing ducks].

    PubMed

    Shu, Gang; Chen, Jie; Ni, Ying-Dong; Zhou, Yu-Chuan; Zhao, Ru-Qian

    2004-10-01

    Three expressed sequence tags ( ESTs), SXDF0201 (271 bp), SXDF0202 (200 bp) and SXDF0203 (173 bp), were isolated from ovarian follicles of Shaoxing ducks by using silver staining mRNA differential display. GenBank/BLAST analysis revealed that SXDF0201 was not homologous to any of the published sequences from all species, indicating that it was a novel EST and was then registered in GenBank (GenBank Accession No.: CB072629), while SXDF0202 and SXDF0203 were found to be highly homologous to seven known chicken ESTs and chicken mRNA for gizzard smooth muscle myosin heavy chain. 5'-RACE was employed to extend the SXDF0201 to 544 bp which was confirmed as novel in BLAST search. The temporal and spatial expression of SXDF0201 and SXDF0202 were also investigated with semi-quantitative RT-PCR. The result showed that: both SXDF0201 and SXDF0202 were found to be expressed in hypothalamus, pituitary, muscle, liver, and fat tissues of Shaoxing ducks; SXDF0201 was expressed significantly higher in ovaries of 30-day-old Shaoxing ducks compared with that of 60-day-old (P < 0.05) and 90-day-old (P = 0.015), but the expression of SXDF0202 showed no difference throughout the ovarian development; granulose layers expressed higher SXDF0201 than theca layers in almost all hierarchical follicles, the expression of SXDF0202 in granulose layers increased along with follicular maturation (P < 0.01) from Fw to F3 follicles, but decreased dramatically to the lowest in F1 follicles (P < 0.01). In theca layers, the highest expression of SXDF0202 was found in Fw follicles (P < 0.01). PMID:15552044

  9. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    PubMed Central

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  10. Gene expression profile in the anterior regeneration of the earthworm using expressed sequence tags.

    PubMed

    Cho, Sung-Jin; Lee, Myung Sik; Tak, Eun Sik; Lee, Eun; Koh, Ki Seok; Ahn, Chi Hyun; Park, Soon Cheol

    2009-01-01

    In order to gain insight into the gene expression profiles associated with anterior regeneration of the earthworm, Perionyx excavatus, we analyzed 1,159 expressed sequence tags (ESTs) derived from cDNA library early anterior regenerated tissue. Among the 1,159 ESTs analyzed, 622 (53.7%) ESTs showed significant similarity to known genes and represented 338 genes, of which 233 ESTs were singletons and 105 ESTs manifested as two or more ESTs. While 663 ESTs (57.2%) were sequenced only once, 308 ESTs (26.6%) appeared 2 to 5 times, and 188 ESTs (16.2%) were sequenced more than 5 times. A total of 803 genes were categorized into 15 groups according to their biological functions. Among 1,159 ESTs sequenced, we found several gene encoding signaling molecules, such as Notch and Distal-less. The ESTs used in this study should provide a resource for future research in earthworm regeneration. PMID:19129665

  11. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  12. A physical map of the X chromosome of Drosophila melanogaster: Cosmid contigs and sequence tagged sites

    SciTech Connect

    Madueno, E.; Modolell, J.; Papagiannakis, G.

    1995-04-01

    A physical map of the euchromatic X chromosome of Drosophila melanogaster has been constructed by assembling contiguous arrays of cosmids that were selected by screening a library with DNA isolated from microamplified chromosomal divisions. This map, consisting of 893 cosmids, covers {approximately}64% of the euchromatic part of the chromosome. In addition, 568 sequence tagged sites (STS), in aggregate representing 120 kb of sequenced DNA, were derived from selected cosmids. Most of these STSs, spaced at an average distance of {approximately} 35 kb along the euchromatic region of the chromosome, represent DNA tags that can be used as entry points to the fruitfly genome. Furthermore, 42 genes have been placed on the physical map, either through the hybridization of specific probes to the cosmids or through the fact that they were represented among the STSs. These provide a link between the physical and the genetic maps of D. melanogaster. Nine novel genes have been tentatively identified in Drosophila on the basis of matches between STS sequences and sequences from other species. 32 refs., 3 figs., 4 tabs.

  13. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Hixson, Kim K.; Purvine, Samuel O.; Anderson, Gordon A.; Smith, Richard D.

    2008-10-15

    De novo sequencing has a promise to discover the protein post-translation modifications; however, such approach is still in their infancy and not widely applied for proteomics practices due to its limited reliability. In this work, we describe a de novo sequencing approach for discovery of protein modifications through identification of the UStags (Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry for peptides and polypeptides in a yeast lysate, and the de novo sequences obtained were filtered to define a more limited set of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags’ prefix and suffix sequences and the UStags themselves) were used to infer the possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances of yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. Random matching of the de novo sequences to the predicted sequences were examined with use of two random (false) databases, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity are described. The de novo-UStag complements the UStag method previously reported by enabling discovery of new protein modifications.

  14. Comprehensive analyses of prostate gene expression: convergence of expressed sequence tag databases, transcript profiling and proteomics.

    PubMed

    Nelson, P S; Han, D; Rochon, Y; Corthals, G L; Lin, B; Monson, A; Nguyen, V; Franza, B R; Plymate, S R; Aebersold, R; Hood, L

    2000-05-01

    Several methods have been developed for the comprehensive analysis of gene expression in complex biological systems. Generally these procedures assess either a portion of the cellular transcriptome or a portion of the cellular proteome. Each approach has distinct conceptual and methodological advantages and disadvantages. We have investigated the application of both methods to characterize the gene expression pathway mediated by androgens and the androgen receptor in prostate cancer cells. This pathway is of critical importance for the development and progression of prostate cancer. Of clinical importance, modulation of androgens remains the mainstay of treatment for patients with advanced disease. To facilitate global gene expression studies we have first sought to define the prostate transcriptome by assembling and annotating prostate-derived expressed sequence tags (ESTs). A total of 55000 prostate ESTs were assembled into a set of 15953 clusters putatively representing 15953 distinct transcripts. These clusters were used to construct cDNA microarrays suitable for examining the androgen-response pathway at the level of transcription. The expression of 20 genes was found to be induced by androgens. This cohort included known androgen-regulated genes such as prostate-specific antigen (PSA) and several novel complementary DNAs (cDNAs). Protein expression profiles of androgen-stimulated prostate cancer cells were generated by two-dimensional electrophoresis (2-DE). Mass spectrometric analysis of androgen-regulated proteins in these cells identified the metastasis-suppressor gene NDKA/nm23, a finding that may explain a marked reduction in metastatic potential when these cells express a functional androgen receptor pathway. PMID:10870968

  15. Development of expressed sequence tag-simple sequence repeat markers for Chrysanthemum morifolium and closely related species.

    PubMed

    Liu, H; Zhang, Q X; Sun, M; Pan, H T; Kong, Z X

    2015-01-01

    With the development of chrysanthemum breeding in recent years, an increasing number of wild species in genera related to Chrysanthemum were introduced to extend the genetic resources and facilitate the genetic improvement of chrysanthemums via hybridization. However, few simple sequence repeat (SSR) markers are available for marker-assisted breeding and population genetic studies of chrysanthemum and closely related species. Expressed sequence tags (ESTs) in public databases and cross-species transferable markers are considered to be a cost-effective means for developing sequence-based markers. In this study, 25 EST-SSRs were successfully developed from Chrysanthemum EST sequences for Chrysanthemum morifolium and closely related species. In total, 4164 unigene sequences were assembled from 7180 ESTs of chrysanthemum in GenBank, which were subsequently used to screen for the presence of microsatellites with the SSRIT software. The screening criteria were 8, 5, 4, and 3 repeating units for di-, tri-, tetra-, and penta- and higher-order nucleotides, respectively. Moreover, 310 SSR loci from 296 sequences were identified, and 198 primer pairs for SSR amplification were designed with the Primer Premier 5.0 software, of which 25 SSR loci showed polymorphic amplification in 52 species and varieties belonging to Chrysanthemum, Ajania, and Opisthopappus. The application of EST-SSR markers to the identification of intergeneric hybrids between Chrysanthemum and Ajania was demonstrated. Therefore, EST-SSRs can be developed for species that lack gene sequences or ESTs by utilizing ESTs of closely related species. PMID:26214436

  16. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation

  17. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  18. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  19. Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

    NASA Technical Reports Server (NTRS)

    Beil, Robert J.; Malin, Jane T.

    2012-01-01

    Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.

  20. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.

    PubMed

    Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M

    2001-10-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  1. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    PubMed Central

    Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.

    2001-01-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  2. OSIRIS-REx Touch-And-Go (TAG) Mission Design and Analysis

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Sutter, Brian; May, Alex; Williams, Ken; Barbee, Brent W.; Beckman, Mark; Williams, Bobby

    2013-01-01

    The Origins Spectral Interpretation Resource Identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) 1999 RQ36 in late 2018. After several months in formation with and orbit about the asteroid, OSIRIS-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid s surface to obtain a regolith sample. This paper describes the mission design of the TAG sequence and the propulsive maneuvers required to achieve the trajectory. This paper also shows preliminary results of orbit covariance analysis and Monte-Carlo analysis that demonstrate the ability to arrive at a targeted location on the surface of RQ36 within a 25 meter radius with 98.3% confidence.

  3. Generation of expressed sequence tags of random root cDNA clones of Brassica napus by single-run partial sequencing.

    PubMed Central

    Park, Y S; Kwak, J M; Kwon, O Y; Kim, Y S; Lee, D S; Cho, M J; Lee, H H; Nam, H G

    1993-01-01

    Two hundred thirty-seven expressed sequence tags (ESTs) of Brassica napus were generated by single-run partial sequencing of 197 random root cDNA clones. A computer search of these root ESTs revealed that 21 ESTs show significant similarity to the protein-coding sequences in the existing data bases, including five stress- or defense-related genes and four clones related to the genes from other kingdoms. Northern blot analysis of the 10 data base-matched cDNA clones revealed that many of the clones are expressed most abundantly in root but less abundantly in other organs. However, two clones were highly root specific. The results show that generation of the root ESTs by partial sequencing of random cDNA clones along with the expression analysis is an efficient approach to isolate genes that are functional in plant root in a large scale. We also discuss the results of the examination of cDNA libraries and sequencing methods suitable for this approach. PMID:8029332

  4. Generation of expressed sequence tags under cadmium stress for gene discovery and development of molecular markers in chickpea.

    PubMed

    Gaur, Rashmi; Bhatia, Sabhyata; Gupta, Meetu

    2014-07-01

    Chickpea is the world's third most important legume crop and belongs to Fabaceae family but suffered from severe yield loss due to various biotic and abiotic stresses. Development of modern genomic tools such as molecular markers and identification of resistant genes associated with these stresses facilitate improvement in chickpea breeding towards abiotic stress tolerance. In this study, 1597 high-quality expressed sequence tags (ESTs) were generated from a cDNA library of variety Pusa 1105 root tissue after cadmium (Cd) treatment. Assembly of ESTs resulted in a total of 914 unigenes of which putative homology was obtained for 38.8 % of unigenes after BLASTX search. In terms of species distribution, majority of sequences found similarity with Medicago truncatula followed by Glycine max, Vitis vinifera and Populus trichocarpa and Pisum sativum sequences. Functional annotation was assigned using Blast2Go, and the Gene Ontology (GO) terms were categorized into biological process, molecular function and cellular component. Approximately 10.83 % of unigenes were assigned at least one GO term. Moreover, in the distribution of transcripts into various biological pathways, 20 of the annotated transcripts were assigned to ten pathways in KEGG database. A majority of the genes were found to be involved in sulphur and nitrogen metabolism. In the quantitative real-time PCR analysis, five of the transcription factors and three of the transporter genes were found to be highly expressed after Cd treatment. Besides, the utility of ESTs was demonstrated by exploiting them for the development of 83 genic molecular markers including EST-simple sequence repeats and intron targeted polymorphism that would assist in tagging of genes related to metal stress for future prospects. PMID:24414095

  5. Development of polymorphic microsatellite markers based on expressed sequence tags in Populus cathayana (Salicaceae).

    PubMed

    Tian, Z Z; Zhang, F Q; Cai, Z Y; Chen, S L

    2016-01-01

    Populus cathayana occupies a large area within the northern, central, and southwestern regions of China, and is considered to be an important reforestation species in western China. In order to investigate the population genetic structure of this species, 10 polymorphic microsatellite loci were identified based on expressed sequence tags from de novo sequencing on the Illumina HiSeq 2000 platform. All microsatellite primers were tested on 48 P. cathayana individuals from four locations on the Qinghai-Tibet Plateau. The observed heterozygosity ranged from 0.000 to 1.000, and the null-allele frequency ranged from 0.000 to 0.904. These microsatellite markers may be a useful tool in genetic studies on P. cathayana and closely related species. PMID:27525845

  6. A known expressed sequence tag, BM742401, is a potent lincRNA inhibiting cancer metastasis.

    PubMed

    Park, Seong-Min; Park, Sung-Joon; Kim, Hee-Jin; Kwon, Oh-Hyung; Kang, Tae-Wook; Sohn, Hyun-Ahm; Kim, Seon-Kyu; Moo Noh, Seung; Song, Kyu-Sang; Jang, Se-Jin; Sung Kim, Yong; Kim, Seon-Young

    2013-01-01

    Long intergenic non-coding RNAs (lincRNAs) have historically been ignored in cancer biology. However, thousands of lincRNAs have been identified in mammals using recently developed genomic tools, including microarray and high-throughput RNA sequencing (RNA-seq). Several of the lincRNAs identified have been well characterized for their functions in carcinogenesis. Here we performed RNA-seq experiments comparing gastric cancer with normal tissues to find differentially expressed transcripts in intergenic regions. By analyzing our own RNA-seq and public microarray data, we identified 31 transcripts, including a known expressed sequence tag, BM742401. BM742401 was downregulated in cancer, and its downregulation was associated with poor survival in gastric cancer patients. Ectopic overexpression of BM742401 inhibited metastasis-related phenotypes and decreased the concentration of extracellular MMP9. These results suggest that BM742401 is a potential lincRNA marker and therapeutic target. PMID:23846333

  7. Species diagnostic single-nucleotide polymorphism and sequence-tagged site markers for the parasitic WASP Genus Nasonia (Hymenoptera: Ptermalidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We developed, identified and evaluated eight single nucleotide polymorphism (SNP) and three sequence-tagged site (STS) markers in nuclear gene sequences of the wasp genus Nasonia (Hymenoptera). We studied variation of these markers in natural populations of the closely related and regionally sympatr...

  8. Behavior Analysis Based on Coordinates of Body Tags

    NASA Astrophysics Data System (ADS)

    Luštrek, Mitja; Kaluža, Boštjan; Dovgan, Erik; Pogorelc, Bogdan; Gams, Matjaž

    This paper describes fall detection, activity recognition and the detection of anomalous gait in the Confidence project. The project aims to prolong the independence of the elderly by detecting falls and other types of behavior indicating a health problem. The behavior will be analyzed based on the coordinates of tags worn on the body. The coordinates will be detected with radio sensors. We describe two Confidence modules. The first one classifies the user's activity into one of six classes, including falling. The second one detects walking anomalies, such as limping, dizziness and hemiplegia. The walking analysis can automatically adapt to each person by using only the examples of normal walking of that person. Both modules employ machine learning: the paper focuses on the features they use and the effect of tag placement and sensor noise on the classification accuracy. Four tags were enough for activity recognition accuracy of over 93% at moderate sensor noise, while six were needed to detect walking anomalies with the accuracy of over 90%.

  9. A Comprehensive Approach to Clustering of Expressed Human Gene Sequence: The Sequence Tag Alignment and Consensus Knowledge Base

    PubMed Central

    Miller, Robert T.; Christoffels, Alan G.; Gopalakrishnan, Chella; Burke, John; Ptitsyn, Andrey A.; Broveak, Tania R.; Hide, Winston A.

    1999-01-01

    The expressed human genome is being sequenced and analyzed by disparate groups producing disparate data. The majority of the identified coding portion is in the form of expressed sequence tags (ESTs). The need to discover exonic representation and expression forms of full-length cDNAs for each human gene is frustrated by the partial and variable quality nature of this data delivery. A highly redundant human EST data set has been processed into integrated and unified expressed transcript indices that consist of hierarchically organized human transcript consensi reflecting gene expression forms and genetic polymorphism within an index class. The expression index and its intermediate outputs include cleaned transcript sequence, expression, and alignment information and a higher fidelity subset, SANIGENE. The STACK_PACK clustering system has been applied to dbEST release 121598 (GenBank version 110). Sixty-four percent of 1,313,103 Homo sapiens ESTs are condensed into 143,885 tissue level multiple sequence clusters; linking through clone-ID annotations produces 68,701 total assemblies, such that 81% of the original input set is captured in a STACK multiple sequence or linked cluster. Indexing of alignments by substituent EST accession allows browsing of the data structure and its cross-links to UniGene. STACK metaclusters consolidate a greater number of ESTs by a factor of 1.86 with respect to the corresponding UniGene build. Fidelity comparison with genome reference sequence AC004106 demonstrates consensus expression clusters that reflect significantly lower spurious repeat sequence content and capture alternate splicing within a whole body index cluster and three STACK v.2.3 tissue-level clusters. Statistics of a staggered release whole body index build of STACK v.2.0 are presented. PMID:10568754

  10. Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR.

    PubMed

    Teh, Seow-Ling; Chan, Wai-Sun; Abdullah, Janna Ong; Namasivayam, Parameswari

    2011-08-01

    Vanda Mimi Palmer (VMP) is a highly sought as fragrant-orchid hybrid in Malaysia. It is economically important in cosmetic and beauty industries and also a famous potted ornamental plant. To date, no work on fragrance-related genes of vandaceous orchids has been reported from other research groups although the analysis of floral fragrance or volatiles have been extensively studied. An expressed sequence tag (EST) resource was developed for VMP principally to mine any potential fragrance-related expressed sequence tag-simple sequence repeat (EST-SSR) for future development as markers in the identification of fragrant vandaceous orchids endemic to Malaysia. Clustering, annotation and assembling of the ESTs identified 1,196 unigenes which defined 966 singletons and 230 contigs. The VMP dbEST was functionally classified by gene ontology (GO) into three groups: molecular functions (51.2%), cellular components (16.4%) and biological processes (24.6%) while the remaining 7.8% showed no hits with GO identifier. A total of 112 EST-SSR (9.4%) was mined on which at least five units of di-, tri-, tetra-, penta-, or hexa-nucleotide repeats were predicted. The di-nucleotide motif repeats appeared to be the most frequent repeats among the detected SSRs with the AT/TA types as the most abundant among the dimerics, while AAG/TTC, AGA/TCT-type were the most frequent trimerics. The mined EST-SSR is believed to be useful in the development of EST-SSR markers that is applicable in the screening and characterization of fragrance-related transcripts in closely related species. PMID:21116862

  11. Development of Simple Sequence Repeat Markers from Expressed Sequence Tags of the Maize Gray Leaf Spot Pathogen, Cercospora Zea-Maydis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Ten simple sequence repeat markers were developed from expressed sequence tags of Cercospora zeae-maydis, the cause of gray leaf spot of maize (Zea mays). All loci were evaluated on 80 isolates from a local population of C. zeae-maydis and all were highly polymorphic, with 4 to 14 alleles per locus....

  12. Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...

  13. Development of polymorphic expressed sequence tag-single sequence repeat markers in the common Chinese cuttlefish, Sepiella maindroni.

    PubMed

    Li, R H; Lu, S K; Zhang, C L; Song, W W; Mu, C K; Wang, C L

    2014-01-01

    The common Chinese cuttlefish (Sepiella maindroni) is one of the popular edible cephalopod consumed across Asia. To facilitate the population genetic investigation of this species, we developed fourteen polymorphic microsatellite makers from expressed sequence tags of S. maindroni. The number of alleles at each locus ranged from 6 to 10 with an average of 7.9 alleles per locus. The ranges of observed and expected heterozygosity were from 0.615 to 0.962 and 0.685 to 0.888, respectively. Four loci were found deviated significantly from Hardy-Weinberg equilibrium. The polymorphism information content ranged from 0.638 to 0.833. These polymorphic microsatellite loci will be helpful for the population genetic, genetic linkage map, and other genetic studies of S. maindroni. PMID:25117305

  14. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  15. Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm.

    PubMed

    Riju, Aykkal; Chandrasekar, Arumugam; Arunachalam, Vadivel

    2007-01-01

    The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies. PMID:21670789

  16. Serial number tagging reveals a prominent sequence preference of retrotransposon integration.

    PubMed

    Chatterjee, Atreyi Ghatak; Esnault, Caroline; Guo, Yabin; Hung, Stevephen; McQueen, Philip G; Levin, Henry L

    2014-07-01

    Transposable elements (TE) have both negative and positive impact on the biology of their host. As a result, a balance is struck between the host and the TE that relies on directing integration to specific genome territories. The extraordinary capacity of DNA sequencing can create ultra dense maps of integration that are being used to study the mechanisms that position integration. Unfortunately, the great increase in the numbers of insertion sites detected comes with the cost of not knowing which positions are rare targets and which sustain high numbers of insertions. To address this problem we developed the serial number system, a TE tagging method that measures the frequency of integration at single nucleotide positions. We sequenced 1 million insertions of retrotransposon Tf1 in the genome of Schizosaccharomyces pombe and obtained the first profile of integration with frequencies for each individual position. Integration levels at individual nucleotides varied over two orders of magnitude and revealed that sequence recognition plays a key role in positioning integration. The serial number system is a general method that can be applied to determine precise integration maps for retroviruses and gene therapy vectors. PMID:24948612

  17. Expressed sequence tags from the red imported fire ant, Solenopsis invicta: annotation and utilization for discovery of viruses.

    PubMed

    Valles, Steven M; Strong, Charles A; Hunter, Wayne B; Dang, Phat M; Pereira, Roberto M; Oi, David H; Williams, David F

    2008-09-01

    An expression library was created and 2304 clones sequenced from a monogyne colony of Solenopsis invicta. The primary intention of the project was to utilize homologous gene identification to facilitate discovery of viruses infecting this ant pest that could potentially be used in pest management. Additional genes were identified from the ant host and associated pathogens that serve as an important resource for studying these organisms. After assembly and removal of mitochondrial and poor quality sequences, 1054 unique sequences were yielded and deposited into the GenBank database under Accession Nos. EH412746 through EH413799. At least nine expressed sequence tags (ESTs) were identified as possessing microsatellite motifs and 15 ESTs exhibited significant homology with microsporidian genes. These sequences most likely originated from Thelohania solenopsae, a well-characterized microsporidian that infects S. invicta. Six ESTs exhibited significant homology with single-stranded RNA viruses (3B4, 3F6, 11F1, 12G12, 14D5, and 24C10). Subsequent analysis of these putative viral ESTs revealed that 3B4 was most likely a ribosomal gene of S. invicta, 11F1 was a single-stranded RNA (ssRNA) virus contaminant introduced into the colony from the cricket food source, 12G12 appeared to be a plant-infecting tenuivirus also introduced into the colony as a field contaminant, and 3F6, 14D5, and 24C10 were all from a unique ssRNA virus found to infect S. invicta. The sequencing project illustrates the utility of this method for discovery of viruses and pathogens that may otherwise go undiscovered. PMID:18329665

  18. GST-PRIME: a genome-wide primer design software for the generation of gene sequence tags.

    PubMed

    Varotto, C; Richly, E; Salamini, F; Leister, D

    2001-11-01

    The availability of sequenced genomes has generated a need for experimental approaches that allow the simultaneous analysis of large, or even complete, sets of genes. To facilitate such analyses, we have developed GST-PRIME, a software package for retrieving and assembling gene sequences, even from complex genomes, using the NCBI public database, and then designing sets of primer pairs for use in gene amplification. Primers were designed by the program for the direct amplification of gene sequence tags (GSTs) from either genomic DNA or cDNA. Test runs of GST-PRIME on 2000 randomly selected Arabidopsis and Drosophila genes demonstrate that 93 and 88% of resulting GSTs, respectively, fulfilled imposed length criteria. GST-PRIME primer pairs were tested on a set of 1900 Arabidopsis genes coding for chloroplast-targeted proteins: 95% of the primer pairs used in PCRs with genomic DNA generated the correct amplicons. GST-PRIME can thus be reliably used for large-scale or specific amplification of intron-containing genes of multicellular eukaryotes. PMID:11691924

  19. GST-PRIME: a genome-wide primer design software for the generation of gene sequence tags

    PubMed Central

    Varotto, Claudio; Richly, Erik; Salamini, Francesco; Leister, Dario

    2001-01-01

    The availability of sequenced genomes has generated a need for experimental approaches that allow the simultaneous analysis of large, or even complete, sets of genes. To facilitate such analyses, we have developed GST-PRIME, a software package for retrieving and assembling gene sequences, even from complex genomes, using the NCBI public database, and then designing sets of primer pairs for use in gene amplification. Primers were designed by the program for the direct amplification of gene sequence tags (GSTs) from either genomic DNA or cDNA. Test runs of GST-PRIME on 2000 randomly selected Arabidopsis and Drosophila genes demonstrate that 93 and 88% of resulting GSTs, respectively, fulfilled imposed length criteria. GST-PRIME primer pairs were tested on a set of 1900 Arabidopsis genes coding for chloroplast-targeted proteins: 95% of the primer pairs used in PCRs with genomic DNA generated the correct amplicons. GST-PRIME can thus be reliably used for large-scale or specific amplification of intron-containing genes of multicellular eukaryotes. PMID:11691924

  20. Proteomic analysis of Trypanosoma cruzi developmental stages using isotope-coded affinity tag reagents.

    PubMed

    Paba, Jaime; Ricart, Carlos A O; Fontes, Wagner; Santana, Jaime M; Teixeira, Antonio R L; Marchese, Jason; Williamson, Brian; Hunt, Tony; Karger, Barry L; Sousa, Marcelo V

    2004-01-01

    Comparative proteome analysis of developmental stages of the human pathogen Trypanosoma cruzi was carried out by isotope-coded affinity tag technology (ICAT) associated with liquid cromatography-mass spectrometry peptide sequencing (LC-MS/MS). Protein extracts of the protozoan trypomastigote and amastigote stages were labeled with heavy (D8) and light (D0) ICAT reagents and subjected to cation exchange and avidin affinity chromatographies followed by LC-MS/MS analysis. High confidence sequence information and expression levels for 41 T. cruzi polypeptides, including metabolic enzymes, paraflagellar rod components, tubulins, and heat-shock proteins were reported. Twenty-nine proteins displayed similar levels of expression in both forms of the parasite, nine proteins presented higher levels in trypomastigotes, whereas three were more expressed in amastigotes. PMID:15253433

  1. Mapping of Heterologous Expressed Sequence Tags as an Alternative to Microarrays for Study of Defense Responses in Plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, we used publicly available EST (expressed sequence tags) database derived from four different plant species infected with a variety of pathogens, to generate an expression profile of orthologous genes involved in defense response of a model organism, Arabidopsis thaliana. Computer-ass...

  2. The HaloTag: Improving Soluble Expression and Applications in Protein Functional Analysis.

    PubMed

    N Peterson, Scott; Kwon, Keehwan

    2012-01-01

    Technological and methodological advances have been critical for the rapidly evolving field of proteomics. The development of fusion tag systems is essential for purification and analysis of recombinant proteins. The HaloTag is a 34 KDa monomeric protein derived from a bacterial haloalkane dehalogenase. The majority of fusion tags in use today utilize a reversible binding interaction with a specific ligand. The HaloTag system is unique in that it forms a covalent linkage to its chloroalkane ligand. This linkage permits attachment of the HaloTag to a variety of functional reporters, which can be used to label and immobilize recombinant proteins. The success rate for HaloTag expression of soluble proteins is very high and comparable to maltose binding protein (MBP) tag. Furthermore, cleavage of the HaloTag does not result in protein insolubility that often is observed with the MBP tag. In the present report, we describe applications of the HaloTag system in our ongoing investigation of protein-protein interactions of the Y. pestis Type 3 secretion system on a custom protein microarray. We also describe the utilization of affinity purification/mass spectroscopy (AP/MS) to evaluate the utility of the Halo Tag system to characterize DNA binding activity and protein specificity. PMID:23115610

  3. The HaloTag: Improving Soluble Expression and Applications in Protein Functional Analysis

    PubMed Central

    N Peterson, Scott; Kwon, Keehwan

    2012-01-01

    Technological and methodological advances have been critical for the rapidly evolving field of proteomics. The development of fusion tag systems is essential for purification and analysis of recombinant proteins. The HaloTag is a 34 KDa monomeric protein derived from a bacterial haloalkane dehalogenase. The majority of fusion tags in use today utilize a reversible binding interaction with a specific ligand. The HaloTag system is unique in that it forms a covalent linkage to its chloroalkane ligand. This linkage permits attachment of the HaloTag to a variety of functional reporters, which can be used to label and immobilize recombinant proteins. The success rate for HaloTag expression of soluble proteins is very high and comparable to maltose binding protein (MBP) tag. Furthermore, cleavage of the HaloTag does not result in protein insolubility that often is observed with the MBP tag. In the present report, we describe applications of the HaloTag system in our ongoing investigation of protein-protein interactions of the Y. pestis Type 3 secretion system on a custom protein microarray. We also describe the utilization of affinity purification/mass spectroscopy (AP/MS) to evaluate the utility of the Halo Tag system to characterize DNA binding activity and protein specificity. PMID:23115610

  4. Construction of a Lotus japonicus late nodulin expressed sequence tag library and identification of novel nodule-specific genes.

    PubMed Central

    Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J

    1997-01-01

    A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951

  5. Myocardial motion estimation in tagged MR sequences by using alphaMI-based non rigid registration.

    PubMed

    Oubel, E; Tobon-Gomez, C; Hero, A O; Frangi, A F

    2005-01-01

    Tagged Magnetic Resonance Imaging (MRI) is currently the reference MR modality for myocardial motion and strain analysis. NMI-based non rigid registration has proven to be an accurate method to retrieve cardiac deformation fields. The use of alphaMI permits higher dimensional features to be implemented in myocardial deformation estimation through image registration. This paper demonstrates that this is feasible with a set of Haar wavelet features of high dimension. While we do not demonstrate performance improvement for this set of features, there is no significant degradation as compared to implementing the registration method with the traditional NMI metric. We use Entropic Spanning Graphs (ESGs) to estimate the alphaMI of the wavelet feature vectors WFVs since this is not possible with histograms. To the best of our knowledge, this is the first time that ESGs are used for non rigid registration. PMID:16685969

  6. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  7. Sequence analysis on microcomputers.

    PubMed

    Cannon, G C

    1987-10-01

    Overall, each of the program packages performed their tasks satisfactorily. For analyses where there was a well-defined answer, such as a search for a restriction site, there were few significant differences between the program sets. However, for tasks in which a degree of flexibility is desirable, such as homology or similarity determinations and database searches, DNASTAR consistently afforded the user more options in conducting the required analysis than did the other two packages. However, for laboratories where sequence analysis is not a major effort and the expense of a full sequence analysis workstation cannot be justified, MicroGenie and IBI-Pustell offer a satisfactory alternative. MicroGenie is a polished program system. Many may find that its user interface is more "user friendly" than the standard menu-driven interfaces. Its system of filing sequences under individual passwords facilitates use by more than one person. MicroGenie uses a hardware device for software protection that occupies a card slot in the computer on which it is used. Although I am sympathetic to the problem of software piracy, I feel that a less drastic solution is in order for a program likely to be sharing limited computer space with other software packages. The IBI-Pustell package performs the required analysis functions as accurately and quickly as MicroGenie but it lacks the clearness and ease of use. The menu system seems disjointed, and new or infrequent users often find themselves at apparent "dead-end menus" where the only clear alternative is to restart the entire program package. It is suggested from published accounts that the user interface is going to be upgraded and perhaps when that version is available, use of the system will be improved. The documentation accompanying each package was relatively clear as to how to run the programs, but all three packages assumed that the user was familiar with the computational techniques employed. MicroGenie and IBI-Pustell further

  8. The non-coding RNA composition of the mitotic chromosome by 5′-tag sequencing

    PubMed Central

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M.; Shao, Zhifeng

    2016-01-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5′-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  9. Isolation of expressed sequence tags of Agaricus bisporus and their assignment to chromosomes.

    PubMed Central

    Sonnenberg, A S; de Groot, P W; Schaap, P J; Baars, J J; Visser, J; Van Griensven, L J

    1996-01-01

    The genome of the cultivated basidiomycete Agaricus bisporus Horst U1 and of its homokaryotic parents has been characterized by using an optimized method of pulsed-field gel electrophoresis. Expressed sequence tags obtained as expressed cDNAs from a primordial tissue-derived cDNA library and a number of previously isolated genes were used to identify the individual chromosomes of the parental lines of Horst U1. The genome consists of 13 chromosomes, and its total size is 31 Mb. For those chromosomes that could not be resolved by contour-clamped homogeneous electric field electrophoresis, the segregation of marker genes was studied in a set of 86 homokaryotic offspring of Horst U1. At least two markers were assigned to each individual chromosome. In this way all individual chromosomes were unequivocally identified. The large size difference observed between the homologous chromosomes IX, harboring the rDNA repeat, was shown to be largely due to a higher copy number of rDNA in parental strain H97 than in parental strain H39. PMID:8953726

  10. A new method to identify flanking sequence tags in chlamydomonas using 3’-RACE

    PubMed Central

    2012-01-01

    Background The green alga Chlamydomonas reinhardtii, although a premier model organism in biology, still lacks extensive insertion mutant libraries with well-identified Flanking Sequence Tags (FSTs). Rapid and efficient methods are needed for FST retrieval. Results Here, we present a novel method to identify FSTs in insertional mutants of Chlamydomonas. Transformants can be obtained with a resistance cassette lacking a 3’ untranslated region (UTR), suggesting that the RNA that is produced from the resistance marker terminates in the flanking genome when it encounters a cleavage/polyadenylation signal. We have used a robust 3’-RACE method to specifically amplify such chimeric cDNAs. Out of 38 randomly chosen transformants, 27 (71%) yielded valid FSTs, of which 23 could be unambiguously mapped to the genome. Eighteen of the mutants lie within a predicted gene. All but two of the intragenic insertions occur in the sense orientation with respect to transcription, suggesting a bias against situations of convergent transcription. Among the 14 insertion sites tested by genomic PCR, 12 could be confirmed. Among these are insertions in genes coding for PSBS3 (possibly involved in non-photochemical quenching), the NimA-related protein kinase CNK2, the mono-dehydroascorbate reductase MDAR1, the phosphoglycerate mutase PGM5 etc.. Conclusion We propose that our 3’-RACE FST method can be used to build large scale FST libraries in Chlamydomonas and other transformable organisms. PMID:22735168

  11. The non-coding RNA composition of the mitotic chromosome by 5'-tag sequencing.

    PubMed

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M; Shao, Zhifeng

    2016-06-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5'-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  12. ISHAN: sequence homology analysis package.

    PubMed

    Shil, Pratip; Dudani, Niraj; Vidyasagar, Pandit B

    2006-01-01

    Sequence based homology studies play an important role in evolutionary tracing and classification of proteins. Various methods are available to analyze biological sequence information. However, with the advent of proteomics era, there is a growing demand for analysis of huge amount of biological sequence information, and it has become necessary to have programs that would provide speedy analysis. ISHAN has been developed as a homology analysis package, built on various sequence analysis tools viz FASTA, ALIGN, CLUSTALW, PHYLIP and CODONW (for DNA sequences). This JAVA application offers the user choice of analysis tools. For testing, ISHAN was applied to perform phylogenetic analysis for sets of Caspase 3 DNA sequences and NF-kappaB p105 amino acid sequences. By integrating several tools it has made analysis much faster and reduced manual intervention. PMID:17274766

  13. Rediscovering Medicinal Plants' Potential with OMICS: Microsatellite Survey in Expressed Sequence Tags of Eleven Traditional Plants with Potent Antidiabetic Properties

    PubMed Central

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar

    2014-01-01

    Abstract Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to

  14. Image analysis methods for tagged MRI cardiac studies

    NASA Astrophysics Data System (ADS)

    Guttman, Michael A.; Prince, Jerry L.

    1990-07-01

    Tracking of magnetic resonance (MR) tags in myocardial tissue promises to be an effective tool in the assessment of myocardial motion. The amount of data acquired is very large and the measurements are numerous and must be precise requiring automated tracking methods. We describe a hierarchy of image processing steps that estimate both the endocardial and epicardial boundaries of the left ventricle and also estimate the spines of radial tags that emanate outward from the left ventricular cavity. The first stage determines the position of the myocardial boundaries for each of 128 rays emanating from the origin. To counter the deleterious effects of noise and the presence of the tags when determining the boundary positions we use nonlinear filtering concepts from mathematical morphology together with a prion knowledge related to boundary smoothness to improve the estimates. The second stage estimates the tag spines by matching a template in a direction orthogonal to the expected tag direction. We show results on tagged images and discuss further research directions. 1.

  15. Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.

    PubMed

    Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A

    2014-11-01

    Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute. PMID:24628982

  16. Transmural Myocardial Strain in Mouse: Quantification of High-Resolution MR Tagging using HARP Analysis

    PubMed Central

    Zhong, Jia; Liu, Wei; Yu, Xin

    2009-01-01

    MR tagging allows noninvasive examination of regional myocardial function with high accuracy and reproducibility. Current tagging method is limited by low tagging resolution for accurate transmural strain quantification. Previously, a SPAMM-based method was proposed to increase the tagging resolution by combining two or more tagged images with different tagging grid positions. However, there has been limited application due to the challenge in image processing of multiple data sets. In the current study, we propose a HARP-based method for automated and fast analysis of high-resolution tagged images. First-order harmonic peaks from low tagging resolution images were combined to generate the composite second-order harmonic peak for strain computation. The combined images reached a tagging resolution of 0.3 mm. The proposed method was applied to the quantification of transmural myocardial wall strain in 7 normal C57BL/6 mice. Principal strains, as well as radial and circumferential strains, were quantified using the current method. PMID:19319888

  17. Twin Mitochondrial Sequence Analysis.

    PubMed

    Bouhlal, Yosr; Martinez, Selena; Gong, Henry; Dumas, Kevin; Shieh, Joseph T C

    2013-09-01

    When applying genome-wide sequencing technologies to disease investigation, it is increasingly important to resolve sequence variation in regions of the genome that may have homologous sequences. The human mitochondrial genome challenges interpretation given the potential for heteroplasmy, somatic variation, and homologous nuclear mitochondrial sequences (numts). Identical twins share the same mitochondrial DNA (mtDNA) from early life, but whether the mitochondrial sequence remains similar is unclear. We compared an adult monozygotic twin pair using high throughput-sequencing and evaluated variants with primer extension and mitochondrial pre-enrichment. Thirty-seven variants were shared between the twin individuals, and the variants were verified on the original genomic DNA. These studies support highly identical genetic sequence in this case. Certain low-level variant calls were of high quality and homology to the mitochondrial DNA, and they were further evaluated. When we assessed calls in pre-enriched mitochondrial DNA templates, we found that these may represent numts, which can be differentiated from mtDNA variation. We conclude that twin identity extends to mitochondrial DNA, and it is critical to differentiate between numts and mtDNA in genome sequencing, particularly since significant heteroplasmy could influence genome interpretation. Further studies on mtDNA and numts will aid in understanding how variation occurs and persists. PMID:24040623

  18. Regional localisation of 19 brain expressed sequence tags to human chromosome 11 using PCR amplification of somatic cell hybrid DNAs.

    PubMed

    Slorach, E M; Polymeropoulos, M H; Evans, K L; Seawright, A; Fletcher, J M; Porteous, D J; Brookes, A J

    1995-01-01

    Expressed sequence tags (ESTs) provide an efficient route to the identification of genes involved in normal development and in disease. PCR amplification of somatic cell hybrid DNAs was used to localise 22 brain-derived ESTs to subregions of human chromosome 11. Problems encountered with the standardised PCR conditions were overcome by optimising the annealing temperatures and the use of "touchdown" PCR. Amplification of the correct target sequence allowed the mapping of 19 ESTs, 8 to the short arm and 11 to the long arm of chromosome 11. No definitive localisation could be determined for the three remaining ESTs. PMID:7736794

  19. A direct method for regiospecific analysis of TAG using alpha-MAG.

    PubMed

    Turon, F; Bachain, P; Caro, Y; Pina, M; Graille, J

    2002-08-01

    An analytical procedure was developed for regiodistribution analysis of TAG using alpha-MAG prepared by an ethyl magnesium bromide deacylation. In the present communication, the deacylation procedure is shown to lead to representative alpha-MAG, allowing the composition of the native TAG in the alpha-position to be determined directly. The composition in the beta-position can then be estimated from the composition of the alpha-MAG and TAG according to the formula 3 x TAG - 2 x alpha-MAG. The estimates are superior to those obtained using the alpha,beta-DAG and Brockerhoff calculations as they come closer to the theoretical value and have smaller SD. The present procedure, first demonstrated on a synthetic TAG, was then successfully applied to the analysis of borage oil, milkfat, and tuna oil. PMID:12371754

  20. Chasing migration genes: a brain expressed sequence tag resource for summer and migratory monarch butterflies (Danaus plexippus).

    PubMed

    Zhu, Haisun; Casselman, Amy; Reppert, Steven M

    2008-01-01

    North American monarch butterflies (Danaus plexippus) undergo a spectacular fall migration. In contrast to summer butterflies, migrants are juvenile hormone (JH) deficient, which leads to reproductive diapause and increased longevity. Migrants also utilize time-compensated sun compass orientation to help them navigate to their overwintering grounds. Here, we describe a brain expressed sequence tag (EST) resource to identify genes involved in migratory behaviors. A brain EST library was constructed from summer and migrating butterflies. Of 9,484 unique sequences, 6068 had positive hits with the non-redundant protein database; the EST database likely represents approximately 52% of the gene-encoding potential of the monarch genome. The brain transcriptome was cataloged using Gene Ontology and compared to Drosophila. Monarch genes were well represented, including those implicated in behavior. Three genes involved in increased JH activity (allatotropin, juvenile hormone acid methyltransfersase, and takeout) were upregulated in summer butterflies, compared to migrants. The locomotion-relevant turtle gene was marginally upregulated in migrants, while the foraging and single-minded genes were not differentially regulated. Many of the genes important for the monarch circadian clock mechanism (involved in sun compass orientation) were in the EST resource, including the newly identified cryptochrome 2. The EST database also revealed a novel Na+/K+ ATPase allele predicted to be more resistant to the toxic effects of milkweed than that reported previously. Potential genetic markers were identified from 3,486 EST contigs and included 1599 double-hit single nucleotide polymorphisms (SNPs) and 98 microsatellite polymorphisms. These data provide a template of the brain transcriptome for the monarch butterfly. Our "snap-shot" analysis of the differential regulation of candidate genes between summer and migratory butterflies suggests that unbiased, comprehensive transcriptional

  1. Identification of Disulfide Bonds in Protein Proteolytic Degradation Products Using de Novo-Protein Unique Sequence Tags Approach

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Purvine, Samuel O.; Smith, Richard D.

    2010-08-01

    Disulfide bonds are a form of posttranslational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, and specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags) we unambiguously correlated the spectra to specific database proteins. Examination of the UStags’ prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to ~10 kDa.

  2. Identification of disulfide bonds in protein proteolytic degradation products using de novo-protein unique sequence tags approach.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Purvine, Samuel O; Smith, Richard D

    2010-08-01

    Disulfide bonds are a form of post-translational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags), we unambiguously correlated the spectra to specific database proteins. Examination of the UStags' prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to approximately 10 kDa. PMID:20590115

  3. Expressed sequence tags (ESTs) from immune tissues of turbot (Scophthalmus maximus) challenged with pathogens

    PubMed Central

    Pardo, Belén G; Fernández, Carlos; Millán, Adrián; Bouza, Carmen; Vázquez-López, Araceli; Vera, Manuel; Alvarez-Dios, José A; Calaza, Manuel; Gómez-Tato, Antonio; Vázquez, María; Cabaleiro, Santiago; Magariños, Beatriz; Lemos, Manuel L; Leiro, José M; Martínez, Paulino

    2008-01-01

    Background The turbot (Scophthalmus maximus; Scophthalmidae; Pleuronectiformes) is a flatfish species of great relevance for marine aquaculture in Europe. In contrast to other cultured flatfish, very few genomic resources are available in this species. Aeromonas salmonicida and Philasterides dicentrarchi are two pathogens that affect turbot culture causing serious economic losses to the turbot industry. Little is known about the molecular mechanisms for disease resistance and host-pathogen interactions in this species. In this work, thousands of ESTs for functional genomic studies and potential markers linked to ESTs for mapping (microsatellites and single nucleotide polymorphisms (SNPs)) are provided. This information enabled us to obtain a preliminary view of regulated genes in response to these pathogens and it constitutes the basis for subsequent and more accurate microarray analysis. Results A total of 12584 cDNAs partially sequenced from three different cDNA libraries of turbot (Scophthalmus maximus) infected with Aeromonas salmonicida, Philasterides dicentrarchi and from healthy fish were analyzed. Three immune-relevant tissues (liver, spleen and head kidney) were sampled at several time points in the infection process for library construction. The sequences were processed into 9256 high-quality sequences, which constituted the source for the turbot EST database. Clustering and assembly of these sequences, revealed 3482 different putative transcripts, 1073 contigs and 2409 singletons. BLAST searches with public databases detected significant similarity (e-value ≤ 1e-5) in 1766 (50.7%) sequences and 816 of them (23.4%) could be functionally annotated. Two hundred three of these genes (24.9%), encoding for defence/immune-related proteins, were mostly identified for the first time in turbot. Some ESTs showed significant differences in the number of transcripts when comparing the three libraries, suggesting regulation in response to these pathogens. A total of

  4. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw. PMID:20478825

  5. Development and Characterization of 1,827 Expressed Sequence Tag-Derived Simple Sequence Repeat Markers for Ramie (Boehmeria nivea L. Gaud)

    PubMed Central

    Liu, Touming; Zhu, Siyuan; Fu, Lili; Tang, Qingming; Yu, Yongting; Chen, Ping; Luan, Mingbao; Wang, Changbiao; Tang, Shouwei

    2013-01-01

    Ramie (Boehmeria nivea L. Gaud) is one of the most important natural fiber crops, and improvement of fiber yield and quality is the main goal in efforts to breed superior cultivars. However, efforts aimed at enhancing the understanding of ramie genetics and developing more effective breeding strategies have been hampered by the shortage of simple sequence repeat (SSR) markers. In our previous study, we had assembled de novo 43,990 expressed sequence tags (ESTs). In the present study, we searched these previously assembled ESTs for SSRs and identified 1,685 ESTs (3.83%) containing 1,878 SSRs. Next, we designed 1,827 primer pairs complementary to regions flanking these SSRs, and these regions were designated as SSR markers. Among these markers, dinucleotide and trinucleotide repeat motifs were the most abundant types (36.4% and 36.3%, respectively), whereas tetranucleotide, pentanucleotide, and hexanucleotide motifs represented <10% of the markers. The motif AG/CT was the most abundant, accounting for 28.74% of the markers. One hundred EST-SSR markers (97 SSRs located in genes encoding transcription factors and 3 SSRs in genes encoding cellulose synthases) were amplified using polymerase chain reaction for detecting 24 ramie varieties. Of these 100 markers, 98 markers were successfully amplified and 81 markers were polymorphic, with 2–6 alleles among the 24 varieties. Analysis of the genetic diversity of all 24 varieties revealed similarity coefficients that ranged from 0.51 to 0.80. The EST-SSRs developed in this study represent the first large-scale development of SSR markers for ramie. These SSR markers could be used for development of genetic and physical maps, quantitative trait loci mapping, genetic diversity studies, association mapping, and cultivar fingerprinting. PMID:23565230

  6. Sequence analysis of diacylglycerol acyltransferases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Diacylglycerol acyltransferases (DGATs) catalyze the final step of triacylglycerol (TAG) biosynthesis in eukaryotes. DGATs esterify sn-1,2-diacylglycerol with a long-chain fatty acyl-CoA. Plants and animals deficient in DGATs accumulate less TAG and over-expression of DGATs increases TAG. DGAT knock...

  7. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Jingjie, Hu; Xiaolong, Wang; Xiaoli, Hu; Zhenmin, Bao

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2-6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  8. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Hu, Jingjie; Wang, Xiaolong; Hu, Xiaoli; Bao, Zhenmin

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2 6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  9. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system. PMID:25496401

  10. Miniaturised wireless smart tag for optical chemical analysis applications.

    PubMed

    Steinberg, Matthew D; Kassal, Petar; Tkalčec, Biserka; Murković Steinberg, Ivana

    2014-01-01

    A novel miniaturised photometer has been developed as an ultra-portable and mobile analytical chemical instrument. The low-cost photometer presents a paradigm shift in mobile chemical sensor instrumentation because it is built around a contactless smart card format. The photometer tag is based on the radio-frequency identification (RFID) smart card system, which provides short-range wireless data and power transfer between the photometer and a proximal reader, and which allows the reader to also energise the photometer by near field electromagnetic induction. RFID is set to become a key enabling technology of the Internet-of-Things (IoT), hence devices such as the photometer described here will enable numerous mobile, wearable and vanguard chemical sensing applications in the emerging connected world. In the work presented here, we demonstrate the characterisation of a low-power RFID wireless sensor tag with an LED/photodiode-based photometric input. The performance of the wireless photometer has been tested through two different model analytical applications. The first is photometry in solution, where colour intensity as a function of dye concentration was measured. The second is an ion-selective optode system in which potassium ion concentrations were determined by using previously well characterised bulk optode membranes. The analytical performance of the wireless photometer smart tag is clearly demonstrated by these optical absorption-based analytical experiments, with excellent data agreement to a reference laboratory instrument. PMID:24274311

  11. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  12. Exploring the Structure of Library and Information Science Web Space Based on Multivariate Analysis of Social Tags

    ERIC Educational Resources Information Center

    Joo, Soohyung; Kipp, Margaret E. I.

    2015-01-01

    Introduction: This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tripartite graphs, pattern tracing and descriptive statistics. This…

  13. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  14. Evidence from sequence-tagged-site markers of a recent progenitor-derivative species pair in conifers

    PubMed Central

    Perron, Martin; Perry, Daniel J.; Andalo, Christophe; Bousquet, Jean

    2000-01-01

    Black spruce (Picea mariana [B.S.P.] Mill.) and red spruce (Picea rubens Sarg.) are two conifer species known to hybridize naturally in northeastern North America. We hypothesized that there is a progenitor-derivative relationship between these two taxa and conducted a genetic investigation by using sequence-tagged-site markers of expressed genes. Based on the 26 sequence-tagged-site loci assayed in this study, the unbiased genetic identity between the two taxa was quite high with a value of 0.920. The mean number of polymorphic loci, the mean number of alleles per polymorphic locus, and the average observed heterozygosity were lower in red spruce (P = 35%, AP = 2.1, Ho = 0.069) than in black spruce (P = 54%, AP = 2.9, Ho = 0.103). No unique alleles were found in red spruce, and the observed patterns of allele distribution indicated that the genetic diversity of red spruce was essentially a subset of that found in black spruce. When considered in combination with ecological evidence and simulation results, these observations clearly support the existence of a progenitor-derivative relationship and suggest that the reduced level of genetic diversity in red spruce may result from allopatric speciation through glaciation-induced isolation of a preexisting black spruce population during the Pleistocene era. Our observations signal a need for a thorough reexamination of several conifer species complexes in which natural hybridization is known to occur. PMID:11016967

  15. Micro- and minisatellite-expressed sequence tag (EST) markers discriminate between populations of Rhipicephalus appendiculatus.

    PubMed

    Kanduma, Esther G; Mwacharo, Joram M; Sunter, Jack D; Nzuki, Inosters; Mwaura, Stephen; Kinyanjui, Peter W; Kibe, Michael; Heyne, Heloise; Hanotte, Olivier; Skilton, Robert A; Bishop, Richard P

    2012-06-01

    Biological differences, including vector competence for the protozoan parasite Theileria parva have been reported among populations of Rhipicephalus appendiculatus (Acari: Ixodidae) from different geographic regions. However, the genetic diversity and population structure of this important tick vector remain unknown due to the absence of appropriate genetic markers. Here, we describe the development and evaluation of a panel of EST micro- and minisatellite markers to characterize the genetic diversity within and between populations of R. appendiculatus and other rhipicephaline species. Sixty-six micro- and minisatellite markers were identified through analysis of the R. appendiculatus Gene Index (RaGI) EST database and selected bacterial artificial chromosome (BAC) sequences. These were used to genotype 979 individual ticks from 10 field populations, 10 laboratory-bred stocks, and 5 additional Rhipicephalus species. Twenty-nine markers were polymorphic and therefore informative for genetic studies while 6 were monomorphic. Primers designed from the remaining 31 loci did not reliably generate amplicons. The 29 polymorphic markers discriminated populations of R. appendiculatus and also 4 other Rhipicephalus species, but not R. zambeziensis. The percentage Principal Component Analysis (PCA) implemented using Multiple Co-inertia Analysis (MCoA) clustered populations of R. appendiculatus into 2 groups. Individual markers however differed in their ability to generate the reference typology using the MCoA approach. This indicates that different panels of markers may be required for different applications. The 29 informative polymorphic micro- and minisatellite markers are the first available tools for the analysis of the phylogeography and population genetics of R. appendiculatus. PMID:22789728

  16. SNP discovery using Paired-End RAD-tag sequencing on pooled genomic DNA of Sisymbrium austriacum (Brassicaceae).

    PubMed

    Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T

    2013-03-01

    Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50)  = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers. PMID:23231662

  17. Precipitation recycling in West Africa - regional modeling, evaporation tagging and atmospheric water budget analysis

    NASA Astrophysics Data System (ADS)

    Arnault, Joel; Kunstmann, Harald; Knoche, Hans-Richard

    2015-04-01

    Many numerical studies have shown that the West African monsoon is highly sensitive to the state of the land surface. It is however questionable to which extend a local change of land surface properties would affect the local climate, especially with respect to precipitation. This issue is traditionally addressed with the concept of precipitation recycling, defined as the contribution of local surface evaporation to local precipitation. For this study the West African monsoon has been simulated with the Weather Research and Forecasting (WRF) model using explicit convection, for the domain (1°S-21°N, 18°W-14°E) at a spatial resolution of 10 km, for the period January-October 2013, and using ERA-Interim reanalyses as driving data. This WRF configuration has been selected for its ability to simulate monthly precipitation amounts and daily histograms close to TRMM (Tropical Rainfall Measuring Mission) data. In order to investigate precipitation recycling in this WRF simulation, surface evaporation tagging has been implemented in the WRF source code as well as the budget of total and tagged atmospheric water. Surface evaporation tagging consists in duplicating all water species and the respective prognostic equations in the source code. Then, tagged water species are set to zero at the lateral boundaries of the simulated domain (no inflow of tagged water vapor), and tagged surface evaporation is considered only in a specified region. All the source terms of the prognostic equations of total and tagged water species are finally saved in the outputs for the budget analysis. This allows quantifying the respective contribution of total and tagged atmospheric water to atmospheric precipitation processes. The WRF simulation with surface evaporation tagging and budgets has been conducted two times, first with a 100 km2 tagged region (11-12°N, 1-2°W), and second with a 1000 km2 tagged region (7-16°N, 6°W -3°E). In this presentation we will investigate hydro

  18. FAST: FAST Analysis of Sequences Toolbox.

    PubMed

    Lawrence, Travis J; Kauffman, Kyle T; Amrine, Katherine C H; Carper, Dana L; Lee, Raymond S; Becich, Peter J; Canales, Claudia J; Ardell, David H

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought. PMID:26042145

  19. FAST: FAST Analysis of Sequences Toolbox

    PubMed Central

    Lawrence, Travis J.; Kauffman, Kyle T.; Amrine, Katherine C. H.; Carper, Dana L.; Lee, Raymond S.; Becich, Peter J.; Canales, Claudia J.; Ardell, David H.

    2015-01-01

    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought. PMID:26042145

  20. Protein identities from 'Graphocephala atropunctata' expressed sequence tags: Expanding leafhopper vector biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Heat shock proteins and 44 protein sequences from the blue-green sharpshooter, BGSS, were produced and identified. The sequences were submitted and published under accession numbers: DQ445499-DQ445542, in the National Center for Biotechnology Information, NCBI, Public Database. The blue-green sharps...

  1. Ribosomal proteins and expressed sequence tags from Lysiphlebus testaceipes(Hymenoptera: Aphidiidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A dataset containing 101 putative ribosomal protein (RP) sequences is provided for the aphid parasitoid, Lysiphlebus testaceipes. These data were obtained as a subset from a cDNA library constructed from adult L. testaceipes, and represent one of the largest complete sets of cytoplasmic RP sequence...

  2. Transient Analysis Generator /TAG/ simulates behavior of large class of electrical networks

    NASA Technical Reports Server (NTRS)

    Thomas, W. J.

    1967-01-01

    Transient Analysis Generator program simulates both transient and dc steady-state behavior of a large class of electrical networks. It generates a special analysis program for each circuit described in an easily understood and manipulated programming language. A generator or preprocessor and a simulation system make up the TAG system.

  3. Statistical analysis of nucleotide sequences.

    PubMed Central

    Stückle, E E; Emmrich, C; Grob, U; Nielsen, P J

    1990-01-01

    In order to scan nucleic acid databases for potentially relevant but as yet unknown signals, we have developed an improved statistical model for pattern analysis of nucleic acid sequences by modifying previous methods based on Markov chains. We demonstrate the importance of selecting the appropriate parameters in order for the method to function at all. The model allows the simultaneous analysis of several short sequences with unequal base frequencies and Markov order k not equal to 0 as is usually the case in databases. As a test of these modifications, we show that in E. coli sequences there is a bias against palindromic hexamers which correspond to known restriction enzyme recognition sites. PMID:2251125

  4. [Multilocus sequence typing (MLST) analysis].

    PubMed

    Matsumura, Yasufumi

    2013-12-01

    Multilocus sequence typing (MLST) analysis has been emerging as a powerful tool for genotyping specific bacterial species. MLST utilizes internal fragments of multiple housekeeping genes and the combination of each allele defines the sequence type for each isolate. MLST databases contain reference data and are freely accessible via internet websites. The standard method for investigating short-term hospital outbreaks is still pulse-field gel-electrophoresis and MLST analysis is not a substitute. However, analysis of sequence types and clonal complexes (closely related sequence types) enables identification and understanding of a specific clone that is widely spreading among drug-resistant organisms, or a key clone that is important for evolution of the organism. In the case of Escherichia coli, CTX-M-15 or CTX-M-14 extended-spectrum beta-lactamase producing ST131 clone has emerged and spread globally in the last 10 years. MLST analysis is an unambiguous procedure and is becoming a common typing method to characterize isolates. PMID:24605545

  5. RefNetBuilder: a platform for construction of integrated reference gene regulatory networks from expressed sequence tags

    PubMed Central

    2011-01-01

    Background Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs). Results RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG). It features sequence alignment tools such as BLAST to allow mapping ESTs to pathways and GRNs in model organisms. A scoring algorithm was incorporated to rank and select the best match for each query EST. We validated RefNetBuilder using DNA sequences of Caenorhabditis elegans, a model organism having manually curated KEGG pathways. Using the earthworm Eisenia fetida as an example, we demonstrated the functionalities and features of RefNetBuilder. Conclusions The RefNetBuilder provides a standalone application for building reference GRNs for non-model organisms on a number of operating system platforms with standard desktop computer hardware. As a new bioinformatic tool aimed for constructing putative GRNs for non-model organisms that have only ESTs available, RefNetBuilder is especially useful to explore pathway- or network-related information in these organisms. PMID:22166047

  6. In silico identification of conserved microRNAs and their target transcripts from expressed sequence tags of three earthworm species.

    PubMed

    Gong, Ping; Xie, Fuliang; Zhang, Baohong; Perkins, Edward J

    2010-12-01

    MicroRNAs are a recently identified class of small regulatory RNAs that target more than 30% protein-coding genes. Elevating evidence shows that miRNAs play a critical role in many biological processes, including developmental timing, tissue differentiation, and response to chemical exposure. In this study, we applied a computational approach to analyze expressed sequence tags, and identified 32 miRNAs belonging to 22 miRNA families, in three earthworm species Eisenia fetida, Eisenia andrei, and Lumbricus rubellus. These newly identified earthworm miRNAs possess a difference of 2-4 nucleotides from their homologous counterparts in Caenorhabditis elegans. They also share similar features with other known animal miRNAs, for instance, the nucleotide U being dominant in both mature and pre-miRNA sequences, particularly in the first position of mature miRNA sequences at the 5' end. The newly identified earthworm miRNAs putatively regulate mRNA genes that are involved in many important biological processes and pathways related to development, growth, locomotion, and reproduction as well as response to stresses, particularly oxidative stress. Future efforts will focus on experimental validation of their presence and target mRNA genes to further elucidate their biological functions in earthworms. PMID:21030313

  7. Toward a physical map of Drosophila buzzatii. Use of randomly amplified polymorphic dna polymorphisms and sequence-tagged site landmarks.

    PubMed Central

    Laayouni, H; Santos, M; Fontdevila, A

    2000-01-01

    We present a physical map based on RAPD polymorphic fragments and sequence-tagged sites (STSs) for the repleta group species Drosophila buzzatii. One hundred forty-four RAPD markers have been used as probes for in situ hybridization to the polytene chromosomes, and positive results allowing the precise localization of 108 RAPDs were obtained. Of these, 73 behave as effectively unique markers for physical map construction, and in 9 additional cases the probes gave two hybridization signals, each on a different chromosome. Most markers (68%) are located on chromosomes 2 and 4, which partially agree with previous estimates on the distribution of genetic variation over chromosomes. One RAPD maps close to the proximal breakpoint of inversion 2z(3) but is not included within the inverted fragment. However, it was possible to conclude from this RAPD that the distal breakpoint of 2z(3) had previously been wrongly assigned. A total of 39 cytologically mapped RAPDs were converted to STSs and yielded an aggregate sequence of 28,431 bp. Thirty-six RAPDs (25%) did not produce any detectable hybridization signal, and we obtained the DNA sequence from three of them. Further prospects toward obtaining a more developed genetic map than the one currently available for D. buzzatii are discussed. PMID:11102375

  8. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published. PMID:18495751

  9. The Short ITS2 Sequence Serves as an Efficient Taxonomic Sequence Tag in Comparison with the Full-Length ITS

    PubMed Central

    Han, Jianping; Zhu, Yingjie; Chen, Xiaochen; Liao, Baoshen; Yao, Hui; Song, Jingyuan; Chen, Shilin; Meng, Fanyun

    2013-01-01

    An ideal DNA barcoding region should be short enough to be amplified from degraded DNA. In this paper, we discuss the possibility of using a short nuclear DNA sequence as a barcode to identify a wide range of medicinal plant species. First, the PCR and sequencing success rates of ITS and ITS2 were evaluated based entirely on materials from dry medicinal product and herbarium voucher specimens, including some samples collected back to 90 years ago. The results showed that ITS2 could recover 91% while ITS could recover only 23% efficiency of PCR and sequencing by using one pair of primer. Second, 12861 ITS and ITS2 plant sequences were used to compare the identification efficiency of the two regions. Four identification criteria (BLAST, inter- and intradivergence Wilcoxon signed rank tests, and TaxonDNA) were evaluated. Our results supported the hypothesis that ITS2 can be used as a minibarcode to effectively identify species in a wide variety of specimens and medicinal materials. PMID:23484151

  10. Random Tagging Genotyping by Sequencing (rtGBS), an Unbiased Approach to Locate Restriction Enzyme Sites across the Target Genome

    PubMed Central

    Hilario, Elena; Barron, Lorna; Deng, Cecilia H.; Datson, Paul M.; Davy, Marcus W.; Storey, Roy D.

    2015-01-01

    Genotyping by sequencing (GBS) is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al. method: some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS). By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS) method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145) of BamH I sites shared with the reference genome, compared to only 14% (11,513) by stdGBS. PMID:26633193

  11. Accurate mass tag retention time database for urine proteome analysis by chromatography--mass spectrometry.

    PubMed

    Agron, I A; Avtonomov, D M; Kononikhin, A S; Popov, I A; Moshkovskii, S A; Nikolaev, E N

    2010-05-01

    Information about peptides and proteins in urine can be used to search for biomarkers of early stages of various diseases. The main technology currently used for identification of peptides and proteins is tandem mass spectrometry, in which peptides are identified by mass spectra of their fragmentation products. However, the presence of the fragmentation stage decreases sensitivity of analysis and increases its duration. We have developed a method for identification of human urinary proteins and peptides. This method based on the accurate mass and time tag (AMT) method does not use tandem mass spectrometry. The database of AMT tags containing more than 1381 AMT tags of peptides has been constructed. The software for database filling with AMT tags, normalizing the chromatograms, database application for identification of proteins and peptides, and their quantitative estimation has been developed. The new procedures for peptide identification by tandem mass spectra and the AMT tag database are proposed. The paper also lists novel proteins that have been identified in human urine for the first time. PMID:20632944

  12. Protein identities - Graphocephala atropunctata expressed sequenced tags: expanding leafhopper vector biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A small heat shock protein was isolated and sequenced from the Blue-green sharpshooter, BGSS, Graphocephala atropunctata (Signoret) (Hemiptera: Cicadellidae). The BGSS has been the native vector of Pierce’s disease in vineyards in California for nearly a century. The importance of this vector spec...

  13. Tag Questions across Irish English and British English: A Corpus Analysis of Form and Function

    ERIC Educational Resources Information Center

    Barron, Anne; Pandarova, Irina; Muderack, Karoline

    2015-01-01

    The present study, situated in the area of variational pragmatics, contrasts tag question (TQ) use in Ireland and Great Britain using spoken data from the Irish and British components of the International Corpus of English (ICE). Analysis is on the formal and functional level and also investigates form-functional relationships. Findings reveal…

  14. Construction of a Genetic Linkage Map Based on Amplified Fragment Length Polymorphism Markers and Development of Sequence-Tagged Site Markers for Marker-Assisted Selection of the Sporeless Trait in the Oyster Mushroom (Pleurotus eryngii)

    PubMed Central

    Ueda, Jun; Obatake, Yasushi; Murakami, Shigeyuki; Fukumasa, Yukitaka; Matsumoto, Teruyuki

    2012-01-01

    A large number of spores from fruiting bodies can lead to allergic reactions and other problems during the cultivation of edible mushrooms, including Pleurotus eryngii (DC.) Quél. A cultivar harboring a sporulation-deficient (sporeless) mutation would be useful for preventing these problems, but traditional breeding requires extensive time and labor. In this study, using a sporeless P. eryngii strain, we constructed a genetic linkage map to introduce a molecular breeding program like marker-assisted selection. Based on the segregation of 294 amplified fragment length polymorphism markers, two mating type factors, and the sporeless trait, the linkage map consisted of 11 linkage groups with a total length of 837.2 centimorgans (cM). The gene region responsible for the sporeless trait was located in linkage group IX with 32 amplified fragment length polymorphism markers and the B mating type factor. We also identified eight markers closely linked (within 1.2 cM) to the sporeless locus using bulked-segregant analysis-based amplified fragment length polymorphism. One such amplified fragment length polymorphism marker was converted into two sequence-tagged site markers, SD488-I and SD488-II. Using 14 wild isolates, sequence-tagged site analysis indicated the potential usefulness of the combination of two sequence-tagged site markers in cross-breeding of the sporeless strain. It also suggested that a map constructed for P. eryngii has adequate accuracy for marker-assisted selection. PMID:22210222

  15. Single & Multiple Stellar Populations in Globular Clusters: Chemical Tagging, Photometric Sequences, and Dynamics

    NASA Astrophysics Data System (ADS)

    Piotto, Giampaolo

    2015-08-01

    The discovery of multiple stellar populations in globular clusters has revolutionized our view of these objects one thought to be simple, single population stellar systems. Different star formation scenarios have been proposed in order to account for the photometric and spectroscopic properties of the different populations hosted by the single cluster, and some of them imply that the original cluster should have been much more massive than it is now, with a significant fraction of the original stars lost into the environment (Galaxy halo or bulge). Because of this, globular clusters become relevant not only as tracers of the general process of galaxy halo formation, but also possible incubators of most (all?) halo stars.In my talk I will briefly summarize the basic observational facts that made the community at large to accept the idea of population multiplicity.I will also present the newest results coming from an extensive, multi-wavelength astrometric and photometric survey, which includes UV data from ACS and WFC3/HST of close to half of the Milky Way globular clusters. The increasing number of spectroscopic surveys of stars in globular clusters, coupled with the capability of (UV) photometry to distinguish different populations has largely increased our capability to trace the basic chemical properties of the many populations within a single cluster.I will present a census of the presence of multiple populations in GCs, their chemical tagging, radial distribution, and kinematics.Possible correlations of multiple populations characterizing quantities with the main cluster parameters will also be presented. Implications on multiple stellar populations formation will be discussed as well as the still open issues.

  16. Analyses of Expressed Sequence Tags from the Maize Foliar Pathogen Cercospora Zeae-Maydis Identifing Novel Genes expressed during Vegetative, Infectious, & Reproductive Growth

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial yield losses annually throughout the western hemisphere. To learn more about the molecular regulation of pathogenesis in C. zeae-maydis, we generated a collection of expressed sequence tags (ESTs) and...

  17. Development of high-density linkage map and tagging leaf spot resistance in pearl millet using genotyping-by-sequencing markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pearl millet is an important forage and grain crop in many parts of the world. Genome mapping studies are a prerequisite for tagging agronomically important traits. Genotyping-by-Sequencing (GBS) markers can be used to build high density linkage maps even in species lacking a reference genome. A re...

  18. Identification and characterization of 43 microsatellite markers derived from expressed sequence tags of the sea cucumber ( Apostichopus japonicus)

    NASA Astrophysics Data System (ADS)

    Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

    2011-06-01

    The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.

  19. Development and characterization of new single nucleotide polymorphism markers from expressed sequence tags in common carp (Cyprinus carpio).

    PubMed

    Zhu, Chuankun; Cheng, Lei; Tong, Jingou; Yu, Xiaomu

    2012-01-01

    The common carp (Cyprinus carpio) is an important aquaculture fish worldwide but only limited single nucleotide polymorphism (SNP) markers are characterized from expressed sequence tags (ESTs) in this species. In this study, 1487 putative SNPs were bioinformatically mined from 14,066 online ESTs mainly from the European common carp, with the occurrence rate of about one SNP every 173 bp. One hundred and twenty-one of these SNPs were selected for validation using PCR fragment sequencing, and 48 out of 81 primers could amplify the expected fragments in the Chinese common carp genome. Only 26 (21.5%) putative SNPs were validated, however, 508 new SNPs and 68 indels were identified. The ratios of transitions to transversions were 1.77 for exon SNPs and 1.05 for intron SNPs. All the 23 SNPs selected for population tests were polymorphic, with the observed heterozygosity (Ho) ranging from 0.053 to 0.526 (mean 0.262), polymorphism information content (PIC) from 0.095 to 0.357 (mean 0.246), and 21 SNPs were in Hardy-Weinberg equilibrium. These results suggest that different common carp populations with geographic isolation have significant genetic variation at the SNP level, and these new EST-SNP markers are readily available for genetics and breeding studies in common carp. PMID:22837697

  20. Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    PubMed Central

    Krzyzanowski, Paul M.; Price, Feodor D.; Muro, Enrique M.; Rudnicki, Michael A.; Andrade-Navarro, Miguel A.

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  1. Identification of Anhydrobiosis-related Genes from an Expressed Sequence Tag Database in the Cryptobiotic Midge Polypedilum vanderplanki (Diptera; Chironomidae)*

    PubMed Central

    Cornette, Richard; Kanamori, Yasushi; Watanabe, Masahiko; Nakahara, Yuichi; Gusev, Oleg; Mitsumasu, Kanako; Kadono-Okuda, Keiko; Shimomura, Michihiko; Mita, Kazuei; Kikawada, Takahiro; Okuda, Takashi

    2010-01-01

    Some organisms are able to survive the loss of almost all their body water content, entering a latent state known as anhydrobiosis. The sleeping chironomid (Polypedilum vanderplanki) lives in the semi-arid regions of Africa, and its larvae can survive desiccation in an anhydrobiotic form during the dry season. To unveil the molecular mechanisms of this resistance to desiccation, an anhydrobiosis-related Expressed Sequence Tag (EST) database was obtained from the sequences of three cDNA libraries constructed from P. vanderplanki larvae after 0, 12, and 36 h of desiccation. The database contained 15,056 ESTs distributed into 4,807 UniGene clusters. ESTs were classified according to gene ontology categories, and putative expression patterns were deduced for all clusters on the basis of the number of clones in each library; expression patterns were confirmed by real-time PCR for selected genes. Among up-regulated genes, antioxidants, late embryogenesis abundant (LEA) proteins, and heat shock proteins (Hsps) were identified as important groups for anhydrobiosis. Genes related to trehalose metabolism and various transporters were also strongly induced by desiccation. Those results suggest that the oxidative stress response plays a central role in successful anhydrobiosis. Similarly, protein denaturation and aggregation may be prevented by marked up-regulation of Hsps and the anhydrobiosis-specific LEA proteins. A third major feature is the predicted increase in trehalose synthesis and in the expression of various transporter proteins allowing the distribution of trehalose and other solutes to all tissues. PMID:20833722

  2. Integration of expressed sequence tag data flanking predicted RNA secondary structures facilitates novel non-coding RNA discovery.

    PubMed

    Krzyzanowski, Paul M; Price, Feodor D; Muro, Enrique M; Rudnicki, Michael A; Andrade-Navarro, Miguel A

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  3. Transcriptome annotation using tandem SAGE tags

    PubMed Central

    Rivals, Eric; Boureux, Anthony; Lejeune, Mireille; Ottones, Florence; Pecharromàn Pérez, Oscar; Tarhio, Jorma; Pierrat, Fabien; Ruffle, Florence; Commes, Thérèse; Marti, Jacques

    2007-01-01

    Analysis of several million expressed gene signatures (tags) revealed an increasing number of different sequences, largely exceeding that of annotated genes in mammalian genomes. Serial analysis of gene expression (SAGE) can reveal new Poly(A) RNAs transcribed from previously unrecognized chromosomal regions. However, conventional SAGE tags are too short to identify unambiguously unique sites in large genomes. Here, we design a novel strategy with tags anchored on two different restrictions sites of cDNAs. New transcripts are then tentatively defined by the two SAGE tags in tandem and by the spanning sequence read on the genome between these tagged sites. Having developed a new algorithm to locate these tag-delimited genomic sequences (TDGS), we first validated its capacity to recognize known genes and its ability to reveal new transcripts with two SAGE libraries built in parallel from a single RNA sample. Our algorithm proves fast enough to experiment this strategy at a large scale. We then collected and processed the complete sets of human SAGE tags to predict yet unknown transcripts. A cross-validation with tiling arrays data shows that 47% of these TDGS overlap transcriptional active regions. Our method provides a new and complementary approach for complex transcriptome annotation. PMID:17709346

  4. Harmonic phase interference for the detection of tag line crossings and beyond in homogeneous strain analysis of cardiac tagged MRI data.

    PubMed

    Bilgen, Mehmet

    2010-12-01

    Homogenous strain analysis (HSA) was developed to evaluate regional cardiac function using tagged cine magnetic resonance images of heart. Current cardiac applications of HSA are however limited in accurately detecting tag intersections within the myocardial wall, producing consistent triangulation of tag cells throughout the image series and achieving optimal spatial resolution due to the large size of the triangles. To address these issues, this article introduces a harmonic phase (HARP) interference method. In principle, as in the standard HARP analysis, the method uses harmonic phases associated with the two of the four fundamental peaks in the spectrum of a tagged image. However, the phase associated with each peak is wrapped when estimated digitally. This article shows that special combination of wrapped phases results in an image with unique intensity pattern that can be exploited to automatically detect tag intersections and to produce reliable triangulation with regularly organized partitioning of the mesh for HSA. In addition, the method offers new opportunities and freedom for evaluating myocardial function when the power and angle of the complex filtered spectra are mathematically modified prior to computing the phase. For example, the triangular elements can be shifted spatially by changing the angle and/or their sizes can be reduced by changing the power. Interference patterns obtained under a variety of power and angle conditions were presented and specific features observed in the results were explained. Together, the advanced processing capabilities increase the power of HSA by making the analysis less prone to errors from human interactions. It also allows strain measurements at higher spatial resolution and multi-scale, thereby improving the display methods for better interpretation of the analysis results. PMID:21110236

  5. Robust Computational Analysis of rRNA Hypervariable Tag Datasets

    PubMed Central

    Sipos, Maksim; Jeraldo, Patricio; Chia, Nicholas; Qu, Ani; Dhillon, A. Singh; Konkel, Michael E.; Nelson, Karen E.; White, Bryan A.; Goldenfeld, Nigel

    2010-01-01

    Next-generation DNA sequencing is increasingly being utilized to probe microbial communities, such as gastrointestinal microbiomes, where it is important to be able to quantify measures of abundance and diversity. The fragmented nature of the 16S rRNA datasets obtained, coupled with their unprecedented size, has led to the recognition that the results of such analyses are potentially contaminated by a variety of artifacts, both experimental and computational. Here we quantify how multiple alignment and clustering errors contribute to overestimates of abundance and diversity, reflected by incorrect OTU assignment, corrupted phylogenies, inaccurate species diversity estimators, and rank abundance distribution functions. We show that straightforward procedural optimizations, combining preexisting tools, are effective in handling large () 16S rRNA datasets, and we describe metrics to measure the effectiveness and quality of the estimators obtained. We introduce two metrics to ascertain the quality of clustering of pyrosequenced rRNA data, and show that complete linkage clustering greatly outperforms other widely used methods. PMID:21217830

  6. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  7. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs).

    PubMed

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  8. AB039. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Li, Zesong

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome.

  9. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  10. WEBSAGE: a web tool for visual analysis of differentially expressed human SAGE tags.

    PubMed

    Pylouster, Jean; Sénamaud-Beaufort, Catherine; Saison-Behmoaras, Tula Ester

    2005-07-01

    The serial analysis of gene expression (SAGE) is a powerful method to compare gene expression of mRNA populations. To provide quantitative expression levels on a genome-wide scale, the Cancer Genome Anatomy Project (CGAP) uses SAGE. Over 7 million SAGE tags, from 171 human cell types have been assembled. The growing number of laboratories involved in SAGE research necessitates the use of software that provides statistical analysis of raw data, allowing the rapid visualization and interpretation of results. We have created the first simple tool that performs statistical analysis on SAGE data, identifies the tags differentially expressed and shows the results in a scatter plot. It is freely available and accessible at http://bioserv.rpbs.jussieu.fr/websage/index.php. PMID:15980565

  11. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    SciTech Connect

    Abraham, Paul E; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert {Bob} L

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  12. Gene cataloging and expression profiling in human gastric cancer cells by expressed sequence tags.

    PubMed

    Kim, Nam-Soon; Hahn, Yoonsoo; Oh, Jung-Hwa; Lee, Ju-Yeon; Oh, Kyung-Jin; Kim, Jeong-Min; Park, Hong-Seog; Kim, Sangsoo; Song, Kyu-Sang; Rho, Seung-Moo; Yoo, Hyang-Sook; Kim, Yong Sung

    2004-06-01

    To understand the molecular mechanism associated with gastric carcinogenesis, we identified genes expressed in gastric cancer cell lines and tissues. Of 97,609 high-quality ESTs sequenced from 36 cDNA libraries, 92,545 were coalesced into 10,418 human Unigene clusters (Build 151). The gene expression profile was produced by counting the cluster frequencies in each library. Although the profiles of highly expressed genes varied greatly from library to library, those genes related to cell structure formation, heat shock proteins, the glycolysis pathway, and the signaling pathway were highly represented in human gastric cancer cell lines and in primary tumors. Conversely, the genes encoding immunoglobulins, ribosomal proteins, and digestive proteins were down-regulated in gastric cancer cell lines and tissues compared to normal tissues. The transcription levels of some of these genes were confirmed by RT-PCR. We found that genes related to cell adhesion, apoptosis, and cytoskeleton formation were particularly up-regulated in the gastric cancer cell lines established from malignant ascites compared to those from primary tumors. This comprehensive molecular profiling of human gastric cancer should be useful for elucidating the genetic events associated with human gastric cancer. PMID:15177556

  13. Computational identification of microRNAs and their targets in Catharanthus roseus expressed sequence tags

    PubMed Central

    Pani, Alok; Mahapatra, Rajani Kanta

    2013-01-01

    No study has been performed on identifying microRNAs (miRNAs) and their targets in the medicinal plant, Catharanthus roseus. In the present study, using the comparative genomics approach, we have predicted two potential C. roseus miRNAs. Furthermore, twelve potential mRNA targets were identified in C. roseus genome based on the characteristics that miRNAs exhibit perfect or nearly perfect complementarity with their targeted mRNA sequences. Among them many of the targets were predicted to encode enzymes that regulate the biosynthesis of terpenoid indole alkaloids (TIA). In addition, most of the predicted targets were the gene coding for transcription factors which are mainly involved in cell growth and development, signaling and metabolism. This is the first in silico study to indicate that miRNA target gene encoding enzymes involved in vinblastine and vincristine biosynthesis, which may help to understand the miRNA-mediated regulation of TIA alkaloid biosynthesis in C. roseus. PMID:26484050

  14. Computational identification of microRNAs and their targets in Catharanthus roseus expressed sequence tags.

    PubMed

    Pani, Alok; Mahapatra, Rajani Kanta

    2013-12-01

    No study has been performed on identifying microRNAs (miRNAs) and their targets in the medicinal plant, Catharanthus roseus. In the present study, using the comparative genomics approach, we have predicted two potential C. roseus miRNAs. Furthermore, twelve potential mRNA targets were identified in C. roseus genome based on the characteristics that miRNAs exhibit perfect or nearly perfect complementarity with their targeted mRNA sequences. Among them many of the targets were predicted to encode enzymes that regulate the biosynthesis of terpenoid indole alkaloids (TIA). In addition, most of the predicted targets were the gene coding for transcription factors which are mainly involved in cell growth and development, signaling and metabolism. This is the first in silico study to indicate that miRNA target gene encoding enzymes involved in vinblastine and vincristine biosynthesis, which may help to understand the miRNA-mediated regulation of TIA alkaloid biosynthesis in C. roseus. PMID:26484050

  15. Construction and characterization of subtractive stage-specific expressed sequence tag (EST) libraries of the pinewood nematode Bursaphelenchus xylophilus.

    PubMed

    Kang, Jae Soon; Lee, Hyoungseok; Moon, Il Sung; Lee, Yi; Koh, Young Ho; Je, Yeon Ho; Lim, Kook-Jin; Lee, Si Hyeock

    2009-07-01

    To establish expressed sequence tag databases of the two life stages (the dispersal and propagative stages) of pinewood nematode Bursaphelenchus xylophilus, subtractive EST libraries that were specific to the dispersal 4th larval stage (D4S) and the pine-grown propagative mixed (PGPS) stage were constructed by suppressed subtractive hybridization, and annotated by BLASTx and Gene Ontology (GO). A total of 1112 (57.7%) contigs from the D4S-cDNA library and 1215 (46.7%) contigs from the PGPS-specific cDNA libraries had matched BLASTx hits (E

  16. Fine Mutational Analysis of 2B8 and 3H7 Tag Epitopes with Corresponding Specific Monoclonal Antibodies.

    PubMed

    Kim, Tae-Lim; Cho, Man-Ho; Sangsawang, Kanidta; Bhoo, Seong Hee

    2016-06-30

    Bacteriophytochromes are phytochrome-like light-sensing photoreceptors that use biliverdin as a chromophore. To study the biochemical properties of the Deinococcus radiodurans bacteriophytochrome (DrBphP) protein, two anti-DrBphP mouse monoclonal antibodies (2B8 and 3H7) were generated. Their specific epitopes were identified in our previous report. We present here fine epitope mapping of these two antibodies by using truncation and substitution of original epitope sequences in order to identify minimized epitope peptides. The previously reported original epitope sequences for 2B8 and 3H7 were truncated from both sides. Our analysis showed that the minimal peptide sequence lengths for 2B8 and 3H7 antibodies were nine amino acids (RDPLPFFPP) and six amino acids (PGEIEE), respectively. We further characterized these peptides in order to investigate their reactivity after single deletion and single substitution of the original peptides. We found that single-substituted 2B8 epitope (RDPLPAFPP) and dual-substituted 3H7 epitope (PGEIAD) showed significantly increased reactivity. These two antibodies with high reactivity for the short modified peptide sequences are valueble for developing new peptide tags for protein research. PMID:27137090

  17. Fine Mutational Analysis of 2B8 and 3H7 Tag Epitopes with Corresponding Specific Monoclonal Antibodies

    PubMed Central

    Kim, Tae-Lim; Cho, Man-Ho; Sangsawang, Kanidta; Bhoo, Seong Hee

    2016-01-01

    Bacteriophytochromes are phytochrome-like light-sensing photoreceptors that use biliverdin as a chromophore. To study the biochemical properties of the Deinococcus radiodurans bacteriophytochrome (DrBphP) protein, two anti-DrBphP mouse monoclonal antibodies (2B8 and 3H7) were generated. Their specific epitopes were identified in our previous report. We present here fine epitope mapping of these two antibodies by using truncation and substitution of original epitope sequences in order to identify minimized epitope peptides. The previously reported original epitope sequences for 2B8 and 3H7 were truncated from both sides. Our analysis showed that the minimal peptide sequence lengths for 2B8 and 3H7 antibodies were nine amino acids (RDPLPFFPP) and six amino acids (PGEIEE), respectively. We further characterized these peptides in order to investigate their reactivity after single deletion and single substitution of the original peptides. We found that single-substituted 2B8 epitope (RDPLPAFPP) and dual-substituted 3H7 epitope (PGEIAD) showed significantly increased reactivity. These two antibodies with high reactivity for the short modified peptide sequences are valueble for developing new peptide tags for protein research. PMID:27137090

  18. Data for analysis of mannose-6-phosphate glycans labeled with fluorescent tags.

    PubMed

    Kang, Ji-Yeon; Kwon, Ohsuk; Gil, Jin Young; Oh, Doo-Byoung

    2016-06-01

    Mannose-6-phosphate (M-6-P) glycan plays an important role in lysosomal targeting of most therapeutic enzymes for treatment of lysosomal storage diseases. This article provides data for the analysis of M-6-P glycans by high-performance liquid chromatography (HPLC) and matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The identities of M-6-P glycan peaks in HPLC profile were confirmed by measuring the masses of the collected peak eluates. The performances of three fluorescent tags (2-aminobenzoic acid [2-AA], 2-aminobenzamide [2-AB], and 3-(acetyl-amino)-6-aminoacridine [AA-Ac]) were compared focusing on the analysis of bi-phosphorylated glycan (containing two M-6-Ps). The bi-phosphorylated glycan analysis is highly affected by the attached fluorescent tag and the hydrophilicity of elution solvent used in HPLC. The data in this article is associated with the research article published in "Comparison of fluorescent tags for analysis of mannose-6-phosphate glycans" (Kang et al., 2016 [1]). PMID:27222848

  19. Data for analysis of mannose-6-phosphate glycans labeled with fluorescent tags

    PubMed Central

    Kang, Ji-Yeon; Kwon, Ohsuk; Gil, Jin Young; Oh, Doo-Byoung

    2016-01-01

    Mannose-6-phosphate (M-6-P) glycan plays an important role in lysosomal targeting of most therapeutic enzymes for treatment of lysosomal storage diseases. This article provides data for the analysis of M-6-P glycans by high-performance liquid chromatography (HPLC) and matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The identities of M-6-P glycan peaks in HPLC profile were confirmed by measuring the masses of the collected peak eluates. The performances of three fluorescent tags (2-aminobenzoic acid [2-AA], 2-aminobenzamide [2-AB], and 3-(acetyl-amino)-6-aminoacridine [AA-Ac]) were compared focusing on the analysis of bi-phosphorylated glycan (containing two M-6-Ps). The bi-phosphorylated glycan analysis is highly affected by the attached fluorescent tag and the hydrophilicity of elution solvent used in HPLC. The data in this article is associated with the research article published in “Comparison of fluorescent tags for analysis of mannose-6-phosphate glycans” (Kang et al., 2016 [1]). PMID:27222848

  20. HPLC-APCI-MS analysis of triacylglycerols (TAGs) in historical pharmaceutical ointments from the eighteenth century.

    PubMed

    Saliu, Francesco; Modugno, Francesca; Orlandi, Marco; Colombini, Maria Perla

    2011-10-01

    The lipid fractions of residues from historical pharmaceutical ointments were analysed by reversed-phase liquid chromatography coupled with atmospheric pressure chemical ionization and mass spectrometer detection. The residues were contained in a series of historical apothecary jars, dating from the eighteenth century and conserved at the "Aboca Museum" in Sansepolcro (Arezzo, Italy) and at the pharmacy of the "Real Cartuja de Valldemossa" in Palma de Majorca (Spain). The analytical protocol was set up using a comparative study based on the evaluation of triacylglycerol (TAG) compositions in raw natural lipid materials and in laboratory-reproduced ointments. These ointments were prepared following pharmaceutical recipes reported in historical treatises and used as reference materials. The reference materials were also subjected to stress treatments in order to evaluate the modification occurring in the TAG profiles as an effect of ageing. TAGs were successfully detected in the reproduced formulations even in mixtures of up to ten ingredients and after harsh degradative treatments, and also in real historical samples. No particular interferences were detected from other non-lipid ingredients of the formulations. The TAG compositions detected in the historical ointments indicated a predominant use of olive oil and pig adipose material as lipid ingredients. The detection of a high level of tristearine and myristyl-palmitoyl-stearyl glycerol in two of the samples suggested the presence of a fatty material of a different origin (maybe a ruminant). On the basis of the positional isomer ratio, sn-PPO/sn-POP, it was possible to hypothesize an exclusive use of pig fat in one sample. We also evaluated the application of principal component analysis of TAG profiles as an approach for the multivariate statistical comparison of the reference and historical ointments. PMID:21713420

  1. Proteomic analysis of peptides tagged with dimedone and related probes.

    PubMed

    Martínez-Acedo, Pablo; Gupta, Vinayak; Carroll, Kate S

    2014-04-01

    Owing to its labile nature, a new role for cysteine sulfenic acid (-SOH) modification has emerged. This oxidative modification modulates protein function by acting as a redox switch during cellular signaling. The identification of proteins that undergo this modification represents a methodological challenge, and its resolution remains a matter of current interest. The development of strategies to chemically modify cysteinyl-containing peptides for liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis has increased significantly within the past decade. The method of choice to selectively label sulfenic acid is based on the use of dimedone or its derivatives. For these chemical probes to be effective on a proteome-wide level, their reactivity toward -SOH must be high to ensure reaction completion. In addition, the presence of an adduct should not interfere with electrospray ionization, the efficiency of induced dissociation in MS/MS experiments or with the identification of Cys-modified peptides by automated database searching algorithms. Herein, we employ a targeted proteomics approach to study the electrospray ionization and fragmentation effects of different -SOH specific probes and compared them to commonly used alkylating agents. We then extend our study to a whole proteome extract using shotgun proteomic approaches. These experiments enable us to demonstrate that dimedone adducts do not interfere with electrospray by suppressing the ionization nor impede product ion assignment by automated search engines, which detect a + 138 Da increase from unmodified peptides. Collectively, these results suggest that dimedone can be a powerful tool to identify sulfenic acid modifications by high-throughput shotgun proteomics of a whole proteome. PMID:24719340

  2. N-terminal sequence tagging using reliably determined b2 ions: a useful approach to deconvolute tandem mass spectra of co-fragmented peptides in proteomics.

    PubMed

    Kryuchkov, Fedor; Verano-Braga, Thiago; Kjeldsen, Frank

    2014-05-30

    With the recent introduction of higher-energy collisional dissociation (HCD) in Orbitrap mass spectrometry, the popularity of that technique has grown tremendously in the proteomics society. HCD spectra, however, are characterized by a limited distribution of bn-type ions, which permit the generation of reliable sequence tags based on complementary b,y pairs both for de novo sequencing and sequence tagging strategies. Instead, most peptide HCD spectra (~95%) are dominated with b2 ions. In this work, we analyzed positive predictive values of b2 ions in HCD, and found that b2 ions can be determined with >97% certainty in the presence of a2 and its complementary yn-2 ions. Analytically, b2 ions provide information on the composition of the first two N-terminal amino acids in peptides. Their utilization in N-terminal sequence tagging leads to a significant decrease in false discovery rate by filtering out false positives while retaining true positive identifications. As a consequence, the number of peptide spectrum matches (PSMs) increased by 4.8% at fixed FDR (1%). This approach allows for deconvolution of mixture spectra and increased the number of PSM to 9.2% in a complex human sample and to 24% in a complex sample of synthetic peptides at 1% FDR. PMID:24726481

  3. Identification and functional characterization of effectors in expressed sequence tags from various life cycle stages of the potato cyst nematode Globodera pallida.

    PubMed

    Jones, John T; Kumar, Amar; Pylypenko, Liliya A; Thirugnanasambandam, Amarnath; Castelli, Lydia; Chapman, Sean; Cock, Peter J A; Grenier, Eric; Lilley, Catherine J; Phillips, Mark S; Blok, Vivian C

    2009-11-01

    In this article, we describe the analysis of over 9000 expressed sequence tags (ESTs) from cDNA libraries obtained from various life cycle stages of Globodera pallida. We have identified over 50 G. pallida effectors from this dataset using bioinformatics analysis, by screening clones in order to identify secreted proteins up-regulated after the onset of parasitism and using in situ hybridization to confirm the expression in pharyngeal gland cells. A substantial gene family encoding G. pallida SPRYSEC proteins has been identified. The expression of these genes is restricted to the dorsal pharyngeal gland cell. Different members of the SPRYSEC family of proteins from G. pallida show different subcellular localization patterns in plants, with some localized to the cytoplasm and others to the nucleus and nucleolus. Differences in subcellular localization may reflect diverse functional roles for each individual protein or, more likely, variety in the compartmentalization of plant proteins targeted by the nematode. Our data are therefore consistent with the suggestion that the SPRYSEC proteins suppress host defences, as suggested previously, and that they achieve this through interaction with a range of host targets. PMID:19849787

  4. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  5. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities. PMID:26209697

  6. Analysis of the early-flowering mechanisms and generation of T-DNA tagging lines in Kitaake, a model rice cultivar.

    PubMed

    Kim, Song Lim; Choi, Minkyung; Jung, Ki-Hong; An, Gynheung

    2013-11-01

    As an extremely early flowering cultivar, rice cultivar Kitaake is a suitable model system for molecular studies. Expression analyses revealed that transcript levels of the flowering repressor Ghd7 were decreased while those of its downstream genes, Ehd1, Hd3a, and RFT1, were increased. Sequencing the known flowering-regulator genes revealed mutations in Ghd7 and OsPRR37 that cause early translation termination and amino acid substitutions, respectively. Genetic analysis of F2 progeny from a cross between cv. Kitaake and cv. Dongjin indicated that those mutations additively contribute to the early-flowering phenotype in cv. Kitaake. Because the short life cycle facilitates genetics research, this study generated 10 000 T-DNA tagging lines and deduced 6758 flanking sequence tags (FSTs), in which 3122 were genic and 3636 were intergenic. Among the genic lines, 367 (11.8%) were inserted into new genes that were not previously tagged. Because the lines were generated by T-DNA that contained the promoterless GUS reporter gene, which had an intron with triple splicing donors/acceptors in the right border region, a high efficiency of GUS expression was shown in various organs. Sequencing of the GUS-positive lines demonstrated that the third splicing donor and the first splicing acceptor of the vector were extensively used. The FST data have now been released into the public domain for seed distribution and facilitation of rice research. PMID:23966593

  7. Biological Sequence Analysis with Multivariate String Kernels.

    PubMed

    Kuksa, Pavel P

    2013-03-01

    String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. They often exhibit state-of-the-art performance on many practical tasks of sequence analysis such as biological sequence classification, remote homology detection, or protein superfamily and fold prediction. However, typical string kernel methods rely on analysis of discrete one-dimensional (1D) string data (e.g., DNA or amino acid sequences). In this work we address the multi-class biological sequence classification problems using multivariate representations in the form of sequences of features vectors (as in biological sequence profiles, or sequences of individual amino acid physico-chemical descriptors) and a class of multivariate string kernels that exploit these representations. On a number of protein sequence classification tasks proposed multivariate representations and kernels show significant 15-20\\% improvements compared to existing state-of-the-art sequence classification methods. PMID:23509193

  8. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    PubMed

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html. PMID:17626053

  9. Novel Cysteine Tags for the Sequencing of Non-Tryptic Disulfide Peptides of Anurans: ESI-MS Study of Fragmentation Efficiency

    NASA Astrophysics Data System (ADS)

    Samgina, Tatyana Y.; Vorontsov, Egor A.; Gorshkov, Vladimir A.; Artemenko, Konstantin A.; Nifant'ev, Ilya E.; Kanawati, Basem; Schmitt-Kopplin, Philippe; Zubarev, Roman A.; Lebedev, Albert T.

    2011-12-01

    Mass spectrometry faces considerable difficulties in de novo sequencing of long non-tryptic peptides with S-S bonds. Long disulfide-containing peptides brevinins 1E and 2Ec from frog Rana ridibunda were reduced and alkylated with nine novel and three known derivatizing agents. Eight of the novel reagents are maleimide derivatives. Modified samples were subjected to MS/MS studies on FT-ICR and Orbitrap mass spectrometers using CAD/HCD or ECD/ETD techniques. Procedures, fragmentation patterns, and sequence coverage for two peptides modified with 12 tags are described. ECD/ETD and CAD fragmentation revealed complementary sequence information. Higher-energy collisionally activated dissociation (HCD) sufficiently enhanced y-ions formation for brevinin 1E, but not for brevinin 2Ec. Some novel tags [ N-benzylmaleimide, N-(2,6-dimethylphenyl)maleimide] along with known N-phenylmaleimide and iodoacetic acid showed high total sequence coverage taking into account combined ETD and HCD fragmentation. Moreover, modification of long (34 residues) brevinin 2Ec with N-benzylmaleimide or N-(2,6-dimethylphenyl)maleimide yielded high sequence coverage and full C-terminal sequence determination with ECD alone.

  10. Long-range effects of tag sequence on marginally stabilized structure in HIV-1 p24 capsid protein monitored using NMR.

    PubMed

    Okazaki, Honoka; Kaneko, Chie; Hirahara, Miyuki; Watanabe, Satoru; Tochio, Naoya; Kigawa, Takanori; Nishimura, Chiaki

    2014-09-01

    N-terminal domain of HIV-1 p24 capsid protein is a globular fold composed of seven helices and two β-strands with a flexible structure including the α4-5 loop and both N- and C-terminal ends. However, the protein shows a high tendency (48%) for an intrinsically disordered structure based on the PONDR VL-XT prediction from the primary sequence. To assess the possibility of marginally stabilized structure under physiological conditions, the N-terminal domain of p24 was destabilized by the addition of an artificial flexible tag to either N- or C-terminal ends, and it was analyzed using T1, T2, hetero-nuclear NOE, and amide-proton exchange experiments. When the C-terminal tag (12 residues) was attached, the regions of the α3-4 loop and helix 6 as well as the α4-5 loop attained the flexible structures. Furthermore, in the protein containing the N-terminal tag (27 residues), helix 4 in addition to the above-mentioned area including α3-4 and α4-5 loops as well as helix 6 exhibited highly disordered structures. Thus, the long-range effects of the existence of tag sequence was observed in the stepwise manner of the appearance of disordered structures (step 1: α4-5 loop, step 2: α3-4 loop and helix 6, and step 3: helix 4). Furthermore, the disordered regions in tagged proteins were consistent with the PONDR VL-XT disordered prediction. The dynamic structure located in the middle part (α3-4 loop to helix 6) of the protein shown in this study may be related to the assembly of the viral particle. PMID:24960591

  11. Functional categorization of unique expressed sequence tags obtained from the yeast-like growth phase of the elm pathogen Ophiostoma novo-ulmi

    PubMed Central

    2011-01-01

    Background The highly aggressive pathogenic fungus Ophiostoma novo-ulmi continues to be a serious threat to the American elm (Ulmus americana) in North America. Extensive studies have been conducted in North America to understand the mechanisms of virulence of this introduced pathogen and its evolving population structure, with a view to identifying potential strategies for the control of Dutch elm disease. As part of a larger study to examine the genomes of economically important Ophiostoma spp. and the genetic basis of virulence, we have constructed an expressed sequence tag (EST) library using total RNA extracted from the yeast-like growth phase of O. novo-ulmi (isolate H327). Results A total of 4,386 readable EST sequences were annotated by determining their closest matches to known or theoretical sequences in public databases by BLASTX analysis. Searches matched 2,093 sequences to entries found in Genbank, including 1,761 matches with known proteins and 332 matches with unknown (hypothetical/predicted) proteins. Known proteins included a collection of 880 unique transcripts which were categorized to obtain a functional profile of the transcriptome and to evaluate physiological function. These assignments yielded 20 primary functional categories (FunCat), the largest including Metabolism (FunCat 01, 20.28% of total), Sub-cellular localization (70, 10.23%), Protein synthesis (12, 10.14%), Transcription (11, 8.27%), Biogenesis of cellular components (42, 8.15%), Cellular transport, facilitation and routes (20, 6.08%), Classification unresolved (98, 5.80%), Cell rescue, defence and virulence (32, 5.31%) and the unclassified category, or known sequences of unknown metabolic function (99, 7.5%). A list of specific transcripts of interest was compiled to initiate an evaluation of their impact upon strain virulence in subsequent studies. Conclusions This is the first large-scale study of the O. novo-ulmi transcriptome. The expression profile obtained from the yeast

  12. A capture-recapture survival analysis model for radio-tagged animals

    USGS Publications Warehouse

    Pollock, K.H.; Bunck, C.M.; Winterstein, S.R.; Chen, C.-L.

    1995-01-01

    In recent years, survival analysis of radio-tagged animals has developed using methods based on the Kaplan-Meier method used in medical and engineering applications (Pollock et al., 1989a,b). An important assumption of this approach is that all tagged animals with a functioning radio can be relocated at each sampling time with probability 1. This assumption may not always be reasonable in practice. In this paper, we show how a general capture-recapture model can be derived which allows for some probability (less than one) for animals to be relocated. This model is not simply a Jolly-Seber model because it is possible to relocate both dead and live animals, unlike when traditional tagging is used. The model can also be viewed as a generalization of the Kaplan-Meier procedure, thus linking the Jolly-Seber and Kaplan-Meier approaches to survival estimation. We present maximum likelihood estimators and discuss testing between submodels. We also discuss model assumptions and their validity in practice. An example is presented based on canvasback data collected by G. M. Haramis of Patuxent Wildlife Research Center, Laurel, Maryland, USA.

  13. Development and optimization of sequence-tagged microsatellite site markers to detect genetic diversity within Colletotrichum capsici, a causal agent of chilli pepper anthracnose disease.

    PubMed

    Ranathunge, N P; Ford, R; Taylor, P W J

    2009-07-01

    Genomic libraries enriched for microsatellites from Colletotrichum capsici, one of the major causal agents of anthracnose disease in chilli pepper (Capsicum spp.), were developed using a modified hybridization procedure. Twenty-seven robust primer pairs were designed from microsatellite flanking sequences and were characterized using 52 isolates from three countries India, Sri Lanka and Thailand. Highest gene diversity of 0.857 was observed at the CCSSR1 with up to 18 alleles among all the isolates whereas the differentiation ranged from 0.05 to 0.45. The sequence-tagged microsatellite site markers developed in this study will be useful for genetic analyses of C. capsici populations. PMID:21564867

  14. Bacterial diversity assessment of pristine mangrove microbial community from Dhulibhashani, Sundarbans using 16S rRNA gene tag sequencing.

    PubMed

    Basak, Pijush; Pramanik, Arnab; Sengupta, Sohan; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2016-03-01

    The global knowledge of microbial diversity and function in Sundarbans ecosystem is still scarce, despite global advancement in understanding the microbial diversity. In the present study, we have analyzed the diversity and distribution of bacteria in the tropical mangrove sediments of Sundarbans using 16S rRNA gene amplicon sequencing. Metagenome is comprised of 1,53,926 sequences with 108.8 Mbp data and with 55 ± 2% G + C content. Metagenome sequence data are available at NCBI under the Bioproject database with accession no. PRJNA245459. Bacterial community metagenome sequences were analyzed by MG-RAST software representing the presence of 56,547 species belonging to 44 different phyla. The taxonomic analysis revealed the dominance of phyla Proteobacteria within our dataset. Further taxonomic analysis revealed abundance of Bacteroidetes, Acidobactreia, Firmicutes, Actinobacteria, Nitrospirae, Cyanobacteria, Planctomycetes and Fusobacteria group as the predominant bacterial assemblages in this largely pristine mangrove habitat. The distribution of different community datasets obtained from four sediment samples originated from one sampling station at two different depths providing better understanding of the sediment bacterial diversity and its relationship to the ecosystem dynamics of this pristine mangrove sediment of Dhulibhashani in, Sundarbans. PMID:26981367

  15. Bacterial diversity assessment of pristine mangrove microbial community from Dhulibhashani, Sundarbans using 16S rRNA gene tag sequencing

    PubMed Central

    Basak, Pijush; Pramanik, Arnab; Sengupta, Sohan; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2015-01-01

    The global knowledge of microbial diversity and function in Sundarbans ecosystem is still scarce, despite global advancement in understanding the microbial diversity. In the present study, we have analyzed the diversity and distribution of bacteria in the tropical mangrove sediments of Sundarbans using 16S rRNA gene amplicon sequencing. Metagenome is comprised of 1,53,926 sequences with 108.8 Mbp data and with 55 ± 2% G + C content. Metagenome sequence data are available at NCBI under the Bioproject database with accession no. PRJNA245459. Bacterial community metagenome sequences were analyzed by MG-RAST software representing the presence of 56,547 species belonging to 44 different phyla. The taxonomic analysis revealed the dominance of phyla Proteobacteria within our dataset. Further taxonomic analysis revealed abundance of Bacteroidetes, Acidobactreia, Firmicutes, Actinobacteria, Nitrospirae, Cyanobacteria, Planctomycetes and Fusobacteria group as the predominant bacterial assemblages in this largely pristine mangrove habitat. The distribution of different community datasets obtained from four sediment samples originated from one sampling station at two different depths providing better understanding of the sediment bacterial diversity and its relationship to the ecosystem dynamics of this pristine mangrove sediment of Dhulibhashani in, Sundarbans. PMID:26981367

  16. Spiral MR myocardial tagging.

    PubMed

    Ryf, Salome; Kissinger, Kraig V; Spiegel, Marcus A; Börnert, Peter; Manning, Warren J; Boesiger, Peter; Stuber, Matthias

    2004-02-01

    In the present study, complementary spatial modulation of magnetization (CSPAMM) myocardial tagging was extended with an interleaved spiral imaging sequence. The use of a spiral sequence enables the acquisition of grid-tagged images with a tagline distance as low as 4 mm in a single breath-hold. Alternatively, a high temporal resolution of 77 frames per second was obtained with 8-mm grid spacing. Ten healthy adult subjects were studied. With this new approach, high-quality images can be obtained and the tags persist throughout the entire cardiac cycle. PMID:14755646

  17. EGENES: Transcriptome-Based Plant Database of Genes with Metabolic Pathway Information and Expressed Sequence Tag Indices in KEGG1[C][W][OA

    PubMed Central

    Masoudi-Nejad, Ali; Goto, Susumu; Jauregui, Ruy; Ito, Masumi; Kawashima, Shuichi; Moriya, Yuki; Endo, Takashi R.; Kanehisa, Minoru

    2007-01-01

    EGENES is a knowledge-based database for efficient analysis of plant expressed sequence tags (ESTs) that was recently added to the KEGG suite of databases. It links plant genomic information with higher order functional information in a single database. It also provides gene indices for each genome. The genomic information in EGENES is a collection of EST contigs constructed from assembly of ESTs. Due to the extremely large genomes of plant species, the bulk collection of data such as ESTs is a quick way to capture a complete repertoire of genes expressed in an organism. Using ESTs for reconstructing metabolic pathways is a new expansion in KEGG and provides researchers with a new resource for species in which only EST sequences are available. Functional annotation in EGENES is a process of linking a set of genes/transcripts in each genome with a network of interacting molecules in the cell. EGENES is a multispecies, integrated resource consisting of genomic, chemical, and network information containing a complete set of building blocks (genes and molecules) and wiring diagrams (biological pathways) to represent cellular functions. Using EGENES, genome-based pathway annotation and EST-based annotation can now be compared and mutually validated. The ultimate goals of EGENES will be to: bring new plant species into KEGG by clustering and annotating ESTs; abstract knowledge and principles from large-scale plant EST data; and improve computational prediction of systems of higher complexity. EGENES will be updated at least once a year. EGENES is publicly available and is accessible by the following link or by KEGG's navigation system (http://www.genome.jp/kegg-bin/create_kegg_menu?category=plants_egenes). PMID:17468225

  18. Whole-genome sequence-based analysis of thyroid function

    PubMed Central

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J.; Traglia, Michela; Brown, Suzanne J.; Mullin, Benjamin H.; Shihab, Hashem A.; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R.; Beilby, John P.; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D.; Hui, Jennie; Lim, Ee M.; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R.B.; Bell, Jordana T.; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L.; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M.; Naitza, Silvia; Walsh, John P.; Spector, Tim; Davey Smith, George; Durbin, Richard; Brent Richards, J.; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J.; Wilson, Scott G.; Turki, Saeed Al; Anderson, Carl; Anney, Richard; Antony, Dinu; Artigas, Maria Soler; Ayub, Muhammad; Balasubramaniam, Senduran; Barrett, Jeffrey C.; Barroso, Inês; Beales, Phil; Bentham, Jamie; Bhattacharya, Shoumo; Birney, Ewan; Blackwood, Douglas; Bobrow, Martin; Bochukova, Elena; Bolton, Patrick; Bounds, Rebecca; Boustred, Chris; Breen, Gerome; Calissano, Mattia; Carss, Keren; Chatterjee, Krishna; Chen, Lu; Ciampi, Antonio; Cirak, Sebhattin; Clapham, Peter; Clement, Gail; Coates, Guy; Collier, David; Cosgrove, Catherine; Cox, Tony; Craddock, Nick; Crooks, Lucy; Curran, Sarah; Curtis, David; Daly, Allan; Day-Williams, Aaron; Day, Ian N.M.; Down, Thomas; Du, Yuanping; Dunham, Ian; Edkins, Sarah; Ellis, Peter; Evans, David; Faroogi, Sadaf; Fatemifar, Ghazaleh; Fitzpatrick, David R.; Flicek, Paul; Flyod, James; Foley, A. Reghan; Franklin, Christopher S.; Futema, Marta; Gallagher, Louise; Geihs, Matthias; Geschwind, Daniel; Griffin, Heather; Grozeva, Detelina; Guo, Xueqin; Guo, Xiaosen; Gurling, Hugh; Hart, Deborah; Hendricks, Audrey; Holmans, Peter; Howie, Bryan; Huang, Liren; Hubbard, Tim; Humphries, Steve E.; Hurles, Matthew E.; Hysi, Pirro; Jackson, David K.; Jamshidi, Yalda; Jing, Tian; Joyce, Chris; Kaye, Jane; Keane, Thomas; Keogh, Julia; Kemp, John; Kennedy, Karen; Kolb-Kokocinski, Anja; Lachance, Genevieve; Langford, Cordelia; Lawson, Daniel; Lee, Irene; Lek, Monkol; Liang, Jieqin; Lin, Hong; Li, Rui; Li, Yingrui; Liu, Ryan; Lönnqvist, Jouko; Lopes, Margarida; Lotchkova, Valentina; MacArthur, Daniel; Marchini, Jonathan; Maslen, John; Massimo, Mangino; Mathieson, Iain; Marenne, Gaëlle; McGuffin, Peter; McIntosh, Andrew; McKechanie, Andrew G.; McQuillin, Andrew; Metrustry, Sarah; Mitchison, Hannah; Moayyeri, Alireza; Morris, James; Muntoni, Francesco; Northstone, Kate; O'Donnovan, Michael; Onoufriadis, Alexandros; O'Rahilly, Stephen; Oualkacha, Karim; Owen, Michael J.; Palotie, Aarno; Panoutsopoulou, Kalliope; Parker, Victoria; Parr, Jeremy R.; Paternoster, Lavinia; Paunio, Tiina; Payne, Felicity; Pietilainen, Olli; Plagnol, Vincent; Quaye, Lydia; Quai, Michael A.; Raymond, Lucy; Rehnström, Karola; Richards, Brent; Ring, Susan; Ritchie, Graham R.S.; Roberts, Nicola; Savage, David B.; Scambler, Peter; Schiffels, Stephen; Schmidts, Miriam; Schoenmakers, Nadia; Semple, Robert K.; Serra, Eva; Sharp, Sally I.; Shin, So-Youn; Skuse, David; Small, Kerrin; Southam, Lorraine; Spasic-Boskovic, Olivera; Clair, David St; Stalker, Jim; Stevens, Elizabeth; Pourcian, Beate St; Sun, Jianping; Suvisaari, Jaana; Tachmazidou, Ionna; Tobin, Martin D.; Valdes, Ana; Kogelenberg, Margriet Van; Vijayarangakannan, Parthiban; Visscher, Peter M.; Wain, Louise V.; Walters, James T.R.; Wang, Guangbiao; Wang, Jun; Wang, Yu; Ward, Kirsten; Wheeler, Elanor; Whyte, Tamieka; Williams, Hywel; Williamson, Kathleen A.; Wilson, Crispian; Wong, Kim; Xu, ChangJiang; Yang, Jian; Zhang, Fend; Zhang, Pingbo

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10−9) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10−14). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10−9) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10−11). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function. PMID:25743335

  19. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function. PMID:25743335

  20. RNA sequence analysis using covariance models.

    PubMed Central

    Eddy, S R; Durbin, R

    1994-01-01

    We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences. Images PMID:8029015

  1. Phylogenetic Analysis of Poliovirus Sequences.

    PubMed

    Jorba, Jaume

    2016-01-01

    Comparative genomic sequencing is a major surveillance tool in the Polio Laboratory Network. Due to the rapid evolution of polioviruses (~1 % per year), pathways of virus transmission can be reconstructed from the pathways of genomic evolution. Here, we describe three main phylogenetic methods; estimation of genetic distances, reconstruction of a maximum-likelihood (ML) tree, and estimation of substitution rates using Bayesian Markov chain Monte Carlo (MCMC). The data set used consists of complete capsid sequences from a survey of poliovirus sequences available in GenBank. PMID:26983737

  2. Expression of the Arabidopsis transposable element Tag1 is targeted to developing gametophytes.

    PubMed

    Galli, Mary; Theriault, Angie; Liu, Dong; Crawford, Nigel M

    2003-12-01

    The Arabidopsis transposon Tag1 undergoes late excision during vegetative and germinal development in plants containing 35S-Tag1-GUS constructs. To determine if transcriptional regulation can account for the developmental control of Tag1 excision, the transcriptional activity of Tag1 promoter-GUS fusion constructs of various lengths was examined in transgenic plants. All constructs showed expression in the reproductive organs of developing flowers but no expression in leaves. Expression was restricted to developing gametophytes in both male and female lineages. Quantitative RT-PCR analysis confirmed that Tag1 expression predominates in the reproductive organs of flower buds. These results are consistent with late germinal excision of Tag1, but they cannot explain the vegetative excision activity of Tag1 observed with 35S-Tag1-GUS constructs. To resolve this issue, Tag1 excision was reexamined using elements with no adjacent 35S promoter sequences. Tag1 excision in this context is restricted to germinal events with no detectable vegetative excision. If a 35S enhancer sequence is placed next to Tag1, vegetative excision is restored. These results indicate that the intrinsic activity of Tag1 is restricted to germinal excision due to targeted expression of the Tag1 transposase to developing gametophytes and that this activity is altered by the presence of adjacent enhancers or promoters. PMID:14704189

  3. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  4. Genome Sequencing and Analysis Conference IV

    SciTech Connect

    Not Available

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  5. Sorting of a HaloTag protein that has only a signal peptide sequence into exocrine secretory granules without protein aggregation.

    PubMed

    Fujita-Yoshigaki, Junko; Matsuki-Fukushima, Miwako; Yokoyama, Megumi; Katsumata-Kato, Osamu

    2013-11-15

    The mechanism involved in the sorting and accumulation of secretory cargo proteins, such as amylase, into secretory granules of exocrine cells remains to be solved. To clarify that sorting mechanism, we expressed a reporter protein HaloTag fused with partial sequences of salivary amylase protein in primary cultured parotid acinar cells. We found that a HaloTag protein fused with only the signal peptide sequence (Met(1)-Ala(25)) of amylase, termed SS25H, colocalized well with endogenous amylase, which was confirmed by immunofluorescence microscopy. Percoll-density gradient centrifugation of secretory granule fractions shows that the distributions of amylase and SS25H were similar. These results suggest that SS25H is transported to secretory granules and is not discriminated from endogenous amylase by the machinery that functions to remove proteins other than granule cargo from immature granules. Another reporter protein, DsRed2, that has the same signal peptide sequence also colocalized with amylase, suggesting that the sorting to secretory granules is not dependent on a characteristic of the HaloTag protein. Whereas Blue Native PAGE demonstrates that endogenous amylase forms a high-molecular-weight complex, SS25H does not participate in the complex and does not form self-aggregates. Nevertheless, SS25H was released from cells by the addition of a β-adrenergic agonist, isoproterenol, which also induces amylase secretion. These results indicate that addition of the signal peptide sequence, which is necessary for the translocation in the endoplasmic reticulum, is sufficient for the transportation and storage of cargo proteins in secretory granules of exocrine cells. PMID:24029466

  6. Utilizing Social Bookmarking Tag Space for Web Content Discovery: A Social Network Analysis Approach

    ERIC Educational Resources Information Center

    Wei, Wei

    2010-01-01

    Social bookmarking has gained popularity since the advent of Web 2.0. Keywords known as tags are created to annotate web content, and the resulting tag space composed of the tags, the resources, and the users arises as a new platform for web content discovery. Useful and interesting web resources can be located through searching and browsing based…

  7. Expressed sequence tags from larval gut of the European corn borer (Ostrinia nubilalis): Exploring candidate genes potentially involved in Bacillus thuringiensis toxicity and resistance

    PubMed Central

    Khajuria, Chitvan; Zhu, Yu Cheng; Chen, Ming-Shun; Buschman, Lawrent L; Higgins, Randall A; Yao, Jianxiu; Crespo, Andre LB; Siegfried, Blair D; Muthukrishnan, Subbaratnam; Zhu, Kun Yan

    2009-01-01

    Background Lepidoptera represents more than 160,000 insect species which include some of the most devastating pests of crops, forests, and stored products. However, the genomic information on lepidopteran insects is very limited. Only a few studies have focused on developing expressed sequence tag (EST) libraries from the guts of lepidopteran larvae. Knowledge of the genes that are expressed in the insect gut are crucial for understanding basic physiology of food digestion, their interactions with Bacillus thuringiensis (Bt) toxins, and for discovering new targets for novel toxins for use in pest management. This study analyzed the ESTs generated from the larval gut of the European corn borer (ECB, Ostrinia nubilalis), one of the most destructive pests of corn in North America and the western world. Our goals were to establish an ECB larval gut-specific EST database as a genomic resource for future research and to explore candidate genes potentially involved in insect-Bt interactions and Bt resistance in ECB. Results We constructed two cDNA libraries from the guts of the fifth-instar larvae of ECB and sequenced a total of 15,000 ESTs from these libraries. A total of 12,519 ESTs (83.4%) appeared to be high quality with an average length of 656 bp. These ESTs represented 2,895 unique sequences, including 1,738 singletons and 1,157 contigs. Among the unique sequences, 62.7% encoded putative proteins that shared significant sequence similarities (E-value ≤ 10-3)with the sequences available in GenBank. Our EST analysis revealed 52 candidate genes that potentially have roles in Bt toxicity and resistance. These genes encode 18 trypsin-like proteases, 18 chymotrypsin-like proteases, 13 aminopeptidases, 2 alkaline phosphatases and 1 cadherin-like protein. Comparisons of expression profiles of 41 selected candidate genes between Cry1Ab-susceptible and resistant strains of ECB by RT-PCR showed apparently decreased expressions in 2 trypsin-like and 2 chymotrypsin

  8. Comparison of fluorescent tags for analysis of mannose-6-phosphate glycans.

    PubMed

    Kang, Ji-Yeon; Kwon, Ohsuk; Gil, Jin Young; Oh, Doo-Byoung

    2016-05-15

    Mannose-6-phosphate (M-6-P) glycan analysis is important for quality control of therapeutic enzymes for lysosomal storage diseases. Here, we found that the analysis of glycans containing two M-6-Ps was highly affected by the hydrophilicity of the elution solvent used in high-performance liquid chromatography (HPLC). In addition, the performances of three fluorescent tags--2-aminobenzoic acid (2-AA), 2-aminobenzamide (2-AB), and 3-(acetyl-amino)-6-aminoacridine (AA-Ac)--were compared with each other for M-6-P glycan analysis using HPLC and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. The best performance for analyzing M-6-P glycans was shown by 2-AA labeling in both analyses. PMID:26876105

  9. Molecular genetic analysis of activation-tagged transcription factors thought to be involved in photomorphogenesis

    SciTech Connect

    Neff, Michael M.

    2011-06-23

    This is a final report for Department of Energy Grant No. DE-FG02-08ER15927 entitled “Molecular Genetic Analysis of Activation-Tagged Transcription Factors Thought to be Involved in Photomorphogenesis”. Based on our preliminary photobiological and genetic analysis of the sob1-D mutant, we hypothesized that OBP3 is a transcription factor involved in both phytochrome and cryptochrome-mediated signal transduction. In addition, we hypothesized that OBP3 is involved in auxin signaling and root development. Based on our preliminary photobiological and genetic analysis of the sob2-D mutant, we also hypothesized that a related gene, LEP, is involved in hormone signaling and seedling development.

  10. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

  11. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  12. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  13. Development and Validation of Single Nucleotide Polymorphism (SNP) Markers from an Expressed Sequence Tag (EST) Database in Olive Flounder (Paralichthys olivaceus)

    PubMed Central

    Kim, Jung Eun; Lee, Young Mee; Lee, Jeong-Ho; Noh, Jae Koo; Kim, Hyun Chul; Park, Choul-Ji; Park, Jong-Won; Kim, Kyung-Kil

    2014-01-01

    To successful molecular breeding, identification and functional characterization of breeding related genes and development of molecular breeding techniques using DNA markers are essential. Although the development of a useful marker is difficult in the aspect of time, cost and effort, many markers are being developed to be used in molecular breeding and developed markers have been used in many fields. Single nucleotide polymorphisms (SNPs) markers were widely used for genomic research and breeding, but has hardly been validated for screening functional genes in olive flounder. We identified single nucleotide polymorphisms (SNPs) from expressed sequence tag (EST) database in olive flounder; out of a total 4,327 ESTs, 693 contigs and 514 SNPs were detected in total EST, and these substitutions include 297 transitions and 217 transversions. As a result, 144 SNP markers were developed on the basis of 514 SNP to selection of useful gene region, and then applied to each of eight wild and culture olive flounder (total 16 samples). In our experimental result, only 32 markers had detected polymorphism in sample, also identified 21 transitions and 11 transversions, whereas indel was not detected in polymorphic SNPs. Heterozygosity of wild and cultured olive flounder using the 32 SNP markers is 0.34 and 0.29, respectively. In conclusion, we identified SNP and polymorphism in olive flounder using newly designed marker, it supports that developed markers are suitable for SNP detection and diversity analysis in olive flounder. The outcome of this study can be basic data for researches for immunity gene and characteristic with SNP. PMID:25949198

  14. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  15. Fractal Analysis of DNA Sequence Data

    NASA Astrophysics Data System (ADS)

    Berthelsen, Cheryl Lynn

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the "sandbox method." Analysis of 164 human DNA sequences compared to three types of control sequences (random, base -content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than do invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  16. The Arabidopsis Root Transcriptome by Serial Analysis of Gene Expression. Gene Identification Using the Genome Sequence1

    PubMed Central

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source. PMID:14730065

  17. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  18. Methyl-CpG island-associated genome signature tags

    DOEpatents

    Dunn, John J

    2014-05-20

    Disclosed is a method for analyzing the organismic complexity of a sample through analysis of the nucleic acid in the sample. In the disclosed method, through a series of steps, including digestion with a type II restriction enzyme, ligation of capture adapters and linkers and digestion with a type IIS restriction enzyme, genome signature tags are produced. The sequences of a statistically significant number of the signature tags are determined and the sequences are used to identify and quantify the organisms in the sample. Various embodiments of the invention described herein include methods for using single point genome signature tags to analyze the related families present in a sample, methods for analyzing sequences associated with hyper- and hypo-methylated CpG islands, methods for visualizing organismic complexity change in a sampling location over time and methods for generating the genome signature tag profile of a sample of fragmented DNA.

  19. An Expressed Sequence Tag (EST)-enriched genetic map of turbot (Scophthalmus maximus): a useful framework for comparative genomics across model and farmed teleosts

    PubMed Central

    2012-01-01

    Background The turbot (Scophthalmus maximus) is a relevant species in European aquaculture. The small turbot genome provides a source for genomics strategies to use in order to understand the genetic basis of productive traits, particularly those related to sex, growth and pathogen resistance. Genetic maps represent essential genomic screening tools allowing to localize quantitative trait loci (QTL) and to identify candidate genes through comparative mapping. This information is the backbone to develop marker-assisted selection (MAS) programs in aquaculture. Expressed sequenced tag (EST) resources have largely increased in turbot, thus supplying numerous type I markers suitable for extending the previous linkage map, which was mostly based on anonymous loci. The aim of this study was to construct a higher-resolution turbot genetic map using EST-linked markers, which will turn out to be useful for comparative mapping studies. Results A consensus gene-enriched genetic map of the turbot was constructed using 463 SNP and microsatellite markers in nine reference families. This map contains 438 markers, 180 EST-linked, clustered at 24 linkage groups. Linkage and comparative genomics evidences suggested additional linkage group fusions toward the consolidation of turbot map according to karyotype information. The linkage map showed a total length of 1402.7 cM with low average intermarker distance (3.7 cM; ~2 Mb). A global 1.6:1 female-to-male recombination frequency (RF) ratio was observed, although largely variable among linkage groups and chromosome regions. Comparative sequence analysis revealed large macrosyntenic patterns against model teleost genomes, significant hits decreasing from stickleback (54%) to zebrafish (20%). Comparative mapping supported particular chromosome rearrangements within Acanthopterygii and aided to assign unallocated markers to specific turbot linkage groups. Conclusions The new gene-enriched high-resolution turbot map represents a

  20. A New Methodology for Multiscale Myocardial Deformation and Strain Analysis Based on Tagging MRI

    PubMed Central

    Florack, Luc; van Assen, Hans

    2010-01-01

    Myocardial deformation and strain can be investigated using suitably encoded cine MRI that admits disambiguation of material motion. Practical limitations currently restrict the analysis to in-plane motion in cross-sections of the heart (2D + time), but the proposed method readily generalizes to 3D + time. We propose a new, promising methodology, which departs from a multiscale algorithm that exploits local scale selection so as to obtain a robust estimate for the velocity gradient tensor field. Time evolution of the deformation tensor is governed by a first-order ordinary differential equation, which is completely determined by this velocity gradient tensor field. We solve this matrix-ODE analytically and present results obtained from healthy volunteers as well as from patient data. The proposed method requires only off-the-shelf algorithms and is readily applicable to planar or volumetric tagging MRI sampled on arbitrary coordinate grids. PMID:20204157

  1. Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags

    PubMed Central

    Lipton, Mary S.; Paša-Tolić, Ljiljana; Anderson, Gordon A.; Anderson, David J.; Auberry, Deanna L.; Battista, John R.; Daly, Michael J.; Fredrickson, Jim; Hixson, Kim K.; Kostandarithes, Heather; Masselon, Christophe; Markillie, Lye Meng; Moore, Ronald J.; Romine, Margaret F.; Shen, Yufeng; Stritmatter, Eric; Tolić, Nikola; Udseth, Harold R.; Venkateswaran, Amudhan; Wong, Kwong-Kwok; Zhao, Rui; Smith, Richard D.

    2002-01-01

    Understanding biological systems and the roles of their constituents is facilitated by the ability to make quantitative, sensitive, and comprehensive measurements of how their proteome changes, e.g., in response to environmental perturbations. To this end, we have developed a high-throughput methodology to characterize an organism's dynamic proteome based on the combination of global enzymatic digestion, high-resolution liquid chromatographic separations, and analysis by Fourier transform ion cyclotron resonance mass spectrometry. The peptides produced serve as accurate mass tags for the proteins and have been used to identify with high confidence >61% of the predicted proteome for the ionizing radiation-resistant bacterium Deinococcus radiodurans. This fraction represents the broadest proteome coverage for any organism to date and includes 715 proteins previously annotated as either hypothetical or conserved hypothetical. PMID:12177431

  2. Haplotag: Software for Haplotype-Based Genotyping-by-Sequencing Analysis

    PubMed Central

    Tinker, Nicholas A.; Bekele, Wubishet A.; Hattori, Jiro

    2016-01-01

    Genotyping-by-sequencing (GBS), and related methods, are based on high-throughput short-read sequencing of genomic complexity reductions followed by discovery of single nucleotide polymorphisms (SNPs) within sequence tags. This provides a powerful and economical approach to whole-genome genotyping, facilitating applications in genomics, diversity analysis, and molecular breeding. However, due to the complexity of analyzing large data sets, applications of GBS may require substantial time, expertise, and computational resources. Haplotag, the novel GBS software described here, is freely available, and operates with minimal user-investment on widely available computer platforms. Haplotag is unique in fulfilling the following set of criteria: (1) operates without a reference genome; (2) can be used in a polyploid species; (3) provides a discovery mode, and a production mode; (4) discovers polymorphisms based on a model of tag-level haplotypes within sequenced tags; (5) reports SNPs as well as haplotype-based genotypes; and (6) provides an intuitive visual “passport” for each inferred locus. Haplotag is optimized for use in a self-pollinating plant species. PMID:26818073

  3. Expressed sequence tags from the red imported fire ant, Solenopsis invicta: Annotation and utilization for discovery of viruses

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An expression library was created and 2,300 clones sequenced from a monogyne colony of Solenopsis invicta with the primary intention of discovering viruses infecting this ant pest. After assembly and removal of mitochondrial and poor quality sequences, 1,054 unique sequences were yielded and deposi...

  4. Design and Analysis of Salmonid Tagging Studies in the Columbia Basin, Volume XVI; Alternative Designs for Future Adult PIT-Tag Detection Studies, 2000 Technical Report.

    SciTech Connect

    Perez-Comas, Jose A.; Skalski, John R.

    2000-09-25

    In the advent of the installation of a PIT-tag interrogation system in the Cascades Island fish ladder at Bonneville Dam (BON), and other CRB dams, this overview describes in general terms what can and cannot be estimated under seven different scenarios of adult PIT-tag detection capabilities in the CRB. Moreover, this overview attempted to identify minimal adult PIT-tag detection configurations required by the ten threatened Columbia River Basin (CRB) chinook and steelhead ESUs. A minimal adult PIT-tag detection configuration will require the installation of adult PIT-tag detection facilities at Bonneville Dam and another dam above BON. Thus, the Snake River spring/summer and fall chinook salmon, and the Snake River steelhead will require a minimum of three dams with adult PIT-tag detection capabilities to guarantee estimates of ''ocean survival'' and at least of one independent, in-river returning adult survival (e.g., adult PIT-tag detection facilities at BON and LGR dams and at any other intermediary dam such as IHR). The Upper Columbia River spring chinook salmon and steelhead will also require a minimum of three dams with adult PIT-tag detection capabilities: BON and two other dams on the BON-WEL reach. The current CRB dam system configuration and BPA's and COE's commitment to install adult PIT-tag detectors only in major CRB projects will not allow the estimation of an ''ocean survival'' and of any in-river adult survival for the Lower Columbia River chinook salmon and steelhead. The Middle Columbia River steelhead ESU will require a minimum of two dams with adult PIT-tag detection capabilities: BON and another upstream dam on the BON-McN reach. Finally, in spite of their importance in terms of releases, PIT-tag survival studies for the Upper Willamette chinook and Upper Willamette steelhead ESUs cannot be perform with the current CRB dam system configuration and PIT-tag detection capabilities.

  5. Long Span DNA Paired-End-Tag (DNA-PET) Sequencing Strategy for the Interrogation of Genomic Structural Mutations and Fusion-Point-Guided Reconstruction of Amplicons

    PubMed Central

    Hillmer, Axel M.; Lee, Wah Heng; Li, Guoliang; Teo, Audrey S. M.; Woo, Xing Yi; Zhang, Zhenshui; Chen, Jieqi P.; Poh, Wan Ting; Zawack, Kelson F. B.; Chan, Chee Seng; Leong, See Ting; Neo, Say Chuan; Choi, Poh Sum D.; Gao, Song; Nagarajan, Niranjan; Thoreau, Hervé; Shahab, Atif; Ruan, Xiaoan; Cacheux-Rataboul, Valère; Wei, Chia-Lin; Bourque, Guillaume; Sung, Wing-Kin; Liu, Edison T.; Ruan, Yijun

    2012-01-01

    Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10–20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer. PMID:23029419

  6. Validation of Shewanella oneidensis MR-1 Small Proteins by AMT Tag-based Proteome Analysis

    SciTech Connect

    Romine, Margaret F.; Elias, Dwayne A.; Monroe, Matthew E.; Auberry, Kenneth J.; Fang, Ruihua; Fredrickson, Jim K.; Anderson, Gordon A.; Smith, Richard D.; Lipton, Mary S.

    2004-09-01

    Using stringent criteria for protein identification by accurate mass and time (AMT) tag mass spectrometric methodology, we detected 36 proteins <101 amino acids in length, including 10 that were annotated as hypothetical proteins, in 172 global tryptic digests of Shewanella oneidensis MR-1 proteins analyzed. Peptides that map to the conserved, but functionally uncharacterized proteins SO4134 and SO2787, were the most frequently detected small proteins in these samples, while hypotheticals SO2669 and SO2063, conserved hypotheticals SO0335 and SO2176, and the SlyX protein (SO1063) were observed at frequencies similar to small expected abundant ribosomal proteins and translation initiation factor IF-1 and consequently, likely to encode important cellular functions. In addition, 30 proteins including three of the small proteins that map to genes predicted to encode frameshifts, point mutations, or recoding signals were detected. Of these 30 genes, peptides that map to positions beyond internal stop codons were detected in 13 genes (SO0101, SO0419, SO0590, SO0738, SO1113, SO1211, SO3079, SO3130, SO3240, SO4231, SO4328, SO4422, and SO4657). While expression of the full-length formate dehydrogenase encoded by SO0101 can be explained by incorporation of selenocysteine at the internal stop codon, the mechanism of translating downstream sequences in the remaining genes remains unknown.

  7. Fluorescent Protein-Tagged Sindbis Virus E2 Glycoprotein Allows Single Particle Analysis of Virus Budding from Live Cells

    PubMed Central

    Jose, Joyce; Tang, Jinghua; Taylor, Aaron B.; Baker, Timothy S.; Kuhn, Richard J.

    2015-01-01

    Sindbis virus (SINV) is an enveloped, mosquito-borne alphavirus. Here we generated and characterized a fluorescent protein-tagged (FP-tagged) SINV and found that the presence of the FP-tag (mCherry) affected glycoprotein transport to the plasma membrane whereas the specific infectivity of the virus was not affected. We examined the virions by transmission electron cryo-microscopy and determined the arrangement of the FP-tag on the surface of the virion. The fluorescent proteins are arranged icosahedrally on the virus surface in a stable manner that did not adversely affect receptor binding or fusion functions of E2 and E1, respectively. The delay in surface expression of the viral glycoproteins, as demonstrated by flow cytometry analysis, contributed to a 10-fold reduction in mCherry-E2 virus titer. There is a 1:1 ratio of mCherry to E2 incorporated into the virion, which leads to a strong fluorescence signal and thus facilitates single-particle tracking experiments. We used the FP-tagged virus for high-resolution live-cell imaging to study the spatial and temporal aspects of alphavirus assembly and budding from mammalian cells. These processes were further analyzed by thin section microscopy. The results demonstrate that SINV buds from the plasma membrane of infected cells and is dispersed into the surrounding media or spread to neighboring cells facilitated by its close association with filopodial extensions. PMID:26633461

  8. Sample size requirements and analysis of tag recoveries for paired releases of lake trout

    USGS Publications Warehouse

    Elrod, Joseph H.; Frank, Anthony

    1990-01-01

    A simple chi-square test can be used to analyze recoveries from a paired-release experiment to determine whether differential survival occurs between two groups of fish. The sample size required for analysis is a function of (1) the proportion of fish stocked, (2) the expected proportion at recovery, (3) the level of significance (a) at which the null hypothesis is tested, and (4) the power (1-I?) of the statistical test. Detection of a 20% change from a stocking ratio of 50:50 requires a sample of 172 (I?=0.10; 1-I?=0.80) to 459 (I?=0.01; 1-I?=0.95) fish. Pooling samples from replicate pairs is sometimes an appropriate way to increase statistical precision without increasing numbers stocked or sampling intensity. Summing over time is appropriate if catchability or survival of the two groups of fish does not change relative to each other through time. Twelve pairs of identical groups of yearling lake trout Salvelinus namaycush were marked with coded wire tags and stocked into Lake Ontario. Recoveries of fish at ages 2-8 showed differences of 1-14% from the initial stocking ratios. Mean tag recovery rates were 0.217%, 0.156%, 0.128%, 0.121%, 0.093%, 0.042%, and 0.016% for ages 2-8, respectively. At these rates, stocking 12,100-29,700 fish per group would yield samples of 172-459 fish at ages 2-8 combined.

  9. [Preparation of monoclonal antibodies against His-tag and epitope analysis of cross antigens].

    PubMed

    Zhao, Xiangrong; Zhang, Haixiang; Liu, Yang; Wang, Xin; Wang, Guanghua; Qi, Zongli; Li, Yuan; Hu, Jun

    2016-05-01

    Objective To explore the influence of His-tag on recombinant proteins in vaccination, immunization and pathogenesis. Methods Multiple mouse monoclonal antibodies (mAb) against His-tag were prepared. The biological and immunoreactive characteristics of these mAbs and their cross-reactivity with the normal human tissues were investigated by ELISA, Western blotting and immunohistochemistry (IHC), respectively. Results The binding activity of these anti-His mAbs was associated with the steric configuration of the his-tagged antigen. In addition, most of these mAbs reacted with human hemoglobin and some normal human tissues. Conclusion Anti-His antibodies could be elicited by His-tagged recombinant proteins in vivo experiments. Moreover, the functional studies of the His-tagged recombinant proteins might be affected by the reactions of anti-His6 antibodies with human hemoglobin and normal human tissues. PMID:27126949

  10. Auditory sequence analysis and phonological skill.

    PubMed

    Grube, Manon; Kumar, Sukhbinder; Cooper, Freya E; Turton, Stuart; Griffiths, Timothy D

    2012-11-01

    This work tests the relationship between auditory and phonological skill in a non-selected cohort of 238 school students (age 11) with the specific hypothesis that sound-sequence analysis would be more relevant to phonological skill than the analysis of basic, single sounds. Auditory processing was assessed across the domains of pitch, time and timbre; a combination of six standard tests of literacy and language ability was used to assess phonological skill. A significant correlation between general auditory and phonological skill was demonstrated, plus a significant, specific correlation between measures of phonological skill and the auditory analysis of short sequences in pitch and time. The data support a limited but significant link between auditory and phonological ability with a specific role for sound-sequence analysis, and provide a possible new focus for auditory training strategies to aid language development in early adolescence. PMID:22951739

  11. Categorical and Specificity Differences between User-Supplied Tags and Search Query Terms for Images. An Analysis of "Flickr" Tags and Web Image Search Queries

    ERIC Educational Resources Information Center

    Chung, EunKyung; Yoon, JungWon

    2009-01-01

    Introduction: The purpose of this study is to compare characteristics and features of user supplied tags and search query terms for images on the "Flickr" Website in terms of categories of pictorial meanings and level of term specificity. Method: This study focuses on comparisons between tags and search queries using Shatford's categorization…

  12. Sequencing and Analysis of Neanderthal Genomic DNA

    PubMed Central

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Pääbo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2008-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library are of Neanderthal origin, the strongest being the ascertainment of sequence identities between Neanderthal and chimpanzee at sites where the human genomic sequence is different. These results enabled us to calculate the human-Neanderthal divergence time based on multiple randomly distributed autosomal loci. Our analyses suggest that on average the Neanderthal genomic sequence we obtained and the reference human genome sequence share a most recent common ancestor ~706,000 years ago, and that the human and Neanderthal ancestral populations split ~370,000 years ago, before the emergence of anatomically modern humans. Our finding that the Neanderthal and human genomes are at least 99.5% identical led us to develop and successfully implement a targeted method for recovering specific ancient DNA sequences from metagenomic libraries. This initial analysis of the Neanderthal genome advances our understanding of the evolutionary relationship of Homo sapiens and Homo neanderthalensis and signifies the dawn of Neanderthal genomics. PMID:17110569

  13. Genomic sequence analysis tools: a user's guide.

    PubMed

    Fortna, A; Gardiner, K

    2001-03-01

    The wealth of information from various genome sequencing projects provides the biologist with a new perspective from which to analyze, and design experiments with, mammalian systems. The complexity of the information, however, requires new software tools, and numerous such tools are now available. Which type and which specific system is most effective depends, in part, upon how much sequence is to be analyzed and with what level of experimental support. Here we survey a number of mammalian genomic sequence analysis systems with respect to the data they provide and the ease of their use. The hope is to aid the experimental biologist in choosing the most appropriate tool for their analyses. PMID:11226611

  14. A High-Throughput Data Mining of Single Nucleotide Polymorphisms in Coffea Species Expressed Sequence Tags Suggests Differential Homeologous Gene Expression in the Allotetraploid Coffea arabica1[W

    PubMed Central

    Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

    2010-01-01

    Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545

  15. A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica.

    PubMed

    Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

    2010-11-01

    Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545

  16. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  17. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  18. In-vivo motion analysis of bi-ventricular hearts from tagged MR images

    NASA Astrophysics Data System (ADS)

    Park, Kyoungju; Axel, Leon; Metaxas, Dimitris N.

    2005-04-01

    We conduct experiments to look at the in-vivo cardiac motion during systole, to visualize heart contraction, and to examine the clinical usefulness. Our model-based technique incorporates subject-specific modeling, motion analysis and the extraction of clinically relevant parameters within one framework. Previous bi-ventricular model based method could only handle up to the mid-ventricles and have a few test-subjects. Our parameterized model includes the LV, RV and up to the basal area for full ventricular motion study. Finite element methods capture cardiac motion by tracking the material points from tagged Magnetic Resonance (MR) images. A number of experiments from ten subjects are evaluated and analyzed. We tested subject several times and compared the resulting parameters to ensure the reproducibility and deviations. The resulting parameters can be used to describe the cardiac motion of normal subjects. The patterns of normal subjects were derived from experiments. While significant shape and motion variations were apparent in normal subjects, the quantitative analysis show typical patterns. Generally, the basal area moves downwards and the apical area contracts towards the cavity. The principal strain analysis describes the directions and magnitudes of maximum shortening, and maximum thickening.

  19. Chemical tagging of chlorinated phenols for their facile detection and analysis by NMR spectroscopy

    SciTech Connect

    Valdez, Carlos A.; Leif, Roald N.

    2015-03-22

    A derivatization method that employs diethyl (bromodifluoromethyl) phosphonate (DBDFP) to efficiently tag the endocrine disruptor pentachlorophenol (PCP) and other chlorinated phenols (CPs) along with their reliable detection and analysis by NMR is presented. The method accomplishes the efficient alkylation of the hydroxyl group in CPs with the difluoromethyl (CF2H) moiety in extremely rapid fashion (5 min), at room temperature and in an environmentally benign manner. The approach proved successful in difluoromethylating a panel of 18 chlorinated phenols, yielding derivatives that displayed unique 1H, 19F NMR spectra allowing for the clear discrimination between isomerically related CPs. Due to its biphasic nature, the derivatization can be applied to both aqueous and organic mixtures where the analysis of CPs is required. Furthermore, the methodology demonstrates that PCP along with other CPs can be selectively derivatized in the presence of other various aliphatic alcohols, underscoring the superiority of the approach over other general derivatization methods that indiscriminately modify all analytes in a given sample. The present work demonstrates the first application of NMR on the qualitative analysis of these highly toxic and environmentally persistent species.

  20. The Arabidopsis transposable element Tag1 is widely distributed among Arabidopsis ecotypes.

    PubMed

    Frank, M J; Preuss, D; Mack, A; Kuhlmann, T C; Crawford, N M

    1998-02-01

    Tag1 is an autonomous transposable element (3.3 kb in length) first identified as an insertion in the CHL1 (NRT1) gene of Arabidopsis thaliana. Tag1 has been found in the Landsberg erecta ecotype of A. thaliana but not in Columbia or WS. In this paper, 41 additional ecotypes were examined for the presence of Tag1. Using an internal Tag1 fragment as probe, we found that DNA form 19 of the 41 ecotypes strongly hybridized to Tag1. Almost all of the Tag1-containing ecotypes had only one or two copies of Tag1 per haploid genome, as determined by Southern blot analysis. The only exception, Bf-1 from Bretagny-sur-Orge, France, had four copies. Two ecotypes, Di-G and S96, gave identical Southern blot patterns to that of Landsberg erecta and were subsequently shown to contain Tag1 at the same two positions found in Landsberg erecta (loci designated as Tag1-2 and Tag1-3). Two other ecotypes, Ag-0 and Lo-1, had a Tag1 element located at Tag1-2 but not at Tag1-3. The distance between these two loci was determined to be 0.37 cM. Analysis of DNA from two related species, A. griffithiana and A. pumila, showed that both species contain sequences that hybridize to Tag1 and that could be amplified with an oligonucleotide specific to the terminal inverted repeats of Tag1. These results show that Tag1 and related elements are present, and may be useful for insertional mutagenesis, in many A. thaliana ecotypes and several Arabidopsis species. PMID:9529529

  1. Expressed Sequence Tags for Bovine Muscle Satellite Cells, Myotube Formed-Cells and Adipocyte-Like Cells

    PubMed Central

    Pokharel, Smritee; Malik, Adeel; Tareq, K. M. A.; Roouf Bhat, Abdul; Park, Hee-Bok; Lee, Yong Seok; Kim, SangHoon; Yang, Bohsuk; Young Chung, Ki; Choi, Inho

    2013-01-01

    Background Muscle satellite cells (MSCs) represent a devoted stem cell population that is responsible for postnatal muscle growth and skeletal muscle regeneration. An important characteristic of MSCs is that they encompass multi potential mesenchymal stem cell activity and are able to differentiate into myocytes and adipocytes. To achieve a global view of the genes differentially expressed in MSCs, myotube formed-cells (MFCs) and adipocyte-like cells (ALCs), we performed large-scale EST sequencing of normalized cDNA libraries developed from bovine MSCs. Results A total of 24,192 clones were assembled into 3,333 clusters, 5,517 singletons and 3,842contigs. Functional annotation of these unigenes revealed that a large portion of the differentially expressed genes are involved in cellular and signaling processes. Database for Annotation, Visualization and Integrated Discovery (DAVID) functional analysis of three subsets of highly expressed gene lists (MSC233, MFC258, and ALC248) highlighted some common and unique biological processes among MSC, MFC and ALC. Additionally, genes that may be specific to MSC, MFC and ALC are reported here, and the role of dimethylarginine dimethylaminohydrolase2 (DDAH2) during myogenesis and hemoglobin subunit alpha2 (HBA2) during transdifferentiation in C2C12 were assayed as a case study. DDAH2 was up-regulated during myognesis and knockdown of DDAH2 by siRNA significantly decreased myogenin (MYOG) expression corresponding with the slight change in cell morphology. In contrast, HBA2 was up-regulated during ALC formation and resulted in decreased intracellular lipid accumulation and CD36 mRNA expression upon knockdown assay. Conclusion In this study, a large number of EST sequences were generated from the MSC, MFC and ALC. Overall, the collection of ESTs generated in this study provides a starting point for the identification of novel genes involved in MFC and ALC formation, which in turn offers a fundamental resource to enable better

  2. Ear tag

    MedlinePlus

    ... are: An inherited tendency to have this facial feature A genetic syndrome that includes having these pits or tags A sinus tract problem (an abnormal connection between the skin and tissue underneath) When to Contact a Medical Professional Your provider will usually find the skin tag ...

  3. The Design and Analysis of Salmonid Tagging Studies in the Columbia Basin : Volume II: Experiment Salmonid Survival with Combined PIT-CWT Tagging.

    SciTech Connect

    Newman, Ken

    1997-06-01

    Experiment designs to estimate the effect of transportation on survival and return rates of Columbia River system salmonids are discussed along with statistical modeling techniques. Besides transportation, river flow and dam spill are necessary components in the design and analysis otherwise questions as to the effects of reservoir drawdowns and increased dam spill may never be satisfactorily answered. Four criteria for comparing different experiment designs are: (1) feasibility, (2) clarity of results, (3) scope of inference, and (4) time to learn. In this report, alternative designs for conducting experimental manipulations of smolt tagging studies to study effects of river operations such as flow levels, spill fractions, and transporting outmigrating salmonids around dams in the Columbia River system are presented. The principles of study design discussed in this report have broad implications for the many studies proposed to investigate both smolt and adult survival relationships. The concepts are illustrated for the case of the design and analysis of smolt transportation experiments. The merits of proposed transportation studies should be measured relative to these principles of proper statistical design and analysis.

  4. Phylogenetic analysis of adenovirus sequences.

    PubMed

    Harrach, Balázs; Benko, Mária

    2007-01-01

    Members of the family Adenoviridae have been isolated from a large variety of hosts, including representatives from every major vertebrate class from fish to mammals. The high prevalence, together with the fairly conserved organization of the central part of their genomes, make the adenoviruses one of (if not the) best models for studying viral evolution on a larger time scale. Phylogenetic calculation can infer the evolutionary distance among adenovirus strains on serotype, species, and genus levels, thus helping the establishment of a correct taxonomy on the one hand, and speeding up the process of typing new isolates on the other. Initially, four major lineages corresponding to four genera were recognized. Later, the demarcation criteria of lower taxon levels, such as species or types, could also be defined with phylogenetic calculations. A limited number of possible host switches have been hypothesized and convincingly supported. Application of the web-based BLAST and MultAlin programs and the freely available PHYLIP package, along with the TreeView program, enables everyone to make correct calculations. In addition to step-by-step instruction on how to perform phylogenetic analysis, critical points where typical mistakes or misinterpretation of the results might occur will be identified and hints for their avoidance will be provided. PMID:17656792

  5. Numerical and experimental hydrodynamic analysis of suction cup bio-logging tag designs for marine mammals

    NASA Astrophysics Data System (ADS)

    Murray, Mark; Shorter, Alex; Howle, Laurens; Johnson, Mark; Moore, Michael

    2012-11-01

    The improvement and miniaturization of sensing technologies has made bio-logging tags, utilized for the study of marine mammal behavior, more practical. These sophisticated sensing packages require a housing which protects the electronics from the environment and provides a means of attachment to the animal. The hydrodynamic forces on these housings can inadvertently remove the tag or adversely affect the behavior or energetics of the animal. A modification to the original design of a suction cup bio-logging tag housing was desired to minimize the adverse forces. In this work, hydrodynamic loading of two suction cup tag designs, original and modified designs, were analyzed using computational fluid dynamics (CFD) models and validated experimentally. Overall, the simulation and experimental results demonstrated that a tag housing that minimized geometric disruptions to the flow reduced drag forces, and that a tag housing with a small frontal cross-sectional area close to the attachment surface reduced lift forces. Preliminary results from experimental work with a common dolphin cadaver indicates that the suction cups used to attach the tags to the animal provide sufficient attachment force to resist failure at predicted drag and lift forces in 10 m/s flow.

  6. Ginger and turmeric expressed sequence tags identify signature genes for rhizome identity and development and the biosynthesis of curcuminoids, gingerols and terpenoids

    PubMed Central

    2013-01-01

    Background Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. Results In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. Conclusion A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific

  7. Sequence analysis by iterated maps, a review.

    PubMed

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results. PMID:24162172

  8. SMASH, a fragmentation and sequencing method for genomic copy number analysis.

    PubMed

    Wang, Zihua; Andrews, Peter; Kendall, Jude; Ma, Beicong; Hakker, Inessa; Rodgers, Linda; Ronemus, Michael; Wigler, Michael; Levy, Dan

    2016-06-01

    Copy number variants (CNVs) underlie a significant amount of genetic diversity and disease. CNVs can be detected by a number of means, including chromosomal microarray analysis (CMA) and whole-genome sequencing (WGS), but these approaches suffer from either limited resolution (CMA) or are highly expensive for routine screening (both CMA and WGS). As an alternative, we have developed a next-generation sequencing-based method for CNV analysis termed SMASH, for short multiply aggregated sequence homologies. SMASH utilizes random fragmentation of input genomic DNA to create chimeric sequence reads, from which multiple mappable tags can be parsed using maximal almost-unique matches (MAMs). The SMASH tags are then binned and segmented, generating a profile of genomic copy number at the desired resolution. Because fewer reads are necessary relative to WGS to give accurate CNV data, SMASH libraries can be highly multiplexed, allowing large numbers of individuals to be analyzed at low cost. Increased genomic resolution can be achieved by sequencing to higher depth. PMID:27197213

  9. SMASH, a fragmentation and sequencing method for genomic copy number analysis

    PubMed Central

    Wang, Zihua; Andrews, Peter; Kendall, Jude; Ma, Beicong; Hakker, Inessa; Rodgers, Linda; Ronemus, Michael; Wigler, Michael; Levy, Dan

    2016-01-01

    Copy number variants (CNVs) underlie a significant amount of genetic diversity and disease. CNVs can be detected by a number of means, including chromosomal microarray analysis (CMA) and whole-genome sequencing (WGS), but these approaches suffer from either limited resolution (CMA) or are highly expensive for routine screening (both CMA and WGS). As an alternative, we have developed a next-generation sequencing-based method for CNV analysis termed SMASH, for short multiply aggregated sequence homologies. SMASH utilizes random fragmentation of input genomic DNA to create chimeric sequence reads, from which multiple mappable tags can be parsed using maximal almost-unique matches (MAMs). The SMASH tags are then binned and segmented, generating a profile of genomic copy number at the desired resolution. Because fewer reads are necessary relative to WGS to give accurate CNV data, SMASH libraries can be highly multiplexed, allowing large numbers of individuals to be analyzed at low cost. Increased genomic resolution can be achieved by sequencing to higher depth. PMID:27197213

  10. Arabidopsis Genes Involved in Acyl Lipid Metabolism. A 2003 Census of the Candidates, a Study of the Distribution of Expressed Sequence Tags in Organs, and a Web-Based Database1

    PubMed Central

    Beisson, Frédéric; Koo, Abraham J.K.; Ruuska, Sari; Schwender, Jörg; Pollard, Mike; Thelen, Jay J.; Paddock, Troy; Salas, Joaquín J.; Savage, Linda; Milcamps, Anne; Mhaske, Vandana B.; Cho, Younghee; Ohlrogge, John B.

    2003-01-01

    The genome of Arabidopsis has been searched for sequences of genes involved in acyl lipid metabolism. Over 600 encoded proteins have been identified, cataloged, and classified according to predicted function, subcellular location, and alternative splicing. At least one-third of these proteins were previously annotated as “unknown function” or with functions unrelated to acyl lipid metabolism; therefore, this study has improved the annotation of over 200 genes. In particular, annotation of the lipolytic enzyme group (at least 110 members total) has been improved by the critical examination of the biochemical literature and the sequences of the numerous proteins annotated as “lipases.” In addition, expressed sequence tag (EST) data have been surveyed, and more than 3,700 ESTs associated with the genes were cataloged. Statistical analysis of the number of ESTs associated with specific cDNA libraries has allowed calculation of probabilities of differential expression between different organs. More than 130 genes have been identified with a statistical probability > 0.95 of preferential expression in seed, leaf, root, or flower. All the data are available as a Web-based database, the Arabidopsis Lipid Gene database (http://www.plantbiology.msu.edu/lipids/genesurvey/index.htm). The combination of the data of the Lipid Gene Catalog and the EST analysis can be used to gain insights into differential expression of gene family members and sets of pathway-specific genes, which in turn will guide studies to understand specific functions of individual genes. PMID:12805597

  11. Proteomic analysis of astrocytic secretion that regulates neurogenesis using quantitative amine-specific isobaric tagging

    SciTech Connect

    Yan, Hu; Zhou, Wenhao; Wei, Liming; Zhong, Fan; Yang, Yi

    2010-01-08

    Astrocytes are essential components of neurogenic niches that affect neurogenesis through membrane association and/or the release of soluble factors. To identify factors released from astrocytes that could regulate neural stem cell differentiation and proliferation, we used mild oxygen-glucose deprivation (OGD) to inhibit the secretory capacity of astrocytes. Using the Transwell co-culture system, we found that OGD-treated astrocytes could not promote neural stem cell differentiation and proliferation. Next, isobaric tagging for the relative and absolute quantitation (iTRAQ) proteomics techniques was performed to identify the proteins in the supernatants of astrocytes (with or without OGD). Through a multi-step analysis and gene ontology classification, 130 extracellular proteins were identified, most of which were involved in neuronal development, the inflammatory response, extracellular matrix composition and supportive functions. Of these proteins, 44 had never been reported to be produced by astrocytes. Using ProteinPilot software analysis, we found that 60 extracellular proteins were significantly altered (27 upregulated and 33 downregulated) in the supernatant of OGD-treated astrocytes. Among these proteins, 7 have been reported to be able to regulate neurogenesis, while others may have the potential to regulate neurogenesis. This study profiles the major proteins released by astrocytes, which play important roles in the modulation of neurogenesis.

  12. Preparation of a Ytterbium-tagged Gunshot Residue Standard for Quality Control in the Forensic Analysis of GSR.

    PubMed

    Hearns, Nigel G R; Laflèche, Denis N; Sandercock, Mark L

    2015-05-01

    Preparation of a ytterbium-tagged gunshot residue (GSR) reference standard for scanning electron microscopy and energy dispersive X-ray spectroscopic (SEM-EDS) microanalysis is reported. Two different chemical markers, ytterbium and neodymium, were evaluated by spiking the primers of 38 Special ammunition cartridges (no propellant, no projectile) and discharging them onto 12.7 mm diameter aluminum SEM pin stubs. Following SEM-EDS microanalysis, the majority of tri-component particles containing lead, barium, and antimony (PbBaSb) were successfully tagged with the chemical marker. Results demonstrate a primer spiked with 0.75% weight percent of ytterbium nitrate affords PbBaSb particles characteristic of GSR with a ytterbium inclusion efficiency of between 77% and 100%. Reproducibility of the method was verified, and durability of the ytterbium-tagged tri-component particles under repeated SEM-EDS analysis was also tested. The ytterbium-tagged PbBaSb particles impart synthetic traceability to a GSR reference standard and are suitable for analysis alongside case work samples, as a positive control for quality assurance purposes. PMID:25678346

  13. Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon (Salmo salar) in a searchable data resource

    PubMed Central

    Adzhubei, Alexei A; Vlasova, Anna V; Hagen-Larsen, Heidi; Ruden, Torgeir A; Laerdahl, Jon K; Høyheim, Bjørn

    2007-01-01

    Background To identify as many different transcripts/genes in the Atlantic salmon genome as possible, it is crucial to acquire good cDNA libraries from different tissues and developmental stages, their relevant sequences (ESTs or full length sequences) and attempt to predict function. Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage. This data is important in constructing a microarray chip, identifying SNPs in coding regions, and for future identification of genes in the whole genome sequence. An important factor that determines the usefulness of generated data for biologists is efficient data access. Public searchable databases play a crucial role in providing such service. Description Twenty-three Atlantic salmon cDNA libraries were constructed from 15 tissues, yielding nearly 155,000 clones. From these libraries 58,109 ESTs were generated, of which 57,212 were used for contig assembly. Following deletion of mitochondrial sequences 55,118 EST sequences were submitted to GenBank. In all, 20,019 unique sequences, consisting of 6,424 contigs and 13,595 singlets, were generated. The Norwegian Salmon Genome Project Database has been constructed and annotation performed by the annotation transfer approach. Annotation was successful for 50.3% (10,075) of the sequences and 6,113 sequences (30.5%) were annotated with Gene Ontology terms for molecular function, biological process and cellular component. Conclusion We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts. These sequences represents 97% of all sequences submitted to GenBank from the pre-smoltification stage. The data has been grouped into datasets according to its source and type of annotation. Various data query options are offered

  14. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  15. Transcriptome analysis in primary neural stem cells using a tag cDNA amplification method

    PubMed Central

    Sievertzon, Maria; Wirta, Valtteri; Mercer, Alex; Meletis, Konstantinos; Erlandsson, Rikard; Wikström, Lilian; Frisén, Jonas; Lundeberg, Joakim

    2005-01-01

    Background Neural stem cells (NSCs) can be isolated from the adult mammalian brain and expanded in culture, in the form of cellular aggregates called neurospheres. Neurospheres provide an in vitro model for studying NSC behaviour and give information on the factors and mechanisms that govern their proliferation and differentiation. They are also a promising source for cell replacement therapies of the central nervous system. Neurospheres are complex structures consisting of several cell types of varying degrees of differentiation. One way of characterising neurospheres is to analyse their gene expression profiles. The value of such studies is however uncertain since they are heterogeneous structures and different populations of neurospheres may vary significantly in their gene expression. Results To address this issue, we have used cDNA microarrays and a recently reported tag cDNA amplification method to analyse the gene expression profiles of neurospheres originating from separate isolations of the lateral ventricle wall of adult mice and passaged to varying degrees. Separate isolations as well as consecutive passages yield a high variability in gene expression while parallel cultures yield the lowest variability. Conclusions We demonstrate a low technical amplification variability using the employed amplification strategy and conclude that neurospheres from the same isolation and passage are sufficiently similar to be used for comparative gene expression analysis. PMID:15833137

  16. Haplotypes of the TaGS5-A1 Gene Are Associated with Thousand-Kernel Weight in Chinese Bread Wheat

    PubMed Central

    Wang, Shasha; Yan, Xuefang; Wang, Yongyan; Liu, Hongmei; Cui, Dangqun; Chen, Feng

    2016-01-01

    In previous work, we cloned TaGS5 gene and found the association of TaGS5-A1 alleles with agronomic traits. In this study, the promoter sequence of the TaGS5-A1 gene was isolated from bread wheat. Sequencing results revealed that a G insertion was found in position -1925 bp of the TaGS5-A1 gene (Reference to ATG), which occurred in the Sp1 domain of the promoter sequence. Combined with previous single nucleotide polymorphism (SNP) in the TaGS5-A1 exon sequence, four genotypes were formed at the TaGS5-A1 locus and were designated as TaGS5-A1a-a, TaGS5-A1a-b, TaGS5-A1b-a, and TaGS5-A1b-b, respectively. Analysis of the association of TaGS5-A1 alleles with agronomic traits indicated that cultivars with the TaGS5-A1a-b allele possessed significantly higher thousand-kernel weight (TKW) and lower plant height than cultivars with the TaGS5-A1a-a allele, and cultivars with the TaGS5-A1b-b allele showed higher TKW than cultivars with the TaGS5-A1b-a allele. The differences of these traits between the TaGS5-A1a-a and TaGS5-A1a-b alleles were larger than those of the TaGS5-A1b-a and TaGS5-A1b-b alleles, suggesting that the -1925G insertion plays the more important role in TaGS5-A1a genotypes than in TaGS5-A1b genotypes. qRT-PCR indicated that TaGS5-A1b-b possessed the significantly highest expression level among four TaGS5-A1 haplotypes in mature seeds and further showed a significantly higher expression level than TaGS5-A1b-a at five different developmental stages of the seeds, suggesting that high expression of TaGS5-A1 was positively associated with high TKW in bread wheat. This study could provide a relatively superior genotype in view of TKW in wheat breeding programs and could also provide important information for dissection of the regulatory mechanism of the yield-related traits. PMID:27375643

  17. Sequence analysis of the AAA protein family.

    PubMed Central

    Beyer, A.

    1997-01-01

    The AAA protein family, a recently recognized group of Walker-type ATPases, has been subjected to an extensive sequence analysis. Multiple sequence alignments revealed the existence of a region of sequence similarity, the so-called AAA cassette. The borders of this cassette were localized and within it, three boxes of a high degree of conservation were identified. Two of these boxes could be assigned to substantial parts of the ATP binding site (namely, to Walker motifs A and B); the third may be a portion of the catalytic center. Phylogenetic trees were calculated to obtain insights into the evolutionary history of the family. Subfamilies with varying degrees of intra-relatedness could be discriminated; these relationships are also supported by analysis of sequences outside the canonical AAA boxes: within the cassette are regions that are strongly conserved within each subfamily, whereas little or even no similarity between different subfamilies can be observed. These regions are well suited to define fingerprints for subfamilies. A secondary structure prediction utilizing all available sequence information was performed and the result was fitted to the general 3D structure of a Walker A/GTPase. The agreement was unexpectedly high and strongly supports the conclusion that the AAA family belongs to the Walker superfamily of A/GTPases. PMID:9336829

  18. Mining an Ostrinia nubilalis Midgut Expressed Sequence Tag (EST) Library for Candidate Genes and Single Nucleotide Polymorphisms (SNPs)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    European corn borer, Ostrinia nubilalis, larvae feed upon many plant hosts and are a major target for genetically-engineered corn expressing Bacillus thuringiensis (Bt) toxins. DNA sequencing of a non-normalized O. nubilalis larval midgut cDNA library (ARS-CICGRU ONmgEST) identified 535 unique sequ...

  19. Sequence analysis by iterated maps, a review

    PubMed Central

    2014-01-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, ‘Chaos Game Representation’. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results. PMID:24162172

  20. Gene identification in black cohosh (Actaea racemosa L.): expressed sequence tag profiling and genetic screening yields candidate genes for production of bioactive secondary metabolites.

    PubMed

    Spiering, Martin J; Urban, Lori A; Nuss, Donald L; Gopalan, Vivek; Stoltzfus, Arlin; Eisenstein, Edward

    2011-04-01

    Black cohosh (Actaea racemosa L., syn. Cimicifuga racemosa, Nutt., Ranunculaceae) is a popular herb used for relieving menopausal discomforts. A variety of secondary metabolites, including triterpenoids, phenolic dimers, and serotonin derivatives have been associated with its biological activity, but the genes and metabolic pathways as well as the tissue distribution of their production in this plant are unknown. A gene discovery effort was initiated in A. racemosa by partial sequencing of cDNA libraries constructed from young leaf, rhizome, and root tissues. In total, 2,066 expressed sequence tags (ESTs) were assembled into 1,590 unique genes (unigenes). Most of the unigenes were predicted to encode primary metabolism genes, but about 70 were identified as putative secondary metabolism genes. Several of these candidates were analyzed further and full-length cDNA and genomic sequences for a putative 2,3 oxidosqualene cyclase (CAS1) and two BAHD-type acyltransferases (ACT1 and HCT1) were obtained. Homology-based PCR screening for the central gene in plant serotonin biosynthesis, tryptophan decarboxylase (TDC), identified two TDC-related sequences in A. racemosa. CAS1, ACT1, and HCT1 were expressed in most plant tissues, whereas expression of TDC genes was detected only sporadically in immature flower heads and some very young leaf tissues. The cDNA libraries described and assorted genes identified provide initial insight into gene content and diversity in black cohosh, and provide tools and resources for detailed investigations of secondary metabolite genes and enzymes in this important medicinal plant. PMID:21188383

  1. Computational analysis of wake structure and body forces on marine animal research tag

    NASA Astrophysics Data System (ADS)

    Rosanio, Matthew; Morrida, Jacob; Green, Melissa

    2013-11-01

    The Acousounde 3B marine animal research tag is used to study the relationship between the sounds made by whales and their behaviors, and ultimately to improve whale conservation efforts. In practical implementation, some researchers have attached external GPS Fastloc devices to the top surface of the tag, in order to accurately record the position of the whales throughout the deployment. There is a need to characterize the flow over the tag in order to better understand the body forces being exerted on it and how wake turbulence could affect noise measurements. The addition of the GPS Fastloc exacerbates both of these concerns, as it complicates the hydrodynamics of the device. Using CFD techniques, we were able to simulate the flow over the tag with a GPS attachment at multiple yaw angles. We used Pointwise to construct the mesh and Fluent to simulate the flow. We have also used flow visualization to experimentally validate our computational results. It was found that the GPS has a minimal effect on the wake of the tag at a 0 degree offset from the freestream flow. However, at increasing offset angles, the presence of the GPS greatly increased the amount of wake turbulence observed. Performed work while undergrad at Syracuse.

  2. A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence tags.

    PubMed Central

    Tani, Naoki; Takahashi, Tomokazu; Iwata, Hiroyoshi; Mukai, Yuzuru; Ujino-Ihara, Tokuko; Matsumoto, Asako; Yoshimura, Kensuke; Yoshimaru, Hiroshi; Murai, Masafumi; Nagasaka, Kazutoshi; Tsumura, Yoshihiko

    2003-01-01

    A consensus map for sugi (Cryptomeria japonica) was constructed by integrating linkage data from two unrelated third-generation pedigrees, one derived from a full-sib cross and the other by self-pollination of F1 individuals. The progeny segregation data of the first pedigree were derived from cleaved amplified polymorphic sequences, microsatellites, restriction fragment length polymorphisms, and single nucleotide polymorphisms. The data of the second pedigree were derived from cleaved amplified polymorphic sequences, isozyme markers, morphological traits, random amplified polymorphic DNA markers, and restriction fragment length polymorphisms. Linkage analyses were done for the first pedigree with JoinMap 3.0, using its parameter set for progeny derived by cross-pollination, and for the second pedigree with the parameter set for progeny derived from selfing of F1 individuals. The 11 chromosomes of C. japonica are represented in the consensus map. A total of 438 markers were assigned to 11 large linkage groups, 1 small linkage group, and 1 nonintegrated linkage group from the second pedigree; their total length was 1372.2 cM. On average, the consensus map showed 1 marker every 3.0 cM. PCR-based codominant DNA markers such as cleaved amplified polymorphic sequences and microsatellite markers were distributed in all linkage groups and occupied about half of mapped loci. These markers are very useful for integration of different linkage maps, QTL mapping, and comparative mapping for evolutional study, especially for species with a large genome size such as conifers. PMID:14668402

  3. An evaluation for cross-species proteomics research by publicly available expressed sequence tag database search using tandem mass spectral data.

    PubMed

    Huang, Mei; Chen, Tong; Chan, ZhuLong

    2006-01-01

    With 1383 tandem mass spectra derived from 120 individual protein spots separated by the two-dimensional (2-D) gel electrophoresis of protein samples from three different species, comparative analyses were performed by searching the Expressed Sequence Tag (EST) database (DB) and the NCBI non-redundant (nr) DB of green plants, respectively, which uses the Mascot search engine to establish a statistical basis. It was confirmed that the former could identify more peptides manually validated by de novo sequencing (DNS) from fewer species in more closely phylogenetic relationships than the latter in a statistically significant manner. Our data demonstrated that correct peptide identifications were given low Mascot scores (e.g. 6-14) and incorrect peptide identifications were given high Mascot scores (e.g. 68-83). Our data also showed that the current evaluation approaches to protein assignments are unsatisfactory because a few 'false-positive' proteins are recognized and several 'false-negative' proteins are rescued by manual validation. PMID:16941525

  4. Green Fluorescent Protein-Tagged Retroviral Envelope Protein for Analysis of Virus-Cell Interactions

    PubMed Central

    Spitzer, Dirk; Dittmar, Kurt E. J.; Rohde, Manfred; Hauser, Hansjörg; Wirth, Dagmar

    2003-01-01

    Fluorescent retroviral envelope (Env) proteins were developed for direct visualization of viral particles. By fusing the enhanced green fluorescent protein (eGFP) to the N terminus of the amphotropic 4070A envelope protein, extracellular presentation of eGFP was achieved. Viruses incorporated the modified Env protein and efficiently infected cells. We used the GFP-tagged viruses for staining retrovirus receptor-positive cells, thereby circumventing indirect labeling techniques. By generating cells which conditionally expressed the GFP-tagged Env protein, we could confirm an inverse correlation between retroviral Env expression and infectivity (superinfection). eGFP-tagged virus particles are suitable for monitoring the dynamics of virus-cell interactions. PMID:12719600

  5. Analysis and Design of a Long Range PTFE Substrate UHF RFID Tag for Cargo Container Identification

    NASA Astrophysics Data System (ADS)

    Petrariu, Adrian-Ioan; Popa, Valentin

    2016-01-01

    In this paper, a high-performances microstrip antenna for UHF (ultra high frequency) RFID (radio frequency identification) tag is designed, prototyped and tested. The antenna consists of two main components: a 1.52 mm RT/duroid 5880 laminate substrate on which the antenna is designed and a 10 mm polytetrafluoroethylene (PTFE) dielectric material placed as a separator between the antenna and the reference ground plane for the microstrip antenna. With this structure, the RFID tag can reach a maximum reading distance of 19 m, although the antenna has a compact size of 80 mm × 50 mm. The long reading distance is obtained by attaching to the antenna an RFID chip that can provide a reading sensitivity of -20.5 dBm. The high bandwidth from 677 MHz to 947 MHz measured at -10 dB, makes the tag being usable worldwide especially for cargo container identification, the main purpose of this research.

  6. NexGen Production – Sequencing and Analysis

    SciTech Connect

    Muzny, Donna

    2010-06-02

    Donna Muzny of the Baylor College of Medicine Human Genome Sequencing Center discusses next generation sequencing platforms and evaluating pipeline performance on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  7. Identification and characterisation of functional expressed sequence tags-derived simple sequence repeat (eSSR) markers for genetic linkage mapping of Schistosoma mansoni juvenile resistance and susceptibility loci in Biomphalaria glabrata

    PubMed Central

    Ittiprasert, Wannaporn; Miller, André; Su, Xin-zhuan; Mu, Jianbing; Bhusudsawang, Ganlayarat; Ukoskit, Kitipat; Knight, Matty

    2013-01-01

    Biomphalaria glabrata susceptibility to Schistosoma mansoni has a strong genetic component, offering the possibility for investigating host–parasite interactions at the molecular level, perhaps leading to novel control approaches. The identification, mapping and molecular characterisation of genes that influence the outcome of parasitic infection in the intermediate snail host is, therefore, seen as fundamental to the control of schistosomiasis. To better understand the evolutionary processes driving disease resistance/susceptibility phenotypes, we previously identified polymorphic random amplification of polymorphic DNA and genomic simple sequence repeats from B. glabrata. In the present study we identified and characterised polymorphic expressed simple sequence repeats markers (Bg-eSSR) from existing B. glabrata expressed sequence tags. Using these markers, and with previously identified genomic simple sequence repeats, genetic linkage mapping for parasite refractory and susceptibility phenotypes, the first known for B. glabrata, was initiated. Data mining of 54,309 expressed sequence tag, produced 660 expressed simple sequence repeats of which dinucleotide motifs (TA)n were the most common (37.88%), followed by trinucleotide (29.55%), mononucleotide (18.64%) and tetranucleotide (10.15%). Penta- and hexanucleotide motifs represented <3% of the Bg-eSSRs identified. While the majority (71%) of Bg-eSSRs were monomorphic between resistant and susceptible snails, several were, however, useful for the construction of a genetic linkage map based on their inheritance in segregating F2 progeny snails derived from crossing juvenile BS-90 and NMRI snails. Polymorphic Bg-eSSRs assorted into six linkage groups at a logarithm of odds score of 3. Interestingly, the heritability of four markers (Prim1_910, Prim1_771, Prim6_1024 and Prim7_823) with juvenile snail resistance were, by t-test, significant (P < 0.05) while an allelic marker, Prim24_524, showed linkage with the

  8. Advancing the surgical implantation of electronic tags in fish: a gap analysis and research agenda based on a review of trends in intracoelomic tagging effects studies

    SciTech Connect

    Cooke, Steven J.; Woodley, Christa M.; Eppard, M. B.; Brown, Richard S.; Nielsen, Jennifer L.

    2011-03-08

    Early approaches to surgical implantation of electronic tags in fish were often through trial and error, however, in recent years there has been an interest in using scientific research to identify techniques and procedures that improve the outcome of surgical procedures and determine the effects of tagging on individuals. Here we summarize the trends in 108 peer-reviewed electronic tagging effect studies focused on intracoleomic implantation to determine opportunities for future research. To date, almost all of the studies have been conducted in freshwater, typically in laboratory environments, and have focused on biotelemetry devices. The majority of studies have focused on salmonids, cyprinids, ictalurids and centrarchids, with a regional bias towards North America, Europe and Australia. Most studies have focused on determining whether there is a negative effect of tagging relative to control fish, with proportionally fewer that have contrasted different aspects of the surgical procedure (e.g., methods of sterilization, incision location, wound closure material) that could advance the discipline. Many of these studies included routine endpoints such as mortality, growth, healing and tag retention, with fewer addressing sublethal measures such as swimming ability, predator avoidance, physiological costs, or fitness. Continued research is needed to further elevate the practice of electronic tag implantation in fish in order to ensure that the data generated are relevant to untagged conspecifics (i.e., no long-term behavioural or physiological consequences) and the surgical procedure does not impair the health and welfare status of the tagged fish. To that end, we advocate for i) rigorous controlled manipulations based on statistical designs that have adequate power, account for inter-individual variation, and include controls and shams, ii) studies that transcend the laboratory and the field with more studies in marine waters, iii) incorporation of knowledge and

  9. Tag-SNP analysis of the GFI1-EVI5-RPL5-FAM69 risk locus for multiple sclerosis

    PubMed Central

    Alcina, Antonio; Fernández, Óscar; Gonzalez, Juan Ramón; Catalá-Rabasa, Antonio; Fedetz, María; Ndagire, Dorothy; Leyva, Laura; Guerrero, Miguel; Arnal, Carmen; Delgado, Concepción; Lucas, Miguel; Izquierdo, Guillermo; Matesanz, Fuencisla

    2010-01-01

    A recent genome-wide association study conducted by the International Multiple Sclerosis Genetic Consortium (IMSGC) identified, among others, a number of putative multiple sclerosis (MS) susceptibility variants at position 1p22. Twenty-one SNPs positively associated with MS were located at the GFI-EVI5-RPL5-FAM69A locus. In this study, we performed an analysis and fine mapping of this locus, genotyping eight Tag-SNPs in 732 MS patients and 974 controls from Spain. We observed an association with MS in three of eight Tag-SNPs: rs11804321 (P=0.008, OR=1.29; 95% CI=1.08–1.54), rs11808092 (P=0.048, OR=1.19; 95% CI=1.03–1.39) and rs6680578 (P=0.0082, OR=1.23; 95% CI=1.07–1.41). After correcting for multiple comparisons and using logistic regression analysis to test the addition of each SNP to the most associated SNPs, we observed that rs11804321 alone was sufficient to model the association. This Tag-SNP captures two SNPs in complete linkage disequilibrium (r2=1), both located within the 17th intron of the EVI5 gene. Our findings agree with the corresponding data of the recent IMSGC study and present new genetic evidence that points to EVI5 as a factor of susceptibility to MS. PMID:20087403

  10. Analysis of 3-D Tongue Motion from Tagged and Cine Magnetic Resonance Images

    ERIC Educational Resources Information Center

    Xing, Fangxu; Woo, Jonghye; Lee, Junghoon; Murano, Emi Z.; Stone, Maureen; Prince, Jerry L.

    2016-01-01

    Purpose: Measuring tongue deformation and internal muscle motion during speech has been a challenging task because the tongue deforms in 3 dimensions, contains interdigitated muscles, and is largely hidden within the vocal tract. In this article, a new method is proposed to analyze tagged and cine magnetic resonance images of the tongue during…

  11. Versatile Trans-Replication Systems for Chikungunya Virus Allow Functional Analysis and Tagging of Every Replicase Protein

    PubMed Central

    Utt, Age; Quirin, Tania; Saul, Sirle; Hellström, Kirsi; Ahola, Tero; Merits, Andres

    2016-01-01

    Chikungunya virus (CHIKV; genus Alphavirus, family Togaviridae) has recently caused several major outbreaks affecting millions of people. There are no licensed vaccines or antivirals, and the knowledge of the molecular biology of CHIKV, crucial for development of efficient antiviral strategies, remains fragmentary. CHIKV has a 12 kb positive-strand RNA genome, which is translated to yield a nonstructural (ns) or replicase polyprotein. CHIKV structural proteins are expressed from a subgenomic RNA synthesized in infected cells. Here we have developed CHIKV trans-replication systems, where replicase expression and RNA replication are uncoupled. Bacteriophage T7 RNA polymerase or cellular RNA polymerase II were used for production of mRNAs for CHIKV ns polyprotein and template RNAs, which are recognized by CHIKV replicase and encode for reporter proteins. CHIKV replicase efficiently amplified such RNA templates and synthesized large amounts of subgenomic RNA in several cell lines. This system was used to create tagged versions of ns proteins including nsP1 fused with enhanced green fluorescent protein and nsP4 with an immunological tag. Analysis of these constructs and a matching set of replicon vectors revealed that the replicases containing tagged ns proteins were functional and maintained their subcellular localizations. When cells were co-transfected with constructs expressing template RNA and wild type or tagged versions of CHIKV replicases, formation of characteristic replicase complexes (spherules) was observed. Analysis of mutations associated with noncytotoxic phenotype in CHIKV replicons showed that a low level of RNA replication is not a pre-requisite for reduced cytotoxicity. The CHIKV trans-replicase does not suffer from genetic instability and represents an efficient, sensitive and reliable tool for studies of different aspects of CHIKV RNA replication process. PMID:26963103

  12. A versatile PCR-based tandem epitope tagging system for Streptomyces coelicolor genome.

    PubMed

    Kim, Ji-Nu; Yi, Jeong Sang; Lee, Bo-Rahm; Kim, Eun-Jung; Kim, Min Woo; Song, Yoseb; Cho, Byung-Kwan; Kim, Byung-Gee

    2012-07-20

    Epitope tagging approaches have been widely used for the analysis of functions, interactions and subcellular distributions of proteins. However, incorporating epitope sequence into protein loci in Streptomyces is time-consuming procedure due to the absence of the versatile tagging methods. Here, we developed a versatile PCR-based tandem epitope tagging tool for the Streptomyces genome engineering. We constructed a series of template plasmids that carry repeated sequence of c-myc epitope, Flp recombinase target (FRT) sites, and apramycin resistance marker to insert epitope tags into any desired spot of the chromosomal loci. A DNA module which includes the tandem epitope-encoding sequence and a selectable marker was amplified by PCR with primers that carry homologous extensions to the last portion and downstream region of the targeted gene. We fused the epitope tags at the 3' region of global transcription factors of Streptomyces coelicolor to test the validity of this system. The proper insertion of the epitope tag was confirmed by PCR and western blot analysis. The recombinants showed the identical phenotype to the wild-type that proved the conservation of in vivo function of the tagged proteins. Finally, the direct binding targets were successfully detected by chromatin immunoprecipitation with the increase in the signal-to-noise ratio. The epitope tagging system describes here would provide wide applications to study the protein functions in S. coelicolor. PMID:22704935

  13. The Design and Analysis of Salmonid Tagging Studies in the Columbia Basin; Volume XII; A Multinomial Model for Estimating Ocean Survival from Salmonid Coded Wire-Tag Data.

    SciTech Connect

    Ryding, Kristen E.; Skalski, John R.

    1999-06-01

    The purpose of this report is to illustrate the development of a stochastic model using coded wire-tag (CWT) release and age-at-return data, in order to regress first year ocean survival probabilities against coastal ocean conditions and climate covariates.

  14. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  15. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  16. Identification of genes expressed in human CD34+ hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning

    PubMed Central

    Mao, Mao; Fu, Gang; Wu, Ji-Sheng; Zhang, Qing-Hua; Zhou, Jun; Kan, Li-Xin; Huang, Qiu-Hua; He, Kai-Li; Gu, Bai-Wei; Han, Ze-Guang; Shen, Yu; Gu, Jian; Yu, Ya-Ping; Xu, Shu-Hua; Wang, Ya-Xin; Chen, Sai-Juan; Chen, Zhu

    1998-01-01

    Hematopoietic stem/progenitor cells (HSPCs) possess the potentials of self-renewal, proliferation, and differentiation toward different lineages of blood cells. These cells not only play a primordial role in hematopoietic development but also have important clinical application. Characterization of the gene expression profile in CD34+ HSPCs may lead to a better understanding of the regulation of normal and pathological hematopoiesis. In the present work, genes expressed in human umbilical cord blood CD34+ cells were catalogued by partially sequencing a large amount of cDNA clones [or expressed sequence tags (ESTs)] and analyzing these sequences with the tools of bioinformatics. Among 9,866 ESTs thus obtained, 4,697 (47.6%) showed identity to known genes in the GenBank database, 2,603 (26.4%) matched to the ESTs previously deposited in a public domain database, 1,415 (14.3%) were previously undescribed ESTs, and the remaining 1,151 (11.7%) were mitochondrial DNA, ribosomal RNA, or repetitive (Alu or L1) sequences. Integration of ESTs of known genes generated a profile including 855 genes that could be divided into different categories according to their functions. Some (8.2%) of the genes in this profile were considered related to early hematopoiesis. The possible function of ESTs corresponding to so far unknown genes were approached by means of homology and functional motif searches. Moreover, attempts were made to generate libraries enriched for full-length cDNAs, to better explore the genes in HSPCs. Nearly 60% of the cDNA clones of mRNA under 2 kb in our libraries had 5′ ends upstream of the first ATG codon of the ORF. With this satisfactory result, we have developed an efficient working system that allowed fast sequencing of 32 full-length cDNAs, 16 of them being mapped to the chromosomes with radiation hybrid panels. This work may lay a basis for the further research on the molecular network of hematopoietic regulation. PMID:9653160

  17. Evaporation tagging and atmospheric water budget analysis with WRF: A regional precipitation recycling study for West Africa

    NASA Astrophysics Data System (ADS)

    Arnault, Joel; Knoche, Richard; Wei, Jianhui; Kunstmann, Harald

    2016-03-01

    Regional precipitation recycling is the measure of the contribution of local evaporation E to local precipitation. This study provides a set of two methods developed in the Weather Research and Forecasting WRF model system for investigating regional precipitation recycling mechanisms: (1) tracking of tagged atmospheric water species originating from evaporation in a source region, ie E-tagging, and (2) three-dimensional budgets of total and tagged atmospheric water species. These methods are used to quantify the effect of return flow and nonwell vertical mixing neglected in the computation of the bulk precipitation recycling ratio. The developed algorithms are applied to a WRF simulation of the West African Monsoon 2003. The simulated region is characterized by vertical wind shear condition, i.e., southwesterlies in the low levels and easterlies in the mid-levels, which favors return flow and nonwell vertical mixing. Regional precipitation recycling is investigated in 100 × 100 and 1000 × 1000 km2 areas. A prerequisite condition for evaporated water to contribute to the precipitation process in both areas is that it is lifted to the mid-levels where hydrometeors are produced. In the 100 × 100 (1000 × 1000) km2 area the bulk precipitation recycling ratio is 0.9 (7.3) %. Our budget analysis reveals that return flow and nonwell vertically mixed outflow increase this value by about +0.2 (2.9) and +0.2 (1.6) %, respectively, thus strengthening the well-known scale-dependency of regional precipitation recycling.

  18. Selective Chemoprecipitation and Subsequent Release of Tagged Species for the Analysis of Nitropeptides by Liquid Chromatography–Tandem Mass Spectrometry*

    PubMed Central

    Prokai-Tatrai, Katalin; Guo, Jia; Prokai, Laszlo

    2011-01-01

    Tyrosine nitration is a low-abundance post-translational protein modification that requires appropriate enrichment techniques to enable proteomic analyses. We report a simple yet highly specific method to enrich nitropeptides by chemoprecipitation involving only two straightforward chemical modifications of the nitropeptides before capturing the obtained derivatives with a strategically designed solid-phase active ester reagent. Specifically, capping of the aliphatic amines in the peptides is done first by reductive methylation to preserve the charge state of peptides for electrospray ionization mass spectrometric analysis, followed by reduction of nitrotyrosines to the corresponding aminotyrosines. These peptides are then immobilized on the solid-phase active ester reagent, whereas other peptides carrying no free amino groups are separated from the immobilized species by thoroughly washing the beads from which the tagged peptide derivatives can easily be released by acid-catalyzed hydrolysis at room temperature. The benefits of selective enrichment from a matrix of unmodified peptides for liquid chromatography-tandem mass spectrometry are demonstrated on three synthetic nitropeptides that are nitrated fragments of biologically relevant proteins. Identification of several in vitro nitrated human plasma proteins, also implicated under various pathological processes, by database searches from the enriched and tagged tryptic nitropeptides is presented as a practical application. We also show that converting the nitro-group to the small 4-formylbenzoylamido tag does not significantly alter fragmentation properties upon collision-induced dissociation compared with those of the native nitropeptides, and at the same time this derivatization actually improves electron capture dissociation due to conversion of the electron-predator nitro-group to this novel tag. PMID:21540302

  19. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum)

    PubMed Central

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156–5p, vco-miR156–3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs. PMID:25763692

  20. A high-density genetic recombination map of sequence-tagged sites for sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses.

    PubMed Central

    Bowers, John E; Abbey, Colette; Anderson, Sharon; Chang, Charlene; Draye, Xavier; Hoppe, Alison H; Jessup, Russell; Lemke, Cornelia; Lennington, Jennifer; Li, Zhikang; Lin, Yann-Rong; Liu, Sin-Chieh; Luo, Lijun; Marler, Barry S; Ming, Reiguang; Mitchell, Sharon E; Qiang, Dou; Reischmann, Kim; Schulze, Stefan R; Skinner, D Neil; Wang, Yue-Wen; Kresovich, Stephen; Schertz, Keith F; Paterson, Andrew H

    2003-01-01

    We report a genetic recombination map for Sorghum of 2512 loci spaced at average 0.4 cM ( approximately 300 kb) intervals based on 2050 RFLP probes, including 865 heterologous probes that foster comparative genomics of Saccharum (sugarcane), Zea (maize), Oryza (rice), Pennisetum (millet, buffelgrass), the Triticeae (wheat, barley, oat, rye), and Arabidopsis. Mapped loci identify 61.5% of the recombination events in this progeny set and reveal strong positive crossover interference acting across intervals of sequence-tagged sites will foster many structural, functional and evolutionary genomic studies in major food, feed, and biomass crops. PMID:14504243

  1. Molecular Genetic Analysis of Activation-tagged Transcription Factors Thought to be Involved in Photomorphogenesis

    SciTech Connect

    Neff, Michael

    2011-06-23

    Plants utilize light as a source of information via families of photoreceptors such as the red/far-red absorbing phytochromes (PHY) and the blue/UVA absorbing cryptochromes (CRY). The main goal of the Neff lab is to use molecular-genetic mutant screens to elucidate signaling components downstream of these photoreceptors. Activation-tagging mutagenesis led to the identification of two putative transcription factors that may be involved in both photomorphogenesis and hormone signaling pathways. sob1-D (suppressor of phyB-dominant) mutant phenotypes are caused by the over-expression of a Dof transcription factor previously named OBP3. Our previous studies indicate that OBP3 is a negative regulator of light-mediated cotyledon expansion and may be involved in modulating responsiveness to the growth-regulating hormone auxin. The sob2-D mutant uncovers a role for LEP, a putative AP2/EREBP-like transcription factor, in seed germination, hypocotyl elongation and responsiveness to the hormone abscisic acid. Based on photobiological and genetic analysis of OBP3-knockdown and LEP-null mutations, we hypothesize that these transcription factors are involved in both light-mediated seedling development and hormone signaling. To examine the role that these genes play in photomorphogenesis we will: 1) Further explore the genetic role of OBP3 in cotyledon/leaf expansion and other photomorphogenic processes as well as examine potential physical interactions between OBP3 and CRY1 or other signaling components that genetically interact with this transcription factor 2) Test the hypothesis that OBP3 is genetically involved in auxin signaling and root development as well as examine the affects of this hormone and light on OBP3 protein accumulation. 3) Test the hypothesis that LEP is involved in seed germination, seedling photomorphogenesis and hormone signaling. Together these experiments will lead to a greater understanding of the complexity of interactions between photoreceptors and DNA

  2. Integrating Sequence Evolution into Probabilistic Orthology Analysis.

    PubMed

    Ullah, Ikram; Sjöstrand, Joel; Andersson, Peter; Sennblad, Bengt; Lagergren, Jens

    2015-11-01

    Orthology analysis, that is, finding out whether a pair of homologous genes are orthologs - stemming from a speciation - or paralogs - stemming from a gene duplication - is of central importance in computational biology, genome annotation, and phylogenetic inference. In particular, an orthologous relationship makes functional equivalence of the two genes highly likely. A major approach to orthology analysis is to reconcile a gene tree to the corresponding species tree, (most commonly performed using the most parsimonious reconciliation, MPR). However, most such phylogenetic orthology methods infer the gene tree without considering the constraints implied by the species tree and, perhaps even more importantly, only allow the gene sequences to influence the orthology analysis through the a priori reconstructed gene tree. We propose a sound, comprehensive Bayesian Markov chain Monte Carlo-based method, DLRSOrthology, to compute orthology probabilities. It efficiently sums over the possible gene trees and jointly takes into account the current gene tree, all possible reconciliations to the species tree, and the, typically strong, signal conveyed by the sequences. We compare our method with PrIME-GEM, a probabilistic orthology approach built on a probabilistic duplication-loss model, and MrBayesMPR, a probabilistic orthology approach that is based on conventional Bayesian inference coupled with MPR. We find that DLRSOrthology outperforms these competing approaches on synthetic data as well as on biological data sets and is robust to incomplete taxon sampling artifacts. PMID:26130236

  3. An integrated mobile system for non-destructive analysis with tagged neutrons

    NASA Astrophysics Data System (ADS)

    Cester, D.; Nebbia, G.; Stevanato, L.; Viesti, G.; Neri, F.; Petrucci, S.; Selmi, S.; Tintori, C.

    2013-04-01

    An integrated mobile system for port security is presented. The system is designed to perform active investigations by using the tagged neutron inspection technique of suspect dangerous materials as well as passive measurements of neutrons and gamma rays to search and identify radioactive and special nuclear materials. The system has been employed in detection tests of special nuclear material as well as in a seaport demonstration.

  4. An integrated mobile system for non-destructive analysis with tagged neutrons

    SciTech Connect

    Cester, D.; Stevanato, L.; Viesti, G.; Nebbia, G.; Neri, F.; Petrucci, S.; Selmi, S.; Tintori, C.

    2013-04-19

    An integrated mobile system for port security is presented. The system is designed to perform active investigations by using the tagged neutron inspection technique of suspect dangerous materials as well as passive measurements of neutrons and gamma rays to search and identify radioactive and special nuclear materials. The system has been employed in detection tests of special nuclear material as well as in a seaport demonstration.

  5. Analysis of mixtures using next generation sequencing of mitochondrial DNA hypervariable regions

    PubMed Central

    Kim, Hanna; Erlich, Henry A.; Calloway, Cassandra D.

    2015-01-01

    Aim To apply massively parallel and clonal sequencing (next generation sequencing or NGS) to the analysis of forensic mixed samples. Methods A duplex polymerase chain reaction (PCR) assay targeting the mitochondrial DNA (mtDNA) hypervariable regions I/II (HVI/HVII) was developed for NGS analysis on the Roche 454 GS Junior instrument. Eight sets of multiplex identifier-tagged 454 fusion primers were used in a combinatorial approach for amplification and deep sequencing of up to 64 samples in parallel. Results This assay was shown to be highly sensitive for sequencing limited DNA amounts ( ~ 100 mtDNA copies) and analyzing contrived and biological mixtures with low level variants ( ~ 1%) as well as “complex” mixtures (≥3 contributors). PCR artifact “hybrid” sequences generated by jumping PCR or template switching were observed at a low level (<2%) in the analysis of mixed samples but could be eliminated by reducing the PCR cycle number. Conclusion This study demonstrates the power of NGS technologies targeting the mtDNA HVI/HVII regions for analysis of challenging forensic samples, such as mixtures and specimens with limited DNA. PMID:26088845

  6. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  7. Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags

    PubMed Central

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V.; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-01-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  8. Sample Preparation for Fungal Community Analysis by High-Throughput Sequencing of Barcode Amplicons.

    PubMed

    Clemmensen, Karina Engelbrecht; Ihrmark, Katarina; Durling, Mikael Brandström; Lindahl, Björn D

    2016-01-01

    Fungal species participate in vast numbers of processes in the landscape around us. However, their often cryptic growth, inside various substrates and in highly diverse species assemblages, has been a major obstacle to thorough analysis of fungal communities, hampering exhaustive description of the fungal kingdom. Recent technological developments allowing rapid, high-throughput sequencing of mixed communities from many samples at once are currently having a tremendous impact in fungal community ecology. Universal DNA extraction followed by amplification and sequencing of fungal species-level barcodes such as the nuclear internal transcribed spacer (ITS) region now enable identification and relative quantification of fungal community members across well-replicated experimental settings. Here, we present the sample preparation procedure presently used in our laboratory for fungal community analysis by high-throughput sequencing of amplified ITS2 markers. We focus on the procedure optimized for studies of total fungal communities in humus-rich soils, wood, and litter. However, this procedure can be applied to other sample types and markers. We focus on the laboratory-based part of sample preparation, that is, the procedure from the point where samples enter the laboratory until amplicons are submitted for sequencing. Our procedure comprises four main parts: (1) universal DNA extraction, (2) optimization of PCR conditions, (3) production of tagged ITS amplicons, and (4) preparation of the multiplexed amplicon mix to be sequenced. The presented procedure is independent of the specific high-throughput sequencing technology used, which makes it highly versatile. PMID:26791497

  9. DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations

    PubMed Central

    Andrews, T. Daniel; Jeelall, Yogesh; Talaulikar, Dipti; Goodnow, Christopher C.

    2016-01-01

    Background. Massively parallel sequencing technology is being used to sequence highly diverse populations of DNA such as that derived from heterogeneous cell mixtures containing both wild-type and disease-related states. At the core of such molecule tagging techniques is the tagging and identification of sequence reads derived from individual input DNA molecules, which must be first computationally disambiguated to generate read groups sharing common sequence tags, with each read group representing a single input DNA molecule. This disambiguation typically generates huge numbers of reads groups, each of which requires additional variant detection analysis steps to be run specific to each read group, thus representing a significant computational challenge. While sequencing technologies for producing these data are approaching maturity, the lack of available computational tools for analysing such heterogeneous sequence data represents an obstacle to the widespread adoption of this technology. Results. Using synthetic data we successfully detect unique variants at dilution levels of 1 in a 1,000,000 molecules, and find DeeepSNVMiner obtains significantly lower false positive and false negative rates compared to popular variant callers GATK, SAMTools, FreeBayes and LoFreq, particularly as the variant concentration levels decrease. In a dilution series with genomic DNA from two cells lines, we find DeepSNVMiner identifies a known somatic variant when present at concentrations of only 1 in 1,000 molecules in the input material, the lowest concentration amongst all variant callers tested. Conclusions. Here we present DeepSNVMiner; a tool to disambiguate tagged sequence groups and robustly identify sequence variants specific to subsets of starting DNA molecules that may indicate the presence of a disease. DeepSNVMiner is an automated workflow of custom sequence analysis utilities and open source tools able to differentiate somatic DNA variants from artefactual sequence

  10. DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations.

    PubMed

    Andrews, T Daniel; Jeelall, Yogesh; Talaulikar, Dipti; Goodnow, Christopher C; Field, Matthew A

    2016-01-01

    Background. Massively parallel sequencing technology is being used to sequence highly diverse populations of DNA such as that derived from heterogeneous cell mixtures containing both wild-type and disease-related states. At the core of such molecule tagging techniques is the tagging and identification of sequence reads derived from individual input DNA molecules, which must be first computationally disambiguated to generate read groups sharing common sequence tags, with each read group representing a single input DNA molecule. This disambiguation typically generates huge numbers of reads groups, each of which requires additional variant detection analysis steps to be run specific to each read group, thus representing a significant computational challenge. While sequencing technologies for producing these data are approaching maturity, the lack of available computational tools for analysing such heterogeneous sequence data represents an obstacle to the widespread adoption of this technology. Results. Using synthetic data we successfully detect unique variants at dilution levels of 1 in a 1,000,000 molecules, and find DeeepSNVMiner obtains significantly lower false positive and false negative rates compared to popular variant callers GATK, SAMTools, FreeBayes and LoFreq, particularly as the variant concentration levels decrease. In a dilution series with genomic DNA from two cells lines, we find DeepSNVMiner identifies a known somatic variant when present at concentrations of only 1 in 1,000 molecules in the input material, the lowest concentration amongst all variant callers tested. Conclusions. Here we present DeepSNVMiner; a tool to disambiguate tagged sequence groups and robustly identify sequence variants specific to subsets of starting DNA molecules that may indicate the presence of a disease. DeepSNVMiner is an automated workflow of custom sequence analysis utilities and open source tools able to differentiate somatic DNA variants from artefactual sequence

  11. Integrative visual analysis of protein sequence mutations

    PubMed Central

    2014-01-01

    Background An important aspect of studying the relationship between protein sequence, structure and function is the molecular characterization of the effect of protein mutations. To understand the functional impact of amino acid changes, the multiple biological properties of protein residues have to be considered together. Results Here, we present a novel visual approach for analyzing residue mutations. It combines different biological visualizations and integrates them with molecular data derived from external resources. To show various aspects of the biological information on different scales, our approach includes one-dimensional sequence views, three-dimensional protein structure views and two-dimensional views of residue interaction networks as well as aggregated views. The views are linked tightly and synchronized to reduce the cognitive load of the user when switching between them. In particular, the protein mutations are mapped onto the views together with further functional and structural information. We also assess the impact of individual amino acid changes by the detailed analysis and visualization of the involved residue interactions. We demonstrate the effectiveness of our approach and the developed software on the data provided for the BioVis 2013 data contest. Conclusions Our visual approach and software greatly facilitate the integrative and interactive analysis of protein mutations based on complementary visualizations. The different data views offered to the user are enriched with information about molecular properties of amino acid residues and further biological knowledge. PMID:25237389

  12. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi. PMID:26542221

  13. Expressed sequence tag survey of gene expression in the scab mite Psoroptes ovis--allergens, proteases and free-radical scavengers.

    PubMed

    Kenyon, F; Welsh, M; Parkinson, J; Whitton, C; Blaxter, M L; Knox, D P

    2003-05-01

    Psoroptes ovis, the causative agent of sheep scab, is an important ectoparasitic mite infecting sheep, goats and cattle. Infection is characterized by an extensive dermatitis, scab formation and intense itching. Initial focal lesions spread outwards, coalesce and may extend over the whole body. The host response to infestation has all the characteristics of an immediate-type hypersensitivity reaction but the mite antigens and allergens which initiate this response are almost completely undefined. Here, 507 randomly selected cDNAs derived from a mixed population of P. ovis were sequenced and the resultant nucleotide sequences subjected to Cluster analysis and Blast searches. This analysis yielded 280 clusters of which 49 had > 1 sequence with 24 showing significant Blast X homology to another protein in the databases. There were 231 sequences which appeared on one occasion and 109 of these showed significant Blast X homology to other sequences in the databases. This analysis identified homologues of 9 different types of allergens which have been characterized in other allergic conditions such as responses to house dust mites. It also identified a number of cysteine proteases which may contribute to lesion development as well as several free-radical scavenging enzymes which may protect the mite from host immune effector responses. PMID:12793649

  14. Analysis of black fungal biofilms occurring at domestic water taps. I: compositional analysis using Tag-Encoded FLX Amplicon Pyrosequencing.

    PubMed

    Heinrichs, Guido; Hübner, Iris; Schmidt, Carsten K; de Hoog, G Sybren; Haase, Gerhard

    2013-06-01

    Mass growth of dark fungal biofilms on water taps and associated habitats was observed in various German drinking water distribution systems recently. Customers of affected drinking water systems are anxious about potential and unknown health risks. These environments are known to harbour a fungal flora also comprising a variety of fungal opportunists that are well known to cause superficial mycoses in humans (Exophiala equina, Exophiala lecanii-corni) but are not known to establish dark biofilms so far. To gain profound insight on composition of respective biofilms, a metagenomic approach using Tag-Encoded FLX Amplicon Pyrosequencing (TEFAP) of the ribosomal internal transcribed spacer 2 region in comparison with a classical cultivation approach using Sabouraud agar with chloramphenicol and erythritol-chloramphenicol-agar was performed. E. lecanii-corni was found to be the major component in 10 of 13 biofilms analysed independently of the method used. Alternaria sp., E. equina, Fusarium spp. and Ochroconis spp. were also relatively abundant. As expected, TEFAP usually revealed a higher diversity than the cultivation approaches. For example, opportunistic species like Candida albicans or Exophiala dermatitidis were detected in very low amounts. In conclusion, TEFAP turned out to be a promising and powerful tool for the semi-quantitative analysis of fungal biofilms. Referring to relevant literature, potential biological hazards caused by fungi of the dark biofilms can be regarded as low. PMID:23385952

  15. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  16. Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L.

    PubMed Central

    Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire

    2012-01-01

    Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr. PMID:22210604

  17. OSIRI-REx Touch and Go (TAG) Navigation Performance

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Antreasian, Peter; Moreau, Michael C.; May, Alex; Sutter, Brian

    2015-01-01

    The Origins Spectral Interpretation Resource Identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) Bennu in late 2018. Following an extensive campaign of proximity operations activities to characterize the properties of Bennu and select a suitable sample site, OSIRIS-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid's surface to obtain a regolith sample. The paper summarizes the mission design of the TAG sequence, the propulsive maneuvers required to achieve the trajectory, and the sequence of events leading up to the TAG event. The paper also summarizes the Monte-Carlo simulation of the TAG sequence and presents analysis results that demonstrate the ability to conduct the TAG within 25 meters of the selected sample site and 2 cm/s of the targeted contact velocity. The paper describes some of the challenges associated with conducting precision navigation operations and ultimately contacting a very small asteroid.

  18. OSIRIS-REx Touch-And-Go (TAG) Navigation Performance

    NASA Technical Reports Server (NTRS)

    Berry, Kevin; Antreasian, Peter; Moreau, Michael C.; May, Alex; Sutter, Brian

    2015-01-01

    The Origins Spectral Interpretation Resource identification Security Regolith Explorer (OSIRIS-REx) mission is a NASA New Frontiers mission launching in 2016 to rendezvous with the near-Earth asteroid (101955) Bennu in late 2018. Following an extensive campaign of proximity operations activities to characterize the properties of Bennu and select a suitable sample site, OSIRIES-REx will fly a Touch-And-Go (TAG) trajectory to the asteroid's surface to obtain a regolith sample. The paper summarizes the mission design of the TAG sequence, the propulsive required to achieve the trajectory, and the sequence of events leading up to the TAG event. The paper will summarize the Monte-Carlo simulation of the TAG sequence and present analysis results that demonstrate the ability to conduct the TAG within 25 meters of the selected sample site and +-2 cms of the targeted contact velocity. The paper will describe some of the challenges associated with conducting precision navigation operations and ultimately contacting a very small asteroid.

  19. Surface Plasmon Resonance Analysis of Histidine-Tagged F1-ATPase Surface Adsorption

    NASA Astrophysics Data System (ADS)

    Tucker, Jenifer K.; Richter, Mark L.; Berrie, Cindy L.

    2015-11-01

    Studies of the rotational activity of the enzymatic core (α3β3γ) of the F1-ATPase motor protein have relied on binding the enzyme to NTA-coated glass surfaces via polyhistidine tags engineered into the C-termini of each of the three α or β subunits. Those studies revealed the rotational motion of the central γ subunit by monitoring the motion of attached micron-long actin filaments or spherical nanoparticles. However, only a small percentage of the attached filaments or particles were observed to rotate, likely due, at least in part, to non-uniform surface attachment of the motor proteins. In this study, we have applied surface plasmon resonance to monitor the kinetics and affinity of binding of the His-tagged motor protein to NTA-coated gold sensor surfaces. The binding data, when fit to a heterogeneous binding model, exhibit two sets of adsorption-desorption rate constants with two dissociation constants of 4.0 × 10-9 M and 8.6 × 10-11 M for 6His-α3β3γ binding to the nickel ion-activated NTA surface. The data are consistent with mixed attachment of the protein via two (bimodal) and three (trimodal) NTA/Ni2+-His-tag interactions, respectively, with the less stable bimodal interaction dominating. The results provide a partial explanation for the low number of surface-attached F1 motors previously observed in rotation studies and suggest alternative approaches to uniform F1 motor surface attachment for future fabrication of motor-based nanobiodevices and materials.

  20. Meshless deformable models for 3D cardiac motion and strain analysis from tagged MRI.

    PubMed

    Wang, Xiaoxu; Chen, Ting; Zhang, Shaoting; Schaerer, Joël; Qian, Zhen; Huh, Suejung; Metaxas, Dimitris; Axel, Leon

    2015-01-01

    Tagged magnetic resonance imaging (TMRI) provides a direct and noninvasive way to visualize the in-wall deformation of the myocardium. Due to the through-plane motion, the tracking of 3D trajectories of the material points and the computation of 3D strain field call for the necessity of building 3D cardiac deformable models. The intersections of three stacks of orthogonal tagging planes are material points in the myocardium. With these intersections as control points, 3D motion can be reconstructed with a novel meshless deformable model (MDM). Volumetric MDMs describe an object as point cloud inside the object boundary and the coordinate of each point can be written in parametric functions. A generic heart mesh is registered on the TMRI with polar decomposition. A 3D MDM is generated and deformed with MR image tagging lines. Volumetric MDMs are deformed by calculating the dynamics function and minimizing the local Laplacian coordinates. The similarity transformation of each point is computed by assuming its neighboring points are making the same transformation. The deformation is computed iteratively until the control points match the target positions in the consecutive image frame. The 3D strain field is computed from the 3D displacement field with moving least squares. We demonstrate that MDMs outperformed the finite element method and the spline method with a numerical phantom. Meshless deformable models can track the trajectory of any material point in the myocardium and compute the 3D strain field of any particular area. The experimental results on in vivo healthy and patient heart MRI show that the MDM can fully recover the myocardium motion in three dimensions. PMID:25157446

  1. Meshless deformable models for 3D cardiac motion and strain analysis from tagged MRI

    PubMed Central

    Wang, Xiaoxu; Chen, Ting; Zhang, Shaoting; Schaerer, Joël; Qian, Zhen; Huh, Suejung; Metaxas, Dimitris; Axel, Leon

    2016-01-01

    Tagged magnetic resonance imaging (TMRI) provides a direct and noninvasive way to visualize the in-wall deformation of the myocardium. Due to the through-plane motion, the tracking of 3D trajectories of the material points and the computation of 3D strain field call for the necessity of building 3D cardiac deformable models. The intersections of three stacks of orthogonal tagging planes are material points in the myocardium. With these intersections as control points, 3D motion can be reconstructed with a novel meshless deformable model (MDM). Volumetric MDMs describe an object as point cloud inside the object boundary and the coordinate of each point can be written in parametric functions. A generic heart mesh is registered on the TMRI with polar decomposition. A 3D MDM is generated and deformed with MR image tagging lines. Volumetric MDMs are deformed by calculating the dynamics function and minimizing the local Laplacian coordinates. The similarity transformation of each point is computed by assuming its neighboring points are making the same transformation. The deformation is computed iteratively until the control points match the target positions in the consecutive image frame. The 3D strain field is computed from the 3D displacement field with moving least squares. We demonstrate that MDMs outperformed the finite element method and the spline method with a numerical phantom. Meshless deformable models can track the trajectory of any material point in the myocardium and compute the 3D strain field of any particular area. The experimental results on in vivo healthy and patient heart MRI show that the MDM can fully recover the myocardium motion in three dimensions. PMID:25157446

  2. Whole-genome sequencing in outbreak analysis.

    PubMed

    Gilchrist, Carol A; Turner, Stephen D; Riley, Margaret F; Petri, William A; Hewlett, Erik L

    2015-07-01

    In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  3. Whole-Genome Sequencing in Outbreak Analysis

    PubMed Central

    Turner, Stephen D.; Riley, Margaret F.; Petri, William A.; Hewlett, Erik L.

    2015-01-01

    SUMMARY In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  4. Substantial prevalence of microdeletions of the Y-chromosome in infertile men with idiopathic azoospermia and oligozoospermia detected using a sequence-tagged site-based mapping strategy

    SciTech Connect

    Najmabadi, H.; Huang, V.; Bhasin, D.

    1996-04-01

    Genes on the long arm of Y (Yq), particularly within interval 6, are believed to play a critical role in human spermatogenesis. Cytogenetically detectable deletions of this region are associated with azoospermia in men, but are relatively uncommon. The objective of this study was to validate a sequence-tagged site (STS)-mapping strategy for the detection of Yq microdeletions and to use this method to determine the proportion of men with idiopathic azoospermia or severe oligozoospermia who carry microdeletions in Yq. STS mapping of a sufficiently large sample of infertile men should also help further localize the putative gene(s) involved in the pathogenesis of male infertility. Genomic DNA was extracted from peripheral leukocytes of 16 normal fertile men, 7 normal fertile women, 60 infertile men, and 15 patients with the X-linked disorder, ichthyosis. PCR primers were synthesized for 26 STSs that span Yq interval 6. None of the 16 normal men of known fertility had microdeletions. Seven normal fertile women failed to amplify any of the 26 STSs, providing evidence of their Y specificity. No microdeletions were detected in any of the 15 patients with ichthyosis. Of the 60 infertile men typed with 26 STSs, 11 (18%; 10 azoospermic and 1 oligozoospermic) failed to amplify 1 or more STS. Interestingly, 4 of the 11 patients had microdeletions in a region that is outside the Yq region from which the DAZ (deleted in azoospermia gene region) gene was cloned. In an additional 3 patients, microdeletions were present both inside and outside the DAZ region. The physical locations of these microdeletions provide further support for the concept that a gene(s) on Yq deletion interval 6 plays an important role in spermatogenesis. The presence of deletions that do not overlap with the DAZ region suggests that genes other than the DAZ gene may also be implicated in the pathogenesis of some subsets of male infertility. 48 refs., 2 figs., 2 tabs.

  5. Analyses of expressed sequence tags in Neurospora reveal rapid evolution of genes associated with the early stages of sexual reproduction in fungi

    PubMed Central

    2012-01-01

    Background The broadly accepted pattern of rapid evolution of reproductive genes is primarily based on studies of animal systems, although several examples of rapidly evolving genes involved in reproduction are found in diverse additional taxa. In fungi, genes involved in mate recognition have been found to evolve rapidly. However, the examples are too few to draw conclusions on a genome scale. Results In this study, we performed microarray hybridizations between RNA from sexual and vegetative tissues of two strains of the heterothallic (self-sterile) filamentous ascomycete Neurospora intermedia, to identify a set of sex-associated genes in this species. We aligned Expressed Sequence Tags (ESTs) from sexual and vegetative tissue of N. intermedia to orthologs from three closely related species: N. crassa, N. discreta and N. tetrasperma. The resulting four-species alignments provided a dataset for molecular evolutionary analyses. Our results confirm a general pattern of rapid evolution of fungal sex-associated genes, compared to control genes with constitutive expression or a high relative expression during vegetative growth. Among the rapidly evolving sex-associated genes, we identified candidates that could be of importance for mating or fruiting-body development. Analyses of five of these candidate genes from additional species of heterothallic Neurospora revealed that three of them evolve under positive selection. Conclusions Taken together, our study represents a novel finding of a genome-wide pattern of rapid evolution of sex-associated genes in the fungal kingdom, and provides a list of candidate genes important for reproductive isolation in Neurospora. PMID:23186325

  6. Expression of Epitope-Tagged Proteins in Mammalian Cells in Culture.

    PubMed

    Bhatt, Jay M; Styers, Melanie L; Sztul, Elizabeth

    2016-01-01

    Before the advent of molecular methods to tag proteins, visualization of proteins within cells required the use of antibodies directed against the protein of interest. Thus, only proteins for which antibodies were available could be visualized. Epitope tagging allows the detection of all proteins with existing sequence information, irrespective of the availability of antibodies directed against them. This technique involves the generation of DNA constructs that express the protein of interest tagged with an epitope that can be recognized by a commercially available antibody. Proteins can be tagged with a wide variety of epitopes using commercially available vectors that allow expression in mammalian cells. Epitope-tagged proteins are easily transfected into mammalian cell lines and, in most cases, tightly mimic the behavior of the endogenous protein. Tagged proteins exogenously expressed in cells provide different types of information depending on the subsequent detection approaches. Using immunofluorescence and immunoelectron microscopy with anti-tag antibodies, relative to known markers of cellular organelles, can provide information on the subcellular localization of the tagged protein and may provide clues regarding the protein's function. Immunofluorescence with anti-tag antibodies can also be utilized to assess the tagged protein's responses to cellular signals and pharmacological treatments. Immunoprecipitations with anti-tag antibodies can recover protein complexes containing the protein of interest, resulting in the identification of interacting proteins. Recovery of tagged proteins on affinity matrices allows their purification for use in biochemical assays. In addition, specialized fluorescent tags, such as the green fluorescent protein (GFP) allow the analysis of cellular dynamics in live cells in real time. PMID:27515071

  7. Time fluctuation analysis of forest fire sequences

    NASA Astrophysics Data System (ADS)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  8. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and...

  9. Analysis of expressed sequence tags derived from a compatible Mycosphaerella fijiensis-banana interaction.

    PubMed

    Portal, Orelvis; Izquierdo, Yovanny; De Vleesschauwer, David; Sánchez-Rodríguez, Aminael; Mendoza-Rodríguez, Milady; Acosta-Suárez, Mayra; Ocaña, Bárbara; Jiménez, Elio; Höfte, Monica

    2011-05-01

    Mycosphaerella fijiensis, a hemibiotrophic fungus, is the causal agent of black leaf streak disease, the most serious foliar disease of bananas and plantains. To analyze the compatible interaction of M. fijiensis with Musa spp., a suppression subtractive hybridization (SSH) cDNA library was constructed to identify transcripts induced at late stages of infection in the host and the pathogen. In addition, a full-length cDNA library was created from the same mRNA starting material as the SSH library. The SSH procedure was effective in identifying specific genes predicted to be involved in plant-fungal interactions and new information was obtained mainly about genes and pathways activated in the plant. Several plant genes predicted to be involved in the synthesis of phenylpropanoids and detoxification compounds were identified, as well as pathogenesis-related proteins that could be involved in the plant response against M. fijiensis infection. At late stages of infection, jasmonic acid and ethylene signaling transduction pathways appear to be active, which corresponds with the necrotrophic life style of M. fijiensis. Quantitative PCR experiments revealed that antifungal genes encoding PR proteins and GDSL-like lipase are only transiently induced 30 days post inoculation (dpi), indicating that the fungus is probably actively repressing plant defense. The only fungal gene found was induced 37 dpi and encodes UDP-glucose pyrophosphorylase, an enzyme involved in the biosynthesis of trehalose. Trehalose biosynthesis was probably induced in response to prior activation of plant antifungal genes and may act as an osmoprotectant against membrane damage. PMID:21279642

  10. Analysis of transcripts expressed by intracellular stages of Eimeria acervulina using expressed sequence tags (ESTs).

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Coccidiosis in chickens is caused by seven species belonging to the genus Eimeria. Even though coccidiosis is a complex disease that can be caused by any combination of these species most of the molecular research concerning chicken Eimeria has been limited to Eimeria tenella. This study describes...

  11. Identification of key genes involved in root development of tomato using expressed sequence tag analysis.

    PubMed

    Kalidhasan, N; Joshi, Deepti; Bhatt, Tarun Kumar; Gupta, Aditya Kumar

    2015-10-01

    Root system of plants are actually fascinating structures, not only critical for plant development, but also important for storage and conduction. Due to its agronomic importance, identification of genes involved in root development has been a subject of intense study. Tomato is the one of the most consumed vegetables in the world. Tomato has been used as model system for dicot plants because of its small genome, well-established transformation techniques and well-constructed physical map. The present study is targeted to identify of root specific genes expressed temporally and also gene(s) involved in lateral root and profuse root development. A total of 890 ESTs were identified from five EST libraries constructed using SSH approach which included temporal gene regulation (early and late) and genes involved in morphogenetic traits (lateral and profuse rooting). One hundred sixty-one unique ESTs identified from various libraries were categorized based on their putative functions and deposited in NCBI-dbEST database. In addition, 36 ESTs were selected for validation of their expression by RT-PCR. The present findings will help in shedding light to the unexplored developmental process of root growth in tomato and plant in general. PMID:26600676

  12. Analysis and functional annotation of expressed sequence tags from the Asian longhorned beetle, Anoplophora glabripennis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Asian longhorned beetle (ALB), ‘Anoplophora glabripennis’, is one of the most economically and ecological important non-native, invasive forest pests recently discovered in North America. Despite the substantial impact of this pest, limited effort has been expended in regards to defining the ge...

  13. Spawning behavior in Atlantic cod: analysis by use of data storage tags

    USGS Publications Warehouse

    Grabowski, Timothy B.; Thorsteinsson, Vilhjalmur; Marteinsdóttir, Gudrún

    2014-01-01

     Electronic data storage tags (DSTs) were implanted into Atlantic cod captured in Icelandic waters from 2002 to 2007 and the depth profiles recovered from these tags (females: n = 31, males: n = 27) were used to identify patterns consistent with published descriptions of cod courtship and spawning behavior. The individual periods of time that males spent exhibiting behavior consistent with being present in a spawning aggregation—i.e. periods consisting of a clear tidal signature in the DST depth profile associated with an individual remaining on or near the substrate—were longer than those of females. Over the course of a spawning season, male cod spent approximately twice the amount of time in spawning aggregations than females, but female cod visited more aggregations per unit time. On average, males participated in approximately 57% more putative spawning events, i.e. vertical ascents potentially corresponding to gamete release, than did females. However, males <85 cm total length participated in the same number of putative spawning events as females of comparable size. In both sexes, larger individuals and/or individuals that spent a longer period of time within an aggregation participated in a larger number of putative spawning events. Although further validation and refinement is necessary, particularly in the identification of spawning events, the ability offered by DSTs to quantify cod spawning behavior may aid in the development of management and conservation plans.

  14. Analysis of new microsatellite markers developed from reported sequences of Japanese flounder Paralichthys olivaceus

    NASA Astrophysics Data System (ADS)

    Yu, Haiyang; Jiang, Liming; Chen, Wei; Wang, Xubo; Wang, Zhigang; Zhang, Quanqi

    2010-12-01

    The expressed sequence tags (ESTs) of Japanese flounder, Paralichthys olivaceus, were selected from GenBank to identify simple sequence repeats (SSRs) or microsatellites. A bioinformatic analysis of 11111 ESTs identified 751 SSR-containing ESTs, including 440 dinucleotide, 254 trinucleotide, 53 tetranucleotide, 95 pentanucleotide and 40 hexanucleotide microsatellites respectively. The CA/TG and GA/TC repeats were the most abundant microsatellites. AT-rich types were predominant among trinucleotide and tetranucleotide microsatellites. PCR primers were designed to amplify 10 identified microsatellites loci. The PCR results from eight pairs of primers showed polymorphisms in wild populations. In 30 wild individuals, the mean observed and expected heterozygosities of these 8 polymorphic SSRs were 0.71 and 0.83 respectively and the average PIC value was 0.8. These microsatellite markers should prove to be a useful addition to the microsatellite markers that are now available for this species.

  15. EST sequencing of Onychophora and phylogenomic analysis of Metazoa.

    PubMed

    Roeding, Falko; Hagner-Holler, Silke; Ruhberg, Hilke; Ebersberger, Ingo; von Haeseler, Arndt; Kube, Michael; Reinhardt, Richard; Burmester, Thorsten

    2007-12-01

    Onychophora (velvet worms) represent a small animal taxon considered to be related to Euarthropoda. We have obtained 1873 5' cDNA sequences (expressed sequence tags, ESTs) from the velvet worm Epiperipatus sp., which were assembled into 833 contigs. BLAST similarity searches revealed that 51.9% of the contigs had matches in the protein databases with expectation values lower than 10(-4). Most ESTs had the best hit with proteins from either Chordata or Arthropoda (approximately 40% respectively). The ESTs included sequences of 27 ribosomal proteins. The orthologous sequences from 28 other species of a broad range of phyla were obtained from the databases, including other EST projects. A concatenated amino acid alignment comprising 5021 positions was constructed, which covers 4259 positions when problematic regions were removed. Bayesian and maximum likelihood methods place Epiperipatus within the monophyletic Ecdysozoa (Onychophora, Arthropoda, Tardigrada and Nematoda), but its exact relation to the Euarthropoda remained unresolved. The "Articulata" concept was not supported. Tardigrada and Nematoda formed a well-supported monophylum, suggesting that Tardigrada are actually Cycloneuralia. In agreement with previous studies, we have demonstrated that random sequencing of cDNAs results in sequence information suitable for phylogenomic approaches to resolve metazoan relationships. PMID:17933557

  16. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  17. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    SciTech Connect

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  18. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    PubMed Central

    2010-01-01

    Background Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35% of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies. PMID:20096101

  19. Whole exome sequence analysis of Peters anomaly.

    PubMed

    Weh, Eric; Reis, Linda M; Happ, Hannah C; Levin, Alex V; Wheeler, Patricia G; David, Karen L; Carney, Erin; Angle, Brad; Hauser, Natalie; Semina, Elena V

    2014-12-01

    Peters anomaly is a rare form of anterior segment ocular dysgenesis, which can also be associated with additional systemic defects. At this time, the majority of cases of Peters anomaly lack a genetic diagnosis. We performed whole exome sequencing of 27 patients with syndromic or isolated Peters anomaly to search for pathogenic mutations in currently known ocular genes. Among the eight previously recognized Peters anomaly genes, we identified a de novo missense mutation in PAX6, c.155G>A, p.(Cys52Tyr), in one patient. Analysis of 691 additional genes currently associated with a different ocular phenotype identified a heterozygous splicing mutation c.1025+2T>A in TFAP2A, a de novo heterozygous nonsense mutation c.715C>T, p.(Gln239*) in HCCS, a hemizygous mutation c.385G>A, p.(Glu129Lys) in NDP, a hemizygous mutation c.3446C>T, p.(Pro1149Leu) in FLNA, and compound heterozygous mutations c.1422T>A, p.(Tyr474*) and c.2544G>A, p.(Met848Ile) in SLC4A11; all mutations, except for the FLNA and SLC4A11 c.2544G>A alleles, are novel. This is the first study to use whole exome sequencing to discern the genetic etiology of a large cohort of patients with syndromic or isolated Peters anomaly. We report five new genes associated with this condition and suggest screening of TFAP2A and FLNA in patients with Peters anomaly and relevant syndromic features and HCCS, NDP and SLC4A11 in patients with isolated Peters anomaly. PMID:25182519

  20. Whole exome sequence analysis of Peters anomaly

    PubMed Central

    Weh, Eric; Reis, Linda M.; Happ, Hannah C.; Levin, Alex V.; Wheeler, Patricia G.; David, Karen L.; Carney, Erin; Angle, Brad; Hauser, Natalie

    2015-01-01

    Peters anomaly is a rare form of anterior segment ocular dysgenesis, which can also be associated with additional systemic defects. At this time, the majority of cases of Peters anomaly lack a genetic diagnosis. We performed whole exome sequencing of 27 patients with syndromic or isolated Peters anomaly to search for pathogenic mutations in currently known ocular genes. Among the eight previously recognized Peters anomaly genes, we identified a de novo missense mutation in PAX6, c.155G>A, p.(Cys52Tyr), in one patient. Analysis of 691 additional genes currently associated with a different ocular phenotype identified a heterozygous splicing mutation c.1025+2T>A in TFAP2A, a de novo heterozygous nonsense mutation c.715C>T, p.(Gln239*) in HCCS, a hemizygous mutation c.385G>A, p.(Glu129Lys) in NDP, a hemizygous mutation c.3446C>T, p.(Pro1149Leu) in FLNA, and compound heterozygous mutations c.1422T>A, p.(Tyr474*) and c.2544G>A, p.(Met848Ile) in SLC4A11; all mutations, except for the FLNA and SLC4A11 c.2544G>A alleles, are novel. This is the frst study to use whole exome sequencing to discern the genetic etiology of a large cohort of patients with syndromic or isolated Peters anomaly. We report five new genes associated with this condition and suggest screening of TFAP2A and FLNA in patients with Peters anomaly and relevant syndromic features and HCCS, NDP and SLC4A11 in patients with isolated Peters anomaly. PMID:25182519

  1. An Analysis of the Effects of RFID Tags on Narrowband Navigation and Communication Receivers

    NASA Technical Reports Server (NTRS)

    LaBerge, E. F. Charles

    2007-01-01

    The simulated effects of the Radio Frequency Identification (RFID) tag emissions on ILS Localizer and ILS Glide Slope functions match the analytical models developed in support of DO-294B provided that the measured peak power levels are adjusted for 1) peak-to-average power ratio, 2) effective duty cycle, and 3) spectrum analyzer measurement bandwidth. When these adjustments are made, simulated and theoretical results are in extraordinarily good agreement. The relationships hold over a large range of potential interference-to-desired signal power ratios, provided that the adjusted interference power is significantly higher than the sum of the receiver noise floor and the noise-like contributions of all other interference sources. When the duty-factor adjusted power spectral densities are applied in the evaluation process described in Section 6 of DO-294B, most narrowband guidance and communications radios performance parameters are unaffected by moderate levels of RFID interference. Specific conclusions and recommendations are provided.

  2. A Comparison of Hyperelastic Warping of PET Images with Tagged MRI for the Analysis of Cardiac Deformation

    DOE PAGESBeta

    Veress, Alexander I.; Klein, Gregory; Gullberg, Grant T.

    2013-01-01

    Tmore » he objectives of the following research were to evaluate the utility of a deformable image registration technique known as hyperelastic warping for the measurement of local strains in the left ventricle through the analysis of clinical, gated PET image datasets.wo normal human male subjects were sequentially imaged with PET and tagged MRI imaging. Strain predictions were made for systolic contraction using warping analyses of the PET images and HARP based strain analyses of the MRI images. Coefficient of determination R 2 values were computed for the comparison of circumferential and radial strain predictions produced by each methodology.here was good correspondence between the methodologies, with R 2 values of 0.78 for the radial strains of both hearts and from an R 2 = 0.81 and R 2 = 0.83 for the circumferential strains.he strain predictions were not statistically different ( P ≤ 0.01 ) . A series of sensitivity results indicated that the methodology was relatively insensitive to alterations in image intensity, random image noise, and alterations in fiber structure.his study demonstrated that warping was able to provide strain predictions of systolic contraction of the LV consistent with those provided by tagged MRI Warping.« less

  3. Shark Tagging Activities.

    ERIC Educational Resources Information Center

    Current: The Journal of Marine Education, 1998

    1998-01-01

    In this group activity, children learn about the purpose of tagging and how scientists tag a shark. Using a cut-out of a shark, students identify, measure, record data, read coordinates, and tag a shark. Includes introductory information about the purpose of tagging and the procedure, a data sheet showing original tagging data from Tampa Bay, and…

  4. Time-dependent accident sequence analysis

    SciTech Connect

    Chu, T.L.

    1983-01-01

    One problem of the current event tree methodology is that the transitions between accident sequences are not modeled. The causes of transitions are mostly due to operator actions during an accident. A model for such transitions is presented. A generalized algorithm is used for quantification. In the more realistic accident analysis, the progression of the physical processes, which determines the time available for proper operators response, is modeled. Furthermore, the uncertainty associated with the physical modeling is considered. As an example, the approach is applied to analyze TMI-type accidents. Statistical evidence is collected and used in assessing the frequency of stuck-open pressure operated relief valve at B and W plants as well as the frequency of misdiagnosis. Statistical data are also used in modeling the timing of operator actions during the accident. A thermal code (CUT) is developed to determine the time at which the core uncovery occurs. A response surface is used to propagate the uncertainty associated with the thermal code.

  5. An analysis of the feasibility of short read sequencing

    PubMed Central

    Whiteford, Nava; Haslam, Niall; Weber, Gerald; Prügel-Bennett, Adam; Essex, Jonathan W.; Roach, Peter L.; Bradley, Mark; Neylon, Cameron

    2005-01-01

    Several methods for ultra high-throughput DNA sequencing are currently under investigation. Many of these methods yield very short blocks of sequence information (reads). Here we report on an analysis showing the level of genome sequencing possible as a function of read length. It is shown that re-sequencing and de novo sequencing of the majority of a bacterial genome is possible with read lengths of 20–30 nt, and that reads of 50 nt can provide reconstructed contigs (a contiguous fragment of sequence data) of 1000 nt and greater that cover 80% of human chromosome 1. PMID:16275781

  6. Project Report: Automatic Sequence Processor Software Analysis

    NASA Technical Reports Server (NTRS)

    Benjamin, Brandon

    2011-01-01

    The Mission Planning and Sequencing (MPS) element of Multi-Mission Ground System and Services (MGSS) provides space missions with multi-purpose software to plan spacecraft activities, sequence spacecraft commands, and then integrate these products and execute them on spacecraft. Jet Propulsion Laboratory (JPL) is currently is flying many missions. The processes for building, integrating, and testing the multi-mission uplink software need to be improved to meet the needs of the missions and the operations teams that command the spacecraft. The Multi-Mission Sequencing Team is responsible for collecting and processing the observations, experiments and engineering activities that are to be performed on a selected spacecraft. The collection of these activities is called a sequence and ultimately a sequence becomes a sequence of spacecraft commands. The operations teams check the sequence to make sure that no constraints are violated. The workflow process involves sending a program start command, which activates the Automatic Sequence Processor (ASP). The ASP is currently a file-based system that is comprised of scripts written in perl, c-shell and awk. Once this start process is complete, the system checks for errors and aborts if there are any; otherwise the system converts the commands to binary, and then sends the resultant information to be radiated to the spacecraft.

  7. Automated shielding analysis sequences for spent fuel casks

    SciTech Connect

    Tang, J.S.; Parks, C.V.; Hermann, O.W.

    1987-01-01

    Two important Shielding Analysis Sequences (SAS) have recently been developed within the SCALE computational system. These sequences significantly enhance the existing SCALE system capabilities for evaluating radiation doses exterior to spent fuel casks. These new control module sequences (SAS1 and SAS4) and their capabilities are discussed and demonstrated, together with the existing SAS2 sequence that is used to generate radiation sources for spent fuel. Particular attention is given to the new SAS4 sequence which provides an automated scheme for generating and using biasing parameters in a subsequent Monte Carlo analysis of a cask.

  8. Sequencing and analysis of a genomic fragment provide an insight into the Dunaliella viridis genomic sequence.

    PubMed

    Sun, Xiao-Ming; Tang, Yuan-Ping; Meng, Xiang-Zong; Zhang, Wen-Wen; Li, Shan; Deng, Zhi-Rui; Xu, Zheng-Kai; Song, Ren-Tao

    2006-11-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)(n) type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features. PMID:17091199

  9. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2001-06-05

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  10. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1999-10-26

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  11. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, M.S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.

  12. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2003-08-19

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  13. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  14. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  15. Differential Proteomic Analysis of Human Saliva using Tandem Mass Tags Quantification for Gastric Cancer Detection

    PubMed Central

    Xiao, Hua; Zhang, Yan; Kim, Yong; Kim, Sung; Kim, Jae Joon; Kim, Kyoung Mee; Yoshizawa, Janice; Fan, Liu-Yin; Cao, Cheng-Xi; Wong, David T. W.

    2016-01-01

    Novel biomarkers and non-invasive diagnostic methods are urgently needed for the screening of gastric cancer to reduce its high mortality. We employed quantitative proteomics approach to develop discriminatory biomarker signatures from human saliva for the detection of gastric cancer. Salivary proteins were analyzed and compared between gastric cancer patients and matched control subjects by using tandem mass tags (TMT) technology. More than 500 proteins were identified with quantification, and 48 of them showed significant difference expression (p < 0.05) between normal controls and gastric cancer patients, including 7 up-regulated proteins and 41 down-regulated proteins. Five proteins were selected for initial verification by ELISA and three were successfully verified, namely cystatin B (CSTB), triosephosphate isomerase (TPI1), and deleted in malignant brain tumors 1 protein (DMBT1). All three proteins could differentiate gastric cancer patients from normal control subjects, dramatically (p < 0.05). The combination of these three biomarkers could reach 85% sensitivity and 80% specificity for the detection of gastric cancer with accuracy of 0.93. This study provides the proof of concept of salivary biomarkers for the non-invasive detection of gastric cancer. It is highly encouraging to turn these biomarkers into an applicable clinical test after large scale validation. PMID:26911362

  16. Differential Proteomic Analysis of Human Saliva using Tandem Mass Tags Quantification for Gastric Cancer Detection.

    PubMed

    Xiao, Hua; Zhang, Yan; Kim, Yong; Kim, Sung; Kim, Jae Joon; Kim, Kyoung Mee; Yoshizawa, Janice; Fan, Liu-Yin; Cao, Cheng-Xi; Wong, David T W

    2016-01-01

    Novel biomarkers and non-invasive diagnostic methods are urgently needed for the screening of gastric cancer to reduce its high mortality. We employed quantitative proteomics approach to develop discriminatory biomarker signatures from human saliva for the detection of gastric cancer. Salivary proteins were analyzed and compared between gastric cancer patients and matched control subjects by using tandem mass tags (TMT) technology. More than 500 proteins were identified with quantification, and 48 of them showed significant difference expression (p < 0.05) between normal controls and gastric cancer patients, including 7 up-regulated proteins and 41 down-regulated proteins. Five proteins were selected for initial verification by ELISA and three were successfully verified, namely cystatin B (CSTB), triosephosphate isomerase (TPI1), and deleted in malignant brain tumors 1 protein (DMBT1). All three proteins could differentiate gastric cancer patients from normal control subjects, dramatically (p < 0.05). The combination of these three biomarkers could reach 85% sensitivity and 80% specificity for the detection of gastric cancer with accuracy of 0.93. This study provides the proof of concept of salivary biomarkers for the non-invasive detection of gastric cancer. It is highly encouraging to turn these biomarkers into an applicable clinical test after large scale validation. PMID:26911362

  17. Characterization and RNA-seq analysis of underperformer, an activation-tagged potato mutant.

    PubMed

    Aulakh, Sukhwinder S; Veilleux, Richard E; Dickerman, Allan W; Tang, Guozhu; Flinn, Barry S

    2014-04-01

    The potato cv. Bintje and a Bintje activation-tagged mutant, underperformer (up) were compared. Mutant up plants grown in vitro were dwarf, with abundant axillary shoot growth, greater tuber yield, altered tuber traits and early senescence compared to wild type. Under in vivo conditions, the dwarf and early senescence phenotypes of the mutant remained, but the up plants exhibited a lower tuber yield and fewer axillary shoots compared to wild type. Southern blot analyses indicated a single T-DNA insertion in the mutant, located on chromosome 10. Initial PCR-based gene expression studies indicated transcriptional activation/repression of several genes in the mutant flanking the insertion. The gene immediately flanking the right border of the T-DNA insertion, which encoded an uncharacterized Broad complex, Tramtrac, Bric-a-brac; also known as Pox virus and Zinc finger (BTB/POZ) domain-containing protein (StBTB/POZ1) containing an Armadillo repeat region, was up-regulated in the mutant. Global gene expression comparisons between Bintje and up using RNA-seq on leaves from 60 day-old plants revealed a dataset of over 1,600 differentially expressed genes. Gene expression analyses suggested a variety of biological processes and pathways were modified in the mutant, including carbohydrate and lipid metabolism, cell division and cell cycle activity, biotic and abiotic stress responses, and proteolysis. PMID:24306493

  18. Comparative Genomic Sequence Analysis of the Human Chromosome 21 Down Syndrome Critical Region

    PubMed Central

    Toyoda, Atsushi; Noguchi, Hideki; Taylor, Todd D.; Ito, Takehiko; Pletcher, Mathew T.; Sakaki, Yoshiyuki; Reeves, Roger H.; Hattori, Masahira

    2002-01-01

    Comprehensive knowledge of the gene content of human chromosome 21 (HSA21) is essential for understanding the etiology of Down syndrome (DS). Here we report the largest comparison of finished mouse and human sequence to date for a 1.35-Mb region of mouse chromosome 16 (MMU16) that corresponds to human chromosome 21q22.2. This includes a portion of the commonly described “DS critical region,” thought to contain a gene or genes whose dosage imbalance contributes to a number of phenotypes associated with DS. We used comparative sequence analysis to construct a DNA feature map of this region that includes all known genes, plus 144 conserved sequences ≥100 bp long that show ≥80% identity between mouse and human but do not match known exons. Twenty of these have matches to expressed sequence tag and cDNA databases, indicating that they may be transcribed sequences from chromosome 21. Eight putative CpG islands are found at conserved positions. Models for two human genes, DSCR4 and DSCR8, are not supported by conserved sequence, and close examination indicates that low-level transcripts from these loci are unlikely to encode proteins. Gene prediction programs give different results when used to analyze the well-conserved regions between mouse and human sequences. Our findings have implications for evolution and for modeling the genetic basis of DS in mice. [Sequence data described in this paper have been submitted to the DDBJ/GenBank under accession nos. AP003148 through AP003158, and AB066227. Supplemental material is available at http://www.genome.org.] PMID:12213769

  19. C-Terminally fused affinity Strep-tag II is removed by proteolysis from recombinant human erythropoietin expressed in transgenic tobacco plants

    PubMed Central

    Kittur, Farooqahmed S.; Lalgondar, Mallikarjun; Hung, Chiu-Yueh; Sane, David C.

    2014-01-01

    Asialo-erythropoietin (asialo-EPO), a desialylated form of EPO, is a potent tissue-protective agent. Recently, we and others have exploited a low cost plant-based expression system to produce recombinant human asialo-EPO (asialo-rhuEPOP). To facilitate purification from plant extracts, Strep-tag II was engineered at the C-terminus of EPO. Although asialo-rhuEPOP was efficiently expressed in transgenic tobacco plants, affinity purification based on Strep-tag II did not result in the recovery of the protein. In this study, we investigated the stability of Strep-tag II tagged asialo-rhuEPOP expressed in tobacco plants to understand whether this fused tag is cleaved or inaccessible. Sequencing RT-PCR products confirmed that fused DNA sequences encoding Strep-tag II were properly transcribed, and three-dimensional protein structure model revealed that the tag must be fully accessible. However, Western blot analysis of leaf extracts and purified asialo-rhuEPOP revealed that the Strep-tag II was absent on the protein. Additionally, no peptide fragment containing Strep-tag II was identified in the LC-MS/MS analysis of purified protein further supporting that the affinity tag was absent on asialo-rhuEPOP. However, Strep-tag II was detected on asialo-rhuEPOP that was retained in the endoplasmic reticulum, suggesting that the Strep-tag II is removed during protein secretion or extraction. These findings together with recent reports that C-terminally fused Strep-tag II or IgG Fc domain are also removed from EPO in tobacco plants, suggest that its C-terminus may be highly susceptible to proteolysis in tobacco plants. Therefore, direct fusion of purification tags at the C-terminus of EPO should be avoided while expressing it in tobacco plants. PMID:25504272

  20. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    ERIC Educational Resources Information Center

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  1. Relationships among genera of the Saccharomycotina from multigene sequence analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Most known species of the subphylum Saccharomycotina (budding ascomycetous yeasts) have now been placed in phylogenetically defined clades following multigene sequence analysis. Terminal clades, which are usually well supported from bootstrap analysis, are viewed as phylogenetically circumscribed ge...

  2. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  3. Phylogenetic analysis of Ostreococcus virus sequences from the Patagonian Coast.

    PubMed

    Manrique, Julieta M; Calvo, Andrea Y; Jones, Leandro R

    2012-10-01

    A phylogenetic analysis of new Ostreococcus virus (OV) sequences from the Patagonian Coast, Argentina, and homologous sequences from public databases was performed. This analysis showed that the Patagonian sequences represented a divergent viral clade and that the rest of OV sequences analyzed here were clustered into six additional phylogenetic groups. Analyses of 18S gene libraries supported a close relationship of the Patagonian Ostreococcus host with clade A sequences described elsewhere, corroborating previous studies indicating that clade A strains are ubiquitous. Besides the Patagonian OV sequences, several phylogenetic groupings were linked to particular geographic locations, suggesting a role for allopatric cladogenesis in viral diversification. However, and in agreement with previous observations, other viral lineages included sequences with diverse geographic origins. These findings, together with analyses of ancestral trait trajectories performed here, are consistent with an evolutionary dynamics in which geographical isolation has a role in OV diversification but can be followed by rapid dispersion to remote places. PMID:22674355

  4. Laser desorption mass spectrometry for DNA analysis and sequencing

    SciTech Connect

    Chen, C.H.; Taranenko, N.I.; Tang, K.; Allman, S.L.

    1995-03-01

    Laser desorption mass spectrometry has been considered as a potential new method for fast DNA sequencing. Our approach is to use matrix-assisted laser desorption to produce parent ions of DNA segments and a time-of-flight mass spectrometer to identify the sizes of DNA segments. Thus, the approach is similar to gel electrophoresis sequencing using Sanger`s enzymatic method. However, gel, radioactive tagging, and dye labeling are not required. In addition, the sequencing process can possibly be finished within a few hundred microseconds instead of hours and days. In order to use mass spectrometry for fast DNA sequencing, the following three criteria need to be satisfied. They are (1) detection of large DNA segments, (2) sensitivity reaching the femtomole region, and (3) mass resolution good enough to separate DNA segments of a single nucleotide difference. It has been very difficult to detect large DNA segments by mass spectrometry before due to the fragile chemical properties of DNA and low detection sensitivity of DNA ions. We discovered several new matrices to increase the production of DNA ions. By innovative design of a mass spectrometer, we can increase the ion energy up to 45 KeV to enhance the detection sensitivity. Recently, we succeeded in detecting a DNA segment with 500 nucleotides. The sensitivity was 100 femtomole. Thus, we have fulfilled two key criteria for using mass spectrometry for fast DNA sequencing. The major effort in the near future is to improve the resolution. Different approaches are being pursued. When high resolution of mass spectrometry can be achieved and automation of sample preparation is developed, the sequencing speed to reach 500 megabases per year can be feasible.

  5. Computational and biological analysis of 680 kb of DNA sequence from the human 5q31 cytokine gene cluster region.

    PubMed

    Frazer, K A; Ueda, Y; Zhu, Y; Gifford, V R; Garofalo, M R; Mohandas, N; Martin, C H; Palazzolo, M J; Cheng, J F; Rubin, E M

    1997-05-01

    With the human genome project advancing into what will be a 7- to 10-year DNA sequencing phase, we are presented with the challenge of developing strategies to convert genomic sequence data, as they become available, into biologically meaningful information. We have analyzed 680 kb of noncontiguous DNA sequence from a 1-Mb region of human chromosome 5q31, coupling computational analysis with gene expression studies of tissues isolated from humans as well as from mice containing human YAC transgenes. This genomic interval has been noted previously for containing the cytokine gene cluster and a quantitative trait locus associated with inflammatory diseases. Our analysis identified and verified expression of 16 new genes, as well as 7 previously known genes. Of the total of 23 genes in this region, 78% had similarity matches to sequences in protein databases and 83% had exact expressed sequence tag (EST) database matches. Comparative mapping studies of eight of the new human genes discovered in the 5q31 region revealed that all are located in the syntenic region of mouse chromosome 11q. Our analysis demonstrates an approach for examining human sequence as it is made available from large sequencing programs and has resulted in the discovery of several biomedically important genes, including a cyclin, a transcription factor that is homologous to an oncogene, a protein involved in DNA repair, and several new members of a family of transporter proteins. PMID:9149945

  6. Molecular diet analysis of two african free-tailed bats (molossidae) using high throughput sequencing.

    PubMed

    Bohmann, Kristine; Monadjem, Ara; Lehmkuhl Noer, Christina; Rasmussen, Morten; Zeale, Matt R K; Clare, Elizabeth; Jones, Gareth; Willerslev, Eske; Gilbert, M Thomas P

    2011-01-01

    Given the diversity of prey consumed by insectivorous bats, it is difficult to discern the composition of their diet using morphological or conventional PCR-based analyses of their faeces. We demonstrate the use of a powerful alternate tool, the use of the Roche FLX sequencing platform to deep-sequence uniquely 5' tagged insect-generic barcode cytochrome c oxidase I (COI) fragments, that were PCR amplified from faecal pellets of two free-tailed bat species Chaerephon pumilus and Mops condylurus (family: Molossidae). Although the analyses were challenged by the paucity of southern African insect COI sequences in the GenBank and BOLD databases, similarity to existing collections allowed the preliminary identification of 25 prey families from six orders of insects within the diet of C. pumilus, and 24 families from seven orders within the diet of M. condylurus. Insects identified to families within the orders Lepidoptera and Diptera were widely present among the faecal samples analysed. The two families that were observed most frequently were Noctuidae and Nymphalidae (Lepidoptera). Species-level analysis of the data was accomplished using novel bioinformatics techniques for the identification of molecular operational taxonomic units (MOTU). Based on these analyses, our data provide little evidence of resource partitioning between sympatric M. condylurus and C. pumilus in the Simunye region of Swaziland at the time of year when the samples were collected, although as more complete databases against which to compare the sequences are generated this may have to be re-evaluated. PMID:21731749

  7. Cloning and sequence analysis of the ces10 gene encoding a Sphingomonas paucimobilis esterase.

    PubMed

    Videira, P A; Fialho, A M; Marques, A R; Coutinho, P M; Sá-Correia, I

    2003-06-01

    The ces10 gene of the gellan gum-producing strain Sphingomonas paucimobilis ATCC 31461 was cloned and sequenced. Multi-sequence alignment of the deduced protein indicated that Ces10 belongs to the serine hydrolase family with a potential catalytic triad comprising Ser(153) (within the G-X-S-X-G consensus sequence), His(75) and Asp(125). The mixed block results obtained following pattern search and the low identities detected in a BLAST analysis indicate that Ces10 is significantly different from other characterised bacterial esterases/lipases. Nevertheless, the Ces10 amino acid sequence showed 45% similarity with Rhodococcus sp. heroin esterase and 48% with Bacillus subtilis p-nitrobenzyl esterase. Ces10, with a predicted molecular mass of 30,641 Da, was overproduced in Escherichia coli and purified to homogeneity in a histidine-tagged form. Enzyme assays using p-nitrophenyl-esters (p-NP-esters) with different acyl chain-lengths as the substrate confirmed the anticipated esterase activity. Ces10 exhibited a marked preference for short-chain fatty acids, yielding the highest activity with p-NP-propionate (optimal pH 7.4, optimal temperature 37 degrees C). PMID:12764567

  8. A Single-Nucleotide Polymorphism of TaGS5 Gene Revealed its Association with Kernel Weight in Chinese Bread Wheat

    PubMed Central

    Wang, Shasha; Zhang, Xiangfen; Chen, Feng; Cui, Dangqun

    2015-01-01

    TaGS5 genes were cloned from bread wheat and were physically mapped on 3AS and 3DS. Sequencing results revealed that a SNP was found in the sixth exon of TaGS5-A1 gene. The SNP resulted in amino acid change from alanine to serine at the 303 bp position of TaGS5-A1. These two alleles were designated as TaGS5-A1a (alanine at the 303 bp position) and TaGS5-A1b genes (serine at the 303-bp position). Analysis of association of TaGS5-A1 alleles with agronomic traits indicated that cultivars with TaGS5-A1b possessed wider kernel width and higher thousand-kernel weight, as well as significantly lower plant height, spike length, and internode length below spike than those of cultivars with TaGS5-A1a over 3 years. These trait differences between TaGS5-A1a and TaGS5-A1b genotypes were larger in landraces than in modern cultivars. This finding suggested that TaGS5 gene played an important role in modulating yield-related traits in the landraces, which possibly resulted from numerous superior genes gathering in modern cultivars after strong artificial selection. The preferred TaGS5-A1b haplotype underwent very strong positive selection in Chinese modern wheat breeding, but not in Chinese landraces. Expression analysis of the TaGS5-A1 gene indicated that TaGS5-A1b allele possessed significantly higher expression level than TaGS5-A1b allele in differently developmental seeds. This study could provide relatively superior genotype in view of agronomic traits in wheat breeding programs. Likewise, this study could offer important information for the dissection of molecular and genetic basis of yield-related traits. PMID:26779195

  9. Wavelet Analysis on Symbolic Sequences and Two-Fold de Bruijn Sequences

    NASA Astrophysics Data System (ADS)

    Osipov, V. Al.

    2016-05-01

    The concept of symbolic sequences play important role in study of complex systems. In the work we are interested in ultrametric structure of the set of cyclic sequences naturally arising in theory of dynamical systems. Aimed at construction of analytic and numerical methods for investigation of clusters we introduce operator language on the space of symbolic sequences and propose an approach based on wavelet analysis for study of the cluster hierarchy. The analytic power of the approach is demonstrated by derivation of a formula for counting of two-fold de Bruijn sequences, the extension of the notion of de Bruijn sequences. Possible advantages of the developed description is also discussed in context of applied problem of construction of efficient DNA sequence assembly algorithms.

  10. Wavelet Analysis on Symbolic Sequences and Two-Fold de Bruijn Sequences

    NASA Astrophysics Data System (ADS)

    Osipov, V. Al.

    2016-07-01

    The concept of symbolic sequences play important role in study of complex systems. In the work we are interested in ultrametric structure of the set of cyclic sequences naturally arising in theory of dynamical systems. Aimed at construction of analytic and numerical methods for investigation of clusters we introduce operator language on the space of symbolic sequences and propose an approach based on wavelet analysis for study of the cluster hierarchy. The analytic power of the approach is demonstrated by derivation of a formula for counting of two-fold de Bruijn sequences, the extension of the notion of de Bruijn sequences. Possible advantages of the developed description is also discussed in context of applied problem of construction of efficient DNA sequence assembly algorithms.

  11. Modern Computational Techniques for the HMMER Sequence Analysis

    PubMed Central

    2013-01-01

    This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944

  12. Understanding why users tag: A survey of tagging motivation literature and results from an empirical study

    PubMed Central

    Strohmaier, Markus; Körner, Christian; Kern, Roman

    2012-01-01

    While recent progress has been achieved in understanding the structure and dynamics of social tagging systems, we know little about the underlying user motivations for tagging, and how they influence resulting folksonomies and tags. This paper addresses three issues related to this question. (1) What distinctions of user motivations are identified by previous research, and in what ways are the motivations of users amenable to quantitative analysis? (2) To what extent does tagging motivation vary across different social tagging systems? (3) How does variability in user motivation influence resulting tags and folksonomies? In this paper, we present measures to detect whether a tagger is primarily motivated by categorizing or describing resources, and apply these measures to datasets from seven different tagging systems. Our results show that (a) users’ motivation for tagging varies not only across, but also within tagging systems, and that (b) tag agreement among users who are motivated by categorizing resources is significantly lower than among users who are motivated by describing resources. Our findings are relevant for (1) the development of tag-based user interfaces, (2) the analysis of tag semantics and (3) the design of search algorithms for social tagging systems. PMID:23471473

  13. Stratigraphic sequence analysis of the Antler foreland

    SciTech Connect

    Silberling, N.J.; Nichols, K.M.; Macke, D.L. )

    1993-04-01

    Mid-Upper Devonian to Upper Mississippian strata in western Utah were deposited in the distal Antler foreland. They record lateral and vertical changes in depositional environments that define five successive stratigraphic sequences, each representing a third-order transgressive-regressive cycle. In ascending order, these sequences are informally named the Langenheim (LA) of late Frasnian to mid-Famennian age, the Gutschick (GU) of late Famennian to early Kinderhookian age, the Morris (MO) of late Kinderhookian age; the Sadlick (SA) of Osagean to early Meramecian age, and the Maughan (MA) of mid-Meramecian to Chesterian age. MO is widespread and recognized within carbonate rocks of the Fitchville Formation and Joana Limestone. SA formed in concert with and to the east and south of the Wendover foreland high; the Delle phosphatic event marks maximum marine flooding during SA deposition. The transgressive systems tract of MA includes rhythmic-bedded limestone in the upper part of the Deseret Limestone in west-central Utah and, farther west, the hypoxic limestone and black shale of the Skunk Spring Limestone Bed and part of the overlying Chainman Shale. Traced westward into Nevada, MA first oversteps SA and then MO. Lithostratigraphic correlation of these sequences still farther west into the Eureka thrust belt (ETB) could mean that the youngest strata truncated by the Roberts Mountains thrust belong to the MA and that this thrust is simply part of the post-Mississippian ETB. However, some strata in central Nevada that lithically resemble those of the MA are paleontologically dated as Early Mississippian, the age of sequences overstepped by MA not far to the east. Thus, at least some imbricates of the ETB may contain a sequence stratigraphy which reflects local tectonic control.

  14. Gene CATCHR--gene cloning and tagging for Caenorhabditis elegans using yeast homologous recombination: a novel approach for the analysis of gene expression.

    PubMed

    Sassi, Holly E; Renihan, Stephanie; Spence, Andrew M; Cooperstock, Ramona L

    2005-01-01

    Expression patterns of gene products provide important insights into gene function. Reporter constructs are frequently used to analyze gene expression in Caenorhabditis elegans, but the sequence context of a given gene is inevitably altered in such constructs. As a result, these transgenes may lack regulatory elements required for proper gene expression. We developed Gene Catchr, a novel method of generating reporter constructs that exploits yeast homologous recombination (YHR) to subclone and tag worm genes while preserving their local sequence context. YHR facilitates the cloning of large genomic regions, allowing the isolation of regulatory sequences in promoters, introns, untranslated regions and flanking DNA. The endogenous regulatory context of a given gene is thus preserved, producing expression patterns that are as accurate as possible. Gene Catchr is flexible: any tag can be inserted at any position without introducing extra sequence. Each step is simple and can be adapted to process multiple genes in parallel. We show that expression patterns derived from Gene Catchr transgenes are consistent with previous reports and also describe novel expression data. Mutant rescue assays demonstrate that Gene Catchr-generated transgenes are functional. Our results validate the use of Gene Catchr as a valuable tool to study spatiotemporal gene expression. PMID:16254074

  15. Statistical Survival Analysis of Fish and Wildlife Tagging Studies; SURPH.1 Manual - Analysis of Release-Recapture Data for Survival Studies, 1994 Technical Manual.

    SciTech Connect

    Smith, Steven G.; Skalski, John R.; Schelechte, J. Warren

    1994-12-01

    Program SURPH is the culmination of several years of research to develop a comprehensive computer program to analyze survival studies of fish and wildlife populations. Development of this software was motivated by the advent of the PIT-tag (Passive Integrated Transponder) technology that permits the detection of salmonid smolt as they pass through hydroelectric facilities on the Snake and Columbia Rivers in the Pacific Northwest. Repeated detections of individually tagged smolt and analysis of their capture-histories permits estimates of downriver survival probabilities. Eventual installation of detection facilities at adult fish ladders will also permit estimation of ocean survival and upstream survival of returning salmon using the statistical methods incorporated in SURPH.1. However, the utility of SURPH.1 far exceeds solely the analysis of salmonid tagging studies. Release-recapture and radiotelemetry studies from a wide range of terrestrial and aquatic species have been analyzed using SURPH.1 to estimate discrete time survival probabilities and investigate survival relationships. The interactive computing environment of SURPH.1 was specifically developed to allow researchers to investigate the relationship between survival and capture processes and environmental, experimental and individual-based covariates. Program SURPH.1 represents a significant advancement in the ability of ecologists to investigate the interplay between morphologic, genetic, environmental and anthropogenic factors on the survival of wild species. It is hoped that this better understanding of risk factors affecting survival will lead to greater appreciation of the intricacies of nature and to improvements in the management of wild resources. This technical report is an introduction to SURPH.1 and provides a user guide for both the UNIX and MS-Windows{reg_sign} applications of the SURPH software.

  16. A Novel Function for Arabidopsis CYCLASE1 in Programmed Cell Death Revealed by Isobaric Tags for Relative and Absolute Quantitation (iTRAQ) Analysis of Extracellular Matrix Proteins*

    PubMed Central

    Smith, Sarah J.; Kroon, Johan T. M.; Simon, William J.; Slabas, Antoni R.; Chivasa, Stephen

    2015-01-01

    Programmed cell death is essential for plant development and stress adaptation. A detailed understanding of the signal transduction pathways that regulate plant programmed cell death requires identification of the underpinning protein networks. Here, we have used a protagonist and antagonist of programmed cell death triggered by fumonisin B1 as probes to identify key cell death regulatory proteins in Arabidopsis. Our hypothesis was that changes in the abundance of cell death-regulatory proteins induced by the protagonist should be blocked or attenuated by concurrent treatment with the antagonist. We focused on proteins present in the mobile phase of the extracellular matrix on the basis that they are important for cell–cell communications during growth and stress-adaptive responses. Salicylic acid, a plant hormone that promotes programmed cell death, and exogenous ATP, which can block fumonisin B1-induced cell death, were used to treat Arabidopsis cell suspension cultures prior to isobaric-tagged relative and absolute quantitation analysis of secreted proteins. A total of 33 proteins, whose response to salicylic acid was suppressed by ATP, were identified as putative cell death-regulatory proteins. Among these was CYCLASE1, which was selected for further analysis using reverse genetics. Plants in which CYCLASE1 gene expression was knocked out by insertion of a transfer-DNA sequence manifested dramatically increased cell death when exposed to fumonisin B1 or a bacterial pathogen that triggers the defensive hypersensitive cell death. Although pathogen inoculation altered CYCLASE1 gene expression, multiplication of bacterial pathogens was indistinguishable between wild type and CYCLASE1 knockout plants. However, remarkably severe chlorosis symptoms developed on gene knockout plants in response to inoculation with either a virulent bacterial pathogen or a disabled mutant that is incapable of causing disease in wild type plants. These results show that CYCLASE1, which

  17. Comparative analyses of genotype dependent expressed sequence tags and stress-responsive transcriptome of chickpea wilt illustrate predicted and unexpected genes and novel regulators of plant immunity

    PubMed Central

    Ashraf, Nasheeman; Ghai, Deepali; Barman, Pranjan; Basu, Swaraj; Gangisetty, Nagaraju; Mandal, Mihir K; Chakraborty, Niranjan; Datta, Asis; Chakraborty, Subhra

    2009-01-01

    Background The ultimate phenome of any organism is modulated by regulated transcription of many genes. Characterization of genetic makeup is thus crucial for understanding the molecular basis of phenotypic diversity, evolution and response to intra- and extra-cellular stimuli. Chickpea is the world's third most important food legume grown in over 40 countries representing all the continents. Despite its importance in plant evolution, role in human nutrition and stress adaptation, very little ESTs and differential transcriptome data is available, let alone genotype-specific gene signatures. Present study focuses on Fusarium wilt responsive gene expression in chickpea. Results We report 6272 gene sequences of immune-response pathway that would provide genotype-dependent spatial information on the presence and relative abundance of each gene. The sequence assembly led to the identification of a CaUnigene set of 2013 transcripts comprising of 973 contigs and 1040 singletons, two-third of which represent new chickpea genes hitherto undiscovered. We identified 209 gene families and 262 genotype-specific SNPs. Further, several novel transcription regulators were identified indicating their possible role in immune response. The transcriptomic analysis revealed 649 non-cannonical genes besides many unexpected candidates with known biochemical functions, which have never been associated with pathostress-responsive transcriptome. Conclusion Our study establishes a comprehensive catalogue of the immune-responsive root transcriptome with insight into their identity and function. The development, detailed analysis of CaEST datasets and global gene expression by microarray provide new insight into the commonality and diversity of organ-specific immune-responsive transcript signatures and their regulated expression shaping the species specificity at genotype level. This is the first report on differential transcriptome of an unsequenced genome during vascular wilt. PMID:19732460

  18. HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis

    PubMed Central

    Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan

    2014-01-01

    Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. Availability https://hive.biochemistry.gwu.edu/hive/ PMID:24918764

  19. Initial sequencing and analysis of the human genome.

    PubMed

    Lander, E S; Linton, L M; Birren, B; Nusbaum, C; Zody, M C; Baldwin, J; Devon, K; Dewar, K; Doyle, M; FitzHugh, W; Funke, R; Gage, D; Harris, K; Heaford, A; Howland, J; Kann, L; Lehoczky, J; LeVine, R; McEwan, P; McKernan, K; Meldrim, J; Mesirov, J P; Miranda, C; Morris, W; Naylor, J; Raymond, C; Rosetti, M; Santos, R; Sheridan, A; Sougnez, C; Stange-Thomann, Y; Stojanovic, N; Subramanian, A; Wyman, D; Rogers, J; Sulston, J; Ainscough, R; Beck, S; Bentley, D; Burton, J; Clee, C; Carter, N; Coulson, A; Deadman, R; Deloukas, P; Dunham, A; Dunham, I; Durbin, R; French, L; Grafham, D; Gregory, S; Hubbard, T; Humphray, S; Hunt, A; Jones, M; Lloyd, C; McMurray, A; Matthews, L; Mercer, S; Milne, S; Mullikin, J C; Mungall, A; Plumb, R; Ross, M; Shownkeen, R; Sims, S; Waterston, R H; Wilson, R K; Hillier, L W; McPherson, J D; Marra, M A; Mardis, E R; Fulton, L A; Chinwalla, A T; Pepin, K H; Gish, W R; Chissoe, S L; Wendl, M C; Delehaunty, K D; Miner, T L; Delehaunty, A; Kramer, J B; Cook, L L; Fulton, R S; Johnson, D L; Minx, P J; Clifton, S W; Hawkins, T; Branscomb, E; Predki, P; Richardson, P; Wenning, S; Slezak, T; Doggett, N; Cheng, J F; Olsen, A; Lucas, S; Elkin, C; Uberbacher, E; Frazier, M; Gibbs, R A; Muzny, D M; Scherer, S E; Bouck, J B; Sodergren, E J; Worley, K C; Rives, C M; Gorrell, J H; Metzker, M L; Naylor, S L; Kucherlapati, R S; Nelson, D L; Weinstock, G M; Sakaki, Y; Fujiyama, A; Hattori, M; Yada, T; Toyoda, A; Itoh, T; Kawagoe, C; Watanabe, H; Totoki, Y; Taylor, T; Weissenbach, J; Heilig, R; Saurin, W; Artiguenave, F; Brottier, P; Bruls, T; Pelletier, E; Robert, C; Wincker, P; Smith, D R; Doucette-Stamm, L; Rubenfield, M; Weinstock, K; Lee, H M; Dubois, J; Rosenthal, A; Platzer, M; Nyakatura, G; Taudien, S; Rump, A; Yang, H; Yu, J; Wang, J; Huang, G; Gu, J; Hood, L; Rowen, L; Madan, A; Qin, S; Davis, R W; Federspiel, N A; Abola, A P; Proctor, M J; Myers, R M; Schmutz, J; Dickson, M; Grimwood, J; Cox, D R; Olson, M V; Kaul, R; Raymond, C; Shimizu, N; Kawasaki, K; Minoshima, S; Evans, G A; Athanasiou, M; Schultz, R; Roe, B A; Chen, F; Pan, H; Ramser, J; Lehrach, H; Reinhardt, R; McCombie, W R; de la Bastide, M; Dedhia, N; Blöcker, H; Hornischer, K; Nordsiek, G; Agarwala, R; Aravind, L; Bailey, J A; Bateman, A; Batzoglou, S; Birney, E; Bork, P; Brown, D G; Burge, C B; Cerutti, L; Chen, H C; Church, D; Clamp, M; Copley, R R; Doerks, T; Eddy, S R; Eichler, E E; Furey, T S; Galagan, J; Gilbert, J G; Harmon, C; Hayashizaki, Y; Haussler, D; Hermjakob, H; Hokamp, K; Jang, W; Johnson, L S; Jones, T A; Kasif, S; Kaspryzk, A; Kennedy, S; Kent, W J; Kitts, P; Koonin, E V; Korf, I; Kulp, D; Lancet, D; Lowe, T M; McLysaght, A; Mikkelsen, T; Moran, J V; Mulder, N; Pollara, V J; Ponting, C P; Schuler, G; Schultz, J; Slater, G; Smit, A F; Stupka, E; Szustakowki, J; Thierry-Mieg, D; Thierry-Mieg, J; Wagner, L; Wallis, J; Wheeler, R; Williams, A; Wolf, Y I; Wolfe, K H; Yang, S P; Yeh, R F; Collins, F; Guyer, M S; Peterson, J; Felsenfeld, A; Wetterstrand, K A; Patrinos, A; Morgan, M J; de Jong, P; Catanese, J J; Osoegawa, K; Shizuya, H; Choi, S; Chen, Y J; Szustakowki, J

    2001-02-15

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence. PMID:11237011

  20. High Throughput Sequence Analysis for Disease Resistance in Maize

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Preliminary results of a computational analysis of high throughput sequencing data from Zea mays and the fungus Aspergillus are reported. The Illumina Genome Analyzer was used to sequence RNA samples from two strains of Z. mays (Va35 and Mp313) collected over a time course as well as several specie...

  1. Error analysis of deep sequencing of phage libraries: peptides censored in sequencing.

    PubMed

    Matochko, Wadim L; Derda, Ratmir

    2013-01-01

    Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||ni||, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (Sa). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq). Sequencing without any bias and errors is Seq = Sa IN, where IN is a N × N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. PMID:24416071

  2. De Novo Sequencing and Transcriptome Analysis of the Central Nervous System of Mollusc Lymnaea stagnalis by Deep RNA Sequencing

    PubMed Central

    Sadamoto, Hisayo; Takahashi, Hironobu; Okada, Taketo; Kenmoku, Hiromichi; Toyota, Masao; Asakawa, Yoshinori

    2012-01-01

    The pond snail Lymnaea stagnalis is among several mollusc species that have been well investigated due to the simplicity of their nervous systems and large identifiable neurons. Nonetheless, despite the continued attention given to the physiological characteristics of its nervous system, the genetic information of the Lymnaea central nervous system (CNS) has not yet been fully explored. The absence of genetic information is a large disadvantage for transcriptome sequencing because it makes transcriptome assembly difficult. We here performed transcriptome sequencing for Lymnaea CNS using an Illumina Genome Analyzer IIx platform and obtained 81.9 M of 100 base pair (bp) single end reads. For de novo assembly, five programs were used: ABySS, Velvet, OASES, Trinity and Rnnotator. Based on a comparison of the assemblies, we chose the Rnnotator dataset for the following blast searches and gene ontology analyses. The present dataset, 116,355 contigs of Lymnaea transcriptome shotgun assembly (TSA), contained longer sequences and was much larger compared to the previously reported Lymnaea expression sequence tag (EST) established by classical Sanger sequencing. The TSA sequences were subjected to blast analyses against several protein databases and Aplysia EST data. The results demonstrated that about 20,000 sequences had significant similarity to the reported sequences using a cutoff value of 1e-6, and showed the lack of molluscan sequences in the public databases. The richness of the present TSA data allowed us to identify a large number of new transcripts in Lymnaea and molluscan species. PMID:22870333

  3. MESSA: MEta-Server for protein Sequence Analysis

    PubMed Central

    2012-01-01

    Background Computational sequence analysis, that is, prediction of local sequence properties, homologs, spatial structure and function from the sequence of a protein, offers an efficient way to obtain needed information about proteins under study. Since reliable prediction is usually based on the consensus of many computer programs, meta-severs have been developed to fit such needs. Most meta-servers focus on one aspect of sequence analysis, while others incorporate more information, such as PredictProtein for local sequence feature predictions, SMART for domain architecture and sequence motif annotation, and GeneSilico for secondary and spatial structure prediction. However, as predictions of local sequence properties, three-dimensional structure and function are usually intertwined, it is beneficial to address them together. Results We developed a MEta-Server for protein Sequence Analysis (MESSA) to facilitate comprehensive protein sequence analysis and gather structural and functional predictions for a protein of interest. For an input sequence, the server exploits a number of select tools to predict local sequence properties, such as secondary structure, structurally disordered regions, coiled coils, signal peptides and transmembrane helices; detect homologous proteins and assign the query to a protein family; identify three-dimensional structure templates and generate structure models; and provide predictive statements about the protein's function, including functional annotations, Gene Ontology terms, enzyme classification and possible functionally associated proteins. We tested MESSA on the proteome of Candidatus Liberibacter asiaticus. Manual curation shows that three-dimensional structure models generated by MESSA covered around 75% of all the residues in this proteome and the function of 80% of all proteins could be predicted. Availability MESSA is free for non-commercial use at http://prodata.swmed.edu/MESSA/ PMID:23031578

  4. A Systematic Analysis of Human Disease-Associated Gene Sequences In Drosophila melanogaster

    PubMed Central

    Reiter, Lawrence T.; Potocki, Lorraine; Chien, Sam; Gribskov, Michael; Bier, Ethan

    2001-01-01

    We performed a systematic BLAST analysis of 929 human disease gene entries associated with at least one mutant allele in the Online Mendelian Inheritance in Man (OMIM) database against the recently completed genome sequence of Drosophila melanogaster. The results of this search have been formatted as an updateable and searchable on-line database called Homophila. Our analysis identified 714 distinct human disease genes (77% of disease genes searched) matching 548 unique Drosophila sequences, which we have summarized by disease category. This breakdown into disease classes creates a picture of disease genes that are amenable to study using Drosophila as the model organism. Of the 548 Drosophila genes related to human disease genes, 153 are associated with known mutant alleles and 56 more are tagged by P-element insertions in or near the gene. Examples of how to use the database to identify Drosophila genes related to human disease genes are presented. We anticipate that cross-genomic analysis of human disease genes using the power of Drosophila second-site mo