Sample records for gene identification software

  1. Pattern identification in time-course gene expression data with the CoGAPS matrix factorization.

    PubMed

    Fertig, Elana J; Stein-O'Brien, Genevieve; Jaffe, Andrew; Colantuoni, Carlo

    2014-01-01

    Patterns in time-course gene expression data can represent the biological processes that are active over the measured time period. However, the orthogonality constraint in standard pattern-finding algorithms, including notably principal components analysis (PCA), confounds expression changes resulting from simultaneous, non-orthogonal biological processes. Previously, we have shown that Markov chain Monte Carlo nonnegative matrix factorization algorithms are particularly adept at distinguishing such concurrent patterns. One such matrix factorization is implemented in the software package CoGAPS. We describe the application of this software and several technical considerations for identification of age-related patterns in a public, prefrontal cortex gene expression dataset.

  2. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less

  3. Development of unidentified dna-specific hif 1α gene of lizard (hemidactylus platyurus) which plays a role in tissue regeneration process

    NASA Astrophysics Data System (ADS)

    Novianti, T.; Sadikin, M.; Widia, S.; Juniantito, V.; Arida, E. A.

    2018-03-01

    Development of unidentified specific gene is essential to analyze the availability these genes in biological process. Identification unidentified specific DNA of HIF 1α genes is important to analyze their contribution in tissue regeneration process in lizard tail (Hemidactylus platyurus). Bioinformatics and PCR techniques are relatively an easier method to identify an unidentified gene. The most widely used method is BLAST (Basic Local Alignment Sequence Tools) method for alignment the sequences from the other organism. BLAST technique is online software from website https://blast.ncbi.nlm.nih.gov/Blast.cgi that capable to generate the similar sequences from closest kinship to distant kindship. Gecko japonicus is a species that it has closest kinship with H. platyurus. Comparing HIF 1 α gene sequence of G. japonicus with the other species used multiple alignment methods from Mega7 software. Conserved base areas were identified using Clustal IX method. Primary DNA of HIF 1 α gene was design by Primer3 software. HIF 1α gene of lizard (H. platyurus) was successfully amplified using a real-time PCR machine by primary DNA that we had designed from Gecko japonicus. Identification unidentified gene of HIF 1a lizard has been done successfully with multiple alignment method. The study was conducted by analyzing during the growth of tail on day 1, 3, 5, 7, 10, 13 and 17 of lizard tail after autotomy. Process amplification of HIF 1α gene was described by CT value in real time PCR machine. HIF 1α expression of gene is quantified by Livak formula. Chi-square statistic test is 0.000 which means that there is a different expression of HIF 1 α gene in every growth day treatment.

  4. Synchronous versus asynchronous modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni

    2008-09-01

    In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.

  5. Identification by 16S rRNA gene sequencing of an Actinomyces hongkongensis isolate recovered from a patient with pelvic actinomycosis.

    PubMed

    Flynn, A N; Lyndon, C A; Church, D L

    2013-08-01

    A case of Actinomyces hongkongensis pelvic actinomycosis in an adult woman is described. Conventional phenotypic tests failed to identify the Gram-positive bacillus isolated from a fluid aspirate of a pelvic abscess. The bacterium was identified by 16S rRNA gene sequencing and analysis using the SmartGene Integrated Database Network System software.

  6. Barcoding of fresh water fishes from Pakistan.

    PubMed

    Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah

    2016-07-01

    DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.

  7. Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.

    PubMed

    Andersen, Ethan J; Nepal, Madhav P

    2017-08-01

    We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.

  8. Identification by 16S rRNA Gene Sequencing of an Actinomyces hongkongensis Isolate Recovered from a Patient with Pelvic Actinomycosis

    PubMed Central

    Flynn, A. N.; Lyndon, C. A.

    2013-01-01

    A case of Actinomyces hongkongensis pelvic actinomycosis in an adult woman is described. Conventional phenotypic tests failed to identify the Gram-positive bacillus isolated from a fluid aspirate of a pelvic abscess. The bacterium was identified by 16S rRNA gene sequencing and analysis using the SmartGene Integrated Database Network System software. PMID:23698532

  9. Simultaneous Identification of Multiple Driver Pathways in Cancer

    PubMed Central

    Leiserson, Mark D. M.; Blokh, Dima

    2013-01-01

    Distinguishing the somatic mutations responsible for cancer (driver mutations) from random, passenger mutations is a key challenge in cancer genomics. Driver mutations generally target cellular signaling and regulatory pathways consisting of multiple genes. This heterogeneity complicates the identification of driver mutations by their recurrence across samples, as different combinations of mutations in driver pathways are observed in different samples. We introduce the Multi-Dendrix algorithm for the simultaneous identification of multiple driver pathways de novo in somatic mutation data from a cohort of cancer samples. The algorithm relies on two combinatorial properties of mutations in a driver pathway: high coverage and mutual exclusivity. We derive an integer linear program that finds set of mutations exhibiting these properties. We apply Multi-Dendrix to somatic mutations from glioblastoma, breast cancer, and lung cancer samples. Multi-Dendrix identifies sets of mutations in genes that overlap with known pathways – including Rb, p53, PI(3)K, and cell cycle pathways – and also novel sets of mutually exclusive mutations, including mutations in several transcription factors or other genes involved in transcriptional regulation. These sets are discovered directly from mutation data with no prior knowledge of pathways or gene interactions. We show that Multi-Dendrix outperforms other algorithms for identifying combinations of mutations and is also orders of magnitude faster on genome-scale data. Software available at: http://compbio.cs.brown.edu/software. PMID:23717195

  10. The identification of key genes and pathways in hepatocellular carcinoma by bioinformatics analysis of high-throughput data.

    PubMed

    Zhang, Chaoyang; Peng, Li; Zhang, Yaqin; Liu, Zhaoyang; Li, Wenling; Chen, Shilian; Li, Guancheng

    2017-06-01

    Liver cancer is a serious threat to public health and has fairly complicated pathogenesis. Therefore, the identification of key genes and pathways is of much importance for clarifying molecular mechanism of hepatocellular carcinoma (HCC) initiation and progression. HCC-associated gene expression dataset was downloaded from Gene Expression Omnibus database. Statistical software R was used for significance analysis of differentially expressed genes (DEGs) between liver cancer samples and normal samples. Gene Ontology (GO) term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, based on R software, were applied for the identification of pathways in which DEGs significantly enriched. Cytoscape software was for the construction of protein-protein interaction (PPI) network and module analysis to find the hub genes and key pathways. Finally, weighted correlation network analysis (WGCNA) was conducted to further screen critical gene modules with similar expression pattern and explore their biological significance. Significance analysis identified 1230 DEGs with fold change >2, including 632 significantly down-regulated DEGs and 598 significantly up-regulated DEGs. GO term enrichment analysis suggested that up-regulated DEG significantly enriched in immune response, cell adhesion, cell migration, type I interferon signaling pathway, and cell proliferation, and the down-regulated DEG mainly enriched in response to endoplasmic reticulum stress and endoplasmic reticulum unfolded protein response. KEGG pathway analysis found DEGs significantly enriched in five pathways including complement and coagulation cascades, focal adhesion, ECM-receptor interaction, antigen processing and presentation, and protein processing in endoplasmic reticulum. The top 10 hub genes in HCC were separately GMPS, ACACA, ALB, TGFB1, KRAS, ERBB2, BCL2, EGFR, STAT3, and CD8A, which resulted from PPI network. The top 3 gene interaction modules in PPI network enriched in immune response, organ development, and response to other organism, respectively. WGCNA revealed that the confirmed eight gene modules significantly enriched in monooxygenase and oxidoreductase activity, response to endoplasmic reticulum stress, type I interferon signaling pathway, processing, presentation and binding of peptide antigen, cellular response to cadmium and zinc ion, cell locomotion and differentiation, ribonucleoprotein complex and RNA processing, and immune system process, respectively. In conclusion, we identified some key genes and pathways closely related with HCC initiation and progression by a series of bioinformatics analysis on DEGs. These screened genes and pathways provided for a more detailed molecular mechanism underlying HCC occurrence and progression, holding promise for acting as biomarkers and potential therapeutic targets.

  11. Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT

    PubMed Central

    Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong

    2006-01-01

    Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417

  12. QSAR Study for Carcinogenic Potency of Aromatic Amines Based on GEP and MLPs

    PubMed Central

    Song, Fucheng; Zhang, Anling; Liang, Hui; Cui, Lianhua; Li, Wenlian; Si, Hongzong; Duan, Yunbo; Zhai, Honglin

    2016-01-01

    A new analysis strategy was used to classify the carcinogenicity of aromatic amines. The physical-chemical parameters are closely related to the carcinogenicity of compounds. Quantitative structure activity relationship (QSAR) is a method of predicting the carcinogenicity of aromatic amine, which can reveal the relationship between carcinogenicity and physical-chemical parameters. This study accessed gene expression programming by APS software, the multilayer perceptrons by Weka software to predict the carcinogenicity of aromatic amines, respectively. All these methods relied on molecular descriptors calculated by CODESSA software and eight molecular descriptors were selected to build function equations. As a remarkable result, the accuracy of gene expression programming in training and test sets are 0.92 and 0.82, the accuracy of multilayer perceptrons in training and test sets are 0.84 and 0.74 respectively. The precision of the gene expression programming is obviously superior to multilayer perceptrons both in training set and test set. The QSAR application in the identification of carcinogenic compounds is a high efficiency method. PMID:27854309

  13. Development of a RAD-Seq Based DNA Polymorphism Identification Software, AgroMarker Finder, and Its Application in Rice Marker-Assisted Breeding

    PubMed Central

    Luo, Zhijing; Chen, Mingjiao; Zhao, Xiangxiang; Zhang, Dabing; Qi, Yiping; Yuan, Zheng

    2016-01-01

    Rapid and accurate genome-wide marker detection is essential to the marker-assisted breeding and functional genomics studies. In this work, we developed an integrated software, AgroMarker Finder (AMF: http://erp.novelbio.com/AMF), for providing graphical user interface (GUI) to facilitate the recently developed restriction-site associated DNA (RAD) sequencing data analysis in rice. By application of AMF, a total of 90,743 high-quality markers (82,878 SNPs and 7,865 InDels) were detected between rice varieties JP69 and Jiaoyuan5A. The density of the identified markers is 0.2 per Kb for SNP markers, and 0.02 per Kb for InDel markers. Sequencing validation revealed that the accuracy of genome-wide marker detection by AMF is 93%. In addition, a validated subset of 82 SNPs and 31 InDels were found to be closely linked to 117 important agronomic trait genes, providing a basis for subsequent marker-assisted selection (MAS) and variety identification. Furthermore, we selected 12 markers from 31 validated InDel markers to identify seed authenticity of variety Jiaoyuanyou69, and we also identified 10 markers closely linked to the fragrant gene BADH2 to minimize linkage drag for Wuxiang075 (BADH2 donor)/Jiachang1 recombinants selection. Therefore, this software provides an efficient approach for marker identification from RAD-seq data, and it would be a valuable tool for plant MAS and variety protection. PMID:26799713

  14. Development of a RAD-Seq Based DNA Polymorphism Identification Software, AgroMarker Finder, and Its Application in Rice Marker-Assisted Breeding.

    PubMed

    Fan, Wei; Zong, Jie; Luo, Zhijing; Chen, Mingjiao; Zhao, Xiangxiang; Zhang, Dabing; Qi, Yiping; Yuan, Zheng

    2016-01-01

    Rapid and accurate genome-wide marker detection is essential to the marker-assisted breeding and functional genomics studies. In this work, we developed an integrated software, AgroMarker Finder (AMF: http://erp.novelbio.com/AMF), for providing graphical user interface (GUI) to facilitate the recently developed restriction-site associated DNA (RAD) sequencing data analysis in rice. By application of AMF, a total of 90,743 high-quality markers (82,878 SNPs and 7,865 InDels) were detected between rice varieties JP69 and Jiaoyuan5A. The density of the identified markers is 0.2 per Kb for SNP markers, and 0.02 per Kb for InDel markers. Sequencing validation revealed that the accuracy of genome-wide marker detection by AMF is 93%. In addition, a validated subset of 82 SNPs and 31 InDels were found to be closely linked to 117 important agronomic trait genes, providing a basis for subsequent marker-assisted selection (MAS) and variety identification. Furthermore, we selected 12 markers from 31 validated InDel markers to identify seed authenticity of variety Jiaoyuanyou69, and we also identified 10 markers closely linked to the fragrant gene BADH2 to minimize linkage drag for Wuxiang075 (BADH2 donor)/Jiachang1 recombinants selection. Therefore, this software provides an efficient approach for marker identification from RAD-seq data, and it would be a valuable tool for plant MAS and variety protection.

  15. Rapid identification and classification of Mycobacterium spp. using whole-cell protein barcodes with matrix assisted laser desorption ionization time of flight mass spectrometry in comparison with multigene phylogenetic analysis.

    PubMed

    Wang, Jun; Chen, Wen Feng; Li, Qing X

    2012-02-24

    The need of quick diagnostics and increasing number of bacterial species isolated necessitate development of a rapid and effective phenotypic identification method. Mass spectrometry (MS) profiling of whole cell proteins has potential to satisfy the requirements. The genus Mycobacterium contains more than 154 species that are taxonomically very close and require use of multiple genes including 16S rDNA for phylogenetic identification and classification. Six strains of five Mycobacterium species were selected as model bacteria in the present study because of their 16S rDNA similarity (98.4-99.8%) and the high similarity of the concatenated 16S rDNA, rpoB and hsp65 gene sequences (95.9-99.9%), requiring high identification resolution. The classification of the six strains by MALDI TOF MS protein barcodes was consistent with, but at much higher resolution than, that of the multi-locus sequence analysis of using 16S rDNA, rpoB and hsp65. The species were well differentiated using MALDI TOF MS and MALDI BioTyper™ software after quick preparation of whole-cell proteins. Several proteins were selected as diagnostic markers for species confirmation. An integration of MALDI TOF MS, MALDI BioTyper™ software and diagnostic protein fragments provides a robust phenotypic approach for bacterial identification and classification. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Identification of a novel CLRN1 gene mutation in Usher syndrome type 3: two case reports.

    PubMed

    Yoshimura, Hidekane; Oshikawa, Chie; Nakayama, Jun; Moteki, Hideaki; Usami, Shin-Ichi

    2015-05-01

    This study examines the CLRN1 gene mutation analysis in Japanese patients who were diagnosed with Usher syndrome type 3 (USH3) on the basis of clinical findings. Genetic analysis using massively parallel DNA sequencing (MPS) was conducted to search for 9 causative USH genes in 2 USH3 patients. We identified the novel pathogenic mutation in the CLRN1 gene in 2 patients. The missense mutation was confirmed by functional prediction software and segregation analysis. Both patients were diagnosed as having USH3 caused by the CLRN1 gene mutation. This is the first report of USH3 with a CLRN1 gene mutation in Asian populations. Validating the presence of clinical findings is imperative for properly differentiating among USH subtypes. In addition, mutation screening using MPS enables the identification of causative mutations in USH. The clinical diagnosis of this phenotypically variable disease can then be confirmed. © The Author(s) 2015.

  17. FunRich proteomics software analysis, let the fun begin!

    PubMed

    Benito-Martin, Alberto; Peinado, Héctor

    2015-08-01

    Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. KinSNP software for homozygosity mapping of disease genes using SNP microarrays

    PubMed Central

    2010-01-01

    Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from http://bioinfo.bgu.ac.il/bsu/software/kinSNP. PMID:20846928

  19. oPOSSUM-3: Advanced Analysis of Regulatory Motif Over-Representation Across Genes or ChIP-Seq Datasets

    PubMed Central

    Kwon, Andrew T.; Arenillas, David J.; Hunt, Rebecca Worsley; Wasserman, Wyeth W.

    2012-01-01

    oPOSSUM-3 is a web-accessible software system for identification of over-represented transcription factor binding sites (TFBS) and TFBS families in either DNA sequences of co-expressed genes or sequences generated from high-throughput methods, such as ChIP-Seq. Validation of the system with known sets of co-regulated genes and published ChIP-Seq data demonstrates the capacity for oPOSSUM-3 to identify mediating transcription factors (TF) for co-regulated genes or co-recovered sequences. oPOSSUM-3 is available at http://opossum.cisreg.ca. PMID:22973536

  20. oPOSSUM-3: advanced analysis of regulatory motif over-representation across genes or ChIP-Seq datasets.

    PubMed

    Kwon, Andrew T; Arenillas, David J; Worsley Hunt, Rebecca; Wasserman, Wyeth W

    2012-09-01

    oPOSSUM-3 is a web-accessible software system for identification of over-represented transcription factor binding sites (TFBS) and TFBS families in either DNA sequences of co-expressed genes or sequences generated from high-throughput methods, such as ChIP-Seq. Validation of the system with known sets of co-regulated genes and published ChIP-Seq data demonstrates the capacity for oPOSSUM-3 to identify mediating transcription factors (TF) for co-regulated genes or co-recovered sequences. oPOSSUM-3 is available at http://opossum.cisreg.ca.

  1. Autonomous system for Web-based microarray image analysis.

    PubMed

    Bozinov, Daniel

    2003-12-01

    Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users.

  2. BATS: a Bayesian user-friendly software for analyzing time series microarray experiments.

    PubMed

    Angelini, Claudia; Cutillo, Luisa; De Canditiis, Daniela; Mutarelli, Margherita; Pensky, Marianna

    2008-10-06

    Gene expression levels in a given cell can be influenced by different factors, namely pharmacological or medical treatments. The response to a given stimulus is usually different for different genes and may depend on time. One of the goals of modern molecular biology is the high-throughput identification of genes associated with a particular treatment or a biological process of interest. From methodological and computational point of view, analyzing high-dimensional time course microarray data requires very specific set of tools which are usually not included in standard software packages. Recently, the authors of this paper developed a fully Bayesian approach which allows one to identify differentially expressed genes in a 'one-sample' time-course microarray experiment, to rank them and to estimate their expression profiles. The method is based on explicit expressions for calculations and, hence, very computationally efficient. The software package BATS (Bayesian Analysis of Time Series) presented here implements the methodology described above. It allows an user to automatically identify and rank differentially expressed genes and to estimate their expression profiles when at least 5-6 time points are available. The package has a user-friendly interface. BATS successfully manages various technical difficulties which arise in time-course microarray experiments, such as a small number of observations, non-uniform sampling intervals and replicated or missing data. BATS is a free user-friendly software for the analysis of both simulated and real microarray time course experiments. The software, the user manual and a brief illustrative example are freely available online at the BATS website: http://www.na.iac.cnr.it/bats.

  3. Identification of suitable qPCR reference genes in leaves of Brassica oleracea under abiotic stresses.

    PubMed

    Brulle, Franck; Bernard, Fabien; Vandenbulcke, Franck; Cuny, Damien; Dumez, Sylvain

    2014-04-01

    Real-time quantitative PCR is nowadays a standard method to study gene expression variations in various samples and experimental conditions. However, to interpret results accurately, data normalization with appropriate reference genes appears to be crucial. The present study describes the identification and the validation of suitable reference genes in Brassica oleracea leaves. Expression stability of eight candidates was tested following drought and cold abiotic stresses by using three different softwares (BestKeeper, NormFinder and geNorm). Four genes (BolC.TUB6, BolC.SAND1, BolC.UBQ2 and BolC.TBP1) emerged as the most stable across the tested conditions. Further gene expression analysis of a drought- and a cold-responsive gene (BolC.DREB2A and BolC.ELIP, respectively), confirmed the stability and the reliability of the identified reference genes when used for normalization in the leaves of B. oleracea. These four genes were finally tested upon a benzene exposure and all appeared to be useful reference genes along this toxicological condition. These results provide a good starting point for future studies involving gene expression measurement on leaves of B. oleracea exposed to environmental modifications.

  4. Development of a miniaturized DNA microarray for identification of 66 virulence genes of Legionella pneumophila.

    PubMed

    Żak, Mariusz; Zaborowski, Piotr; Baczewska-Rej, Milena; Zasada, Aleksandra A; Matuszewska, Renata; Krogulska, Bożena

    2011-12-20

    For the last five years, Legionella sp. infections and legionnaire's disease in Poland have been receiving a lot of attention, because of the new regulations concerning microbiological quality of drinking water. This was the inspiration to search for and develop a new assay to identify many virulence genes of Legionella pneumophila to better understand their distribution in environmental and clinical strains. The method might be an invaluable help in infection risk assessment and in epidemiological investigations. The microarray is based on Array Tube technology. It contains 3 positive and 1 negative control. Target genes encode structural elements of T4SS, effector proteins and factors not related to T4SS. Probes were designed using OligoWiz software and data analyzed using IconoClust software. To isolate environmental and clinical strains, BAL samples and samples of hot water from different and independent hot water distribution systems of public utility buildings were collected. We have developed a miniaturized DNA microarray for identification of 66 virulence genes of L. pneumophila. The assay is specific to L. pneumophila sg 1 with sensitivity sufficient to perform the assay using DNA isolated from a single L. pneumophila colony. Seven environmental strains were analyzed. Two exhibited a hybridization pattern distinct from the reference strain. The method is time- and cost-effective. Initial studies have shown that genes encoding effector proteins may vary among environmental strains. Further studies might help to identify set of genes increasing the risk of clinical disease and to determine the pathogenic potential of environmental strains.

  5. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    PubMed

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  6. Identification of Importin 8 (IPO8) as the most accurate reference gene for the clinicopathological analysis of lung specimens

    PubMed Central

    Nguewa, Paul A; Agorreta, Jackeline; Blanco, David; Lozano, Maria Dolores; Gomez-Roman, Javier; Sanchez, Blas A; Valles, Iñaki; Pajares, Maria J; Pio, Ruben; Rodriguez, Maria Jose; Montuenga, Luis M; Calvo, Alfonso

    2008-01-01

    Background The accurate normalization of differentially expressed genes in lung cancer is essential for the identification of novel therapeutic targets and biomarkers by real time RT-PCR and microarrays. Although classical "housekeeping" genes, such as GAPDH, HPRT1, and beta-actin have been widely used in the past, their accuracy as reference genes for lung tissues has not been proven. Results We have conducted a thorough analysis of a panel of 16 candidate reference genes for lung specimens and lung cell lines. Gene expression was measured by quantitative real time RT-PCR and expression stability was analyzed with the softwares GeNorm and NormFinder, mean of |ΔCt| (= |Ct Normal-Ct tumor|) ± SEM, and correlation coefficients among genes. Systematic comparison between candidates led us to the identification of a subset of suitable reference genes for clinical samples: IPO8, ACTB, POLR2A, 18S, and PPIA. Further analysis showed that IPO8 had a very low mean of |ΔCt| (0.70 ± 0.09), with no statistically significant differences between normal and malignant samples and with excellent expression stability. Conclusion Our data show that IPO8 is the most accurate reference gene for clinical lung specimens. In addition, we demonstrate that the commonly used genes GAPDH and HPRT1 are inappropriate to normalize data derived from lung biopsies, although they are suitable as reference genes for lung cell lines. We thus propose IPO8 as a novel reference gene for lung cancer samples. PMID:19014639

  7. KinSNP software for homozygosity mapping of disease genes using SNP microarrays.

    PubMed

    Amir, El-Ad David; Bartal, Ofer; Morad, Efrat; Nagar, Tal; Sheynin, Jony; Parvari, Ruti; Chalifa-Caspi, Vered

    2010-08-01

    Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from.

  8. Computational Identification of Novel Genes: Current and Future Perspectives.

    PubMed

    Klasberg, Steffen; Bitard-Feildel, Tristan; Mallet, Ludovic

    2016-01-01

    While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.

  9. VirtualPlant: A Software Platform to Support Systems Biology Research1[W][OA

    PubMed Central

    Katari, Manpreet S.; Nowicki, Steve D.; Aceituno, Felipe F.; Nero, Damion; Kelfer, Jonathan; Thompson, Lee Parnell; Cabello, Juan M.; Davidson, Rebecca S.; Goldberg, Arthur P.; Shasha, Dennis E.; Coruzzi, Gloria M.; Gutiérrez, Rodrigo A.

    2010-01-01

    Data generation is no longer the limiting factor in advancing biological research. In addition, data integration, analysis, and interpretation have become key bottlenecks and challenges that biologists conducting genomic research face daily. To enable biologists to derive testable hypotheses from the increasing amount of genomic data, we have developed the VirtualPlant software platform. VirtualPlant enables scientists to visualize, integrate, and analyze genomic data from a systems biology perspective. VirtualPlant integrates genome-wide data concerning the known and predicted relationships among genes, proteins, and molecules, as well as genome-scale experimental measurements. VirtualPlant also provides visualization techniques that render multivariate information in visual formats that facilitate the extraction of biological concepts. Importantly, VirtualPlant helps biologists who are not trained in computer science to mine lists of genes, microarray experiments, and gene networks to address questions in plant biology, such as: What are the molecular mechanisms by which internal or external perturbations affect processes controlling growth and development? We illustrate the use of VirtualPlant with three case studies, ranging from querying a gene of interest to the identification of gene networks and regulatory hubs that control seed development. Whereas the VirtualPlant software was developed to mine Arabidopsis (Arabidopsis thaliana) genomic data, its data structures, algorithms, and visualization tools are designed in a species-independent way. VirtualPlant is freely available at www.virtualplant.org. PMID:20007449

  10. A parallel implementation of the network identification by multiple regression (NIR) algorithm to reverse-engineer regulatory gene networks.

    PubMed

    Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro

    2010-04-21

    The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.

  11. Identification of neuron-related genes for cell therapy of neurological disorders by network analysis.

    PubMed

    Su, Li-Ning; Song, Xiao-Qing; Wei, Hui-Ping; Yin, Hai-Feng

    Bone mesenchymal stem cells (BMSCs) differentiated into neurons have been widely proposed for use in cell therapy of many neurological disorders. It is therefore important to understand the molecular mechanisms underlying this differentiation. We screened differentially expressed genes between immature neural tissues and untreated BMSCs to identify the genes responsible for neuronal differentiation from BMSCs. GSE68243 gene microarray data of rat BMSCs and GSE18860 gene microarray data of rat neurons were received from the Gene Expression Omnibus database. Transcriptome Analysis Console software showed that 1248 genes were up-regulated and 1273 were down-regulated in neurons compared with BMSCs. Gene Ontology functional enrichment, protein-protein interaction networks, functional modules, and hub genes were analyzed using DAVID, STRING 10, BiNGO tool, and Network Analyzer software, revealing that nine hub genes, Nrcam, Sema3a, Mapk8, Dlg4, Slit1, Creb1, Ntrk2, Cntn2, and Pax6, may play a pivotal role in neuronal differentiation from BMSCs. Seven genes, Dcx, Nrcam, sema3a, Cntn2, Slit1, Ephb1, and Pax6, were shown to be hub nodes within the neuronal development network, while six genes, Fgf2, Tgfβ1, Vegfa, Serpine1, Il6, and Stat1, appeared to play an important role in suppressing neuronal differentiation. However, additional studies are required to confirm these results.

  12. Post-Mortem Identification of a Fire Carbonized Body by STR Genotyping.

    PubMed

    Dumache, Raluca; Muresan, Camelia; Ciocan, Veronica; Rogobete, Alexandru F; Enache, Alexandra

    2016-10-01

    Identification of bodies of unknown identity that are victims of exposure to very high temperatures, resulting from fires, plane crashes, and terrorist attacks, represents one of the most difficult sides of forensic genetics, because of the advanced state of decomposition. The aim of this study was the identification of the carbonized cadaver of a fire victim through STR genotyping. We used blood samples obtained from the iliac artery during the autopsy examination as biological samples from the unidentified victim. After DNA isolation and quantification, we proceeded to its amplification using the multiplex PCR kit AmpFlSTR Identifiler. The DNA products were separated using an ABI 3500 genetic analyzer. Further analysis of the data was done using Gene Mapper ID-X version 1.4 software. In this case, it was possible to obtain a complete DNA profile from the biological samples. Due to the fact that the amelogenin gene presented two alleles, X and Y, we concluded that the victim was a man. We conclude that STR profiling of unidentified bodies (carbonized, decomposed) represents a powerful method of human identification in forensic medicine.

  13. BMRF-Net: a software tool for identification of protein interaction subnetworks by a bagging Markov random field-based method.

    PubMed

    Shi, Xu; Barnes, Robert O; Chen, Li; Shajahan-Haq, Ayesha N; Hilakivi-Clarke, Leena; Clarke, Robert; Wang, Yue; Xuan, Jianhua

    2015-07-15

    Identification of protein interaction subnetworks is an important step to help us understand complex molecular mechanisms in cancer. In this paper, we develop a BMRF-Net package, implemented in Java and C++, to identify protein interaction subnetworks based on a bagging Markov random field (BMRF) framework. By integrating gene expression data and protein-protein interaction data, this software tool can be used to identify biologically meaningful subnetworks. A user friendly graphic user interface is developed as a Cytoscape plugin for the BMRF-Net software to deal with the input/output interface. The detailed structure of the identified networks can be visualized in Cytoscape conveniently. The BMRF-Net package has been applied to breast cancer data to identify significant subnetworks related to breast cancer recurrence. The BMRF-Net package is available at http://sourceforge.net/projects/bmrfcjava/. The package is tested under Ubuntu 12.04 (64-bit), Java 7, glibc 2.15 and Cytoscape 3.1.0. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context.

    PubMed

    Mader, Malte; Simon, Ronald; Steinbiss, Sascha; Kurtz, Stefan

    2011-07-28

    The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle.

  15. FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context

    PubMed Central

    2011-01-01

    Background The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. Results We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. Conclusions FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle. PMID:21884636

  16. rpoB-Based Identification of Nonpigmented and Late-Pigmenting Rapidly Growing Mycobacteria

    PubMed Central

    Adékambi, Toïdi; Colson, Philippe; Drancourt, Michel

    2003-01-01

    Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM. PMID:14662964

  17. 48 CFR 227.7203-10 - Contractor identification and marking of computer software or computer software documentation to...

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... and marking of computer software or computer software documentation to be furnished with restrictive... Rights in Computer Software and Computer Software Documentation 227.7203-10 Contractor identification and marking of computer software or computer software documentation to be furnished with restrictive markings...

  18. 48 CFR 227.7203-10 - Contractor identification and marking of computer software or computer software documentation to...

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... and marking of computer software or computer software documentation to be furnished with restrictive... Rights in Computer Software and Computer Software Documentation 227.7203-10 Contractor identification and marking of computer software or computer software documentation to be furnished with restrictive markings...

  19. 48 CFR 227.7203-10 - Contractor identification and marking of computer software or computer software documentation to...

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... and marking of computer software or computer software documentation to be furnished with restrictive... Rights in Computer Software and Computer Software Documentation 227.7203-10 Contractor identification and marking of computer software or computer software documentation to be furnished with restrictive markings...

  20. 48 CFR 227.7203-10 - Contractor identification and marking of computer software or computer software documentation to...

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... and marking of computer software or computer software documentation to be furnished with restrictive... Rights in Computer Software and Computer Software Documentation 227.7203-10 Contractor identification and marking of computer software or computer software documentation to be furnished with restrictive markings...

  1. 48 CFR 227.7203-10 - Contractor identification and marking of computer software or computer software documentation to...

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... and marking of computer software or computer software documentation to be furnished with restrictive... Rights in Computer Software and Computer Software Documentation 227.7203-10 Contractor identification and marking of computer software or computer software documentation to be furnished with restrictive markings...

  2. Software Risk Identification for Interplanetary Probes

    NASA Technical Reports Server (NTRS)

    Dougherty, Robert J.; Papadopoulos, Periklis E.

    2005-01-01

    The need for a systematic and effective software risk identification methodology is critical for interplanetary probes that are using increasingly complex and critical software. Several probe failures are examined that suggest more attention and resources need to be dedicated to identifying software risks. The direct causes of these failures can often be traced to systemic problems in all phases of the software engineering process. These failures have lead to the development of a practical methodology to identify risks for interplanetary probes. The proposed methodology is based upon the tailoring of the Software Engineering Institute's (SEI) method of taxonomy-based risk identification. The use of this methodology will ensure a more consistent and complete identification of software risks in these probes.

  3. Identification of Bacillus Probiotics Isolated from Soil Rhizosphere Using 16S rRNA, recA, rpoB Gene Sequencing and RAPD-PCR.

    PubMed

    Mohkam, Milad; Nezafat, Navid; Berenjian, Aydin; Mobasher, Mohammad Ali; Ghasemi, Younes

    2016-03-01

    Some Bacillus species, especially Bacillus subtilis and Bacillus pumilus groups, have highly similar 16S rRNA gene sequences, which are hard to identify based on 16S rDNA sequence analysis. To conquer this drawback, rpoB, recA sequence analysis along with randomly amplified polymorphic (RAPD) fingerprinting was examined as an alternative method for differentiating Bacillus species. The 16S rRNA, rpoB and recA genes were amplified via a polymerase chain reaction using their specific primers. The resulted PCR amplicons were sequenced, and phylogenetic analysis was employed by MEGA 6 software. Identification based on 16S rRNA gene sequencing was underpinned by rpoB and recA gene sequencing as well as RAPD-PCR technique. Subsequently, concatenation and phylogenetic analysis showed that extent of diversity and similarity were better obtained by rpoB and recA primers, which are also reinforced by RAPD-PCR methods. However, in one case, these approaches failed to identify one isolate, which in combination with the phenotypical method offsets this issue. Overall, RAPD fingerprinting, rpoB and recA along with concatenated genes sequence analysis discriminated closely related Bacillus species, which highlights the significance of the multigenic method in more precisely distinguishing Bacillus strains. This research emphasizes the benefit of RAPD fingerprinting, rpoB and recA sequence analysis superior to 16S rRNA gene sequence analysis for suitable and effective identification of Bacillus species as recommended for probiotic products.

  4. 48 CFR 227.7203-3 - Early identification of computer software or computer software documentation to be furnished to...

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... computer software or computer software documentation to be furnished to the Government with restrictions on..., DATA, AND COPYRIGHTS Rights in Computer Software and Computer Software Documentation 227.7203-3 Early identification of computer software or computer software documentation to be furnished to the Government with...

  5. 48 CFR 227.7203-3 - Early identification of computer software or computer software documentation to be furnished to...

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... computer software or computer software documentation to be furnished to the Government with restrictions on..., DATA, AND COPYRIGHTS Rights in Computer Software and Computer Software Documentation 227.7203-3 Early identification of computer software or computer software documentation to be furnished to the Government with...

  6. 48 CFR 227.7203-3 - Early identification of computer software or computer software documentation to be furnished to...

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... computer software or computer software documentation to be furnished to the Government with restrictions on..., DATA, AND COPYRIGHTS Rights in Computer Software and Computer Software Documentation 227.7203-3 Early identification of computer software or computer software documentation to be furnished to the Government with...

  7. 48 CFR 227.7203-3 - Early identification of computer software or computer software documentation to be furnished to...

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... computer software or computer software documentation to be furnished to the Government with restrictions on..., DATA, AND COPYRIGHTS Rights in Computer Software and Computer Software Documentation 227.7203-3 Early identification of computer software or computer software documentation to be furnished to the Government with...

  8. 48 CFR 227.7203-3 - Early identification of computer software or computer software documentation to be furnished to...

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... computer software or computer software documentation to be furnished to the Government with restrictions on..., DATA, AND COPYRIGHTS Rights in Computer Software and Computer Software Documentation 227.7203-3 Early identification of computer software or computer software documentation to be furnished to the Government with...

  9. Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets.

    PubMed

    Bengtsson, Johan; Eriksson, K Martin; Hartmann, Martin; Wang, Zheng; Shenoy, Belle Damodara; Grelet, Gwen-Aëlle; Abarenkov, Kessy; Petri, Anna; Rosenblad, Magnus Alm; Nilsson, R Henrik

    2011-10-01

    The ribosomal small subunit (SSU) rRNA gene has emerged as an important genetic marker for taxonomic identification in environmental sequencing datasets. In addition to being present in the nucleus of eukaryotes and the core genome of prokaryotes, the gene is also found in the mitochondria of eukaryotes and in the chloroplasts of photosynthetic eukaryotes. These three sets of genes are conceptually paralogous and should in most situations not be aligned and analyzed jointly. To identify the origin of SSU sequences in complex sequence datasets has hitherto been a time-consuming and largely manual undertaking. However, the present study introduces Metaxa ( http://microbiology.se/software/metaxa/ ), an automated software tool to extract full-length and partial SSU sequences from larger sequence datasets and assign them to an archaeal, bacterial, nuclear eukaryote, mitochondrial, or chloroplast origin. Using data from reference databases and from full-length organelle and organism genomes, we show that Metaxa detects and scores SSU sequences for origin with very low proportions of false positives and negatives. We believe that this tool will be useful in microbial and evolutionary ecology as well as in metagenomics.

  10. A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra.

    PubMed

    Kou, Qiang; Wu, Si; Tolic, Nikola; Paša-Tolic, Ljiljana; Liu, Yunlong; Liu, Xiaowen

    2017-05-01

    Although proteomics has rapidly developed in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a 'bird's eye view' of intact proteoforms. The combinatorial explosion of various alterations on a protein may result in billions of possible proteoforms, making proteoform identification a challenging computational problem. We propose a new data structure, called the mass graph, for efficient representation of proteoforms and design mass graph alignment algorithms. We developed TopMG, a mass graph-based software tool for proteoform identification by top-down mass spectrometry. Experiments on top-down mass spectrometry datasets showed that TopMG outperformed existing methods in identifying complex proteoforms. http://proteomics.informatics.iupui.edu/software/topmg/. xwliu@iupui.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  11. Identification of reference genes for RT-qPCR analysis in peach genotypes with contrasting chilling requirements.

    PubMed

    Marini, N; Bevilacqua, C B; Büttow, M V; Raseira, M C B; Bonow, S

    2017-05-25

    Selecting and validating reference genes are the first steps in studying gene expression by reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR). The present study aimed to evaluate the stability of five reference genes for the purpose of normalization when studying gene expression in various cultivars of Prunus persica with different chilling requirements. Flower bud tissues of nine peach genotypes from Embrapa's peach breeding program with different chilling requirements were used, and five candidate reference genes based on the RT-qPCR that were useful for studying the relative quantitative gene expression and stability were evaluated using geNorm, NormFinder, and bestKeeper software packages. The results indicated that among the genes tested, the most stable genes to be used as reference genes are Act and UBQ10. This study is the first survey of the stability of reference genes in peaches under chilling stress and provides guidelines for more accurate RT-qPCR results.

  12. Fast gene ontology based clustering for microarray experiments.

    PubMed

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  13. Particle detection, number estimation, and feature measurement in gene transfer studies: optical fractionator stereology integrated with digital image processing and analysis.

    PubMed

    King, Michael A; Scotty, Nicole; Klein, Ronald L; Meyer, Edwin M

    2002-10-01

    Assessing the efficacy of in vivo gene transfer often requires a quantitative determination of the number, size, shape, or histological visualization characteristics of biological objects. The optical fractionator has become a choice stereological method for estimating the number of objects, such as neurons, in a structure, such as a brain subregion. Digital image processing and analytic methods can increase detection sensitivity and quantify structural and/or spectral features located in histological specimens. We describe a hardware and software system that we have developed for conducting the optical fractionator process. A microscope equipped with a video camera and motorized stage and focus controls is interfaced with a desktop computer. The computer contains a combination live video/computer graphics adapter with a video frame grabber and controls the stage, focus, and video via a commercial imaging software package. Specialized macro programs have been constructed with this software to execute command sequences requisite to the optical fractionator method: defining regions of interest, positioning specimens in a systematic uniform random manner, and stepping through known volumes of tissue for interactive object identification (optical dissectors). The system affords the flexibility to work with count regions that exceed the microscope image field size at low magnifications and to adjust the parameters of the fractionator sampling to best match the demands of particular specimens and object types. Digital image processing can be used to facilitate object detection and identification, and objects that meet criteria for counting can be analyzed for a variety of morphometric and optical properties. Copyright 2002 Elsevier Science (USA)

  14. Identification of possible genetic polymorphisms involved in cancer cachexia: a systematic review.

    PubMed

    Tan, Benjamin H L; Ross, James A; Kaasa, Stein; Skorpen, Frank; Fearon, Kenneth C H

    2011-04-01

    Cancer cachexia is a polygenic and complex syndrome. Genetic variations in regulation of the inflammatory response, muscle and fat metabolic pathways, and pathways in appetite regulation are likely to contribute to the susceptibility or resistance to developing cancer cachexia. A systematic search of Medline and EmBase databases, covering 1986-2008 was performed for potential candidate genes/genetic polymorphisms relating to cancer cachexia. Related genes were then identified using pathway functional analysis software. All candidate genes were reviewed for functional polymorphisms or clinically significant polymorphisms associated with cachexia using the OMIM and GeneRIF databases. Genes with variants which had functional or clinical associations with cachexia and replicated in at least one study were entered into pathway analysis software to reveal possible network associations between genes. A total of 184 polymorphisms with functional or clinical relevance to cancer cachexia were identified in 92 candidate genes. Of these, 42 polymorphisms (in 33 genes) were replicated in more than one study with 13 polymorphisms found to influence two or more hallmarks of cachexia (i.e. inflammation, loss of fat mass and/or lean mass and reduced survival). Thirty-three genes were found to be significantly interconnected in two major networks with four genes (ADIPOQ, IL6, NFKB1 and TLR4) interlinking both networks. Selection of candidate genes and polymorphisms is a key element of multigene study design. The present study provides an initial framework to select genes/polymorphisms for further study in cancer cachexia, and to develop their potential as susceptibility biomarkers of developing cachexia.

  15. HomSI: a homozygous stretch identifier from next-generation sequencing data.

    PubMed

    Görmez, Zeliha; Bakir-Gungor, Burcu; Sagiroglu, Mahmut Samil

    2014-02-01

    In consanguineous families, as a result of inheriting the same genomic segments through both parents, the individuals have stretches of their genomes that are homozygous. This situation leads to the prevalence of recessive diseases among the members of these families. Homozygosity mapping is based on this observation, and in consanguineous families, several recessive disease genes have been discovered with the help of this technique. The researchers typically use single nucleotide polymorphism arrays to determine the homozygous regions and then search for the disease gene by sequencing the genes within this candidate disease loci. Recently, the advent of next-generation sequencing enables the concurrent identification of homozygous regions and the detection of mutations relevant for diagnosis, using data from a single sequencing experiment. In this respect, we have developed a novel tool that identifies homozygous regions using deep sequence data. Using *.vcf (variant call format) files as an input file, our program identifies the majority of homozygous regions found by microarray single nucleotide polymorphism genotype data. HomSI software is freely available at www.igbam.bilgem.tubitak.gov.tr/softwares/HomSI, with an online manual.

  16. Identification of appropriate reference genes for human mesenchymal stem cell analysis by quantitative real-time PCR.

    PubMed

    Li, Xiuying; Yang, Qiwei; Bai, Jinping; Xuan, Yali; Wang, Yimin

    2015-01-01

    Normalization to a reference gene is the method of choice for quantitative reverse transcription-PCR (RT-qPCR) analysis. The stability of reference genes is critical for accurate experimental results and conclusions. We have evaluated the expression stability of eight commonly used reference genes found in four different human mesenchymal stem cells (MSC). Using geNorm, NormFinder and BestKeeper algorithms, we show that beta-2-microglobulin and peptidyl-prolylisomerase A were the optimal reference genes for normalizing RT-qPCR data obtained from MSC, whereas the TATA box binding protein was not suitable due to its extensive variability in expression. Our findings emphasize the significance of validating reference genes for qPCR analyses. We offer a short list of reference genes to use for normalization and recommend some commercially-available software programs as a rapid approach to validate reference genes. We also demonstrate that the two reference genes, β-actin and glyceraldehyde-3-phosphate dehydrogenase, are frequently used are not always successful in many cases.

  17. IMPACT_S: integrated multiprogram platform to analyze and combine tests of selection.

    PubMed

    Maldonado, Emanuel; Sunagar, Kartik; Almeida, Daniela; Vasconcelos, Vitor; Antunes, Agostinho

    2014-01-01

    Among the major goals of research in evolutionary biology are the identification of genes targeted by natural selection and understanding how various regimes of evolution affect the fitness of an organism. In particular, adaptive evolution enables organisms to adapt to changing ecological factors such as diet, temperature, habitat, predatory pressures and prey abundance. An integrative approach is crucial for the identification of non-synonymous mutations that introduce radical changes in protein biochemistry and thus in turn influence the structure and function of proteins. Performing such analyses manually is often a time-consuming process, due to the large number of statistical files generated from multiple approaches, especially when assessing numerous taxa and/or large datasets. We present IMPACT_S, an easy-to-use Graphical User Interface (GUI) software, which rapidly and effectively integrates, filters and combines results from three widely used programs for assessing the influence of selection: Codeml (PAML package), Datamonkey and TreeSAAP. It enables the identification and tabulation of sites detected by these programs as evolving under the influence of positive, neutral and/or negative selection in protein-coding genes. IMPACT_S further facilitates the automatic mapping of these sites onto the three-dimensional structures of proteins. Other useful tools incorporated in IMPACT_S include Jmol, Archaeopteryx, Gnuplot, PhyML, a built-in Swiss-Model interface and a PDB downloader. The relevance and functionality of IMPACT_S is shown through a case study on the toxicoferan-reptilian Cysteine-rich Secretory Proteins (CRiSPs). IMPACT_S is a platform-independent software released under GPLv3 license, freely available online from http://impact-s.sourceforge.net.

  18. Analysis of lipid experiments (ALEX): a software framework for analysis of high-resolution shotgun lipidomics data.

    PubMed

    Husen, Peter; Tarasov, Kirill; Katafiasz, Maciej; Sokol, Elena; Vogt, Johannes; Baumgart, Jan; Nitsch, Robert; Ekroos, Kim; Ejsing, Christer S

    2013-01-01

    Global lipidomics analysis across large sample sizes produces high-content datasets that require dedicated software tools supporting lipid identification and quantification, efficient data management and lipidome visualization. Here we present a novel software-based platform for streamlined data processing, management and visualization of shotgun lipidomics data acquired using high-resolution Orbitrap mass spectrometry. The platform features the ALEX framework designed for automated identification and export of lipid species intensity directly from proprietary mass spectral data files, and an auxiliary workflow using database exploration tools for integration of sample information, computation of lipid abundance and lipidome visualization. A key feature of the platform is the organization of lipidomics data in "database table format" which provides the user with an unsurpassed flexibility for rapid lipidome navigation using selected features within the dataset. To demonstrate the efficacy of the platform, we present a comparative neurolipidomics study of cerebellum, hippocampus and somatosensory barrel cortex (S1BF) from wild-type and knockout mice devoid of the putative lipid phosphate phosphatase PRG-1 (plasticity related gene-1). The presented framework is generic, extendable to processing and integration of other lipidomic data structures, can be interfaced with post-processing protocols supporting statistical testing and multivariate analysis, and can serve as an avenue for disseminating lipidomics data within the scientific community. The ALEX software is available at www.msLipidomics.info.

  19. Sequence comparison of phoR, gyrB, groEL, and cheA genes as phylogenetic markers for distinguishing Bacillus amyloliquefaciens and B. subtilis and for identifying Bacillus strain B29.

    PubMed

    Yu, C; Jin, J; Meng, L-Q; Xia, H-H; Yuan, H-F; Wang, J; Yu, D-S; Zhao, X-Y; Sha, C-Q

    2017-05-20

    Given the close genetic relationship between Bacillus amyloliquefaciens and B. subtilis, distinguishing the two solely based on their physiological and biochemical characteristics and 16S rRNA sequences is difficult. Molecular identification was used to discover suitable genes for distinguishing the two bacteria, and to identify the bio-controlling strain B29, due to molecular identification has been paid more and more attention. The similarity of four genes, cheA, gyrB, groEL and phoR, of the two species was compared by the software BLASTN and MAGA, and phylogenetic tree was constructed. The B29 strain was re-identified by using the screened genes. The similarities of the four genes, gyrB, groEL, cheA and phoR, of the two species were 93-95%, 82-84%, 76-78% and 76-77%, respectively. The homologies of the four genes of the strain B29 and the strains of B. amyloliquefaciens strains were more than 95%. We determined how well the phoR and cheA genes could be used to differentiate B. amyloliquefacien and B. subtilis. The previously isolated biological control strain B29, initially classified as B. subtilis, was re-classified as B. amyloliquefaciens. Our data indicate that other than the phoR gene, the cheA gene might be a useful phylogenetic marker for differentiating B. subtilis and B. amyloliquefaciens.

  20. Computational annotation of genes differentially expressed along olive fruit development

    PubMed Central

    Galla, Giulio; Barcaccia, Gianni; Ramina, Angelo; Collani, Silvio; Alagna, Fiammetta; Baldoni, Luciana; Cultrera, Nicolò GM; Martinelli, Federico; Sebastiani, Luca; Tonutti, Pietro

    2009-01-01

    Background Olea europaea L. is a traditional tree crop of the Mediterranean basin with a worldwide economical high impact. Differently from other fruit tree species, little is known about the physiological and molecular basis of the olive fruit development and a few sequences of genes and gene products are available for olive in public databases. This study deals with the identification of large sets of differentially expressed genes in developing olive fruits and the subsequent computational annotation by means of different software. Results mRNA from fruits of the cv. Leccino sampled at three different stages [i.e., initial fruit set (stage 1), completed pit hardening (stage 2) and veraison (stage 3)] was used for the identification of differentially expressed genes putatively involved in main processes along fruit development. Four subtractive hybridization libraries were constructed: forward and reverse between stage 1 and 2 (libraries A and B), and 2 and 3 (libraries C and D). All sequenced clones (1,132 in total) were analyzed through BlastX against non-redundant NCBI databases and about 60% of them showed similarity to known proteins. A total of 89 out of 642 differentially expressed unique sequences was further investigated by Real-Time PCR, showing a validation of the SSH results as high as 69%. Library-specific cDNA repertories were annotated according to the three main vocabularies of the gene ontology (GO): cellular component, biological process and molecular function. BlastX analysis, GO terms mapping and annotation analysis were performed using the Blast2GO software, a research tool designed with the main purpose of enabling GO based data mining on sequence sets for which no GO annotation is yet available. Bioinformatic analysis pointed out a significantly different distribution of the annotated sequences for each GO category, when comparing the three fruit developmental stages. The olive fruit-specific transcriptome dataset was used to query all known KEGG (Kyoto Encyclopaedia of Genes and Genomes) metabolic pathways for characterizing and positioning retrieved EST records. The integration of the olive sequence datasets within the MapMan platform for microarray analysis allowed the identification of specific biosynthetic pathways useful for the definition of key functional categories in time course analyses for gene groups. Conclusion The bioinformatic annotation of all gene sequences was useful to shed light on metabolic pathways and transcriptional aspects related to carbohydrates, fatty acids, secondary metabolites, transcription factors and hormones as well as response to biotic and abiotic stresses throughout olive drupe development. These results represent a first step toward both functional genomics and systems biology research for understanding the gene functions and regulatory networks in olive fruit growth and ripening. PMID:19852839

  1. An integrated workflow for analysis of ChIP-chip data.

    PubMed

    Weigelt, Karin; Moehle, Christoph; Stempfl, Thomas; Weber, Bernhard; Langmann, Thomas

    2008-08-01

    Although ChIP-chip is a powerful tool for genome-wide discovery of transcription factor target genes, the steps involving raw data analysis, identification of promoters, and correlation with binding sites are still laborious processes. Therefore, we report an integrated workflow for the analysis of promoter tiling arrays with the Genomatix ChipInspector system. We compare this tool with open-source software packages to identify PU.1 regulated genes in mouse macrophages. Our results suggest that ChipInspector data analysis, comparative genomics for binding site prediction, and pathway/network modeling significantly facilitate and enhance whole-genome promoter profiling to reveal in vivo sites of transcription factor-DNA interactions.

  2. C-mii: a tool for plant miRNA and target identification.

    PubMed

    Numnark, Somrak; Mhuantong, Wuttichai; Ingsriswang, Supawadee; Wichadakul, Duangdao

    2012-01-01

    MicroRNAs (miRNAs) have been known to play an important role in several biological processes in both animals and plants. Although several tools for miRNA and target identification are available, the number of tools tailored towards plants is limited, and those that are available have specific functionality, lack graphical user interfaces, and restrict the number of input sequences. Large-scale computational identifications of miRNAs and/or targets of several plants have been also reported. Their methods, however, are only described as flow diagrams, which require programming skills and the understanding of input and output of the connected programs to reproduce. To overcome these limitations and programming complexities, we proposed C-mii as a ready-made software package for both plant miRNA and target identification. C-mii was designed and implemented based on established computational steps and criteria derived from previous literature with the following distinguishing features. First, software is easy to install with all-in-one programs and packaged databases. Second, it comes with graphical user interfaces (GUIs) for ease of use. Users can identify plant miRNAs and targets via step-by-step execution, explore the detailed results from each step, filter the results according to proposed constraints in plant miRNA and target biogenesis, and export sequences and structures of interest. Third, it supplies bird's eye views of the identification results with infographics and grouping information. Fourth, in terms of functionality, it extends the standard computational steps of miRNA target identification with miRNA-target folding and GO annotation. Fifth, it provides helper functions for the update of pre-installed databases and automatic recovery. Finally, it supports multi-project and multi-thread management. C-mii constitutes the first complete software package with graphical user interfaces enabling computational identification of both plant miRNA genes and miRNA targets. With the provided functionalities, it can help accelerate the study of plant miRNAs and targets, especially for small and medium plant molecular labs without bioinformaticians. C-mii is freely available at http://www.biotec.or.th/isl/c-mii for both Windows and Ubuntu Linux platforms.

  3. C-mii: a tool for plant miRNA and target identification

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) have been known to play an important role in several biological processes in both animals and plants. Although several tools for miRNA and target identification are available, the number of tools tailored towards plants is limited, and those that are available have specific functionality, lack graphical user interfaces, and restrict the number of input sequences. Large-scale computational identifications of miRNAs and/or targets of several plants have been also reported. Their methods, however, are only described as flow diagrams, which require programming skills and the understanding of input and output of the connected programs to reproduce. Results To overcome these limitations and programming complexities, we proposed C-mii as a ready-made software package for both plant miRNA and target identification. C-mii was designed and implemented based on established computational steps and criteria derived from previous literature with the following distinguishing features. First, software is easy to install with all-in-one programs and packaged databases. Second, it comes with graphical user interfaces (GUIs) for ease of use. Users can identify plant miRNAs and targets via step-by-step execution, explore the detailed results from each step, filter the results according to proposed constraints in plant miRNA and target biogenesis, and export sequences and structures of interest. Third, it supplies bird's eye views of the identification results with infographics and grouping information. Fourth, in terms of functionality, it extends the standard computational steps of miRNA target identification with miRNA-target folding and GO annotation. Fifth, it provides helper functions for the update of pre-installed databases and automatic recovery. Finally, it supports multi-project and multi-thread management. Conclusions C-mii constitutes the first complete software package with graphical user interfaces enabling computational identification of both plant miRNA genes and miRNA targets. With the provided functionalities, it can help accelerate the study of plant miRNAs and targets, especially for small and medium plant molecular labs without bioinformaticians. C-mii is freely available at http://www.biotec.or.th/isl/c-mii for both Windows and Ubuntu Linux platforms. PMID:23281648

  4. Evaluative Assay of Nuclear and Mitochondrial Genes to Diagnose Leishmania Species in Clinical Specimens.

    PubMed

    Esmaeili Rastaghi, Ahmad Reza; Spotin, Adel; Khataminezhad, Mohammad Reza; Jafarpour, Mostafa; Alaeenovin, Elnaz; Najafzadeh, Narmin; Samei, Neda; Taleshi, Neda; Mohammadi, Somayeh; Parvizi, Parviz

    2017-10-01

    Leishmaniasis as an emerging and reemerging disease is increasing worldwide with high prevalence and new incidence in recent years. For epidemiological investigation and accurate identification of Leishmania species, three nuclear and mitochondrial genes (ITS-rDNA, Hsp70, and Cyt b ) were employed and analyzed from clinical samples in three important Zoonotic Cutaneous Leishmaniasis (ZCL) foci of Iran. In this cross-sectional/descriptive study conducted in 2014-15, serous smears of lesions were directly prepared from suspected patients of ZCL in Turkmen in northeast, Abarkouh in center and Shush district in southwest of Iran. They were directly prepared from suspected patients and DNA was extracted. Two nuclear genes of ITS-rDNA, Hsp70 and one mitochondrial gene of Cyt b within Leishmania parasites were amplified. RFLP was performed on PCR-positive samples. PCR products were sequenced, aligned and edited with sequencher 4.1.4 and phylogenic analyses performed using MEGA 5.05 software. Overall, 203 out of 360 clinical samples from suspected patients were Leishmania positive using routine laboratory methods and 231 samples were positive by molecular techniques. L. major L. tropica , and L. turanica were firmly identified by employing different molecular genes and phylogenic analyses. By combining different molecular genes, Leishmania parasites were identified accurately. The sensitivity and specificity three genes were evaluated and had more advantages to compare routine laboratory methods. ITS-rDNA gene is more appropriate for firm identification of Leishmania species.

  5. Identification of a Novel Reference Gene for Apple Transcriptional Profiling under Postharvest Conditions

    PubMed Central

    Storch, Tatiane Timm; Pegoraro, Camila; Finatto, Taciane; Quecini, Vera; Rombaldi, Cesar Valmor; Girardi, César Luis

    2015-01-01

    Reverse Transcription quantitative PCR (RT-qPCR) is one of the most important techniques for gene expression profiling due to its high sensibility and reproducibility. However, the reliability of the results is highly dependent on data normalization, performed by comparisons between the expression profiles of the genes of interest against those of constitutively expressed, reference genes. Although the technique is widely used in fruit postharvest experiments, the transcription stability of reference genes has not been thoroughly investigated under these experimental conditions. Thus, we have determined the transcriptional profile, under these conditions, of three genes commonly used as reference—ACTIN (MdACT), PROTEIN DISULPHIDE ISOMERASE (MdPDI) and UBIQUITIN-CONJUGATING ENZYME E2 (MdUBC)—along with two novel candidates—HISTONE 1 (MdH1) and NUCLEOSSOME ASSEMBLY 1 PROTEIN (MdNAP1). The expression profile of the genes was investigated throughout five experiments, with three of them encompassing the postharvest period and the other two, consisting of developmental and spatial phases. The transcriptional stability was comparatively investigated using four distinct software packages: BestKeeper, NormFinder, geNorm and DataAssist. Gene ranking results for transcriptional stability were similar for the investigated software packages, with the exception of BestKeeper. The classic reference gene MdUBC ranked among the most stably transcribed in all investigated experimental conditions. Transcript accumulation profiles for the novel reference candidate gene MdH1 were stable throughout the tested conditions, especially in experiments encompassing the postharvest period. Thus, our results present a novel reference gene for postharvest experiments in apple and reinforce the importance of checking the transcription profile of reference genes under the experimental conditions of interest. PMID:25774904

  6. Identification of a novel reference gene for apple transcriptional profiling under postharvest conditions.

    PubMed

    Storch, Tatiane Timm; Pegoraro, Camila; Finatto, Taciane; Quecini, Vera; Rombaldi, Cesar Valmor; Girardi, César Luis

    2015-01-01

    Reverse Transcription quantitative PCR (RT-qPCR) is one of the most important techniques for gene expression profiling due to its high sensibility and reproducibility. However, the reliability of the results is highly dependent on data normalization, performed by comparisons between the expression profiles of the genes of interest against those of constitutively expressed, reference genes. Although the technique is widely used in fruit postharvest experiments, the transcription stability of reference genes has not been thoroughly investigated under these experimental conditions. Thus, we have determined the transcriptional profile, under these conditions, of three genes commonly used as reference--ACTIN (MdACT), PROTEIN DISULPHIDE ISOMERASE (MdPDI) and UBIQUITIN-CONJUGATING ENZYME E2 (MdUBC)--along with two novel candidates--HISTONE 1 (MdH1) and NUCLEOSSOME ASSEMBLY 1 PROTEIN (MdNAP1). The expression profile of the genes was investigated throughout five experiments, with three of them encompassing the postharvest period and the other two, consisting of developmental and spatial phases. The transcriptional stability was comparatively investigated using four distinct software packages: BestKeeper, NormFinder, geNorm and DataAssist. Gene ranking results for transcriptional stability were similar for the investigated software packages, with the exception of BestKeeper. The classic reference gene MdUBC ranked among the most stably transcribed in all investigated experimental conditions. Transcript accumulation profiles for the novel reference candidate gene MdH1 were stable throughout the tested conditions, especially in experiments encompassing the postharvest period. Thus, our results present a novel reference gene for postharvest experiments in apple and reinforce the importance of checking the transcription profile of reference genes under the experimental conditions of interest.

  7. Heuristic Identification of Biological Architectures for Simulating Complex Hierarchical Genetic Interactions

    PubMed Central

    Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C

    2015-01-01

    Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175

  8. Microarray analysis of retinal gene expression in Egr-1 knockout mice

    PubMed Central

    Schippert, Ruth; Schaeffel, Frank

    2009-01-01

    Purpose We found earlier that 42 day-old Egr-1 knockout mice had longer eyes and a more myopic refractive error compared to their wild-types. To identify genes that could be responsible for the temporarily enhanced axial eye growth, a microarray analysis was performed in knockout and wild-type mice at the postnatal ages of 30 and 42 days. Methods The retinas of homozygous and wild-type Egr-1 knockout mice (Taconic, Ry, Denmark) were prepared for RNA isolation (RNeasy Mini Kit, Qiagen) at the age of 30 or 42 days, respectively (n=12 each). Three retinas were pooled and labeled cRNA was made. The samples were hybridized to Affymetrix GeneChip Mouse Genome 430 2.0 Arrays. Hybridization signals were calculated using GC-RMA normalization. Genes were identified as differentially expressed if they showed a fold-change (FC) of at least 1.5 and a p-value <0.05. A false-discovery rate of 5% was applied. Ten genes with potential biologic relevance were examined further with semiquantitative real-time RT–PCR. Results Comparing mRNA expression levels between wild-type and homozygous Egr-1 knockout mice, we found 73 differentially expressed genes at the age of 30 days and 135 genes at the age of 42 days. Testing for differences in gene expression between the two ages (30 versus 42 days), 54 genes were differently expressed in wild-type mice and 215 genes in homozygous animals. Based on three networks proposed by Ingenuity pathway analysis software, nine differently expressed genes in the homozygous Egr-1 knockout mice were chosen for further validation by real-time RT–PCR, three genes in each network. In addition, the gene that was most prominently regulated in the knockout mice, compared to wild-type, at both 30 days and 42 days of age (protocadherin beta-9 [Pcdhb9]), was tested with real-time RT–PCR. Changes in four of the ten genes could be confirmed by real-time RT–PCR: nuclear prelamin A recognition factor (Narf), oxoglutarate dehydrogenase (Ogdh), selenium binding protein 1 (Selenbp1), and Pcdhb9. Except for Pcdhb9, the genes whose mRNA expression levels were validated were listed in one of the networks proposed by Ingenuity pathway analysis software. In addition to these genes, the software proposed several key-regulators which did not change in our study: retinoic acid, vascular endothelial growth factor A (VEGF-A), FBJ murine osteosarcoma viral oncogene homolog (cFos), and others. Conclusions Identification of genes that are differentially regulated during the development period between postnatal day 30 (when both homozygous and wild-type mice still have the same axial length) and day 42 (where the difference in eye length is apparent) could improve the understanding of mechanisms for the control of axial eye growth and may lead to potential targets for pharmacological intervention. With the aid of pathway-analysis software, a coarse picture of possible biochemical pathways could be generated. Although the mRNA expression levels of proteins proposed by the software, like VEGF, FOS, retinoic acid (RA) receptors, or cellular RA binding protein, did not show any changes in our experiment, these molecules have previously been implicated in the signaling cascades controlling axial eye growth. According to the pathway-analysis software, they represent links between several proteins whose mRNA expression was changed in our study. PMID:20019881

  9. Microarray analysis of retinal gene expression in Egr-1 knockout mice.

    PubMed

    Schippert, Ruth; Schaeffel, Frank; Feldkaemper, Marita Pauline

    2009-12-10

    We found earlier that 42 day-old Egr-1 knockout mice had longer eyes and a more myopic refractive error compared to their wild-types. To identify genes that could be responsible for the temporarily enhanced axial eye growth, a microarray analysis was performed in knockout and wild-type mice at the postnatal ages of 30 and 42 days. The retinas of homozygous and wild-type Egr-1 knockout mice (Taconic, Ry, Denmark) were prepared for RNA isolation (RNeasy Mini Kit, Qiagen) at the age of 30 or 42 days, respectively (n=12 each). Three retinas were pooled and labeled cRNA was made. The samples were hybridized to Affymetrix GeneChip Mouse Genome 430 2.0 Arrays. Hybridization signals were calculated using GC-RMA normalization. Genes were identified as differentially expressed if they showed a fold-change (FC) of at least 1.5 and a p-value <0.05. A false-discovery rate of 5% was applied. Ten genes with potential biologic relevance were examined further with semiquantitative real-time RT-PCR. Comparing mRNA expression levels between wild-type and homozygous Egr-1 knockout mice, we found 73 differentially expressed genes at the age of 30 days and 135 genes at the age of 42 days. Testing for differences in gene expression between the two ages (30 versus 42 days), 54 genes were differently expressed in wild-type mice and 215 genes in homozygous animals. Based on three networks proposed by Ingenuity pathway analysis software, nine differently expressed genes in the homozygous Egr-1 knockout mice were chosen for further validation by real-time RT-PCR, three genes in each network. In addition, the gene that was most prominently regulated in the knockout mice, compared to wild-type, at both 30 days and 42 days of age (protocadherin beta-9 [Pcdhb9]), was tested with real-time RT-PCR. Changes in four of the ten genes could be confirmed by real-time RT-PCR: nuclear prelamin A recognition factor (Narf), oxoglutarate dehydrogenase (Ogdh), selenium binding protein 1 (Selenbp1), and Pcdhb9. Except for Pcdhb9, the genes whose mRNA expression levels were validated were listed in one of the networks proposed by Ingenuity pathway analysis software. In addition to these genes, the software proposed several key-regulators which did not change in our study: retinoic acid, vascular endothelial growth factor A (VEGF-A), FBJ murine osteosarcoma viral oncogene homolog (cFos), and others. Identification of genes that are differentially regulated during the development period between postnatal day 30 (when both homozygous and wild-type mice still have the same axial length) and day 42 (where the difference in eye length is apparent) could improve the understanding of mechanisms for the control of axial eye growth and may lead to potential targets for pharmacological intervention. With the aid of pathway-analysis software, a coarse picture of possible biochemical pathways could be generated. Although the mRNA expression levels of proteins proposed by the software, like VEGF, FOS, retinoic acid (RA) receptors, or cellular RA binding protein, did not show any changes in our experiment, these molecules have previously been implicated in the signaling cascades controlling axial eye growth. According to the pathway-analysis software, they represent links between several proteins whose mRNA expression was changed in our study.

  10. Evaluation of stability and validation of reference genes for RT-qPCR expression studies in rice plants under water deficit.

    PubMed

    Auler, Priscila Ariane; Benitez, Letícia Carvalho; do Amaral, Marcelo Nogueira; Vighi, Isabel Lopes; Dos Santos Rodrigues, Gabriela; da Maia, Luciano Carlos; Braga, Eugenia Jacira Bolacel

    2017-05-01

    Many studies use strategies that allow for the identification of a large number of genes expressed in response to different stress conditions to which the plant is subjected throughout its cycle. In order to obtain accurate and reliable results in gene expression studies, it is necessary to use reference genes, which must have uniform expression in the majority of cells in the organism studied. RNA isolation of leaves and expression analysis in real-time quantitative polymerase chain reaction (RT-qPCR) were carried out. In this study, nine candidate reference genes were tested, actin 11 (ACT11), ubiquitin conjugated to E2 enzyme (UBC-E2), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), beta tubulin (β-tubulin), eukaryotic initiation factor 4α (eIF-4α), ubiquitin 10 (UBQ10), ubiquitin 5 (UBQ5), aquaporin TIP41 (TIP41-Like) and cyclophilin, in two genotypes of rice, AN Cambará and BRS Querência, with different levels of soil moisture (20%, 10% and recovery) in the vegetative (V5) and reproductive stages (period preceding flowering). Currently, there are different softwares that perform stability analyses and define the most suitable reference genes for a particular study. In this study, we used five different methods: geNorm, BestKeeper, ΔCt method, NormFinder and RefFinder. The results indicate that UBC-E2 and UBQ5 can be used as reference genes in all samples and softwares evaluated. The genes β-tubulin and eIF-4α, traditionally used as reference genes, along with GAPDH, presented lower stability values. The gene expression of basic leucine zipper (bZIP23 and bZIP72) was used to validate the selected reference genes, demonstrating that the use of an inappropriate reference can induce erroneous results.

  11. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    PubMed Central

    Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

    2009-01-01

    Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249

  12. High-Throughput Identification and Screening of Novel Methylobacterium Species Using Whole-Cell MALDI-TOF/MS Analysis

    PubMed Central

    Tani, Akio; Sahin, Nurettin; Matsuyama, Yumiko; Enomoto, Takashi; Nishimura, Naoki; Yokota, Akira; Kimbara, Kazuhide

    2012-01-01

    Methylobacterium species are ubiquitous α-proteobacteria that reside in the phyllosphere and are fed by methanol that is emitted from plants. In this study, we applied whole-cell matrix-assisted laser desorption/ionization time-of-flight mass spectrometry analysis (WC-MS) to evaluate the diversity of Methylobacterium species collected from a variety of plants. The WC-MS spectrum was reproducible through two weeks of cultivation on different media. WC-MS spectrum peaks of M. extorquens strain AM1 cells were attributed to ribosomal proteins, but those were not were also found. We developed a simple method for rapid identification based on spectra similarity. Using all available type strains of Methylobacterium species, the method provided a certain threshold similarity value for species-level discrimination, although the genus contains some type strains that could not be easily discriminated solely by 16S rRNA gene sequence similarity. Next, we evaluated the WC-MS data of approximately 200 methylotrophs isolated from various plants with MALDI Biotyper software (Bruker Daltonics). Isolates representing each cluster were further identified by 16S rRNA gene sequencing. In most cases, the identification by WC-MS matched that by sequencing, and isolates with unique spectra represented possible novel species. The strains belonging to M. extorquens, M. adhaesivum, M. marchantiae, M. komagatae, M. brachiatum, M. radiotolerans, and novel lineages close to M. adhaesivum, many of which were isolated from bryophytes, were found to be the most frequent phyllospheric colonizers. The WC-MS technique provides emerging high-throughputness in the identification of known/novel species of bacteria, enabling the selection of novel species in a library and identification without 16S rRNA gene sequencing. PMID:22808262

  13. Software architecture of the III/FBI segment of the FBI's integrated automated identification system

    NASA Astrophysics Data System (ADS)

    Booker, Brian T.

    1997-02-01

    This paper will describe the software architecture of the Interstate Identification Index (III/FBI) Segment of the FBI's Integrated Automated Fingerprint Identification System (IAFIS). IAFIS is currently under development, with deployment to begin in 1998. III/FBI will provide the repository of criminal history and photographs for criminal subjects, as well as identification data for military and civilian federal employees. Services provided by III/FBI include maintenance of the criminal and civil data, subject search of the criminal and civil data, and response generation services for IAFIS. III/FBI software will be comprised of both COTS and an estimated 250,000 lines of developed C code. This paper will describe the following: (1) the high-level requirements of the III/FBI software; (2) the decomposition of the III/FBI software into Computer Software Configuration Items (CSCIs); (3) the top-level design of the III/FBI CSCIs; and (4) the relationships among the developed CSCIs and the COTS products that will comprise the III/FBI software.

  14. Differentiation of Toxocara canis and Toxocara cati based on PCR-RFLP analyses of rDNA-ITS and mitochondrial cox1 and nad1 regions.

    PubMed

    Mikaeili, Fattaneh; Mathis, Alexander; Deplazes, Peter; Mirhendi, Hossein; Barazesh, Afshin; Ebrahimi, Sepideh; Kia, Eshrat Beigom

    2017-09-26

    The definitive genetic identification of Toxocara species is currently based on PCR/sequencing. The objectives of the present study were to design and conduct an in silico polymerase chain reaction-restriction fragment length polymorphism method for identification of Toxocara species. In silico analyses using the DNASIS and NEBcutter softwares were performed with rDNA internal transcribed spacers, and mitochondrial cox1 and nad1 sequences obtained in our previous studies along with relevant sequences deposited in GenBank. Consequently, RFLP profiles were designed and all isolates of T. canis and T. cati collected from dogs and cats in different geographical areas of Iran were investigated with the RFLP method using some of the identified suitable enzymes. The findings of in silico analyses predicted that on the cox1 gene only the MboII enzyme is appropriate for PCR-RFLP to reliably distinguish the two species. No suitable enzyme for PCR-RFLP on the nad1 gene was identified that yields the same pattern for all isolates of a species. DNASIS software showed that there are 241 suitable restriction enzymes for the differentiation of T. canis from T. cati based on ITS sequences. RsaI, MvaI and SalI enzymes were selected to evaluate the reliability of the in silico PCR-RFLP. The sizes of restriction fragments obtained by PCR-RFLP of all samples consistently matched the expected RFLP patterns. The ITS sequences are usually conserved and the PCR-RFLP approach targeting the ITS sequence is recommended for the molecular differentiation of Toxocara species and can provide a reliable tool for identification purposes particularly at the larval and egg stages.

  15. Gene expression profiles in rainbow trout, Onchorynchus mykiss, exposed to a simple chemical mixture.

    PubMed

    Hook, Sharon E; Skillman, Ann D; Gopalan, Banu; Small, Jack A; Schultz, Irvin R

    2008-03-01

    Among proposed uses for microarrays in environmental toxiciology is the identification of key contributors to toxicity within a mixture. However, it remains uncertain whether the transcriptomic profiles resulting from exposure to a mixture have patterns of altered gene expression that contain identifiable contributions from each toxicant component. We exposed isogenic rainbow trout Onchorynchus mykiss, to sublethal levels of ethynylestradiol, 2,2,4,4-tetrabromodiphenyl ether, and chromium VI or to a mixture of all three toxicants Fluorescently labeled complementary DNA (cDNA) were generated and hybridized against a commercially available Salmonid array spotted with 16,000 cDNAs. Data were analyzed using analysis of variance (p<0.05) with a Benjamani-Hochberg multiple test correction (Genespring [Agilent] software package) to identify up and downregulated genes. Gene clustering patterns that can be used as "expression signatures" were determined using hierarchical cluster analysis. The gene ontology terms associated with significantly altered genes were also used to identify functional groups that were associated with toxicant exposure. Cross-ontological analytics approach was used to assign functional annotations to genes with "unknown" function. Our analysis indicates that transcriptomic profiles resulting from the mixture exposure resemble those of the individual contaminant exposures, but are not a simple additive list. However, patterns of altered genes representative of each component of the mixture are clearly discernible, and the functional classes of genes altered represent the individual components of the mixture. These findings indicate that the use of microarrays to identify transcriptomic profiles may aid in the identification of key stressors within a chemical mixture, ultimately improving environmental assessment.

  16. Microarray evaluation of gene expression profiles in inflamed and healthy human dental pulp: the role of IL1beta and CD40 in pulp inflammation.

    PubMed

    Gatta, V; Zizzari, V L; Dd ' Amico, V; Salini, L; D' Aurora, M; Franchi, S; Antonucci, I; Sberna, M T; Gherlone, E; Stuppia, L; Tetè, S

    2012-01-01

    Dental pulp undergoes a number of changes passing from healthy status to inflammation due to deep decay. These changes are regulated by several genes resulting differently expressed in inflamed and healthy dental pulp, and the knowledge of the processes underlying this differential expression is of great relevance in the identification of the pathogenesis of the disease. In this study, the gene expression profile of inflamed and healthy dental pulps were compared by microarray analysis, and data obtained were analyzed by Ingenuity Pathway Analysis (IPA) software. This analysis allows to focus on a variety of genes, typically expressed in inflamed tissues. The comparison analysis showed an increased expression of several genes in inflamed pulp, among which IL1β and CD40 resulted of particular interest. These results indicate that gene expression profile of human dental pulp in different physiological and pathological conditions may become an useful tool for improving our knowledge about processes regulating pulp inflammation.

  17. Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus.

    PubMed

    Condon, David E; Tran, Phu V; Lien, Yu-Chin; Schug, Jonathan; Georgieff, Michael K; Simmons, Rebecca A; Won, Kyoung-Jae

    2018-02-05

    Identification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation. Previous approaches to call DMRs suffer from false prediction, use extreme resources, and/or require library installation and input conversion. We developed a new approach called Defiant to identify DMRs. Employing Weighted Welch Expansion (WWE), Defiant showed superior performance to other predictors in the series of benchmarking tests on artificial and real data. Defiant was subsequently used to investigate DNA methylation changes in iron-deficient rat hippocampus. Defiant identified DMRs close to genes associated with neuronal development and plasticity, which were not identified by its competitor. Importantly, Defiant runs between 5 to 479 times faster than currently available software packages. Also, Defiant accepts 10 different input formats widely used for DNA methylation data. Defiant effectively identifies DMRs for whole-genome bisulfite sequencing (WGBS), reduced-representation bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-seq), and HpaII tiny fragment enrichment by ligation-mediated PCR-tag (HELP) assays.

  18. Ub-ISAP: a streamlined UNIX pipeline for mining unique viral vector integration sites from next generation sequencing data.

    PubMed

    Kamboj, Atul; Hallwirth, Claus V; Alexander, Ian E; McCowage, Geoffrey B; Kramer, Belinda

    2017-06-17

    The analysis of viral vector genomic integration sites is an important component in assessing the safety and efficiency of patient treatment using gene therapy. Alongside this clinical application, integration site identification is a key step in the genetic mapping of viral elements in mutagenesis screens that aim to elucidate gene function. We have developed a UNIX-based vector integration site analysis pipeline (Ub-ISAP) that utilises a UNIX-based workflow for automated integration site identification and annotation of both single and paired-end sequencing reads. Reads that contain viral sequences of interest are selected and aligned to the host genome, and unique integration sites are then classified as transcription start site-proximal, intragenic or intergenic. Ub-ISAP provides a reliable and efficient pipeline to generate large datasets for assessing the safety and efficiency of integrating vectors in clinical settings, with broader applications in cancer research. Ub-ISAP is available as an open source software package at https://sourceforge.net/projects/ub-isap/ .

  19. "Plasmo2D": an ancillary proteomic tool to aid identification of proteins from Plasmodium falciparum.

    PubMed

    Khachane, Amit; Kumar, Ranjit; Jain, Sanyam; Jain, Samta; Banumathy, Gowrishankar; Singh, Varsha; Nagpal, Saurabh; Tatu, Utpal

    2005-01-01

    Bioinformatics tools to aid gene and protein sequence analysis have become an integral part of biology in the post-genomic era. Release of the Plasmodium falciparum genome sequence has allowed biologists to define the gene and the predicted protein content as well as their sequences in the parasite. Using pI and molecular weight as characteristics unique to each protein, we have developed a bioinformatics tool to aid identification of proteins from Plasmodium falciparum. The tool makes use of a Virtual 2-DE generated by plotting all of the proteins from the Plasmodium database on a pI versus molecular weight scale. Proteins are identified by comparing the position of migration of desired protein spots from an experimental 2-DE and that on a virtual 2-DE. The procedure has been automated in the form of user-friendly software called "Plasmo2D". The tool can be downloaded from http://144.16.89.25/Plasmo2D.zip.

  20. Characterization of Proteoforms with Unknown Post-translational Modifications Using the MIScore

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kou, Qiang; Zhu, Binhai; Wu, Si

    Various proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs comparedmore » with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform-spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.« less

  1. Identification of Aspergillus sections Flavi, Nigri, and Fumigati and their differentiation using specific primers.

    PubMed

    Ashtiani, Nafiseh Mohebbi; Kachuei, Reza; Yalfani, Roozbeh; Harchegani, Asghar Beigi; Nosratabadi, Mohsen

    2017-06-01

    Aspergillus species are important in medicine, agriculture and various industries. The sections Fumigati, Flavi, and Nigri are the most important members of the Aspergillus genus. This study intended to identify and separate these three Aspergillus sections and to differentiate among them using specific primers. A bioinformatics study was initially performed to analyse the sequences of five genes, namely, beta-tubulin, calmodulin, the pre-rRNA processing protein Tsr1, the DNA-replication licensing factor Mcm7, and RNA polymerase II second largest subunit (RPB2) in the three Aspergillus sections using MEGA6 software and the NCBI database. Primers were designed to select genes for each of the Aspergillus sections being analysed. A total of 134 environmental and clinical Aspergillus species were isolated, purified and initially identified by colony morphology.. Subsequently, DNA was extracted using the phenol-chloroform method, specific primers were synthesized, PCR was performed for DNA from all isolates, and the results were compared to morphological characteristics. Of the 134 isolates tested, 56 were Nigri, 32 were Fumigati, 32 were Flavi, and the rest (14 isolates) belonged to other sections. The beta-tubulin and calmodulin genes were found to be the most suitable for differentiating among these three groups; the beta-tubulin gene was used for molecular identification of Aspergillus section Fumigati, and the calmodulin gene for identifying sections Flavi and Nigri.

  2. Scoring clustering solutions by their biological relevance.

    PubMed

    Gat-Viks, I; Sharan, R; Shamir, R

    2003-12-12

    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.

  3. Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization.

    PubMed

    Jung, Sang-Kyu; McDonald, Karen

    2011-08-16

    Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net.

  4. Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization

    PubMed Central

    2011-01-01

    Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net. PMID:21846353

  5. Proteomic characterization of hempseed (Cannabis sativa L.).

    PubMed

    Aiello, Gilda; Fasoli, Elisa; Boschin, Giovanna; Lammi, Carmen; Zanoni, Chiara; Citterio, Attilio; Arnoldi, Anna

    2016-09-16

    This paper presents an investigation on hempseed proteome. The experimental approach, based on combinatorial peptide ligand libraries (CPLLs), SDS-PAGE separation, nLC-ESI-MS/MS identification, and database search, permitted identifying in total 181 expressed proteins. This very large number of identifications was achieved by searching in two databases: Cannabis sativa L. (56 gene products identified) and Arabidopsis thaliana (125 gene products identified). By performing a protein-protein association network analysis using the STRING software, it was possible to build the first interactomic map of all detected proteins, characterized by 137 nodes and 410 interactions. Finally, a Gene Ontology analysis of the identified species permitted to classify their molecular functions: the great majority is involved in the seed metabolic processes (41%), responses to stimulus (8%), and biological process (7%). Hempseed is an underexploited non-legume protein-rich seed. Although its protein is well known for its digestibility, essential amino acid composition, and useful techno-functional properties, a comprehensive proteome characterization is still lacking. The objective of this work was to fill this knowledge gap and provide information useful for a better exploitation of this seed in different food products. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Alternatives for jet engine control

    NASA Technical Reports Server (NTRS)

    Sain, M. K.

    1983-01-01

    Tensor model order reduction, recursive tensor model identification, input design for tensor model identification, software development for nonlinear feedback control laws based upon tensors, and development of the CATNAP software package for tensor modeling, identification and simulation were studied. The last of these are discussed.

  7. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

    PubMed

    Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda

    2017-06-26

    The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis regulatory motifs). The P. tetraurelia improved transcriptome resource, gene annotations for P. tetraurelia, P. biaurelia, P. sexaurelia and P. caudatum, and Paramecium-trained EuGene configuration are available through ParameciumDB ( http://paramecium.i2bc.paris-saclay.fr ). TrUC software is freely distributed under a GNU GPL v3 licence ( https://github.com/oarnaiz/TrUC ).

  8. INfORM: Inference of NetwOrk Response Modules.

    PubMed

    Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario

    2018-06-15

    Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.

  9. MetNet: Software to Build and Model the Biogenetic Lattice of Arabidopsis

    DOE PAGES

    Wurtele, Eve Syrkin; Li, Jie; Diao, Lixia; ...

    2003-01-01

    MetNet (http://www.botany.iastate.edu/∼mash/metnetex/metabolicnetex.html) is publicly available software in development for analysis of genome-wide RNA, protein and metabolite profiling data. The software is designed to enable the biologist to visualize, statistically analyse and model a metabolic and regulatory network map of Arabidopsis , combined with gene expression profiling data. It contains a JAVA interface to an interactions database (MetNetDB) containing information on regulatory and metabolic interactions derived from a combination of web databases (TAIR, KEGG, BRENDA) and input from biologists in their area of expertise. FCModeler captures input from MetNetDB in a graphical form. Sub-networks can be identified and interpreted using simplemore » fuzzy cognitive maps. FCModeler is intended to develop and evaluate hypotheses, and provide a modelling framework for assessing the large amounts of data captured by high-throughput gene expression experiments. FCModeler and MetNetDB are currently being extended to three-dimensional virtual reality display. The MetNet map, together with gene expression data, can be viewed using multivariate graphics tools in GGobi linked with the data analytic tools in R. Users can highlight different parts of the metabolic network and see the relevant expression data highlighted in other data plots. Multi-dimensional expression data can be rotated through different dimensions. Statistical analysis can be computed alongside the visual. MetNet is designed to provide a framework for the formulation of testable hypotheses regarding the function of specific genes, and in the long term provide the basis for identification of metabolic and regulatory networks that control plant composition and development.« less

  10. LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts.

    PubMed

    Maier, Holger; Döhr, Stefanie; Grote, Korbinian; O'Keeffe, Sean; Werner, Thomas; Hrabé de Angelis, Martin; Schneider, Ralf

    2005-07-01

    The LitMiner software is a literature data-mining tool that facilitates the identification of major gene regulation key players related to a user-defined field of interest in PubMed abstracts. The prediction of gene-regulatory relationships is based on co-occurrence analysis of key terms within the abstracts. LitMiner predicts relationships between key terms from the biomedical domain in four categories (genes, chemical compounds, diseases and tissues). Owing to the limitations (no direction, unverified automatic prediction) of the co-occurrence approach, the primary data in the LitMiner database represent postulated basic gene-gene relationships. The usefulness of the LitMiner system has been demonstrated recently in a study that reconstructed disease-related regulatory networks by promoter modelling that was initiated by a LitMiner generated primary gene list. To overcome the limitations and to verify and improve the data, we developed WikiGene, a Wiki-based curation tool that allows revision of the data by expert users over the Internet. LitMiner (http://andromeda.gsf.de/litminer) and WikiGene (http://andromeda.gsf.de/wiki) can be used unrestricted with any Internet browser.

  11. Validation of Reference Genes for Relative Quantitative Gene Expression Studies in Cassava (Manihot esculenta Crantz) by Using Quantitative Real-Time PCR

    PubMed Central

    Hu, Meizhen; Hu, Wenbin; Xia, Zhiqiang; Zhou, Xincheng; Wang, Wenquan

    2016-01-01

    Reverse transcription quantitative real-time polymerase chain reaction (real-time PCR, also referred to as quantitative RT-PCR or RT-qPCR) is a highly sensitive and high-throughput method used to study gene expression. Despite the numerous advantages of RT-qPCR, its accuracy is strongly influenced by the stability of internal reference genes used for normalizations. To date, few studies on the identification of reference genes have been performed on cassava (Manihot esculenta Crantz). Therefore, we selected 26 candidate reference genes mainly via the three following channels: reference genes used in previous studies on cassava, the orthologs of the most stable Arabidopsis genes, and the sequences obtained from 32 cassava transcriptome sequence data. Then, we employed ABI 7900 HT and SYBR Green PCR mix to assess the expression of these genes in 21 materials obtained from various cassava samples under different developmental and environmental conditions. The stability of gene expression was analyzed using two statistical algorithms, namely geNorm and NormFinder. geNorm software suggests the combination of cassava4.1_017977 and cassava4.1_006391 as sufficient reference genes for major cassava samples, the union of cassava4.1_014335 and cassava4.1_006884 as best choice for drought stressed samples, and the association of cassava4.1_012496 and cassava4.1_006391 as optimal choice for normally grown samples. NormFinder software recommends cassava4.1_006884 or cassava4.1_006776 as superior reference for qPCR analysis of different materials and organs of drought stressed or normally grown cassava, respectively. Results provide an important resource for cassava reference genes under specific conditions. The limitations of these findings were also discussed. Furthermore, we suggested some strategies that may be used to select candidate reference genes. PMID:27242878

  12. SPIRE, a modular pipeline for eQTL analysis of RNA-Seq data, reveals a regulatory hotspot controlling miRNA expression in C. elegans.

    PubMed

    Kel, Ivan; Chang, Zisong; Galluccio, Nadia; Romeo, Margherita; Beretta, Stefano; Diomede, Luisa; Mezzelani, Alessandra; Milanesi, Luciano; Dieterich, Christoph; Merelli, Ivan

    2016-10-18

    The interpretation of genome-wide association study is difficult, as it is hard to understand how polymorphisms can affect gene regulation, in particular for trans-regulatory elements located far from their controlling gene. Using RNA or protein expression data as phenotypes, it is possible to correlate their variations with specific genotypes. This technique is usually referred to as expression Quantitative Trait Loci (eQTLs) analysis and only few packages exist for the integration of genotype patterns and expression profiles. In particular, tools are needed for the analysis of next-generation sequencing (NGS) data on a genome-wide scale, which is essential to identify eQTLs able to control a large number of genes (hotspots). Here we present SPIRE (Software for Polymorphism Identification Regulating Expression), a generic, modular and functionally highly flexible pipeline for eQTL processing. SPIRE integrates different univariate and multivariate approaches for eQTL analysis, paying particular attention to the scalability of the procedure in order to support cis- as well as trans-mapping, thus allowing the identification of hotspots in NGS data. In particular, we demonstrated how SPIRE can handle big association study datasets, reproducing published results and improving the identification of trans-eQTLs. Furthermore, we employed the pipeline to analyse novel data concerning the genotypes of two different C. elegans strains (N2 and Hawaii) and related miRNA expression data, obtained using RNA-Seq. A miRNA regulatory hotspot was identified in chromosome 1, overlapping the transcription factor grh-1, known to be involved in the early phases of embryonic development of C. elegans. In a follow-up qPCR experiment we were able to verify most of the predicted eQTLs, as well as to show, for a novel miRNA, a significant difference in the sequences of the two analysed strains of C. elegans. SPIRE is publicly available as open source software at , together with some example data, a readme file, supplementary material and a short tutorial.

  13. Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statistic.

    PubMed

    Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert

    2012-08-01

    Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.

  14. Robust identification of transcriptional regulatory networks using a Gibbs sampler on outlier sum statistic

    PubMed Central

    Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B.; Chen, Li; Wang, Yue; Clarke, Robert

    2012-01-01

    Motivation: Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive ‘noise’ in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. Results: In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Availability and implementation: The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. Contact: xuan@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22595208

  15. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    PubMed Central

    2010-01-01

    Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value < 0.05). Enrichment ratio 2 calculations showed that > 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and illustrated that the SSHscreen plots are a useful tool for choosing anonymous clones for sequencing, since redundant clones cluster together on the enrichment ratio plots. Conclusions We developed the SSHscreen-SSHdb software pipeline, which greatly facilitates gene discovery using suppression subtractive hybridization by improving the selection of clones for sequencing after screening the library on a small number of microarrays. Annotation of the sequence information and collaboration was further enhanced through a web-based SSHdb database, and we illustrated this through identification of drought responsive genes from cowpea, which can now be investigated in gene function studies. SSH is a popular and powerful gene discovery tool, and therefore this pipeline will have application for gene discovery in any biological system, particularly non-model organisms. SSHscreen 2.0.1 and a link to SSHdb are available from http://microarray.up.ac.za/SSHscreen. PMID:20359330

  16. Identification of 28 cytochrome P450 genes from the transcriptome of the marine rotifer Brachionus plicatilis and analysis of their expression.

    PubMed

    Kim, Hui-Su; Han, Jeonghoon; Kim, Hee-Jin; Hagiwara, Atsushi; Lee, Jae-Seong

    2017-09-01

    Whole transcriptomes of the rotifer Brachionus plicatilis were analyzed using an Illumina sequencer. De novo assembly was performed with 49,122,780 raw reads using Trinity software. Among the assembled 42,820 contigs, 27,437 putative open reading frame contigs were identified (average length 1235bp; N50=1707bp). Functional gene annotation with Gene Ontology and InterProScan, in addition to Kyoto Encyclopedia of Genes and Genomes pathway analysis, highlighted the metabolism of xenobiotics by cytochrome P450 (CYP). In addition, 28 CYP genes were identified, and their transcriptional responses to benzo[α]pyrene (B[α]P) were investigated. Most of the CYPs were significantly upregulated or downregulated (P<0.05) in response to B[α]P, suggesting that Bp-CYP genes play a crucial role in detoxification mechanisms in response to xenobiotics. This study sheds light on the molecular defense mechanisms of the rotifer B. plicatilis in response to exposure to various chemicals. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Software For Computer-Security Audits

    NASA Technical Reports Server (NTRS)

    Arndt, Kate; Lonsford, Emily

    1994-01-01

    Information relevant to potential breaches of security gathered efficiently. Automated Auditing Tools for VAX/VMS program includes following automated software tools performing noted tasks: Privileged ID Identification, program identifies users and their privileges to circumvent existing computer security measures; Critical File Protection, critical files not properly protected identified; Inactive ID Identification, identifications of users no longer in use found; Password Lifetime Review, maximum lifetimes of passwords of all identifications determined; and Password Length Review, minimum allowed length of passwords of all identifications determined. Written in DEC VAX DCL language.

  18. Identification of key genes in Gram-positive and Gram-negative sepsis using stochastic perturbation

    PubMed Central

    Li, Zhenliang; Zhang, Ying; Liu, Yaling; Liu, Yanchun; Li, Youyi

    2017-01-01

    Sepsis is an inflammatory response to pathogens (such as Gram-positive and Gram-negative bacteria), which has high morbidity and mortality in critically ill patients. The present study aimed to identify the key genes in Gram-positive and Gram-negative sepsis. GSE6535 was downloaded from Gene Expression Omnibus, containing 17 control samples, 18 Gram-positive samples and 25 Gram-negative samples. Subsequently, the limma package in R was used to screen the differentially expressed genes (DEGs). Hierarchical clustering was conducted for the specific DEGs in Gram-negative and Gram-negative samples using cluster software and the TreeView software. To analyze the correlation of samples at the gene level, a similarity network was constructed using Cytoscape software. Functional and pathway enrichment analyses were conducted for the DEGs using DAVID. Finally, stochastic perturbation was used to determine the significantly differential functions between Gram-positive and Gram-negative samples. A total of 340 and 485 DEGs were obtained in Gram-positive and Gram-negative samples, respectively. Hierarchical clustering revealed that there were significant differences between control and sepsis samples. In Gram-positive and Gram-negative samples, myeloid cell leukemia sequence 1 was associated with apoptosis and programmed cell death. Additionally, NADH:ubiquinone oxidoreductase subunit S4 was associated with mitochondrial respiratory chain complex I assembly. Stochastic perturbation analysis revealed that NADH:ubiquinone oxidoreductase subunit B2 (NDUFB2), NDUFB8 and ubiquinol-cytochrome c reductase hinge protein (UQCRH) were associated with cellular respiration in Gram-negative samples, whereas large tumor suppressor kinase 2 (LATS2) was associated with G1/S transition of the mitotic cell cycle in Gram-positive samples. NDUFB2, NDUFB8 and UQCRH may be biomarkers for Gram-negative sepsis, whereas LATS2 may be a biomarker for Gram-positive sepsis. These findings may promote the therapies of sepsis caused by Gram-positive and Gram-negative bacteria. PMID:28714002

  19. Searching for Beta-Haemolysin hlb Gene in Staphylococcus pseudintermedius with Species-Specific Primers.

    PubMed

    Kmieciak, Wioletta; Szewczyk, Eligia M; Ciszewski, Marcin

    2016-07-01

    The paper presents an analysis of 51 Staphylococcus pseudintermedius clinically isolated strains from humans and from animals. Staphylococcus pseudintermedius strains' ability to produce β-haemolysin was evaluated with phenotypic methods (hot-cold effect, reverse CAMP test). In order to determine the hlb gene presence (coding for β-haemolysin) in a genomic DNA, PCR reactions were conducted with two different pairs of primers: one described in the literature for Staphylococcus aureus and recommended for analysing SIG group staphylococci and newly designed one in CLC Main Workbench software. Only reactions with newly designed primers resulted in product amplification, the presence of which was fully compatible with the results of phenotypic β-haemolysin test. Negative results for S. aureus and S. intermedius reference ATCC strains suggest that after further analysis the fragment of hlb gene amplified with primers described in this study might be included in the process of S. pseudintermedius strains identification.

  20. A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kou, Qiang; Wu, Si; Tolić, Nikola

    Motivation: Although proteomics has rapidly developed in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a “bird’s eye view” of intact proteoforms. The combinatorial explosion of various alterations on a protein may result inmore » billions of possible proteoforms, making proteoform identification a challenging computational problem. Results: We propose a new data structure, called the mass graph, for efficient representation of proteoforms and design mass graph alignment algorithms. We developed TopMG, a mass graph-based software tool for proteoform identification by top-down mass spectrometry. Experiments on top-down mass spectrometry data sets showed that TopMG outperformed existing methods in identifying complex proteoforms.« less

  1. Selection of Reference Genes for Quantitative Gene Expression in Porcine Mesenchymal Stem Cells Derived from Various Sources along with Differentiation into Multilineages

    PubMed Central

    Lee, Won-Jae; Jeon, Ryoung-Hoon; Jang, Si-Jung; Park, Ji-Sung; Lee, Seung-Chan; Baregundi Subbarao, Raghavendra; Lee, Sung-Lim; Park, Bong-Wook; King, William Allan; Rho, Gyu-Jin

    2015-01-01

    The identification of stable reference genes is a prerequisite for ensuring accurate validation of gene expression, yet too little is known about stable reference genes of porcine MSCs. The present study was, therefore, conducted to assess the stability of reference genes in porcine MSCs derived from bone marrow (BMSCs), adipose (AMSCs), and skin (SMSCs) with their in vitro differentiated cells into mesenchymal lineages such as adipocytes, osteocytes, and chondrocytes. Twelve commonly used reference genes were investigated for their threshold cycle (Ct) values by qRT-PCR. The Ct values of candidate reference genes were analyzed by geNorm software to clarify stable expression regardless of experimental conditions. Thus, Pearson's correlation was applied to determine correlation between the three most stable reference genes (NF3) and optimal number of reference genes (NFopt). In assessment of stability of reference gene across experimental conditions by geNorm analysis, undifferentiated MSCs and each differentiated status into mesenchymal lineages showed slightly different results but similar patterns about more or less stable rankings. Furthermore, Pearson's correlation revealed high correlation (r > 0.9) between NF3 and NFopt. Overall, the present study showed that HMBS, YWHAZ, SDHA, and TBP are suitable reference genes for qRT-PCR in porcine MSCs. PMID:25972899

  2. Validation of reference genes for quantitative real-time PCR during leaf and flower development in Petunia hybrida

    PubMed Central

    2010-01-01

    Background Identification of genes with invariant levels of gene expression is a prerequisite for validating transcriptomic changes accompanying development. Ideally expression of these genes should be independent of the morphogenetic process or environmental condition tested as well as the methods used for RNA purification and analysis. Results In an effort to identify endogenous genes meeting these criteria nine reference genes (RG) were tested in two Petunia lines (Mitchell and V30). Growth conditions differed in Mitchell and V30, and different methods were used for RNA isolation and analysis. Four different software tools were employed to analyze the data. We merged the four outputs by means of a non-weighted unsupervised rank aggregation method. The genes identified as optimal for transcriptomic analysis of Mitchell and V30 were EF1α in Mitchell and CYP in V30, whereas the least suitable gene was GAPDH in both lines. Conclusions The least adequate gene turned out to be GAPDH indicating that it should be rejected as reference gene in Petunia. The absence of correspondence of the best-suited genes suggests that assessing reference gene stability is needed when performing normalization of data from transcriptomic analysis of flower and leaf development. PMID:20056000

  3. In Silico Identification Software (ISIS): A Machine Learning Approach to Tandem Mass Spectral Identification of Lipids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kangas, Lars J.; Metz, Thomas O.; Isaac, Georgis

    2012-05-15

    Liquid chromatography-mass spectrometry-based metabolomics has gained importance in the life sciences, yet it is not supported by software tools for high throughput identification of metabolites based on their fragmentation spectra. An algorithm (ISIS: in silico identification software) and its implementation are presented and show great promise in generating in silico spectra of lipids for the purpose of structural identification. Instead of using chemical reaction rate equations or rules-based fragmentation libraries, the algorithm uses machine learning to find accurate bond cleavage rates in a mass spectrometer employing collision-induced dissocia-tion tandem mass spectrometry. A preliminary test of the algorithm with 45 lipidsmore » from a subset of lipid classes shows both high sensitivity and specificity.« less

  4. Identification of a mouse synaptic glycoprotein gene in cultured neurons.

    PubMed

    Yu, Albert Cheung-Hoi; Sun, Chun Xiao; Li, Qiang; Liu, Hua Dong; Wang, Chen Ran; Zhao, Guo Ping; Jin, Meilei; Lau, Lok Ting; Fung, Yin-Wan Wendy; Liu, Shuang

    2005-10-01

    Neuronal differentiation and aging are known to involve many genes, which may also be differentially expressed during these developmental processes. From primary cultured cerebral cortical neurons, we have previously identified various differentially expressed gene transcripts from cultured cortical neurons using the technique of arbitrarily primed PCR (RAP-PCR). Among these transcripts, clone 0-2 was found to have high homology to rat and human synaptic glycoprotein. By in silico analysis using an EST database and the FACTURA software, the full-length sequence of 0-2 was assembled and the clone was named as mouse synaptic glycoprotein homolog 2 (mSC2). DNA sequencing revealed transcript size of mSC2 being smaller than the human and rat homologs. RT-PCR indicated that mSC2 was expressed differentially at various culture days. The mSC2 gene was located in various tissues with higher expression in brain, lung, and liver. Functions of mSC2 in neurons and other tissues remain elusive and will require more investigation.

  5. Comprehensive Analysis of Gene Expression Profiles of Sepsis-Induced Multiorgan Failure Identified Its Valuable Biomarkers.

    PubMed

    Wang, Yumei; Yin, Xiaoling; Yang, Fang

    2018-02-01

    Sepsis is an inflammatory-related disease, and severe sepsis would induce multiorgan dysfunction, which is the most common cause of death of patients in noncoronary intensive care units. Progression of novel therapeutic strategies has proven to be of little impact on the mortality of severe sepsis, and unfortunately, its mechanisms still remain poorly understood. In this study, we analyzed gene expression profiles of severe sepsis with failure of lung, kidney, and liver for the identification of potential biomarkers. We first downloaded the gene expression profiles from the Gene Expression Omnibus and performed preprocessing of raw microarray data sets and identification of differential expression genes (DEGs) through the R programming software; then, significantly enriched functions of DEGs in lung, kidney, and liver failure sepsis samples were obtained from the Database for Annotation, Visualization, and Integrated Discovery; finally, protein-protein interaction network was constructed for DEGs based on the STRING database, and network modules were also obtained through the MCODE cluster method. As a result, lung failure sepsis has the highest number of DEGs of 859, whereas the number of DEGs in kidney and liver failure sepsis samples is 178 and 175, respectively. In addition, 17 overlaps were obtained among the three lists of DEGs. Biological processes related to immune and inflammatory response were found to be significantly enriched in DEGs. Network and module analysis identified four gene clusters in which all or most of genes were upregulated. The expression changes of Icam1 and Socs3 were further validated through quantitative PCR analysis. This study should shed light on the development of sepsis and provide potential therapeutic targets for sepsis-induced multiorgan failure.

  6. Identification of causal genes for complex traits

    PubMed Central

    Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

    2015-01-01

    Motivation: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider ‘causal variants’ as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. Results: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Availability and implementation: Software is freely available for download at genetics.cs.ucla.edu/caviar. Contact: eeskin@cs.ucla.edu PMID:26072484

  7. Identification of causal genes for complex traits.

    PubMed

    Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

    2015-06-15

    Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Software is freely available for download at genetics.cs.ucla.edu/caviar. © The Author 2015. Published by Oxford University Press.

  8. FragIdent--automatic identification and characterisation of cDNA-fragments.

    PubMed

    Seelow, Dominik; Goehler, Heike; Hoffmann, Katrin

    2009-03-02

    Many genetic studies and functional assays are based on cDNA fragments. After the generation of cDNA fragments from an mRNA sample, their content is at first unknown and must be assigned by sequencing reactions or hybridisation experiments. Even in characterised libraries, a considerable number of clones are wrongly annotated. Furthermore, mix-ups can happen in the laboratory. It is therefore essential to the relevance of experimental results to confirm or determine the identity of the employed cDNA fragments. However, the manual approach for the characterisation of these fragments using BLAST web interfaces is not suited for larger number of sequences and so far, no user-friendly software is publicly available. Here we present the development of FragIdent, an application for the automatic identification of open reading frames (ORFs) within cDNA-fragments. The software performs BLAST analyses to identify the genes represented by the sequences and suggests primers to complete the sequencing of the whole insert. Gene-specific information as well as the protein domains encoded by the cDNA fragment are retrieved from Internet-based databases and included in the output. The application features an intuitive graphical interface and is designed for researchers without any bioinformatics skills. It is suited for projects comprising up to several hundred different clones. We used FragIdent to identify 84 cDNA clones from a yeast two-hybrid experiment. Furthermore, we identified 131 protein domains within our analysed clones. The source code is freely available from our homepage at http://compbio.charite.de/genetik/FragIdent/.

  9. Identification of Suitable Reference Genes for Investigating Gene Expression in Anterior Cruciate Ligament Injury by Using Reverse Transcription-Quantitative PCR.

    PubMed

    Leal, Mariana Ferreira; Astur, Diego Costa; Debieux, Pedro; Arliani, Gustavo Gonçalves; Silveira Franciozi, Carlos Eduardo; Loyola, Leonor Casilla; Andreoli, Carlos Vicente; Smith, Marília Cardoso; Pochini, Alberto de Castro; Ejnisman, Benno; Cohen, Moises

    2015-01-01

    The anterior cruciate ligament (ACL) is one of the most frequently injured structures during high-impact sporting activities. Gene expression analysis may be a useful tool for understanding ACL tears and healing failure. Reverse transcription-quantitative polymerase chain reaction (RT-qPCR) has emerged as an effective method for such studies. However, this technique requires the use of suitable reference genes for data normalization. Here, we evaluated the suitability of six reference genes (18S, ACTB, B2M, GAPDH, HPRT1, and TBP) by using ACL samples of 39 individuals with ACL tears (20 with isolated ACL tears and 19 with ACL tear and combined meniscal injury) and of 13 controls. The stability of the candidate reference genes was determined by using the NormFinder, geNorm, BestKeeper DataAssist, and RefFinder software packages and the comparative ΔCt method. ACTB was the best single reference gene and ACTB+TBP was the best gene pair. The GenEx software showed that the accumulated standard deviation is reduced when a larger number of reference genes is used for gene expression normalization. However, the use of a single reference gene may not be suitable. To identify the optimal combination of reference genes, we evaluated the expression of FN1 and PLOD1. We observed that at least 3 reference genes should be used. ACTB+HPRT1+18S is the best trio for the analyses involving isolated ACL tears and controls. Conversely, ACTB+TBP+18S is the best trio for the analyses involving (1) injured ACL tears and controls, and (2) ACL tears of patients with meniscal tears and controls. Therefore, if the gene expression study aims to compare non-injured ACL, isolated ACL tears and ACL tears from patients with meniscal tear as three independent groups ACTB+TBP+18S+HPRT1 should be used. In conclusion, 3 or more genes should be used as reference genes for analysis of ACL samples of individuals with and without ACL tears.

  10. De novo transcriptome assembly in chili pepper (Capsicum frutescens) to identify genes involved in the biosynthesis of capsaicinoids.

    PubMed

    Liu, Shaoqun; Li, Wanshun; Wu, Yimin; Chen, Changming; Lei, Jianjun

    2013-01-01

    The capsaicinoids are a group of compounds produced by chili pepper fruits and are used widely in many fields, especially in medical purposes. The capsaicinoid biosynthetic pathway has not yet been established clearly. To understand more knowledge in biosynthesis of capsaicinoids, we applied RNA-seq for the mixture of placenta and pericarp of pungent pepper (Capsicum frutescens L.). We have assessed the effect of various assembly parameters using different assembly software, and obtained one of the best strategies for de novo assembly of transcriptome data. We obtained a total 54,045 high-quality unigenes (transcripts) using Trinity software. About 92.65% of unigenes showed similarity to the public protein sequences, genome of potato and tomato and pepper (C. annuum) ESTs databases. Our results predicted 3 new structural genes (DHAD, TD, PAT), which filled gaps of the capsaicinoid biosynthetic pathway predicted by Mazourek, and revealed new candidate genes involved in capsaicinoid biosynthesis based on KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis. A significant number of SSR (Simple Sequence Repeat) and SNP (Single Nucleotide Polymorphism) markers were predicted in C. frutescens and C. annuum sequences, which will be helpful in the identification of polymorphisms within chili pepper populations. These data will provide new insights to the pathway of capsaicinoid biosynthesis and subsequent research of chili peppers. In addition, our strategy of de novo transcriptome assembly is applicable to a wide range of similar studies.

  11. De Novo Transcriptome Assembly in Chili Pepper (Capsicum frutescens) to Identify Genes Involved in the Biosynthesis of Capsaicinoids

    PubMed Central

    Liu, Shaoqun; Li, Wanshun; Wu, Yimin; Chen, Changming; Lei, Jianjun

    2013-01-01

    The capsaicinoids are a group of compounds produced by chili pepper fruits and are used widely in many fields, especially in medical purposes. The capsaicinoid biosynthetic pathway has not yet been established clearly. To understand more knowledge in biosynthesis of capsaicinoids, we applied RNA-seq for the mixture of placenta and pericarp of pungent pepper (Capsicum frutescens L.). We have assessed the effect of various assembly parameters using different assembly software, and obtained one of the best strategies for de novo assembly of transcriptome data. We obtained a total 54,045 high-quality unigenes (transcripts) using Trinity software. About 92.65% of unigenes showed similarity to the public protein sequences, genome of potato and tomato and pepper (C. annuum) ESTs databases. Our results predicted 3 new structural genes (DHAD, TD, PAT), which filled gaps of the capsaicinoid biosynthetic pathway predicted by Mazourek, and revealed new candidate genes involved in capsaicinoid biosynthesis based on KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis. A significant number of SSR (Simple Sequence Repeat) and SNP (Single Nucleotide Polymorphism) markers were predicted in C. frutescens and C. annuum sequences, which will be helpful in the identification of polymorphisms within chili pepper populations. These data will provide new insights to the pathway of capsaicinoid biosynthesis and subsequent research of chili peppers. In addition, our strategy of de novo transcriptome assembly is applicable to a wide range of similar studies. PMID:23349661

  12. Identification of differentially expressed genes in the oviduct of two rabbit lines divergently selected for uterine capacity using suppression subtractive hybridization.

    PubMed

    Ballester, M; Castelló, A; Peiró, R; Argente, M J; Santacreu, M A; Folch, J M

    2013-06-01

    Suppressive subtractive hybridization libraries from oviduct at 62 h post-mating of two lines of rabbits divergently selected for uterine capacity were generated to identify differentially expressed genes. A total of 438 singletons and 126 contigs were obtained by cluster assembly and sequence alignment of 704 expressed sequence tags (ESTs), of which 54% showed homology to known proteins of the non-redundant NCBI databases. Differential screening by dot blot validated 71 ESTs, of which 47 showed similarity to known genes. Transcripts of genes were functionally annotated in the molecular function and the biological process gene ontology categories using the BLAST2GO software and were assigned to reproductive developmental process, immune response, amino acid metabolism and degradation, response to stress and apoptosis terms. Finally, three interesting genes, PGR, HSD17B4 and ERO1L, were identified as overexpressed in the low line using RT-qPCR. Our study provides a list of candidate genes that can be useful to understanding the molecular mechanisms underlying the phenotypic differences observed in early embryo survival and development traits. © 2012 The Authors, Animal Genetics © 2012 Stichting International Foundation for Animal Genetics.

  13. Empirical Bayes conditional independence graphs for regulatory network recovery.

    PubMed

    Mahdi, Rami; Madduri, Abishek S; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R; Crystal, Ronald G; Mezey, Jason G

    2012-08-01

    Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary data are available at Bioinformatics online.

  14. Identification of Candida Species Using MP65 Gene and Evaluation of the Candida albicans MP65 Gene Expression in BALB/C Mice.

    PubMed

    Bineshian, Farahnaz; Yadegari, Mohammad Hossien; Sharifi, Zohre; Akbari Eidgahi, Mohammadreza; Nasr, Reza

    2015-05-01

    Systemic candidiasis is a major public health concern. In particular, in immunocompromised people, such as patients with neutropenia, patients with Acquired Immune Deficiency Syndrome (AIDS) and cancer who are undergoing antiballistic chemotherapy or bone marrow transplants, and people with diabetes. Since the clinical signs and symptoms are nonspecific, early diagnosis is often difficult. The 65-kDa mannoprotein (MP65) gene of Candida albicans is appropriate for detection and identification of systemic candidiasis. This gene encodes a putative b-glucanase mannoprotein of 65 kDa, which plays a major role in the host-fungus relationship, morphogenesis and pathogenicity. The current study aimed to identify different species of Candida (C. albicans, C. glabrata and C. parapsilosis) using the Polymerase Chain Reaction (PCR) technique and also to evaluate C. albicans MP65 gene expression in BALB/C mice. All yeast isolates were identified on cornmeal agar supplemented with tween-80, germ tube formation in serum, and assimilation of carbon sources in the API 20 C AUX yeast identification system. Polymerase Chain Reaction was performed on all samples using species-specific primers for the MP65 65 kDa gene. After RNA extraction, cDNA synthesis was performed by the Maxime RT Pre Mix kit. Candida albicans MP65 gene expression was evaluated by quantitative Real-Time (q Real-Time) and Real-Time (RT) PCR techniques. The 2-ΔΔCT method was used to analyze relative changes in gene expression of MP65. For statistical analysis, nonparametric Wilcoxon test was applied using the SPSS version 16 software. Using biochemical methods, one hundred, six and one isolates of clinical samples were determined as C. albicans, C. glabrata and C. parapsilosis, respectively. Species-specific primers for PCR experiments were applied to clinical specimens, and in all cases a single expected band for C. albicans, C. glabrata and C. parapsilosis was obtained (475, 361 and 124 base pairs, respectively). All species isolated by culture methods (100% positivity) were evaluated with PCR using species-specific primers to identify Candida species. Relative expression of Mp65 genes increased significantly after C. albicans injection into the mice (P < 0.05). The results of the current study showed that the PCR method is reproducible for rapid identification of Candida species with specific primers. Mp65 gene expression of C. albicans after injection into the mice was 2.3 folds higher than before injection, with this difference being significant. These results indicated that increase of Mp65 gene expression might be an early stage of infection; however definitive conclusions require further studies.

  15. Identification of candidate regions for a novel Usher syndrome type II locus.

    PubMed

    Ben Rebeh, Imen; Benzina, Zeineb; Dhouib, Houria; Hadjamor, Imen; Amyere, Mustapha; Ayadi, Leila; Turki, Khalil; Hammami, Bouthaina; Kmiha, Noureddine; Kammoun, Hassen; Hakim, Bochra; Charfedine, Ilhem; Vikkula, Miikka; Ghorbel, Abdelmonem; Ayadi, Hammadi; Masmoudi, Saber

    2008-09-19

    Chronic diseases affecting the inner ear and the retina cause severe impairments to our communication systems. In more than half of the cases, Usher syndrome (USH) is the origin of these double defects. Patients with USH type II (USH2) have retinitis pigmentosa (RP) that develops during puberty, moderate to severe hearing impairment with downsloping pure-tone audiogram, and normal vestibular function. Four loci and three genes are known for USH2. In this study, we proposed to localize the gene responsible for USH2 in a consanguineous family of Tunisian origin. Affected members underwent detailed ocular and audiologic characterization. One Tunisian family with USH2 and 45 healthy controls unrelated to the family were recruited. Two affected and six unaffected family members attended our study. DNA samples of eight family members were genotyped with polymorphic markers. Two-point and multipoint LOD scores were calculated using Genehunter software v2.1. Sequencing was used to investigate candidate genes. Haplotype analysis showed no significant linkage to any known USH gene or locus. A genome-wide screen, using microsatellite markers, was performed, allowing the identification of three homozygous regions in chromosomes 2, 4, and 15. We further confirmed and refined these three regions using microsatellite and single-nucleotide polymorphisms. With recessive mode of inheritance, the highest multipoint LOD score of 1.765 was identified for the candidate regions on chromosomes 4 and 15. The chromosome 15 locus is large (55 Mb), underscoring the limited number of meioses in the consanguineous pedigree. Moreover, the linked, homozygous chromosome 15q alleles, unlike those of the chromosome 2 and 4 loci, are infrequent in the local population. Thus, the data strongly suggest that the novel locus for USH2 is likely to reside on 15q. Our data provide a basis for the localization and the identification of a novel gene implicated in USH2, most likely localized on 15q.

  16. Multiplex polymerase chain reaction for identification of Escherichia coli, Escherichia albertii and Escherichia fergusonii.

    PubMed

    Lindsey, Rebecca L; Garcia-Toledo, L; Fasulo, D; Gladney, L M; Strockbine, N

    2017-09-01

    Escherichia coli, Escherichia albertii, and Escherichia fergusonii are closely related bacteria that can cause illness in humans, such as bacteremia, urinary tract infections and diarrhea. Current identification strategies for these three species vary in complexity and typically rely on the use of multiple phenotypic and genetic tests. To facilitate their rapid identification, we developed a multiplex PCR assay targeting conserved, species-specific genes. We used the Daydreamer™ (Pattern Genomics, USA) software platform to concurrently analyze whole genome sequence assemblies (WGS) from 150 Enterobacteriaceae genomes (107 E. coli, 5 Shigella spp., 21 E. albertii, 12 E. fergusonii and 5 other species) and design primers for the following species-specific regions: a 212bp region of the cyclic di-GMP regulator gene (cdgR, AW869_22935 from genome K-12 MG1655, CP014225) for E. coli/Shigella; a 393bp region of the DNA-binding transcriptional activator of cysteine biosynthesis gene (EAKF1_ch4033 from genome KF1, CP007025) for E. albertii; and a 575bp region of the palmitoleoyl-acyl carrier protein (ACP)-dependent acyltransferase (EFER_0790 from genome ATCC 35469, CU928158) for E. fergusonii. We incorporated the species-specific primers into a conventional multiplex PCR assay and assessed its performance with a collection of 97 Enterobacteriaceae strains. The assay was 100% sensitive and specific for detecting the expected species and offers a quick and accurate strategy for identifying E. coli, E. albertii, and E. fergusonii in either a single reaction or by in silico PCR with sequence assemblies. Published by Elsevier B.V.

  17. SSTAR, a Stand-Alone Easy-To-Use Antimicrobial Resistance Gene Predictor.

    PubMed

    de Man, Tom J B; Limbago, Brandi M

    2016-01-01

    We present the easy-to-use Sequence Search Tool for Antimicrobial Resistance, SSTAR. It combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying antimicrobial resistance (AR) genes from genomic data. Although the database is initially populated from a public repository of acquired resistance determinants (i.e., ARG-ANNOT), it can be customized for particular pathogen groups and resistance mechanisms. For instance, outer membrane porin sequences associated with carbapenem resistance phenotypes can be added, and known intrinsic mechanisms can be included. Unique about this tool is the ability to easily detect putative new alleles and truncated versions of existing AR genes. Variants and potential new alleles are brought to the attention of the user for further investigation. For instance, SSTAR is able to identify modified or truncated versions of porins, which may be of great importance in carbapenemase-negative carbapenem-resistant Enterobacteriaceae. SSTAR is written in Java and is therefore platform independent and compatible with both Windows and Unix operating systems. SSTAR and its manual, which includes a simple installation guide, are freely available from https://github.com/tomdeman-bio/Sequence-Search-Tool-for-Antimicrobial-Resistance-SSTAR-. IMPORTANCE Whole-genome sequencing (WGS) is quickly becoming a routine method for identifying genes associated with antimicrobial resistance (AR). However, for many microbiologists, the use and analysis of WGS data present a substantial challenge. We developed SSTAR, software with a graphical user interface that enables the identification of known AR genes from WGS and has the unique capacity to easily detect new variants of known AR genes, including truncated protein variants. Current software solutions do not notify the user when genes are truncated and, therefore, likely nonfunctional, which makes phenotype predictions less accurate. SSTAR users can apply any AR database of interest as a reference comparator and can manually add genes that impact resistance, even if such genes are not resistance determinants per se (e.g., porins and efflux pumps).

  18. Identification of single nucleotide polymorphism in protein phosphatase 1 regulatory subunit 11 gene in Murrah bulls

    PubMed Central

    Jain, Varsha; Patel, Brijesh; Umar, Farhat Paul; Ajithakumar, H. M.; Gurjar, Suraj K.; Gupta, I. D.; Verma, Archana

    2017-01-01

    Aim: This study was conducted with the objective to identify single nucleotide polymorphism (SNP) in protein phosphatase 1 regulatory subunit 11 (PPP1R11) gene in Murrah bulls. Materials and Methods: Genomic DNA was isolated by phenol–chloroform extraction method from the frozen semen samples of 65 Murrah bulls maintained at Artificial Breeding Research Centre, ICAR-National Dairy Research Institute, Karnal. The quality and concentration of DNA was checked by spectrophotometer reading and agarose gel electrophoresis. The target region of PPP1R11 gene was amplified using four sets of primer designed based on Bos taurus reference sequence. The amplified products were sequenced and aligned using Clustal Omega for identification of SNPs. Animals were genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) using EcoNI restriction enzyme. Results: The sequences in the NCBI accession number NW_005785016.1 for Bubalus bubalis were compared and aligned with the edited sequences of Murrah bulls with Clustal Omega software. A total of 10 SNPs were found, out of which 1 at 5’UTR, 3 at intron 1, and 6 at intron 2 region. PCR-RFLP using restriction enzyme EcoNI revealed only AA genotype indicating monomorphism in PPP1R11 gene of all Murrah animals included in the study. Conclusion: A total of 10 SNPs were found. PCR-RFLP revealed only AA genotype indicating monomorphism in PPP1R11 gene of all Murrah animals included in the study, due to which association analysis with conception rate was not feasible. PMID:28344410

  19. Using RNA Sequence and Structure for the Prediction of Riboswitch Aptamer: A Comprehensive Review of Available Software and Tools

    PubMed Central

    Antunes, Deborah; Jorge, Natasha A. N.; Caffarena, Ernesto R.; Passetti, Fabio

    2018-01-01

    RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications. PMID:29403526

  20. PanACEA: a bioinformatics tool for the exploration and visualization of bacterial pan-chromosomes.

    PubMed

    Clarke, Thomas H; Brinkac, Lauren M; Inman, Jason M; Sutton, Granger; Fouts, Derrick E

    2018-06-27

    Bacterial pan-genomes, comprised of conserved and variable genes across multiple sequenced bacterial genomes, allow for identification of genomic regions that are phylogenetically discriminating or functionally important. Pan-genomes consist of large amounts of data, which can restrict researchers ability to locate and analyze these regions. Multiple software packages are available to visualize pan-genomes, but currently their ability to address these concerns are limited by using only pre-computed data sets, prioritizing core over variable gene clusters, or by not accounting for pan-chromosome positioning in the viewer. We introduce PanACEA (Pan-genome Atlas with Chromosome Explorer and Analyzer), which utilizes locally-computed interactive web-pages to view ordered pan-genome data. It consists of multi-tiered, hierarchical display pages that extend from pan-chromosomes to both core and variable regions to single genes. Regions and genes are functionally annotated to allow for rapid searching and visual identification of regions of interest with the option that user-supplied genomic phylogenies and metadata can be incorporated. PanACEA's memory and time requirements are within the capacities of standard laptops. The capability of PanACEA as a research tool is demonstrated by highlighting a variable region important in differentiating strains of Enterobacter hormaechei. PanACEA can rapidly translate the results of pan-chromosome programs into an intuitive and interactive visual representation. It will empower researchers to visually explore and identify regions of the pan-chromosome that are most biologically interesting, and to obtain publication quality images of these regions.

  1. Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

    PubMed

    Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

    2017-01-01

    Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).

  2. Oligo Design: a computer program for development of probes for oligonucleotide microarrays.

    PubMed

    Herold, Keith E; Rasooly, Avraham

    2003-12-01

    Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.

  3. DNAGear--a free software for spa type identification in Staphylococcus aureus.

    PubMed

    AL-Tam, Faroq; Brunel, Anne-Sophie; Bouzinbi, Nicolas; Corne, Philippe; Bañuls, Anne-Laure; Shahbazkia, Hamid Reza

    2012-11-19

    Staphylococcus aureus is both human commensal and an important human pathogen, responsible for community-acquired and nosocomial infections ranging from superficial wound infections to invasive infections, such as osteomyelitis, bacteremia and endocarditis, pneumonia or toxin shock syndrome with a mortality rate up to 40%. S. aureus reveals a high genetic polymorphism and detecting the genotypes is extremely useful to manage and prevent possible outbreaks and to understand the route of infection. One of current and expanded typing method is based on the X region of the spa gene composed of a succession of repeats of 21 to 27 bp. More than 10000 types are known. Extracting the repeats is impossible by hand and needs a dedicated software. Unfortunately the only software on the market is a commercial program from Ridom. This article presents DNAGear, a free and open source software with a user friendly interface written all in Java on top of NetBeans Platform to perform spa typing, detecting new repeats and new spa types and synchronizing automatically the files with the open access database. The installation is easy and the application is platform independent. In fact, the SPA identification is a formal regular expression matching problem and the results are 100% exact. As the program is using Java embedded modules written over string manipulation of well established algorithms, the exactitude of the solution is perfectly established. DNAGear is able to identify the types of the S. aureus sequences and detect both new types and repeats. Comparing to manual processing, which is time consuming and error prone, this application saves a lot of time and effort and gives very reliable results. Additionally, the users do not need to prepare the forward-reverse sequences manually, or even by using additional tools. They can simply create them in DNAGear and perform the typing task. In short, researchers who do not have commercial software will benefit a lot from this application.

  4. DNAGear- a free software for spa type identification in Staphylococcus aureus

    PubMed Central

    2012-01-01

    Background Staphylococcus aureus is both human commensal and an important human pathogen, responsible for community-acquired and nosocomial infections ranging from superficial wound infections to invasive infections, such as osteomyelitis, bacteremia and endocarditis, pneumonia or toxin shock syndrome with a mortality rate up to 40%. S. aureus reveals a high genetic polymorphism and detecting the genotypes is extremely useful to manage and prevent possible outbreaks and to understand the route of infection. One of current and expanded typing method is based on the X region of the spa gene composed of a succession of repeats of 21 to 27 bp. More than 10000 types are known. Extracting the repeats is impossible by hand and needs a dedicated software. Unfortunately the only software on the market is a commercial program from Ridom. Findings This article presents DNAGear, a free and open source software with a user friendly interface written all in Java on top of NetBeans Platform to perform spa typing, detecting new repeats and new spa types and synchronizing automatically the files with the open access database. The installation is easy and the application is platform independent. In fact, the SPA identification is a formal regular expression matching problem and the results are 100% exact. As the program is using Java embedded modules written over string manipulation of well established algorithms, the exactitude of the solution is perfectly established. Conclusions DNAGear is able to identify the types of the S. aureus sequences and detect both new types and repeats. Comparing to manual processing, which is time consuming and error prone, this application saves a lot of time and effort and gives very reliable results. Additionally, the users do not need to prepare the forward-reverse sequences manually, or even by using additional tools. They can simply create them in DNAGear and perform the typing task. In short, researchers who do not have commercial software will benefit a lot from this application. PMID:23164452

  5. A Review of Feature Extraction Software for Microarray Gene Expression Data

    PubMed Central

    Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini

    2014-01-01

    When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315

  6. PAINT: a promoter analysis and interaction network generation tool for gene regulatory network identification.

    PubMed

    Vadigepalli, Rajanikanth; Chakravarthula, Praveen; Zak, Daniel E; Schwaber, James S; Gonye, Gregory E

    2003-01-01

    We have developed a bioinformatics tool named PAINT that automates the promoter analysis of a given set of genes for the presence of transcription factor binding sites. Based on coincidence of regulatory sites, this tool produces an interaction matrix that represents a candidate transcriptional regulatory network. This tool currently consists of (1) a database of promoter sequences of known or predicted genes in the Ensembl annotated mouse genome database, (2) various modules that can retrieve and process the promoter sequences for binding sites of known transcription factors, and (3) modules for visualization and analysis of the resulting set of candidate network connections. This information provides a substantially pruned list of genes and transcription factors that can be examined in detail in further experimental studies on gene regulation. Also, the candidate network can be incorporated into network identification methods in the form of constraints on feasible structures in order to render the algorithms tractable for large-scale systems. The tool can also produce output in various formats suitable for use in external visualization and analysis software. In this manuscript, PAINT is demonstrated in two case studies involving analysis of differentially regulated genes chosen from two microarray data sets. The first set is from a neuroblastoma N1E-115 cell differentiation experiment, and the second set is from neuroblastoma N1E-115 cells at different time intervals following exposure to neuropeptide angiotensin II. PAINT is available for use as an agent in BioSPICE simulation and analysis framework (www.biospice.org), and can also be accessed via a WWW interface at www.dbi.tju.edu/dbi/tools/paint/.

  7. MS Data Miner: a web-based software tool to analyze, compare, and share mass spectrometry protein identifications.

    PubMed

    Dyrlund, Thomas F; Poulsen, Ebbe T; Scavenius, Carsten; Sanggaard, Kristian W; Enghild, Jan J

    2012-09-01

    Data processing and analysis of proteomics data are challenging and time consuming. In this paper, we present MS Data Miner (MDM) (http://sourceforge.net/p/msdataminer), a freely available web-based software solution aimed at minimizing the time required for the analysis, validation, data comparison, and presentation of data files generated in MS software, including Mascot (Matrix Science), Mascot Distiller (Matrix Science), and ProteinPilot (AB Sciex). The program was developed to significantly decrease the time required to process large proteomic data sets for publication. This open sourced system includes a spectra validation system and an automatic screenshot generation tool for Mascot-assigned spectra. In addition, a Gene Ontology term analysis function and a tool for generating comparative Excel data reports are included. We illustrate the benefits of MDM during a proteomics study comprised of more than 200 LC-MS/MS analyses recorded on an AB Sciex TripleTOF 5600, identifying more than 3000 unique proteins and 3.5 million peptides. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Microarray Data Analysis of Space Grown Arabidopsis Leaves for Genes Important in Vascular Patterning. Analysis of Space Grown Arabidopsis with Microarray Data from GeneLab: Identification of Genes Important in Vascular Patterning

    NASA Technical Reports Server (NTRS)

    Weitzel, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.

    2016-01-01

    Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photo-assimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASA's GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be up-regulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS (Auxin-Regulated Gene Involved in Organ Size)-like protein (potentially affecting cell elongation in the leaves), and an F-box/kelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm up-regulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASA's VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.

  9. LipidMatch: an automated workflow for rule-based lipid identification using untargeted high-resolution tandem mass spectrometry data.

    PubMed

    Koelmel, Jeremy P; Kroeger, Nicholas M; Ulmer, Candice Z; Bowden, John A; Patterson, Rainey E; Cochran, Jason A; Beecher, Christopher W W; Garrett, Timothy J; Yost, Richard A

    2017-07-10

    Lipids are ubiquitous and serve numerous biological functions; thus lipids have been shown to have great potential as candidates for elucidating biomarkers and pathway perturbations associated with disease. Methods expanding coverage of the lipidome increase the likelihood of biomarker discovery and could lead to more comprehensive understanding of disease etiology. We introduce LipidMatch, an R-based tool for lipid identification for liquid chromatography tandem mass spectrometry workflows. LipidMatch currently has over 250,000 lipid species spanning 56 lipid types contained in in silico fragmentation libraries. Unique fragmentation libraries, compared to other open source software, include oxidized lipids, bile acids, sphingosines, and previously uncharacterized adducts, including ammoniated cardiolipins. LipidMatch uses rule-based identification. For each lipid type, the user can select which fragments must be observed for identification. Rule-based identification allows for correct annotation of lipids based on the fragments observed, unlike typical identification based solely on spectral similarity scores, where over-reporting structural details that are not conferred by fragmentation data is common. Another unique feature of LipidMatch is ranking lipid identifications for a given feature by the sum of fragment intensities. For each lipid candidate, the intensities of experimental fragments with exact mass matches to expected in silico fragments are summed. The lipid identifications with the greatest summed intensity using this ranking algorithm were comparable to other lipid identification software annotations, MS-DIAL and Greazy. For example, for features with identifications from all 3 software, 92% of LipidMatch identifications by fatty acyl constituents were corroborated by at least one other software in positive mode and 98% in negative ion mode. LipidMatch allows users to annotate lipids across a wide range of high resolution tandem mass spectrometry experiments, including imaging experiments, direct infusion experiments, and experiments employing liquid chromatography. LipidMatch leverages the most extensive in silico fragmentation libraries of freely available software. When integrated into a larger lipidomics workflow, LipidMatch may increase the probability of finding lipid-based biomarkers and determining etiology of disease by covering a greater portion of the lipidome and using annotation which does not over-report biologically relevant structural details of identified lipid molecules.

  10. Identification and function analysis of contrary genes in Dupuytren's contracture.

    PubMed

    Ji, Xianglu; Tian, Feng; Tian, Lijie

    2015-07-01

    The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.

  11. The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer.

    PubMed

    Aghdam, Rosa; Baghfalaki, Taban; Khosravi, Pegah; Saberi Ansari, Elnaz

    2017-12-01

    Deciphering important genes and pathways from incomplete gene expression data could facilitate a better understanding of cancer. Different imputation methods can be applied to estimate the missing values. In our study, we evaluated various imputation methods for their performance in preserving significant genes and pathways. In the first step, 5% genes are considered in random for two types of ignorable and non-ignorable missingness mechanisms with various missing rates. Next, 10 well-known imputation methods were applied to the complete datasets. The significance analysis of microarrays (SAM) method was applied to detect the significant genes in rectal and lung cancers to showcase the utility of imputation approaches in preserving significant genes. To determine the impact of different imputation methods on the identification of important genes, the chi-squared test was used to compare the proportions of overlaps between significant genes detected from original data and those detected from the imputed datasets. Additionally, the significant genes are tested for their enrichment in important pathways, using the ConsensusPathDB. Our results showed that almost all the significant genes and pathways of the original dataset can be detected in all imputed datasets, indicating that there is no significant difference in the performance of various imputation methods tested. The source code and selected datasets are available on http://profiles.bs.ipm.ir/softwares/imputation_methods/. Copyright © 2017. Production and hosting by Elsevier B.V.

  12. Implementation of Whole Genome Sequencing (WGS) for Identification and Characterization of Shiga Toxin-Producing Escherichia coli (STEC) in the United States

    PubMed Central

    Lindsey, Rebecca L.; Pouseele, Hannes; Chen, Jessica C.; Strockbine, Nancy A.; Carleton, Heather A.

    2016-01-01

    Shiga toxin-producing Escherichia coli (STEC) is an important foodborne pathogen capable of causing severe disease in humans. Rapid and accurate identification and characterization techniques are essential during outbreak investigations. Current methods for characterization of STEC are expensive and time-consuming. With the advent of rapid and cheap whole genome sequencing (WGS) benchtop sequencers, the potential exists to replace traditional workflows with WGS. The aim of this study was to validate tools to do reference identification and characterization from WGS for STEC in a single workflow within an easy to use commercially available software platform. Publically available serotype, virulence, and antimicrobial resistance databases were downloaded from the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org) and integrated into a genotyping plug-in with in silico PCR tools to confirm some of the virulence genes detected from WGS data. Additionally, down sampling experiments on the WGS sequence data were performed to determine a threshold for sequence coverage needed to accurately predict serotype and virulence genes using the established workflow. The serotype database was tested on a total of 228 genomes and correctly predicted from WGS for 96.1% of O serogroups and 96.5% of H serogroups identified by conventional testing techniques. A total of 59 genomes were evaluated to determine the threshold of coverage to detect the different WGS targets, 40 were evaluated for serotype and virulence gene detection and 19 for the stx gene subtypes. For serotype, 95% of the O and 100% of the H serogroups were detected at > 40x and ≥ 30x coverage, respectively. For virulence targets and stx gene subtypes, nearly all genes were detected at > 40x, though some targets were 100% detectable from genomes with coverage ≥20x. The resistance detection tool was 97% concordant with phenotypic testing results. With isolates sequenced to > 40x coverage, the different databases accurately predicted serotype, virulence, and resistance from WGS data, providing a fast and cheaper alternative to conventional typing techniques. PMID:27242777

  13. Implementation of Whole Genome Sequencing (WGS) for Identification and Characterization of Shiga Toxin-Producing Escherichia coli (STEC) in the United States.

    PubMed

    Lindsey, Rebecca L; Pouseele, Hannes; Chen, Jessica C; Strockbine, Nancy A; Carleton, Heather A

    2016-01-01

    Shiga toxin-producing Escherichia coli (STEC) is an important foodborne pathogen capable of causing severe disease in humans. Rapid and accurate identification and characterization techniques are essential during outbreak investigations. Current methods for characterization of STEC are expensive and time-consuming. With the advent of rapid and cheap whole genome sequencing (WGS) benchtop sequencers, the potential exists to replace traditional workflows with WGS. The aim of this study was to validate tools to do reference identification and characterization from WGS for STEC in a single workflow within an easy to use commercially available software platform. Publically available serotype, virulence, and antimicrobial resistance databases were downloaded from the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org) and integrated into a genotyping plug-in with in silico PCR tools to confirm some of the virulence genes detected from WGS data. Additionally, down sampling experiments on the WGS sequence data were performed to determine a threshold for sequence coverage needed to accurately predict serotype and virulence genes using the established workflow. The serotype database was tested on a total of 228 genomes and correctly predicted from WGS for 96.1% of O serogroups and 96.5% of H serogroups identified by conventional testing techniques. A total of 59 genomes were evaluated to determine the threshold of coverage to detect the different WGS targets, 40 were evaluated for serotype and virulence gene detection and 19 for the stx gene subtypes. For serotype, 95% of the O and 100% of the H serogroups were detected at > 40x and ≥ 30x coverage, respectively. For virulence targets and stx gene subtypes, nearly all genes were detected at > 40x, though some targets were 100% detectable from genomes with coverage ≥20x. The resistance detection tool was 97% concordant with phenotypic testing results. With isolates sequenced to > 40x coverage, the different databases accurately predicted serotype, virulence, and resistance from WGS data, providing a fast and cheaper alternative to conventional typing techniques.

  14. Model Transformation for a System of Systems Dependability Safety Case

    NASA Technical Reports Server (NTRS)

    Murphy, Judy; Driskell, Stephen B.

    2010-01-01

    Software plays an increasingly larger role in all aspects of NASA's science missions. This has been extended to the identification, management and control of faults which affect safety-critical functions and by default, the overall success of the mission. Traditionally, the analysis of fault identification, management and control are hardware based. Due to the increasing complexity of system, there has been a corresponding increase in the complexity in fault management software. The NASA Independent Validation & Verification (IV&V) program is creating processes and procedures to identify, and incorporate safety-critical software requirements along with corresponding software faults so that potential hazards may be mitigated. This Specific to Generic ... A Case for Reuse paper describes the phases of a dependability and safety study which identifies a new, process to create a foundation for reusable assets. These assets support the identification and management of specific software faults and, their transformation from specific to generic software faults. This approach also has applications to other systems outside of the NASA environment. This paper addresses how a mission specific dependability and safety case is being transformed to a generic dependability and safety case which can be reused for any type of space mission with an emphasis on software fault conditions.

  15. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains.

    PubMed

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-04-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.

  16. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains

    PubMed Central

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-01-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli. PMID:22139924

  17. Software-implemented fault insertion: An FTMP example

    NASA Technical Reports Server (NTRS)

    Czeck, Edward W.; Siewiorek, Daniel P.; Segall, Zary Z.

    1987-01-01

    This report presents a model for fault insertion through software; describes its implementation on a fault-tolerant computer, FTMP; presents a summary of fault detection, identification, and reconfiguration data collected with software-implemented fault insertion; and compares the results to hardware fault insertion data. Experimental results show detection time to be a function of time of insertion and system workload. For the fault detection time, there is no correlation between software-inserted faults and hardware-inserted faults; this is because hardware-inserted faults must manifest as errors before detection, whereas software-inserted faults immediately exercise the error detection mechanisms. In summary, the software-implemented fault insertion is able to be used as an evaluation technique for the fault-handling capabilities of a system in fault detection, identification and recovery. Although the software-inserted faults do not map directly to hardware-inserted faults, experiments show software-implemented fault insertion is capable of emulating hardware fault insertion, with greater ease and automation.

  18. [Molecular identification of medicinal plant genus Uncaria in Guizhou].

    PubMed

    Gang, Tao; Liu, Tao; Zhu, Ying; Liu, Zuo-Yi

    2008-06-01

    To analyze rDNA ITS regions of the Medicinal Plant Genus Uncaria in Guizhou and construct their phylogenetic tree in order to supply molecular evidence of taxonomy and identification of these Medicinal Plants in genetic level. The ITS gene fragments of the 4 Medicinal Plants were PCR amplified and sequenced. The rDNA ITS regions were analyzed by means of the software of ClustalX, BioEdit and PAUP* 4.0 beta 10. The entire sequences of rDNA ITS1, ITS2, and 5.8S rDNA were obtained, The Maximum-parsimony tree of four ITS regions together with those of similar sequences from GenBank were found, as Mitrayna rubrostipulata (AJ492621 ) and Mitragyna rubrostipulata (AJ605988) were designated as outgroup. The 4 medicinal plants are the 4 species in the genus Uncaria, and are mostly similar to the Uncaria rhynhcophylla.

  19. In silico identification of genetic variants in glucocerebrosidase (GBA) gene involved in Gaucher's disease using multiple software tools.

    PubMed

    Manickam, Madhumathi; Ravanan, Palaniyandi; Singh, Pratibha; Talwar, Priti

    2014-01-01

    Gaucher's disease (GD) is an autosomal recessive disorder caused by the deficiency of glucocerebrosidase, a lysosomal enzyme that catalyses the hydrolysis of the glycolipid glucocerebroside to ceramide and glucose. Polymorphisms in GBA gene have been associated with the development of Gaucher disease. We hypothesize that prediction of SNPs using multiple state of the art software tools will help in increasing the confidence in identification of SNPs involved in GD. Enzyme replacement therapy is the only option for GD. Our goal is to use several state of art SNP algorithms to predict/address harmful SNPs using comparative studies. In this study seven different algorithms (SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO) were used to predict the harmful polymorphisms. Among the seven programs, SIFT found 47 nsSNPs as deleterious, MutPred found 46 nsSNPs as harmful. nsSNP Analyzer program found 43 out of 47 nsSNPs are disease causing SNPs whereas PANTHER found 32 out of 47 as highly deleterious, 22 out of 47 are classified as pathological mutations by PMUT, 44 out of 47 were predicted to be deleterious by PROVEAN server, all 47 shows the disease related mutations by SNPs&GO. Twenty two nsSNPs were commonly predicted by all the seven different algorithms. The common 22 targeted mutations are F251L, C342G, W312C, P415R, R463C, D127V, A309V, G46E, G202E, P391L, Y363C, Y205C, W378C, I402T, S366R, F397S, Y418C, P401L, G195E, W184R, R48W, and T43R.

  20. CRISPR-Cas9 Toolkit for Actinomycete Genome Editing.

    PubMed

    Tong, Yaojun; Robertsen, Helene Lunde; Blin, Kai; Weber, Tilmann; Lee, Sang Yup

    2018-01-01

    Bacteria of the order Actinomycetales are one of the most important sources of bioactive natural products, which are the source of many drugs. However, many of them still lack efficient genome editing methods, some strains even cannot be manipulated at all. This restricts systematic metabolic engineering approaches for boosting known and discovering novel natural products. In order to facilitate the genome editing for actinomycetes, we developed a CRISPR-Cas9 toolkit with high efficiency for actinomyces genome editing. This basic toolkit includes a software for spacer (sgRNA) identification, a system for in-frame gene/gene cluster knockout, a system for gene loss-of-function study, a system for generating a random size deletion library, and a system for gene knockdown. For the latter, a uracil-specific excision reagent (USER) cloning technology was adapted to simplify the CRISPR vector construction process. The application of this toolkit was successfully demonstrated by perturbation of genomes of Streptomyces coelicolor A3(2) and Streptomyces collinus Tü 365. The CRISPR-Cas9 toolkit and related protocol described here can be widely used for metabolic engineering of actinomycetes.

  1. BABAR: an R package to simplify the normalisation of common reference design microarray-based transcriptomic datasets

    PubMed Central

    2010-01-01

    Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918

  2. Detecting long tandem duplications in genomic sequences.

    PubMed

    Audemard, Eric; Schiex, Thomas; Faraut, Thomas

    2012-05-08

    Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS  <  1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.

  3. The PluriNetWork: An Electronic Representation of the Network Underlying Pluripotency in Mouse, and Its Applications

    PubMed Central

    Greber, Boris; Siatkowski, Marcin; Paudel, Yogesh; Warsow, Gregor; Cap, Clemens; Schöler, Hans; Fuellen, Georg

    2010-01-01

    Background Analysis of the mechanisms underlying pluripotency and reprogramming would benefit substantially from easy access to an electronic network of genes, proteins and mechanisms. Moreover, interpreting gene expression data needs to move beyond just the identification of the up-/downregulation of key genes and of overrepresented processes and pathways, towards clarifying the essential effects of the experiment in molecular terms. Methodology/Principal Findings We have assembled a network of 574 molecular interactions, stimulations and inhibitions, based on a collection of research data from 177 publications until June 2010, involving 274 mouse genes/proteins, all in a standard electronic format, enabling analyses by readily available software such as Cytoscape and its plugins. The network includes the core circuit of Oct4 (Pou5f1), Sox2 and Nanog, its periphery (such as Stat3, Klf4, Esrrb, and c-Myc), connections to upstream signaling pathways (such as Activin, WNT, FGF, BMP, Insulin, Notch and LIF), and epigenetic regulators as well as some other relevant genes/proteins, such as proteins involved in nuclear import/export. We describe the general properties of the network, as well as a Gene Ontology analysis of the genes included. We use several expression data sets to condense the network to a set of network links that are affected in the course of an experiment, yielding hypotheses about the underlying mechanisms. Conclusions/Significance We have initiated an electronic data repository that will be useful to understand pluripotency and to facilitate the interpretation of high-throughput data. To keep up with the growth of knowledge on the fundamental processes of pluripotency and reprogramming, we suggest to combine Wiki and social networking software towards a community curation system that is easy to use and flexible, and tailored to provide a benefit for the scientist, and to improve communication and exchange of research results. A PluriNetWork tutorial is available at http://www.ibima.med.uni-rostock.de/IBIMA/PluriNetWork/. PMID:21179244

  4. Reference gene identification for reliable normalisation of quantitative RT-PCR data in Setaria viridis.

    PubMed

    Nguyen, Duc Quan; Eamens, Andrew L; Grof, Christopher P L

    2018-01-01

    Quantitative real-time polymerase chain reaction (RT-qPCR) is the key platform for the quantitative analysis of gene expression in a wide range of experimental systems and conditions. However, the accuracy and reproducibility of gene expression quantification via RT-qPCR is entirely dependent on the identification of reliable reference genes for data normalisation. Green foxtail ( Setaria viridis ) has recently been proposed as a potential experimental model for the study of C 4 photosynthesis and is closely related to many economically important crop species of the Panicoideae subfamily of grasses, including Zea mays (maize), Sorghum bicolor (sorghum) and Sacchurum officinarum (sugarcane). Setaria viridis (Accession 10) possesses a number of key traits as an experimental model, namely; (i) a small sized, sequenced and well annotated genome; (ii) short stature and generation time; (iii) prolific seed production, and; (iv) is amendable to Agrobacterium tumefaciens -mediated transformation. There is currently however, a lack of reference gene expression information for Setaria viridis ( S. viridis ). We therefore aimed to identify a cohort of suitable S. viridis reference genes for accurate and reliable normalisation of S. viridis RT-qPCR expression data. Eleven putative candidate reference genes were identified and examined across thirteen different S. viridis tissues. Of these, the geNorm and NormFinder analysis software identified SERINE / THERONINE - PROTEIN PHOSPHATASE 2A ( PP2A ), 5 '- ADENYLYLSULFATE REDUCTASE 6 ( ASPR6 ) and DUAL SPECIFICITY PHOSPHATASE ( DUSP ) as the most suitable combination of reference genes for the accurate and reliable normalisation of S. viridis RT-qPCR expression data. To demonstrate the suitability of the three selected reference genes, PP2A , ASPR6 and DUSP , were used to normalise the expression of CINNAMYL ALCOHOL DEHYDROGENASE ( CAD ) genes across the same tissues. This approach readily demonstrated the suitably of the three selected reference genes for the accurate and reliable normalisation of S. viridis RT-qPCR expression data. Further, the work reported here forms a highly useful platform for future gene expression quantification in S. viridis and can also be potentially directly translatable to other closely related and agronomically important C 4 crop species.

  5. Array-based comparative genomic hybridization-guided identification of reference genes for normalization of real-time quantitative polymerase chain reaction assay data for lymphomas, histiocytic sarcomas, and osteosarcomas of dogs.

    PubMed

    Tsai, Pei-Chien; Breen, Matthew

    2012-09-01

    To identify suitable reference genes for normalization of real-time quantitative PCR (RT-qPCR) assay data for common tumors of dogs. Malignant lymph node (n = 8), appendicular osteosarcoma (9), and histiocytic sarcoma (12) samples and control samples of various nonneoplastic canine tissues. Array-based comparative genomic hybridization (aCGH) data were used to guide selection of 9 candidate reference genes. Expression stability of candidate reference genes and 4 commonly used reference genes was determined for tumor samples with RT-qPCR assays and 3 software programs. LOC611555 was the candidate reference gene with the highest expression stability among the 3 tumor types. Of the commonly used reference genes, expression stability of HPRT was high in histiocytic sarcoma samples, and expression stability of Ubi and RPL32 was high in osteosarcoma samples. Some of the candidate reference genes had higher expression stability than did the commonly used reference genes. Data for constitutively expressed genes with high expression stability are required for normalization of RT-qPCR assay results. Without such data, accurate quantification of gene expression in tumor tissue samples is difficult. Results of the present study indicated LOC611555 may be a useful RT-qPCR assay reference gene for multiple tissue types. Some commonly used reference genes may be suitable for normalization of gene expression data for tumors of dogs, such as lymphomas, osteosarcomas, or histiocytic sarcomas.

  6. Molecular tools for cryptic Candida species identification with applications in a clinical laboratory.

    PubMed

    Gamarra, Soledad; Dudiuk, Catiana; Mancilla, Estefanía; Vera Garate, María Verónica; Guerrero, Sergio; Garcia-Effron, Guillermo

    2013-01-01

    Candida spp. includes more than 160 species but only 20 species pose clinical problems. C. albicans and C. parapsilosis account for more than 75% of all the fungemias worldwide. In 1995 and 2005, one C. albicans and two C. parapsilosis-related species were described, respectively. Using phenotypic traits, the identification of these newly described species is inconclusive or impossible. Thus, molecular-based procedures are mandatory. In the proposed educational experiment we have adapted different basic molecular biology techniques designed to identify these species including PCR, multiplex PCR, PCR-based restriction endonuclease analysis and nuclear ribosomal RNA amplification. During the classes, students acquired the ability to search and align gene sequences, design primers, and use bioinformatics software. Also, in the performed experiments, fungal molecular taxonomy concepts were introduced and the obtained results demonstrated that classic identification (phenotypic) in some cases needs to be complemented with molecular-based techniques. As a conclusion we can state that we present an inexpensive and well accepted group of classes involving important concepts that can be recreated in any laboratory. Copyright © 2013 International Union of Biochemistry and Molecular Biology, Inc.

  7. Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data.

    PubMed

    Modrák, Martin; Vohradský, Jiří

    2018-04-13

    Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.

  8. An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

    PubMed

    Hsu, Arthur L; Tang, Sen-Lin; Halgamuge, Saman K

    2003-11-01

    Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). JAVA software of dynamic SOM tree algorithm is available upon request for academic use. A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf

  9. Identification of TMEM208 and PQLC2 as reference genes for normalizing mRNA expression in colorectal cancer treated with aspirin

    PubMed Central

    Zhu, Yuanyuan; Yang, Chao; Weng, Mingjiao; Zhang, Yan; Yang, Chunhui; Jin, Yinji; Yang, Weiwei; He, Yan; Wu, Yiqi; Zhang, Yuhua; Wang, Guangyu; RajkumarEzakiel Redpath, Riju James; Zhang, Lei; Jin, Xiaoming; Liu, Ying; Sun, Yuchun; Ning, Ning; Qiao, Yu; Zhang, Fengmin; Li, Zhiwei; Wang, Tianzhen; Zhang, Yanqiao; Li, Xiaobo

    2017-01-01

    Numerous evidences indicate that aspirin usage causes a significant reduction in colorectal cancer. However, the molecular mechanisms about aspirin preventing colon cancer are largely unknown. Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is a most frequently used method to identify the target molecules regulated by certain compound. However, this method needs stable internal reference genes to analyze the expression change of the targets. In this study, the transcriptional stabilities of several traditional reference genes were evaluated in colon cancer cells treated with aspirin, and also, the suitable internal reference genes were screened by using a microarray and were further identified by using the geNorm and NormFinder softwares, and then were validated in more cell lines and xenografts. We have showed that three traditional internal reference genes, β-actin, GAPDH and α-tubulin, are not suitable for studying gene transcription in colon cancer cells treated with aspirin, and we have identified and validated TMEM208 and PQLC2 as the ideal internal reference genes for detecting the molecular targets of aspirin in colon cancer in vitro and in vivo. This study reveals stable internal reference genes for studying the target genes of aspirin in colon cancer, which will contribute to identify the molecular mechanism behind aspirin preventing colon cancer. PMID:28184026

  10. Identification of TMEM208 and PQLC2 as reference genes for normalizing mRNA expression in colorectal cancer treated with aspirin.

    PubMed

    Zhu, Yuanyuan; Yang, Chao; Weng, Mingjiao; Zhang, Yan; Yang, Chunhui; Jin, Yinji; Yang, Weiwei; He, Yan; Wu, Yiqi; Zhang, Yuhua; Wang, Guangyu; RajkumarEzakiel Redpath, Riju James; Zhang, Lei; Jin, Xiaoming; Liu, Ying; Sun, Yuchun; Ning, Ning; Qiao, Yu; Zhang, Fengmin; Li, Zhiwei; Wang, Tianzhen; Zhang, Yanqiao; Li, Xiaobo

    2017-04-04

    Numerous evidences indicate that aspirin usage causes a significant reduction in colorectal cancer. However, the molecular mechanisms about aspirin preventing colon cancer are largely unknown. Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is a most frequently used method to identify the target molecules regulated by certain compound. However, this method needs stable internal reference genes to analyze the expression change of the targets. In this study, the transcriptional stabilities of several traditional reference genes were evaluated in colon cancer cells treated with aspirin, and also, the suitable internal reference genes were screened by using a microarray and were further identified by using the geNorm and NormFinder softwares, and then were validated in more cell lines and xenografts. We have showed that three traditional internal reference genes, β-actin, GAPDH and α-tubulin, are not suitable for studying gene transcription in colon cancer cells treated with aspirin, and we have identified and validated TMEM208 and PQLC2 as the ideal internal reference genes for detecting the molecular targets of aspirin in colon cancer in vitro and in vivo. This study reveals stable internal reference genes for studying the target genes of aspirin in colon cancer, which will contribute to identify the molecular mechanism behind aspirin preventing colon cancer.

  11. Identification of Key Pathways and Genes in L4 Dorsal Root Ganglion (DRG) After Sciatic Nerve Injury via Microarray Analysis.

    PubMed

    Zhao, He; Duan, Li-Jun; Sun, Qing-Ling; Gao, Yu-Shan; Yang, Yong-Dong; Tang, Xiang-Sheng; Zhao, Ding-Yan; Xiong, Yang; Hu, Zhen-Guo; Li, Chuan-Hong; Chen, Si-Xue; Liu, Tao; Yu, Xing

    2018-04-19

    Peripheral nerve injury (PNI) has devastating consequences. Dorsal root ganglion as a pivotal locus participates in the process of neuropathic pain and nerve regeneration. In recent years, gene sequencing technology has seen rapid rise in the biomedicine field. So, we attempt to gain insight into in the mechanism of neuropathic pain and nerve regeneration in the transcriptional level and to explore novel genes through bioinformatics analysis. The gene expression profiles of GSE96051 were downloaded from GEO database. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. Our results showed that both IL-6 and Jun genes and the signaling pathway of MAPK, apoptosis, P53 present their vital modulatory role in nerve regeneration and neuropathic pain. Noteworthy, 13 hub genes associated with neuropathic pain and nerve regeneration, including Ccl12, Ppp1r15a, Cdkn1a, Atf3, Nts, Dusp1, Ccl7, Csf, Gadd45a, Serpine1, Timp1 were rarely reported in PubMed database, these genes may provide us the new orientation in experimental research and clinical study. Our results may provide more deep insight into the mechanism and a promising therapeutic target. The next step is to put our emphasis on an experiment level and to verify the novel genes from 13 hub genes.

  12. Understanding the molecular aspects of oriental obesity pattern differentiation using DNA microarray.

    PubMed

    Hong, Sun Woo; Yoo, Jae-Wook; Bose, Shambhunath; Park, Jung-Hyun; Han, Kyungsun; Kim, Soyoun; Lim, Chi-Yeon; Kim, Hojun; Lee, Dong-Ki

    2015-10-19

    Human constitution, the fundamental basis of oriental medicine, is categorized into different patterns for a particular disease according to the physical, physiological, and clinical characteristics of the individuals. Obesity, a condition of metabolic disorder, is classified according to six patterns in oriental medicine, as follows: spleen deficiency syndrome, phlegm fluid syndrome, yang deficiency syndrome (YDS), food accumulation syndrome (FAS), liver depression syndrome (LDS), and blood stasis syndrome. In oriental medicine, identification of the disease pattern for individual obese patients is performed on the basis of differentiation in obesity syndrome index and, accordingly, personalized treatment is provided to the patients. The aim of the current study was to understand the obesity patterns in oriental medicine from the genomic point of view via determining the gene expression signature of obese patients using peripheral blood mononuclear cells as the samples. The study was conducted in 23 South Korean obese subjects (19 female and four male) with BMI ≥25 kg/m(2). Identification of oriental obesity pattern was based on the software-guided evaluation of the responses of the subjects to a questionnaire developed by the Korean Institute of Oriental Medicine. The expression profiles of genes were determined using DNA microarray and the level of transcription of genes of interest was further evaluated using quantitative real-time PCR (qRT-PCR). Gene clustering analysis of the microarray data from the FAS, LDS, and YDS subjects exhibited disease pattern-specific upregulation of expression of several genes in a particular cluster. Further analysis of transcription of selected genes using qRT-PCR led to identification of specific genes, including prostaglandin endoperoxide synthase 2, G0/G1 switch 2, carcinoembryonic antigen-related cell adhesion molecule 3, cystein-serine-rich nuclear protein 1, and interleukin 8 receptor, alpha which were highly expressed in LDS obesity constitution. Our current study can be considered as a valuable contribution to the understanding of possible explanation for obesity pattern differentiation in oriental medicine. Further studies can address a novel possibility that the genomic and oriental empirical approaches can be combined and implemented in systematic and synergistic development of personalized medicine. This clinical trial was registered in Clinical Research Information Service of Korea National Institute of Health ( https://cris.nih.go.kr/cris/index.jsp ). KCT0000387.

  13. A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

    PubMed Central

    Tang, Haixu; Li, Sujun; Ye, Yuzhen

    2016-01-01

    Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579

  14. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    PubMed Central

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813

  15. ChIPnorm: a statistical method for normalizing and identifying differential regions in histone modification ChIP-seq libraries.

    PubMed

    Nair, Nishanth Ulhas; Sahu, Avinash Das; Bucher, Philipp; Moret, Bernard M E

    2012-01-01

    The advent of high-throughput technologies such as ChIP-seq has made possible the study of histone modifications. A problem of particular interest is the identification of regions of the genome where different cell types from the same organism exhibit different patterns of histone enrichment. This problem turns out to be surprisingly difficult, even in simple pairwise comparisons, because of the significant level of noise in ChIP-seq data. In this paper we propose a two-stage statistical method, called ChIPnorm, to normalize ChIP-seq data, and to find differential regions in the genome, given two libraries of histone modifications of different cell types. We show that the ChIPnorm method removes most of the noise and bias in the data and outperforms other normalization methods. We correlate the histone marks with gene expression data and confirm that histone modifications H3K27me3 and H3K4me3 act as respectively a repressor and an activator of genes. Compared to what was previously reported in the literature, we find that a substantially higher fraction of bivalent marks in ES cells for H3K27me3 and H3K4me3 move into a K27-only state. We find that most of the promoter regions in protein-coding genes have differential histone-modification sites. The software for this work can be downloaded from http://lcbb.epfl.ch/software.html.

  16. Screening and confirmation of microRNA markers for forensic body fluid identification.

    PubMed

    Wang, Zheng; Zhang, Ji; Luo, Haibo; Ye, Yi; Yan, Jing; Hou, Yiping

    2013-01-01

    MicroRNAs (miRNAs, ∼22 nucleotides) are small, non-protein coding RNAs that regulate gene expression at the post-transcriptional level. MiRNAs can express in a tissue-specific manner, and have been introduced to forensic body fluid identification. In this study, we employed the qPCR-array (TaqMan(®) Array Human MicroRNA Cards) to screen the body fluid-specific miRNAs. Seven candidate miRNAs were identified as potentially body fluid-specific and could be used as forensically relevant body fluid markers: miR16 and miR486 for venous blood, miR888 and miR891a for semen, miR214 for menstrual blood, miR124a for vaginal secretions, and miR138-2 for saliva. The candidate miRNA markers were then validated via hydrolysis probes quantitative real-time polymerase chain reaction (TaqMan-qPCR). In addition, BestKeeper software was used to validate the expression stability of four genes, RNU44, RNU48, U6 and U6b, regularly used as reference genes (RGs) for studies involving forensic body fluids. The current study suggests that U6 could be used as a proper RG of miRNAs in forensic body fluid identification. The relative expression ratios (R) of miR486, miR888, miR214, miR16 and miR891a can differentiate the target body fluid from other body fluids that were tested in this study. The detection limit of TaqMan-qPCR of the five confirmed miRNA markers was 10pg of total RNA. The effect of time-wise degradation of blood stains and semen stains for 1 month under normal laboratory conditions was tested and did not significantly affect the detection results. Herein, this study proposes five body fluid-specific miRNAs for the forensic identification of venous blood, semen, and menstrual blood, of which miR486, miR888, and miR214 may be used as new markers for body fluid identification. Additional work remains necessary in search for suitable miRNA markers and stable RGs for forensic body fluid identification. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  17. Binding Sites Analyser (BiSA): Software for Genomic Binding Sites Archiving and Overlap Analysis

    PubMed Central

    Khushi, Matloob; Liddle, Christopher; Clarke, Christine L.; Graham, J. Dinny

    2014-01-01

    Genome-wide mapping of transcription factor binding and histone modification reveals complex patterns of interactions. Identifying overlaps in binding patterns by different factors is a major objective of genomic studies, but existing methods to archive large numbers of datasets in a personalised database lack sophistication and utility. Therefore we have developed transcription factor DNA binding site analyser software (BiSA), for archiving of binding regions and easy identification of overlap with or proximity to other regions of interest. Analysis results can be restricted by chromosome or base pair overlap between regions or maximum distance between binding peaks. BiSA is capable of reporting overlapping regions that share common base pairs; regions that are nearby; regions that are not overlapping; and average region sizes. BiSA can identify genes located near binding regions of interest, genomic features near a gene or locus of interest and statistical significance of overlapping regions can also be reported. Overlapping results can be visualized as Venn diagrams. A major strength of BiSA is that it is supported by a comprehensive database of publicly available transcription factor binding sites and histone modifications, which can be directly compared to user data. The documentation and source code are available on http://bisa.sourceforge.net PMID:24533055

  18. Identification of internal control genes for quantitative expression analysis by real-time PCR in bovine peripheral lymphocytes.

    PubMed

    Spalenza, Veronica; Girolami, Flavia; Bevilacqua, Claudia; Riondato, Fulvio; Rasero, Roberto; Nebbia, Carlo; Sacchi, Paola; Martin, Patrice

    2011-09-01

    Gene expression studies in blood cells, particularly lymphocytes, are useful for monitoring potential exposure to toxicants or environmental pollutants in humans and livestock species. Quantitative PCR is the method of choice for obtaining accurate quantification of mRNA transcripts although variations in the amount of starting material, enzymatic efficiency, and the presence of inhibitors can lead to evaluation errors. As a result, normalization of data is of crucial importance. The most common approach is the use of endogenous reference genes as an internal control, whose expression should ideally not vary among individuals and under different experimental conditions. The accurate selection of reference genes is therefore an important step in interpreting quantitative PCR studies. Since no systematic investigation in bovine lymphocytes has been performed, the aim of the present study was to assess the expression stability of seven candidate reference genes in circulating lymphocytes collected from 15 dairy cows. Following the characterization by flow cytometric analysis of the cell populations obtained from blood through a density gradient procedure, three popular softwares were used to evaluate the gene expression data. The results showed that two genes are sufficient for normalization of quantitative PCR studies in cattle lymphocytes and that YWAHZ, S24 and PPIA are the most stable genes. Copyright © 2010 Elsevier Ltd. All rights reserved.

  19. Identification of stable reference genes for quantitative PCR in cells derived from chicken lymphoid organs.

    PubMed

    Borowska, D; Rothwell, L; Bailey, R A; Watson, K; Kaiser, P

    2016-02-01

    Quantitative polymerase chain reaction (qPCR) is a powerful technique for quantification of gene expression, especially genes involved in immune responses. Although qPCR is a very efficient and sensitive tool, variations in the enzymatic efficiency, quality of RNA and the presence of inhibitors can lead to errors. Therefore, qPCR needs to be normalised to obtain reliable results and allow comparison. The most common approach is to use reference genes as internal controls in qPCR analyses. In this study, expression of seven genes, including β-actin (ACTB), β-2-microglobulin (B2M), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), β-glucuronidase (GUSB), TATA box binding protein (TBP), α-tubulin (TUBAT) and 28S ribosomal RNA (r28S), was determined in cells isolated from chicken lymphoid tissues and stimulated with three different mitogens. The stability of the genes was measured using geNorm, NormFinder and BestKeeper software. The results from both geNorm and NormFinder were that the three most stably expressed genes in this panel were TBP, GAPDH and r28S. BestKeeper did not generate clear answers because of the highly heterogeneous sample set. Based on these data we will include TBP in future qPCR normalisation. The study shows the importance of appropriate reference gene normalisation in other tissues before qPCR analysis. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Metagenomic gene annotation by a homology-independent approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Froula, Jeff; Zhang, Tao; Salmeen, Annette

    2011-06-02

    Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMERmore » but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.« less

  1. Evidence of Molecular Adaptation to Extreme Environments and Applicability to Space Environments

    NASA Astrophysics Data System (ADS)

    Filipovic, M. D.; Ognjanovic, S.; Ognjanovic, M.

    2008-06-01

    This is initial investigation of gene signatures responsible for adapting microscopic life to the extreme Earth environments. We present preliminary results on identification of the clusters of orthologous groups (COGs) common to several hyperthermophiles and exclusion of those common to a mesophile (non-hyperthermophile): Escherichia coli (E. coli K12), will yield a group of proteins possibly involved in adaptation to life under extreme temperatures. Comparative genome analyses represent a powerful tool in discovery of novel genes responsible for adaptation to specific extreme environments. Methanogens stand out as the only group of organisms that have species capable of growth at 0° C (Metarhizium frigidum (M.~frigidum) and Methanococcoides burtonii (M.~burtonii)) and 110° C (Methanopyrus kandleri (M.~kandleri)). Although not all the components of heat adaptation can be attributed to novel genes, the chaperones known as heat shock proteins stabilize the enzymes under elevated temperature. However, highly conserved chaperons found in bacteria and eukaryots are not present in hyperthermophilic Archea, rather, they have a unique chaperone TF55. Our aim was to use software which we specifically developed for extremophile genome comparative analyses in order to search for additional novel genes involved in hyperthermophile adaptation. The following hyperthermophile genomes incorporated in this software were used for these studies: Methanocaldococcus jannaschii (M.~jannaschii), M.~kandleri, Archaeoglobus fulgidus (A.~fulgidus) and three species of Pyrococcus. Common genes were annotated and grouped according to their roles in cellular processes where such information was available and proteins not previously implicated in the heat-adaptation of hyperthermophiles were identified. Additional experimental data are needed in order to learn more about these proteins. To address non-gene based components of thermal adaptation, all sequenced extremophiles were analysed for their GC contents and aminoacid hydrophobicity. Finally, we develop a prediction model for optimal growth temperature.

  2. Frequency Domain Identification Toolbox

    NASA Technical Reports Server (NTRS)

    Horta, Lucas G.; Juang, Jer-Nan; Chen, Chung-Wen

    1996-01-01

    This report documents software written in MATLAB programming language for performing identification of systems from frequency response functions. MATLAB is a commercial software environment which allows easy manipulation of data matrices and provides other intrinsic matrix functions capabilities. Algorithms programmed in this collection of subroutines have been documented elsewhere but all references are provided in this document. A main feature of this software is the use of matrix fraction descriptions and system realization theory to identify state space models directly from test data. All subroutines have templates for the user to use as guidelines.

  3. Lessons learned from gene identification studies in Mendelian epilepsy disorders

    PubMed Central

    Hardies, Katia; Weckhuysen, Sarah; De Jonghe, Peter; Suls, Arvid

    2016-01-01

    Next-generation sequencing (NGS) technologies are now routinely used for gene identification in Mendelian disorders. Setting up cost-efficient NGS projects and managing the large amount of variants remains, however, a challenging job. Here we provide insights in the decision-making processes before and after the use of NGS in gene identification studies. Genetic factors are thought to have a role in ~70% of all epilepsies, and a variety of inheritance patterns have been described for seizure-associated gene defects. We therefore chose epilepsy as disease model and selected 35 NGS studies that focused on patients with a Mendelian epilepsy disorder. The strategies used for gene identification and their respective outcomes were reviewed. High-throughput NGS strategies have led to the identification of several new epilepsy-causing genes, enlarging our knowledge on both known and novel pathomechanisms. NGS findings have furthermore extended the awareness of phenotypical and genetic heterogeneity. By discussing recent studies we illustrate: (I) the power of NGS for gene identification in Mendelian disorders, (II) the accelerating pace in which this field evolves, and (III) the considerations that have to be made when performing NGS studies. Nonetheless, the enormous rise in gene discovery over the last decade, many patients and families included in gene identification studies still remain without a molecular diagnosis; hence, further genetic research is warranted. On the basis of successful NGS studies in epilepsy, we discuss general approaches to guide human geneticists and clinicians in setting up cost-efficient gene identification NGS studies. PMID:26603999

  4. Identification and evaluation of reference genes for qRT-PCR studies in Lentinula edodes

    PubMed Central

    Qin, Peng; He, Maolan; Yu, Xiumei; Zhao, Ke; Zhang, Xiaoping; Ma, Menggen; Chen, Qiang; Chen, Xiaoqiong; Zeng, Xianfu; Gu, Yunfu

    2018-01-01

    Lentinula edodes (shiitake mushroom) is a common edible mushroom with a number of potential therapeutic and nutritional applications. It contains various medically important molecules, such as polysaccharides, terpenoids, sterols, and lipids, were contained in this mushroom. Quantitative real-time polymerase chain reaction (qRT-PCR) is a powerful tool to analyze the mechanisms underlying the biosynthetic pathways of these substances. qRT-PCR is used for accurate analyses of transcript levels owing to its rapidity, sensitivity, and reliability. However, its accuracy and reliability for the quantification of transcripts rely on the expression stability of the reference genes used for data normalization. To ensure the reliability of gene expression analyses using qRT-PCR in L. edodes molecular biology research, it is necessary to systematically evaluate reference genes. In the current study, ten potential reference genes were selected from L. edodes genomic data and their expression levels were measured by qRT-PCR using various samples. The expression stability of each candidate gene was analyzed by three commonly used software packages: geNorm, NormFinder, and BestKeeper. Base on the results, Rpl4 was the most stable reference gene across all experimental conditions, and Atu was the most stable gene among strains. 18S was found to be the best reference gene for different development stages, and Rpl4 was the most stably expressed gene under various nutrient conditions. The present work will contribute to qRT-PCR studies in L. edodes. PMID:29293626

  5. Identification and evaluation of reference genes for qRT-PCR studies in Lentinula edodes.

    PubMed

    Xiang, Quanju; Li, Jin; Qin, Peng; He, Maolan; Yu, Xiumei; Zhao, Ke; Zhang, Xiaoping; Ma, Menggen; Chen, Qiang; Chen, Xiaoqiong; Zeng, Xianfu; Gu, Yunfu

    2018-01-01

    Lentinula edodes (shiitake mushroom) is a common edible mushroom with a number of potential therapeutic and nutritional applications. It contains various medically important molecules, such as polysaccharides, terpenoids, sterols, and lipids, were contained in this mushroom. Quantitative real-time polymerase chain reaction (qRT-PCR) is a powerful tool to analyze the mechanisms underlying the biosynthetic pathways of these substances. qRT-PCR is used for accurate analyses of transcript levels owing to its rapidity, sensitivity, and reliability. However, its accuracy and reliability for the quantification of transcripts rely on the expression stability of the reference genes used for data normalization. To ensure the reliability of gene expression analyses using qRT-PCR in L. edodes molecular biology research, it is necessary to systematically evaluate reference genes. In the current study, ten potential reference genes were selected from L. edodes genomic data and their expression levels were measured by qRT-PCR using various samples. The expression stability of each candidate gene was analyzed by three commonly used software packages: geNorm, NormFinder, and BestKeeper. Base on the results, Rpl4 was the most stable reference gene across all experimental conditions, and Atu was the most stable gene among strains. 18S was found to be the best reference gene for different development stages, and Rpl4 was the most stably expressed gene under various nutrient conditions. The present work will contribute to qRT-PCR studies in L. edodes.

  6. A web server for mining Comparative Genomic Hybridization (CGH) data

    NASA Astrophysics Data System (ADS)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  7. Isolation and identification of halotolerant soil bacteria from coastal Patenga area.

    PubMed

    Rahman, Shafkat Shamim; Siddique, Romana; Tabassum, Nafisa

    2017-10-30

    Halotolerant bacteria have multiple uses viz. fermentation with lesser sterility control and industrial production of bioplastics. Moreover, it may increase the crop productivity of coastal saline lands in Bangladesh by transferring the salt tolerant genes into the plants. The study focused on the isolation and identification of the halotolerant bacteria from three soil samples, collected from coastal Patenga area. The samples were inoculated in nutrient media containing a wide range of salt concentrations. All the samples showed 2, 4 and 6% (w/v) salt tolerance. The isolates from Patenga soil (4, 6%) and beach soil (2%) showed catalase activity and all the isolates showed negative results for oxidase activity, indole production, lactose and motility. All the samples provided positive results for dextrose fermentation. Other tests provided mixed results. Based on the morphological characteristics, biochemical tests and ABIS software analysis the isolates fall within the Enterobacteriaceae, Clostridium and Corynebacterium, with a predominance of Vibrios. Overall the isolates can be considered as mild halotolerant, with the best growth observed at lower salinities and no halophilism detected. Among many possibilities, the genes responsible for the salt tolerant trait in these species can be identified, extracted and inserted into the crop plants to form a transgenic plant to result in higher yield for the rest of the year.

  8. ODEion--a software module for structural identification of ordinary differential equations.

    PubMed

    Gennemark, Peter; Wedelin, Dag

    2014-02-01

    In the systems biology field, algorithms for structural identification of ordinary differential equations (ODEs) have mainly focused on fixed model spaces like S-systems and/or on methods that require sufficiently good data so that derivatives can be accurately estimated. There is therefore a lack of methods and software that can handle more general models and realistic data. We present ODEion, a software module for structural identification of ODEs. Main characteristic features of the software are: • The model space is defined by arbitrary user-defined functions that can be nonlinear in both variables and parameters, such as for example chemical rate reactions. • ODEion implements computationally efficient algorithms that have been shown to efficiently handle sparse and noisy data. It can run a range of realistic problems that previously required a supercomputer. • ODEion is easy to use and provides SBML output. We describe the mathematical problem, the ODEion system itself, and provide several examples of how the system can be used. Available at: http://www.odeidentification.org.

  9. Whole-genome transcription and DNA methylation analysis of peripheral blood mononuclear cells identified aberrant gene regulation pathways in systemic lupus erythematosus.

    PubMed

    Zhu, Honglin; Mi, Wentao; Luo, Hui; Chen, Tao; Liu, Shengxi; Raman, Indu; Zuo, Xiaoxia; Li, Quan-Zhen

    2016-07-13

    Recent achievement in genetics and epigenetics has led to the exploration of the pathogenesis of systemic lupus erythematosus (SLE). Identification of differentially expressed genes and their regulatory mechanism(s) at whole-genome level will provide a comprehensive understanding of the development of SLE and its devastating complications, lupus nephritis (LN). We performed whole-genome transcription and DNA methylation analysis in PBMC of 30 SLE patients, including 15 with LN (SLE LN(+)) and 15 without LN (SLE LN(-)), and 25 normal controls (NC) using HumanHT-12 Beadchips and Illumina Human Methy450 chips. The serum proinflammatory cytokines were quantified using Bio-plex Human Cytokine 27-plex assay. Differentially expressed genes and differentially methylated CpG were analyzed with GenomeStudio, R, and SAM software. The association between DNA methylation and gene expression were tested. Gene interaction pathways of the differentially expressed genes were analyzed by IPA software. We identified 552 upregulated genes and 550 downregulated genes in PBMC of SLE. Integration of DNA methylation and gene expression profiling showed that 334 upregulated genes were hypomethylated, and 479 downregulated genes were hypermethylated. Pathway analysis on the differential genes in SLE revealed significant enrichment in interferon (IFN) signaling and toll-like receptor (TLR) signaling pathways. Nine IFN- and seven TLR-related genes were identified and displayed step-wise increase in SLE LN(-) and SLE LN(+). Hypomethylated CpG sites were detected on these genes. The gene expressions for MX1, GPR84, and E2F2 were increased in SLE LN(+) as compared to SLE LN(-) patients. The serum levels of inflammatory cytokines, including IL17A, IP-10, bFGF, TNF-α, IL-6, IL-15, GM-CSF, IL-1RA, IL-5, and IL-12p70, were significantly elevated in SLE compared with NC. The levels of IL-15 and IL1RA correlated with their mRNA expression. The upregulation of IL-15 may be regulated by hypomethylated CpG sites in the promotor region of the gene. Our study has demonstrated that significant number of differential genes in SLE were involved in IFN, TLR signaling pathways, and inflammatory cytokines. The enrichment of differential genes has been associated with aberrant DNA methylation, which may be relevant to the pathogenesis of SLE. Our observations have laid the groundwork for further diagnostic and mechanistic studies of SLE and LN.

  10. Species identification in forensic samples using the SPInDel approach: A GHEP-ISFG inter-laboratory collaborative exercise.

    PubMed

    Alves, Cíntia; Pereira, Rui; Prieto, Lourdes; Aler, Mercedes; Amaral, Cesar R L; Arévalo, Cristina; Berardi, Gabriela; Di Rocco, Florencia; Caputo, Mariela; Carmona, Cristian Hernandez; Catelli, Laura; Costa, Heloísa Afonso; Coufalova, Pavla; Furfuro, Sandra; García, Óscar; Gaviria, Anibal; Goios, Ana; Gómez, Juan José Builes; Hernández, Alexis; Hernández, Eva Del Carmen Betancor; Miranda, Luís; Parra, David; Pedrosa, Susana; Porto, Maria João Anjos; Rebelo, Maria de Lurdes; Spirito, Matteo; Torres, María Del Carmen Villalobos; Amorim, António; Pereira, Filipe

    2017-05-01

    DNA is a powerful tool available for forensic investigations requiring identification of species. However, it is necessary to develop and validate methods able to produce results in degraded and or low quality DNA samples with the high standards obligatory in forensic research. Here, we describe a voluntary collaborative exercise to test the recently developed Species Identification by Insertions/Deletions (SPInDel) method. The SPInDel kit allows the identification of species by the generation of numeric profiles combining the lengths of six mitochondrial ribosomal RNA (rRNA) gene regions amplified in a single reaction followed by capillary electrophoresis. The exercise was organized during 2014 by a Working Commission of the Spanish and Portuguese-Speaking Working Group of the International Society for Forensic Genetics (GHEP-ISFG), created in 2013. The 24 participating laboratories from 10 countries were asked to identify the species in 11 DNA samples from previous GHEP-ISFG proficiency tests using a SPInDel primer mix and control samples of the 10 target species. A computer software was also provided to the participants to assist the analyses of the results. All samples were correctly identified by 22 of the 24 laboratories, including samples with low amounts of DNA (hair shafts) and mixtures of saliva and blood. Correct species identifications were obtained in 238 of the 241 (98.8%) reported SPInDel profiles. Two laboratories were responsible for the three cases of misclassifications. The SPInDel was efficient in the identification of species in mixtures considering that only a single laboratory failed to detect a mixture in one sample. This result suggests that SPInDel is a valid method for mixture analyses without the need for DNA sequencing, with the advantage of identifying more than one species in a single reaction. The low frequency of wrong (5.0%) and missing (2.1%) alleles did not interfere with the correct species identification, which demonstrated the advantage of using a method based on the analysis of multiple loci. Overall, the SPInDel method was easily implemented by laboratories using different genotyping platforms, the interpretation of results was straightforward and the SPInDel software was used without any problems. The results of this collaborative exercise indicate that the SPInDel method can be applied successfully in forensic casework investigations. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Bioinformatics approach for choosing the correct reference genes when studying gene expression in human keratinocytes.

    PubMed

    Beer, Lucian; Mlitz, Veronika; Gschwandtner, Maria; Berger, Tanja; Narzt, Marie-Sophie; Gruber, Florian; Brunner, Patrick M; Tschachler, Erwin; Mildner, Michael

    2015-10-01

    Reverse transcription polymerase chain reaction (qRT-PCR) has become a mainstay in many areas of skin research. To enable quantitative analysis, it is necessary to analyse expression of reference genes (RGs) for normalization of target gene expression. The selection of reliable RGs therefore has an important impact on the experimental outcome. In this study, we aimed to identify and validate the best suited RGs for qRT-PCR in human primary keratinocytes (KCs) over a broad range of experimental conditions using the novel bioinformatics tool 'RefGenes', which is based on a manually curated database of published microarray data. Expression of 6 RGs identified by RefGenes software and 12 commonly used RGs were validated by qRT-PCR. We assessed whether these 18 markers fulfilled the requirements for a valid RG by the comprehensive ranking of four bioinformatics tools and the coefficient of variation (CV). In an overall ranking, we found GUSB to be the most stably expressed RG, whereas the expression values of the commonly used RGs, GAPDH and B2M were significantly affected by varying experimental conditions. Our results identify RefGenes as a powerful tool for the identification of valid RGs and suggest GUSB as the most reliable RG for KCs. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Gene expression profile of human Down syndrome leukocytes.

    PubMed

    Malagó, Wilson; Sommer, César A; Del Cistia Andrade, Camillo; Soares-Costa, Andrea; Abrao Possik, Patricia; Cassago, Alexandre; Santejo Silveira, Henrique C; Henrique-Silva, Flavio

    2005-08-01

    Identification of differences in the gene expression patterns of Down syndrome and normal leukocytes. We constructed the first Down syndrome leukocyte serial analysis of gene expression (SAGE) library from a 28 year-old patient. This library was analyzed and compared with a normal leukocyte SAGE library using the eSAGE software. Reverse transcriptase polymerase chain reaction (RT-PCR) was used to validate the results. We found that a large number of unidentified transcripts were overexpressed in Down syndrome leukocytes and some transcripts coding for growth factors (e.g. interleukin 8, IL-8), ribosomaproteins (e.g. L13a, L29, and L37), and transcription factors (e.g., Jun B, Jun D, and C/EBP beta) were underexpressed. The SAGE data were successfully validated for the genes IL-8, CXCR4, BCL2A1, L13a, L29, L37, and GTF3A using RT-PCR. Our analysis identified significant changes in the expression pattern of Down syndrome leukocytes compared with normal ones, including key regulators of growth and proliferation, ribosomal proteins, and a large number of overexpressed transcripts that were not matched in UniGene clusters and that may represent novel genes related to Down syndrome. This study offers a new insight into transcriptional changes in Down syndrome leukocytes and indicates candidate genes for further investigations into the molecular mechanism of Down syndrome pathology.

  13. Identification of essential genes and synthetic lethal gene combinations in Escherichia coli K-12.

    PubMed

    Mori, Hirotada; Baba, Tomoya; Yokoyama, Katsushi; Takeuchi, Rikiya; Nomura, Wataru; Makishi, Kazuichi; Otsuka, Yuta; Dose, Hitomi; Wanner, Barry L

    2015-01-01

    Here we describe the systematic identification of single genes and gene pairs, whose knockout causes lethality in Escherichia coli K-12. During construction of precise single-gene knockout library of E. coli K-12, we identified 328 essential gene candidates for growth in complex (LB) medium. Upon establishment of the Keio single-gene deletion library, we undertook the development of the ASKA single-gene deletion library carrying a different antibiotic resistance. In addition, we developed tools for identification of synthetic lethal gene combinations by systematic construction of double-gene knockout mutants. We introduce these methods herein.

  14. Deep sequencing with intronic capture enables identification of an APC exon 10 inversion in a patient with polyposis.

    PubMed

    Shirts, Brian H; Salipante, Stephen J; Casadei, Silvia; Ryan, Shawnia; Martin, Judith; Jacobson, Angela; Vlaskin, Tatyana; Koehler, Karen; Livingston, Robert J; King, Mary-Claire; Walsh, Tom; Pritchard, Colin C

    2014-10-01

    Single-exon inversions have rarely been described in clinical syndromes and are challenging to detect using Sanger sequencing. We report the case of a 40-year-old woman with adenomatous colon polyps too numerous to count and who had a complex inversion spanning the entire exon 10 in APC (the gene encoding for adenomatous polyposis coli), causing exon skipping and resulting in a frameshift and premature protein truncation. In this study, we employed complete APC gene sequencing using high-coverage next-generation sequencing by ColoSeq, analysis with BreakDancer and SLOPE software, and confirmatory transcript analysis. ColoSeq identified a complex small genomic rearrangement consisting of an inversion that results in translational skipping of exon 10 in the APC gene. This mutation would not have been detected by traditional sequencing or gene-dosage methods. We report a case of adenomatous polyposis resulting from a complex single-exon inversion. Our report highlights the benefits of large-scale sequencing methods that capture intronic sequences with high enough depth of coverage-as well as the use of informatics tools-to enable detection of small pathogenic structural rearrangements.

  15. 8D.07: GENE EXPRESSION ANALYSIS AND BIOINFORMATICS REVEALED POTENTIAL TRANSCRIPTION FACTORS ASSOCIATED WITH RENIN-ANGIOTENSIN-ALDOSTERONE SYSTEM IN ATHEROMA.

    PubMed

    Nehme, A; Zibara, K; Cerutti, C; Bricca, G

    2015-06-01

    The implication of the renin-angiotensin-aldosterone system (RAAS) in atheroma development is well described. However, a complete view of the local RAAS in atheroma is still missing. In this study we aimed to reveal the organization of RAAS in atheroma at the transcriptomic level and identify the transcriptional regulators behind it. Extended RAAS (extRAAS) was defined as the set of 37 genes coding for classical and novel RAAS participants (Figure 1). Five microarray datasets containing overall 590 samples representing carotid and peripheral atheroma were downloaded from the GEO database. Correlation-based hierarchical clustering (R software) of extRAAS genes within each dataset allowed the identification of modules of co-expressed genes. Reproducible co-expression modules across datasets were then extracted. Transcription factors (TFs) having common binding sites (TFBSs) in the promoters of coordinated genes were identified using the Genomatix database tools and analyzed for their correlation with extRAAS genes in the microarray datasets. Expression data revealed the expressed extRAAS components and their relative abundance displaying the favored pathways in atheroma. Three co-expression modules with more than 80% reproducibility across datasets were extracted. Two of them (M1 and M2) contained genes coding for angiotensin metabolizing enzymes involved in different pathways: M1 included ACE, MME, RNPEP, and DPP3, in addition to 7 other genes; and M2 included CMA1, CTSG, and CPA3. The third module (M3) contained genes coding for receptors known to be implicated in atheroma (AGTR1, MR, GR, LNPEP, EGFR and GPER). M1 and M3 were negatively correlated in 3 of 5 datasets. We identified 19 TFs that have enriched TFBSs in the promoters of genes of M1, and two for M3, but none was found for M2. Among the extracted TFs, ELF1, MAX, and IRF5 showed significant positive correlations with peptidase-coding genes from M1 and negative correlations with receptors-coding genes from M3 (p < 0.05). The identified co-expression modules display the transcriptional organization of local extRAAS in human carotid atheroma. The identification of several TFs potentially associated to extRAAS genes may provide a frame for the discovery of atheroma-specific modulators of extRAAS activity.(Figure is included in full-text article.).

  16. An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.

    PubMed

    Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-29

    Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.

  17. An open-source framework for large-scale, flexible evaluation of biomedical text mining systems

    PubMed Central

    Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-01

    Background Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Results Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. Conclusion The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net. PMID:18230184

  18. Does filler database size influence identification accuracy?

    PubMed

    Bergold, Amanda N; Heaton, Paul

    2018-06-01

    Police departments increasingly use large photo databases to select lineup fillers using facial recognition software, but this technological shift's implications have been largely unexplored in eyewitness research. Database use, particularly if coupled with facial matching software, could enable lineup constructors to increase filler-suspect similarity and thus enhance eyewitness accuracy (Fitzgerald, Oriet, Price, & Charman, 2013). However, with a large pool of potential fillers, such technologies might theoretically produce lineup fillers too similar to the suspect (Fitzgerald, Oriet, & Price, 2015; Luus & Wells, 1991; Wells, Rydell, & Seelau, 1993). This research proposes a new factor-filler database size-as a lineup feature affecting eyewitness accuracy. In a facial recognition experiment, we select lineup fillers in a legally realistic manner using facial matching software applied to filler databases of 5,000, 25,000, and 125,000 photos, and find that larger databases are associated with a higher objective similarity rating between suspects and fillers and lower overall identification accuracy. In target present lineups, witnesses viewing lineups created from the larger databases were less likely to make correct identifications and more likely to select known innocent fillers. When the target was absent, database size was associated with a lower rate of correct rejections and a higher rate of filler identifications. Higher algorithmic similarity ratings were also associated with decreases in eyewitness identification accuracy. The results suggest that using facial matching software to select fillers from large photograph databases may reduce identification accuracy, and provides support for filler database size as a meaningful system variable. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  19. Spacelab user implementation assessment study. (Software requirements analysis). Volume 2: Technical report

    NASA Technical Reports Server (NTRS)

    1976-01-01

    The engineering analyses and evaluation studies conducted for the Software Requirements Analysis are discussed. Included are the development of the study data base, synthesis of implementation approaches for software required by both mandatory onboard computer services and command/control functions, and identification and implementation of software for ground processing activities.

  20. Identification of appropriate reference genes for normalizing transcript expression by quantitative real-time PCR in Litsea cubeba.

    PubMed

    Lin, Liyuan; Han, Xiaojiao; Chen, Yicun; Wu, Qingke; Wang, Yangdong

    2013-12-01

    Quantitative real-time PCR has emerged as a highly sensitive and widely used method for detection of gene expression profiles, via which accurate detection depends on reliable normalization. Since no single control is appropriate for all experimental treatments, it is generally advocated to select suitable internal controls prior to use for normalization. This study reported the evaluation of the expression stability of twelve potential reference genes in different tissue/organs and six fruit developmental stages of Litsea cubeba in order to screen the superior internal reference genes for data normalization. Two softwares-geNorm, and NormFinder-were used to identify stability of these candidate genes. The cycle threshold difference and coefficient of variance were also calculated to evaluate the expression stability of candidate genes. F-BOX, EF1α, UBC, and TUA were selected as the most stable reference genes across 11 sample pools. F-BOX, EF1α, and EIF4α exhibited the highest expression stability in different tissue/organs and different fruit developmental stages. Besides, a combination of two stable reference genes would be sufficient for gene expression normalization in different fruit developmental stages. In addition, the relative expression profiles of DXS and DXR were evaluated by EF1α, UBC, and SAMDC. The results further validated the reliability of stable reference genes and also highlighted the importance of selecting suitable internal controls for L. cubeba. These reference genes will be of great importance for transcript normalization in future gene expression studies on L. cubeba.

  1. Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1

    PubMed Central

    Berardini, Tanya Z.; Mundodi, Suparna; Reiser, Leonore; Huala, Eva; Garcia-Hernandez, Margarita; Zhang, Peifen; Mueller, Lukas A.; Yoon, Jungwoon; Doyle, Aisling; Lander, Gabriel; Moseyko, Nick; Yoo, Danny; Xu, Iris; Zoeckler, Brandon; Montoya, Mary; Miller, Neil; Weems, Dan; Rhee, Seung Y.

    2004-01-01

    Controlled vocabularies are increasingly used by databases to describe genes and gene products because they facilitate identification of similar genes within an organism or among different organisms. One of The Arabidopsis Information Resource's goals is to associate all Arabidopsis genes with terms developed by the Gene Ontology Consortium that describe the molecular function, biological process, and subcellular location of a gene product. We have also developed terms describing Arabidopsis anatomy and developmental stages and use these to annotate published gene expression data. As of March 2004, we used computational and manual annotation methods to make 85,666 annotations representing 26,624 unique loci. We focus on associating genes to controlled vocabulary terms based on experimental data from the literature and use The Arabidopsis Information Resource-developed PubSearch software to facilitate this process. Each annotation is tagged with a combination of evidence codes, evidence descriptions, and references that provide a robust means to assess data quality. Annotation of all Arabidopsis genes will allow quantitative comparisons between sets of genes derived from sources such as microarray experiments. The Arabidopsis annotation data will also facilitate annotation of newly sequenced plant genomes by using sequence similarity to transfer annotations to homologous genes. In addition, complete and up-to-date annotations will make unknown genes easy to identify and target for experimentation. Here, we describe the process of Arabidopsis functional annotation using a variety of data sources and illustrate several ways in which this information can be accessed and used to infer knowledge about Arabidopsis and other plant species. PMID:15173566

  2. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    PubMed Central

    Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko

    2015-01-01

    Abstract Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579

  3. Constructing an integrated gene similarity network for the identification of disease genes.

    PubMed

    Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

    2017-09-20

    Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .

  4. Identification of Methicillin-Resistant Staphylococcus aureus (MRSA) Using Simultaneous Detection of mecA, nuc, and femB by Loop-Mediated Isothermal Amplification (LAMP).

    PubMed

    Chen, Changguo; Zhao, Qiangyuan; Guo, Jianwei; Li, Yanjun; Chen, Qiuyuan

    2017-08-01

    The aim of this study was to develop a rapid detection assay to identify methicillin-resistant Staphylococcus aureus by simultaneous testing for the mecA, nuc, and femB genes using the loop-mediated isothermal amplification (LAMP) method. LAMP primers were designed using online bio-software ( http://primerexplorer.jp/e/ ), and amplification reactions were performed in an isothermal temperature bath. The products were then examined using 2% agarose gel electrophoresis. MecA, nuc, and femB were confirmed by triplex TaqMan real-time PCR. For better naked-eye inspection of the reaction result, hydroxy naphthol blue (HNB) was added to the amplification system. Within 60 min, LAMP successfully amplified the genes of interest under isothermal conditions at 63 °C. The results of 2% gel electrophoresis indicated that when the Mg 2+ concentration in the reaction system was 6 μmol, the amplification of the mecA gene was relatively good, while the amplification of the nuc and femB genes was better at an Mg 2+ concentration of 8 μmol. Obvious color differences were observed by adding 1 μL (3.75 mM) of HNB into 25 μL reaction system. The LAMP assay was applied to 128 isolates cases of methicillin-resistant Staphylococcus aureus, which were separated from the daily specimens and identified by Vitek microbial identification instruments. The results were identical for both LAMP and PCR. LAMP offers an alternative detection assay for mecA, nuc, and femB and is faster than other methods.

  5. Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG.

    PubMed

    Uchiyama, Ikuo

    2017-01-01

    Comparative genomics is becoming an essential approach for identification of genes associated with a specific function or phenotype. Here, we introduce the microbial genome database for comparative analysis (MBGD), which is a comprehensive ortholog database among the microbial genomes available so far. MBGD contains several precomputed ortholog tables including the standard ortholog table covering the entire taxonomic range and taxon-specific ortholog tables for various major taxa. In addition, MBGD allows the users to create an ortholog table within any specified set of genomes through dynamic calculations. In particular, MBGD has a "My MBGD" mode where users can upload their original genome sequences and incorporate them into orthology analysis. The created ortholog table can serve as the basis for various comparative analyses. Here, we describe the use of MBGD and briefly explain how to utilize the orthology information during comparative genome analysis in combination with the stand-alone comparative genomics software RECOG, focusing on the application to comparison of closely related microbial genomes.

  6. Assessment of nuclear and mitochondrial genes in precise identification and analysis of genetic polymorphisms for the evaluation of Leishmania parasites.

    PubMed

    Fotouhi-Ardakani, Reza; Dabiri, Shahriar; Ajdari, Soheila; Alimohammadian, Mohammad Hossein; AlaeeNovin, Elnaz; Taleshi, Neda; Parvizi, Parviz

    2016-12-01

    The polymorphism and genetic diversity of Leishmania genus has status under discussion depending on many items such as nuclear and/or mitochondrial genes, molecular tools, Leishmania species, geographical origin, condition of micro-environment of Leishmania parasites and isolation of Leishmania from clinical samples, reservoir host and vectors. The genetic variation of Leishmania species (L. major, L. tropica, L. tarentolae, L. mexicana, L. infantum) were analyzed and compared using mitochondrial (COII and Cyt b) and nuclear (nagt, ITS-rDNA and HSP70) genes. The role of each enzymatic (COII, Cyt b and nagt) or housekeeping (ITS-rDNA, HSP70) gene was employed for accurate identification of Leishmania parasites. After DNA extractions and amplifying of native, natural and reference strains of Leishmania parasites, polymerase chain reaction (PCR) products were sequenced and evaluation of genetic proximity and phylogenetic analysis were performed using MEGA6 and DnaSP5 software. Among the 72 sequences of the five genes, the number of polymorphic sites was significantly lower as compared to the monomorphic sites. Of the 72 sequences, 54 new haplotypes (five genes) of Leishmania species were submitted in GenBank (Access number: KU680818 - KU680871). Four genes had a remarkable number of informative sites (P=0.00), except HSP70 maybe because of its microsatellite regions. The non-synonymous (dN) variants of nagt gene were more than that of other expression genes (47.4%). The synonymous (dS)/dN ratio in three expression genes showed a significant variation between five Leishmania species (P=0.001). The highest and lowest levels of haplotype diversity were observed in L. tropica (81.35%) and L. major (28.38%) populations, respectively. Tajima's D index analyses showed that Cyt b gene in L. tropica species was significantly negative (Tajima's D=-2.2, P<0.01), while COII and nagt genes were produced through evolutionary processes for both L. tropica and L. major (Tajima's D=2.85 & 2.91, P<0.01). More different clinical lesions with extensive phylogenetic and evolutionary analyses should be employed to avoid confusion in the diagnosis of leishmaniasis and development of vaccines for eradicating Leishmania parasites. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. GenePRIMP: A software quality control tool

    ScienceCinema

    Amrita Pati

    2017-12-09

    Amrita Pati of the DOE Joint Genome Institute's Genome Biology group describes the software tool GenePRIMP and how it fits into the quality control pipeline for microbial genomics. Further details regarding GenePRIMP appear in a paper published online May 2, 2010 in Nature Methods.

  8. GeMS: an advanced software package for designing synthetic genes.

    PubMed

    Jayaraj, Sebastian; Reid, Ralph; Santi, Daniel V

    2005-01-01

    A user-friendly, advanced software package for gene design is described. The software comprises an integrated suite of programs-also provided as stand-alone tools-that automatically performs the following tasks in gene design: restriction site prediction, codon optimization for any expression host, restriction site inclusion and exclusion, separation of long sequences into synthesizable fragments, T(m) and stem-loop determinations, optimal oligonucleotide component design and design verification/error-checking. The output is a complete design report and a list of optimized oligonucleotides to be prepared for subsequent gene synthesis. The user interface accommodates both inexperienced and experienced users. For inexperienced users, explanatory notes are provided such that detailed instructions are not necessary; for experienced users, a streamlined interface is provided without such notes. The software has been extensively tested in the design and successful synthesis of over 400 kb of genes, many of which exceeded 5 kb in length.

  9. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP) miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM) v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf) results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p<0.05) gene targets in BRM indicates that nicotine exposure disrupts genes involved in neurogenesis, possibly through misregulation of nicotine-sensitive miRNAs. Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single software environment with the added capability to interact with public data sources and visual analytic tools for HTP data analysis at a systems level. BRM is developed using Java™ and other open-source technologies for free distribution (http://www.sysbio.org/dataresources/brm.stm). PMID:23174015

  10. Sign: large-scale gene network estimation environment for high performance computing.

    PubMed

    Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

    2011-01-01

    Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .

  11. Mining, identification and function analysis of microRNAs and target genes in peanut (Arachis hypogaea L.).

    PubMed

    Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua

    2017-02-01

    In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  12. Mutation analysis of the COL1A1 and COL1A2 genes in Vietnamese patients with osteogenesis imperfecta.

    PubMed

    Ho Duy, Binh; Zhytnik, Lidiia; Maasalu, Katre; Kändla, Ivo; Prans, Ele; Reimann, Ene; Märtson, Aare; Kõks, Sulev

    2016-08-12

    The genetics of osteogenesis imperfecta (OI) have not been studied in a Vietnamese population before. We performed mutational analysis of the COL1A1 and COL1A2 genes in 91 unrelated OI patients of Vietnamese origin. We then systematically characterized the mutation profiles of these two genes which are most commonly related to OI. Genomic DNA was extracted from EDTA-preserved blood according to standard high-salt extraction methods. Sequence analysis and pathogenic variant identification was performed with Mutation Surveyor DNA variant analysis software. Prediction of the pathogenicity of mutations was conducted using Alamut Visual software. The presence of variants was checked against Dalgleish's osteogenesis imperfecta mutation database. The sample consisted of 91 unrelated osteogenesis imperfecta patients. We identified 54 patients with COL1A1/2 pathogenic variants; 33 with COL1A1 and 21 with COL1A2. Two patients had multiple pathogenic variants. Seventeen novel COL1A1 and 10 novel COL1A2 variants were identified. The majority of identified COL1A1/2 pathogenic variants occurred in a glycine substitution (36/56, 64.3 %), usually serine (23/36, 63.9 %). We found two pathogenic variants of the COL1A1 gene c.2461G > A (p.Gly821Ser) in four unrelated patients and one, c.2005G > A (p.Ala669Thr), in two unrelated patients. Our data showed a lower number of collagen OI pathogenic variants in Vietnamese patients compared to reported rates for Asian populations. The OI mutational profile of the Vietnamese population is unique and related to the presence of a high number of recessive mutations in non-collagenous OI genes. Further analysis of OI patients negative for collagen mutations, is required.

  13. QUADrATiC: scalable gene expression connectivity mapping for repurposing FDA-approved therapeutics.

    PubMed

    O'Reilly, Paul G; Wen, Qing; Bankhead, Peter; Dunne, Philip D; McArt, Darragh G; McPherson, Suzanne; Hamilton, Peter W; Mills, Ken I; Zhang, Shu-Dong

    2016-05-04

    Gene expression connectivity mapping has proven to be a powerful and flexible tool for research. Its application has been shown in a broad range of research topics, most commonly as a means of identifying potential small molecule compounds, which may be further investigated as candidates for repurposing to treat diseases. The public release of voluminous data from the Library of Integrated Cellular Signatures (LINCS) programme further enhanced the utilities and potentials of gene expression connectivity mapping in biomedicine. We describe QUADrATiC ( http://go.qub.ac.uk/QUADrATiC ), a user-friendly tool for the exploration of gene expression connectivity on the subset of the LINCS data set corresponding to FDA-approved small molecule compounds. It enables the identification of compounds for repurposing therapeutic potentials. The software is designed to cope with the increased volume of data over existing tools, by taking advantage of multicore computing architectures to provide a scalable solution, which may be installed and operated on a range of computers, from laptops to servers. This scalability is provided by the use of the modern concurrent programming paradigm provided by the Akka framework. The QUADrATiC Graphical User Interface (GUI) has been developed using advanced Javascript frameworks, providing novel visualization capabilities for further analysis of connections. There is also a web services interface, allowing integration with other programs or scripts. QUADrATiC has been shown to provide an improvement over existing connectivity map software, in terms of scope (based on the LINCS data set), applicability (using FDA-approved compounds), usability and speed. It offers potential to biological researchers to analyze transcriptional data and generate potential therapeutics for focussed study in the lab. QUADrATiC represents a step change in the process of investigating gene expression connectivity and provides more biologically-relevant results than previous alternative solutions.

  14. [Hydrophidae identification through analysis on Cyt b gene barcode].

    PubMed

    Liao, Li-xi; Zeng, Ke-wu; Tu, Peng-fei

    2015-08-01

    Hydrophidae, one of the precious traditional Chinese medicines, is generally drily preserved to prevent corruption, but it is hard to identify the species of Hydrophidae through the appearance because of the change due to the drying process. The identification through analysis on gene barcode, a new technique in species identification, can avoid the problem. The gene barcodes of the 6 species of Hydrophidae like Lapemis hardwickii were aquired through DNA extraction and gene sequencing. These barcodes were then in sequence alignment and test the identification efficency by BLAST. Our results revealed that the barcode sequences performed high identification efficiency, and had obvious difference between intra- and inter-species. These all indicated that Cyt b DNA barcoding can confirm the Hydrophidae identification.

  15. DGCA: A comprehensive R package for Differential Gene Correlation Analysis.

    PubMed

    McKenzie, Andrew T; Katsyv, Igor; Song, Won-Min; Wang, Minghui; Zhang, Bin

    2016-11-15

    Dissecting the regulatory relationships between genes is a critical step towards building accurate predictive models of biological systems. A powerful approach towards this end is to systematically study the differences in correlation between gene pairs in more than one distinct condition. In this study we develop an R package, DGCA (for Differential Gene Correlation Analysis), which offers a suite of tools for computing and analyzing differential correlations between gene pairs across multiple conditions. To minimize parametric assumptions, DGCA computes empirical p-values via permutation testing. To understand differential correlations at a systems level, DGCA performs higher-order analyses such as measuring the average difference in correlation and multiscale clustering analysis of differential correlation networks. Through a simulation study, we show that the straightforward z-score based method that DGCA employs significantly outperforms the existing alternative methods for calculating differential correlation. Application of DGCA to the TCGA RNA-seq data in breast cancer not only identifies key changes in the regulatory relationships between TP53 and PTEN and their target genes in the presence of inactivating mutations, but also reveals an immune-related differential correlation module that is specific to triple negative breast cancer (TNBC). DGCA is an R package for systematically assessing the difference in gene-gene regulatory relationships under different conditions. This user-friendly, effective, and comprehensive software tool will greatly facilitate the application of differential correlation analysis in many biological studies and thus will help identification of novel signaling pathways, biomarkers, and targets in complex biological systems and diseases.

  16. Novel linkage disequilibrium clustering algorithm identifies new lupus genes on meta-analysis of GWAS datasets.

    PubMed

    Saeed, Mohammad

    2017-05-01

    Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.

  17. A multi-center study benchmarks software tools for label-free proteome quantification

    PubMed Central

    Gillet, Ludovic C; Bernhardt, Oliver M.; MacLean, Brendan; Röst, Hannes L.; Tate, Stephen A.; Tsou, Chih-Chiang; Reiter, Lukas; Distler, Ute; Rosenberger, George; Perez-Riverol, Yasset; Nesvizhskii, Alexey I.; Aebersold, Ruedi; Tenzer, Stefan

    2016-01-01

    The consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from SWATH-MS (sequential window acquisition of all theoretical fragment ion spectra), a method that uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test datasets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation windows setups. For consistent evaluation we developed LFQbench, an R-package to calculate metrics of precision and accuracy in label-free quantitative MS, and report the identification performance, robustness and specificity of each software tool. Our reference datasets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics. PMID:27701404

  18. A multicenter study benchmarks software tools for label-free proteome quantification.

    PubMed

    Navarro, Pedro; Kuharev, Jörg; Gillet, Ludovic C; Bernhardt, Oliver M; MacLean, Brendan; Röst, Hannes L; Tate, Stephen A; Tsou, Chih-Chiang; Reiter, Lukas; Distler, Ute; Rosenberger, George; Perez-Riverol, Yasset; Nesvizhskii, Alexey I; Aebersold, Ruedi; Tenzer, Stefan

    2016-11-01

    Consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH 2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from sequential window acquisition of all theoretical fragment-ion spectra (SWATH)-MS, which uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test data sets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation-window setups. For consistent evaluation, we developed LFQbench, an R package, to calculate metrics of precision and accuracy in label-free quantitative MS and report the identification performance, robustness and specificity of each software tool. Our reference data sets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.

  19. Identification of stably expressed reference genes for RT-qPCR data normalization in defined localizations of cyclic bovine ovaries.

    PubMed

    Schoen, K; Plendl, J; Gabler, C; Kaessmeyer, S

    2015-06-01

    Ovaries are highly complex organs displaying morphological, molecular and functional differences between their cortical zona parenchymatosa and medullary zona vasculosa, and also between the different cyclic luteal stages. Objective of the present study was to validate expression stability of twelve putative reference genes (RGs) in bovine ovaries, considering the intrinsic heterogeneity of bovine ovarian tissue with regard to different luteal stages and intra-ovarian localizations. The focus was on identifying RGs, which are suitable to normalize RT-qPCR results of ovaries collected from clinical healthy cattle, irrespective of localization and the hormonal stage. Expression profiles of twelve potential reference genes (GAPDH, ACTB, YWHAZ, HPRT1, SDHA, UBA52, POLR2C, RPS9, ACTG2, H3F3B, RPS18 and RPL19) were analysed. Evaluation of gene expression differences was performed using genorm, normfinder, and bestkeeper software. The most stably expressed genes according to genorm, normfinder and bestkeeper approaches contained the candidates H3F3B, RPS9, YWHAZ, RPS18, POLR2C and UBA52. Of this group, the genes YWHAZ, H3F3B and RPS9 could be recommended as best-suited RGs for normalization purposes on healthy bovine ovaries irrespective of the luteal stage or intra-ovarian localization. © 2014 Blackwell Verlag GmbH.

  20. Reference Gene Validation for RT-qPCR, a Note on Different Available Software Packages

    PubMed Central

    De Spiegelaere, Ward; Dern-Wieloch, Jutta; Weigel, Roswitha; Schumacher, Valérie; Schorle, Hubert; Nettersheim, Daniel; Bergmann, Martin; Brehm, Ralph; Kliesch, Sabine; Vandekerckhove, Linos; Fink, Cornelia

    2015-01-01

    Background An appropriate normalization strategy is crucial for data analysis from real time reverse transcription polymerase chain reactions (RT-qPCR). It is widely supported to identify and validate stable reference genes, since no single biological gene is stably expressed between cell types or within cells under different conditions. Different algorithms exist to validate optimal reference genes for normalization. Applying human cells, we here compare the three main methods to the online available RefFinder tool that integrates these algorithms along with R-based software packages which include the NormFinder and GeNorm algorithms. Results 14 candidate reference genes were assessed by RT-qPCR in two sample sets, i.e. a set of samples of human testicular tissue containing carcinoma in situ (CIS), and a set of samples from the human adult Sertoli cell line (FS1) either cultured alone or in co-culture with the seminoma like cell line (TCam-2) or with equine bone marrow derived mesenchymal stem cells (eBM-MSC). Expression stabilities of the reference genes were evaluated using geNorm, NormFinder, and BestKeeper. Similar results were obtained by the three approaches for the most and least stably expressed genes. The R-based packages NormqPCR, SLqPCR and the NormFinder for R script gave identical gene rankings. Interestingly, different outputs were obtained between the original software packages and the RefFinder tool, which is based on raw Cq values for input. When the raw data were reanalysed assuming 100% efficiency for all genes, then the outputs of the original software packages were similar to the RefFinder software, indicating that RefFinder outputs may be biased because PCR efficiencies are not taken into account. Conclusions This report shows that assay efficiency is an important parameter for reference gene validation. New software tools that incorporate these algorithms should be carefully validated prior to use. PMID:25825906

  1. Reference gene validation for RT-qPCR, a note on different available software packages.

    PubMed

    De Spiegelaere, Ward; Dern-Wieloch, Jutta; Weigel, Roswitha; Schumacher, Valérie; Schorle, Hubert; Nettersheim, Daniel; Bergmann, Martin; Brehm, Ralph; Kliesch, Sabine; Vandekerckhove, Linos; Fink, Cornelia

    2015-01-01

    An appropriate normalization strategy is crucial for data analysis from real time reverse transcription polymerase chain reactions (RT-qPCR). It is widely supported to identify and validate stable reference genes, since no single biological gene is stably expressed between cell types or within cells under different conditions. Different algorithms exist to validate optimal reference genes for normalization. Applying human cells, we here compare the three main methods to the online available RefFinder tool that integrates these algorithms along with R-based software packages which include the NormFinder and GeNorm algorithms. 14 candidate reference genes were assessed by RT-qPCR in two sample sets, i.e. a set of samples of human testicular tissue containing carcinoma in situ (CIS), and a set of samples from the human adult Sertoli cell line (FS1) either cultured alone or in co-culture with the seminoma like cell line (TCam-2) or with equine bone marrow derived mesenchymal stem cells (eBM-MSC). Expression stabilities of the reference genes were evaluated using geNorm, NormFinder, and BestKeeper. Similar results were obtained by the three approaches for the most and least stably expressed genes. The R-based packages NormqPCR, SLqPCR and the NormFinder for R script gave identical gene rankings. Interestingly, different outputs were obtained between the original software packages and the RefFinder tool, which is based on raw Cq values for input. When the raw data were reanalysed assuming 100% efficiency for all genes, then the outputs of the original software packages were similar to the RefFinder software, indicating that RefFinder outputs may be biased because PCR efficiencies are not taken into account. This report shows that assay efficiency is an important parameter for reference gene validation. New software tools that incorporate these algorithms should be carefully validated prior to use.

  2. ChimeRScope: a novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data

    PubMed Central

    Li, You; Heavican, Tayla B.; Vellichirammal, Neetha N.; Iqbal, Javeed

    2017-01-01

    Abstract The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The ‘fusion’ or ‘chimeric’ transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignment-free method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/). PMID:28472320

  3. GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis.

    PubMed

    Gonzalez, Michael A; Lebrigio, Rafael F Acosta; Van Booven, Derek; Ulloa, Rick H; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schüle, Rebecca; Züchner, Stephan

    2013-06-01

    Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. © 2013 Wiley Periodicals, Inc.

  4. Identification of Differentially Expressed miRNAs between White and Black Hair Follicles by RNA-Sequencing in the Goat (Capra hircus)

    PubMed Central

    Wu, Zhenyang; Fu, Yuhua; Cao, Jianhua; Yu, Mei; Tang, Xiaohui; Zhao, Shuhong

    2014-01-01

    MicroRNAs (miRNAs) play a key role in many biological processes by regulating gene expression at the post-transcriptional level. A number of miRNAs have been identified from livestock species. However, compared with other animals, such as pigs and cows, the number of miRNAs identified in goats is quite low, particularly in hair follicles. In this study, to investigate the functional roles of miRNAs in goat hair follicles of goats with different coat colors, we sequenced miRNAs from two hair follicles samples (white and black) using Solexa sequencing. A total of 35,604,016 reads were obtained, which included 30,878,637 clean reads (86.73%). MiRDeep2 software identified 214 miRNAs. Among them, 205 were conserved among species and nine were novel miRNAs. Furthermore, DESeq software identified six differentially expressed miRNAs. Quantitative PCR confirmed differential expression of two miRNAs, miR-10b and miR-211. KEGG pathways were analyzed using the DAVID website for the predicted target genes of the differentially expressed miRNAs. Several signaling pathways including Notch and MAPK pathways may affect the process of coat color formation. Our study showed that the identified miRNAs might play an essential role in black and white follicle formation in goats. PMID:24879525

  5. Identification of Cronobacter species by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry with an optimized analysis method.

    PubMed

    Wang, Qi; Zhao, Xiao-Juan; Wang, Zi-Wei; Liu, Li; Wei, Yong-Xin; Han, Xiao; Zeng, Jing; Liao, Wan-Jin

    2017-08-01

    Rapid and precise identification of Cronobacter species is important for foodborne pathogen detection, however, commercial biochemical methods can only identify Cronobacter strains to genus level in most cases. To evaluate the power of mass spectrometry based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF MS) for Cronobacter species identification, 51 Cronobacter strains (eight reference and 43 wild strains) were identified by both MALDI-TOF MS and 16S rRNA gene sequencing. Biotyper RTC provided by Bruker identified all eight reference and 43 wild strains as Cronobacter species, which demonstrated the power of MALDI-TOF MS to identify Cronobacter strains to genus level. However, using the Bruker's database (6903 main spectra products) and Biotyper software, the MALDI-TOF MS analysis could not identify the investigated strains to species level. When MALDI-TOF MS analysis was performed using the combined in-house Cronobacter database and Bruker's database, bin setting, and unweighted pair group method with arithmetic mean (UPGMA) clustering, all the 51 strains were clearly identified into six Cronobacter species and the identification accuracy increased from 60% to 100%. We demonstrated that MALDI-TOF MS was reliable and easy-to-use for Cronobacter species identification and highlighted the importance of establishing a reliable database and improving the current data analysis methods by integrating the bin setting and UPGMA clustering. Copyright © 2017. Published by Elsevier B.V.

  6. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    PubMed

    Lim, Hansaim; Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; He, Di; Zhuang, Luke; Meng, Patrick; Xie, Lei

    2016-10-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP.

  7. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing

    PubMed Central

    Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; Meng, Patrick; Xie, Lei

    2016-01-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP. PMID:27716836

  8. CosmoQuest Transient Tracker: Opensource Photometry & Astrometry software

    NASA Astrophysics Data System (ADS)

    Myers, Joseph L.; Lehan, Cory; Gay, Pamela; Richardson, Matthew; CosmoQuest Team

    2018-01-01

    CosmoQuest is moving from online citizen science, to observational astronomy with the creation of Transient Trackers. This open source software is designed to identify asteroids and other transient/variable objects in image sets. Transient Tracker’s features in final form will include: astrometric and photometric solutions, identification of moving/transient objects, identification of variable objects, and lightcurve analysis. In this poster we present our initial, v0.1 release and seek community input.This software builds on the existing NIH funded ImageJ libraries. Creation of this suite of opensource image manipulation routines is lead by Wayne Rasband and is released primarily under the MIT license. In this release, we are building on these libraries to add source identification for point / point-like sources, and to do astrometry. Our materials released under the Apache 2.0 license on github (http://github.com/CosmoQuestTeam) and documentation can be found at http://cosmoquest.org/TransientTracker.

  9. PROGRAM FOR THE IDENTIFICATION AND REPLACEMENT OF ENDOCRINE DISRUPTING CHEMICALS

    EPA Science Inventory

    A computer software program is being developed to aid in the identification and replacement of endocrine disrupting chemicals (EDC). This program will be comprised of two distinct areas of research: identification of potential EDC nd suggstions for replacing those potential EDC. ...

  10. A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model

    PubMed Central

    Beldade, Patrícia; Rudd, Stephen; Gruber, Jonathan D; Long, Anthony D

    2006-01-01

    Background Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly. Results By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers. Conclusion Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics. PMID:16737530

  11. Identification and Validation of Reference Genes for RT-qPCR Analysis in Non-Heading Chinese Cabbage Flowers

    PubMed Central

    Wang, Cheng; Cui, Hong-Mi; Huang, Tian-Hong; Liu, Tong-Kun; Hou, Xi-Lin; Li, Ying

    2016-01-01

    Non-heading Chinese cabbage (Brassica rapa ssp. chinensis Makino) is an important vegetable member of Brassica rapa crops. It exhibits a typical sporophytic self-incompatibility (SI) system and is an ideal model plant to explore the mechanism of SI. Gene expression research are frequently used to unravel the complex genetic mechanism and in such studies appropriate reference selection is vital. Validation of reference genes have neither been conducted in Brassica rapa flowers nor in SI trait. In this study, 13 candidate reference genes were selected and examined systematically in 96 non-heading Chinese cabbage flower samples that represent four strategic groups in compatible and self-incompatible lines of non-heading Chinese cabbage. Two RT-qPCR analysis software, geNorm and NormFinder, were used to evaluate the expression stability of these genes systematically. Results revealed that best-ranked references genes should be selected according to specific sample subsets. DNAJ, UKN1, and PP2A were identified as the most stable reference genes among all samples. Moreover, our research further revealed that the widely used reference genes, CYP and ACP, were the least suitable reference genes in most non-heading Chinese cabbage flower sample sets. To further validate the suitability of the reference genes identified in this study, the expression level of SRK and Exo70A1 genes which play important roles in regulating interaction between pollen and stigma were studied. Our study presented the first systematic study of reference gene(s) selection for SI study and provided guidelines to obtain more accurate RT-qPCR results in non-heading Chinese cabbage. PMID:27375663

  12. Eigensystem realization algorithm user's guide forVAX/VMS computers: Version 931216

    NASA Technical Reports Server (NTRS)

    Pappa, Richard S.

    1994-01-01

    The eigensystem realization algorithm (ERA) is a multiple-input, multiple-output, time domain technique for structural modal identification and minimum-order system realization. Modal identification is the process of calculating structural eigenvalues and eigenvectors (natural vibration frequencies, damping, mode shapes, and modal masses) from experimental data. System realization is the process of constructing state-space dynamic models for modern control design. This user's guide documents VAX/VMS-based FORTRAN software developed by the author since 1984 in conjunction with many applications. It consists of a main ERA program and 66 pre- and post-processors. The software provides complete modal identification capabilities and most system realization capabilities.

  13. RNA sequencing to study gene expression and SNP variations associated with growth in zebrafish fed a plant protein-based diet.

    PubMed

    Ulloa, Pilar E; Rincón, Gonzalo; Islas-Trejo, Alma; Araneda, Cristian; Iturra, Patricia; Neira, Roberto; Medrano, Juan F

    2015-06-01

    The objectives of this study were to measure gene expression in zebrafish and then identify SNP to be used as potential markers in a growth association study. We developed an approach where muscle samples collected from low- and high-growth fish were analyzed using RNA-Sequencing (RNA-seq), and SNP were chosen from the genes that were differentially expressed between the low and high groups. A population of 24 families was fed a plant protein-based diet from the larval to adult stages. From a total of 440 males, 5 % of the fish from both tails of the weight gain distribution were selected. Total RNA was extracted from individual muscle of 8 low-growth and 8 high-growth fish. Two pooled RNA-Seq libraries were prepared for each phenotype using 4 fish per library. Libraries were sequenced using the Illumina GAII Sequencer and analyzed using the CLCBio genomic workbench software. One hundred and twenty-four genes were differentially expressed between phenotypes (p value < 0.05 and FDR < 0.2). From these genes, 164 SNP were selected and genotyped in 240 fish samples. Marker-trait analysis revealed 5 SNP associated with growth in key genes (Nars, Lmod2b, Cuzd1, Acta1b, and Plac8l1). These genes are good candidates for further growth studies in fish and to consider for identification of potential SNPs associated with different growth rates in response to a plant protein-based diet.

  14. Selection of housekeeping genes and demonstration of RNAi in cotton leafhopper, Amrasca biguttula biguttula (Ishida)

    PubMed Central

    Gupta, Mridula; Pandher, Suneet; Kaur, Gurmeet; Rathore, Pankaj; Palli, Subba Reddy

    2018-01-01

    Amrasca biguttula biguttula (Ishida) commonly known as cotton leafhopper is a severe pest of cotton and okra. Not much is known on this insect at molecular level due to lack of genomic and transcriptomic data. To prepare for functional genomic studies in this insect, we evaluated 15 common housekeeping genes (Tub, B-Tub, EF alpha, GADPH, UbiCF, RP13, Ubiq, G3PD, VATPase, Actin, 18s, 28s, TATA, ETF, SOD and Cytolytic actin) during different developmental stages and under starvation stress. We selected early (1st and 2nd), late (3rd and 4th) stage nymphs and adults for identification of stable housekeeping genes using geNorm, NormFinder, BestKeeper and RefFinder software. Based on the different algorithms, RP13 and VATPase are identified as the most suitable reference genes for quantification of gene expression by reverse transcriptase quantitative PCR (RT-qPCR). Based on RefFinder which comprehended the results of three algorithms, RP13 in adults, Tubulin (Tub) in late nymphs, 28S in early nymph and UbiCF under starvation stress were identified as the most stable genes. We also developed methods for feeding double-stranded RNA (dsRNA) incorporated in the diet. Feeding dsRNA targeting Snf7, IAP, AQP1, and VATPase caused 56.17–77.12% knockdown of targeted genes compared to control and 16 to 48% mortality of treated insects when compared to control. PMID:29329327

  15. Visualization and Analysis of MiRNA-Targets Interactions Networks.

    PubMed

    León, Luis E; Calligaris, Sebastián D

    2017-01-01

    MicroRNAs are a class of small, noncoding RNA molecules of 21-25 nucleotides in length that regulate the gene expression by base-pairing with the target mRNAs, mainly leading to down-regulation or repression of the target genes. MicroRNAs are involved in diverse regulatory pathways in normal and pathological conditions. In this context, it is highly important to identify the targets of specific microRNA in order to understand the mechanism of its regulation and consequently its involvement in disease. However, the microRNA target identification is experimentally laborious and time-consuming. The in silico prediction of microRNA targets is an extremely useful approach because you can identify potential mRNA targets, reduce the number of possibilities and then, validate a few microRNA-mRNA interactions in an in vitro experimental model. In this chapter, we describe, in a simple way, bioinformatics guidelines to use miRWalk database and Cytoscape software for analyzing microRNA-mRNA interactions through their visualization as a network.

  16. A tracking system for laboratory mice to support medical researchers in behavioral analysis.

    PubMed

    Macrì, S; Mainetti, L; Patrono, L; Pieretti, S; Secco, A; Sergi, I

    2015-08-01

    The behavioral analysis of laboratory mice plays a key role in several medical and scientific research areas, such as biology, toxicology, pharmacology, and so on. Important information on mice behavior and their reaction to a particular stimulus is deduced from a careful analysis of their movements. Moreover, behavioral analysis of genetically modified mice allows obtaining important information about particular genes, phenotypes or drug effects. The techniques commonly adopted to support such analysis have many limitations, which make the related systems particularly ineffective. Currently, the engineering community is working to explore innovative identification and sensing technologies to develop new tracking systems able to guarantee benefits to animals' behavior analysis. This work presents a tracking solution based on passive Radio Frequency Identification Technology (RFID) in Ultra High Frequency (UHF) band. Much emphasis is given to the software component of the system, based on a Web-oriented solution, able to process the raw tracking data coming from a hardware system, and offer 2D and 3D tracking information as well as reports and dashboards about mice behavior. The system has been widely tested using laboratory mice and compared with an automated video-tracking software (i.e., EthoVision). The obtained results have demonstrated the effectiveness and reliability of the proposed solution, which is able to correctly detect the events occurring in the animals' cage, and to offer a complete and user-friendly tool to support researchers in behavioral analysis of laboratory mice.

  17. Evaluation and Selection of Candidate Reference Genes for Normalization of Quantitative RT-PCR in Withania somnifera (L.) Dunal

    PubMed Central

    Singh, Varinder; Kaul, Sunil C.; Wadhwa, Renu; Pati, Pratap Kumar

    2015-01-01

    Quantitative real-time PCR (qRT-PCR) is now globally used for accurate analysis of transcripts levels in plants. For reliable quantification of transcripts, identification of the best reference genes is a prerequisite in qRT-PCR analysis. Recently, Withania somnifera has attracted lot of attention due to its immense therapeutic potential. At present, biotechnological intervention for the improvement of this plant is being seriously pursued. In this background, it is important to have comprehensive studies on finding suitable reference genes for this high valued medicinal plant. In the present study, 11 candidate genes were evaluated for their expression stability under biotic (fungal disease), abiotic (wounding, salt, drought, heat and cold) stresses, in different plant tissues and in response to various plant growth regulators (methyl jasmonate, salicylic acid, abscisic acid). The data as analyzed by various software packages (geNorm, NormFinder, Bestkeeper and ΔCt method) suggested that cyclophilin (CYP) is a most stable gene under wounding, heat, methyl jasmonate, different tissues and all stress conditions. T-SAND was found to be a best reference gene for salt and salicylic acid (SA) treated samples, while 26S ribosomal RNA (26S), ubiquitin (UBQ) and beta-tubulin (TUB) were the most stably expressed genes under drought, biotic and cold treatment respectively. For abscisic acid (ABA) treated samples 18S-rRNA was found to stably expressed gene. Finally, the relative expression level of the three genes involved in the withanolide biosynthetic pathway was detected to validate the selection of reliable reference genes. The present work will significantly contribute to gene analysis studies in W. somnifera and facilitate in improving the quality of gene expression data in this plant as well as and other related plant species. PMID:25769035

  18. LIQUID: an-open source software for identifying lipids in LC-MS/MS-based lipidomics data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kyle, Jennifer E.; Crowell, Kevin L.; Casey, Cameron P.

    2017-01-31

    We introduce an open-source software, LIQUID, for semi-automated processing and visualization of LC-MS/MS based lipidomics data. LIQUID provides users with the capability to process high throughput data and contains a customizable target library and scoring model per project needs. The graphical user interface provides visualization of multiple lines of spectral evidence for each lipid identification, allowing rapid examination of data for making confident identifications of lipid molecular species.

  19. Short tandem repeat analysis in Japanese population.

    PubMed

    Hashiyada, M

    2000-01-01

    Short tandem repeats (STRs), known as microsatellites, are one of the most informative genetic markers for characterizing biological materials. Because of the relatively small size of STR alleles (generally 100-350 nucleotides), amplification by polymerase chain reaction (PCR) is relatively easy, affording a high sensitivity of detection. In addition, STR loci can be amplified simultaneously in a multiplex PCR. Thus, substantial information can be obtained in a single analysis with the benefits of using less template DNA, reducing labor, and reducing the contamination. We investigated 14 STR loci in a Japanese population living in Sendai by three multiplex PCR kits, GenePrint PowerPlex 1.1 and 2.2. Fluorescent STR System (Promega, Madison, WI, USA) and AmpF/STR Profiler (Perkin-Elmer, Norwalk, CT, USA). Genomic DNA was extracted using sodium dodecyl sulfate (SDS) proteinase K or Chelex 100 treatment followed by the phenol/chloroform extraction. PCR was performed according to the manufacturer's protocols. Electrophoresis was carried out on an ABI 377 sequencer and the alleles were determined by GeneScan 2.0.2 software (Perkin-Elmer). In 14 STRs loci, statistical parameters indicated a relatively high rate, and no significant deviation from Hardy-Weinberg equilibrium was detected. We apply this STR system to paternity testing and forensic casework, e.g., personal identification in rape cases. This system is an effective tool in the forensic sciences to obtain information on individual identification.

  20. Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology.

    PubMed

    Cock, Peter J A; Grüning, Björn A; Paszkiewicz, Konrad; Pritchard, Leighton

    2013-01-01

    The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of "effector" proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen's predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu).

  1. oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes

    PubMed Central

    Ho Sui, Shannan J.; Mortimer, James R.; Arenillas, David J.; Brumm, Jochen; Walsh, Christopher J.; Kennedy, Brian P.; Wasserman, Wyeth W.

    2005-01-01

    Targeted transcript profiling studies can identify sets of co-expressed genes; however, identification of the underlying functional mechanism(s) is a significant challenge. Established methods for the analysis of gene annotations, particularly those based on the Gene Ontology, can identify functional linkages between genes. Similar methods for the identification of over-represented transcription factor binding sites (TFBSs) have been successful in yeast, but extension to human genomics has largely proved ineffective. Creation of a system for the efficient identification of common regulatory mechanisms in a subset of co-expressed human genes promises to break a roadblock in functional genomics research. We have developed an integrated system that searches for evidence of co-regulation by one or more transcription factors (TFs). oPOSSUM combines a pre-computed database of conserved TFBSs in human and mouse promoters with statistical methods for identification of sites over-represented in a set of co-expressed genes. The algorithm successfully identified mediating TFs in control sets of tissue-specific genes and in sets of co-expressed genes from three transcript profiling studies. Simulation studies indicate that oPOSSUM produces few false positives using empirically defined thresholds and can tolerate up to 50% noise in a set of co-expressed genes. PMID:15933209

  2. Development of Automated Image Analysis Software for Suspended Marine Particle Classification

    DTIC Science & Technology

    2003-09-30

    Development of Automated Image Analysis Software for Suspended Marine Particle Classification Scott Samson Center for Ocean Technology...REPORT TYPE 3. DATES COVERED 00-00-2003 to 00-00-2003 4. TITLE AND SUBTITLE Development of Automated Image Analysis Software for Suspended...objective is to develop automated image analysis software to reduce the effort and time required for manual identification of plankton images. Automated

  3. Toward the identification of causal genes in complex diseases: a gene-centric joint test of significance combining genomic and transcriptomic data.

    PubMed

    Charlesworth, Jac C; Peralta, Juan M; Drigalenko, Eugene; Göring, Harald Hh; Almasy, Laura; Dyer, Thomas D; Blangero, John

    2009-12-15

    Gene identification using linkage, association, or genome-wide expression is often underpowered. We propose that formal combination of information from multiple gene-identification approaches may lead to the identification of novel loci that are missed when only one form of information is available. Firstly, we analyze the Genetic Analysis Workshop 16 Framingham Heart Study Problem 2 genome-wide association data for HDL-cholesterol using a "gene-centric" approach. Then we formally combine the association test results with genome-wide transcriptional profiling data for high-density lipoprotein cholesterol (HDL-C), from the San Antonio Family Heart Study, using a Z-transform test (Stouffer's method). We identified 39 genes by the joint test at a conservative 1% false-discovery rate, including 9 from the significant gene-based association test and 23 whose expression was significantly correlated with HDL-C. Seven genes identified as significant in the joint test were not independently identified by either the association or expression tests. This combined approach has increased power and leads to the direct nomination of novel candidate genes likely to be involved in the determination of HDL-C levels. Such information can then be used as justification for a more exhaustive search for functional sequence variation within the nominated genes. We anticipate that this type of analysis will improve our speed of identification of regulatory genes causally involved in disease risk.

  4. Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data.

    PubMed

    Paisitkriangkrai, Sakrapee; Quek, Kelly; Nievergall, Eva; Jabbour, Anissa; Zannettino, Andrew; Kok, Chung Hoow

    2018-06-07

    Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .

  5. Network-based co-expression analysis for exploring the potential diagnostic biomarkers of metastatic melanoma.

    PubMed

    Wang, Li-Xin; Li, Yang; Chen, Guan-Zhi

    2018-01-01

    Metastatic melanoma is an aggressive skin cancer and is one of the global malignancies with high mortality and morbidity. It is essential to identify and verify diagnostic biomarkers of early metastatic melanoma. Previous studies have systematically assessed protein biomarkers and mRNA-based expression characteristics. However, molecular markers for the early diagnosis of metastatic melanoma have not been identified. To explore potential regulatory targets, we have analyzed the gene microarray expression profiles of malignant melanoma samples by co-expression analysis based on the network approach. The differentially expressed genes (DEGs) were screened by the EdgeR package of R software. A weighted gene co-expression network analysis (WGCNA) was used for the identification of DEGs in the special gene modules and hub genes. Subsequently, a protein-protein interaction network was constructed to extract hub genes associated with gene modules. Finally, twenty-four important hub genes (RASGRP2, IKZF1, CXCR5, LTB, BLK, LINGO3, CCR6, P2RY10, RHOH, JUP, KRT14, PLA2G3, SPRR1A, KRT78, SFN, CLDN4, IL1RN, PKP3, CBLC, KRT16, TMEM79, KLK8, LYPD3 and LYPD5) were treated as valuable factors involved in the immune response and tumor cell development in tumorigenesis. In addition, a transcriptional regulatory network was constructed for these specific modules or hub genes, and a few core transcriptional regulators were found to be mostly associated with our hub genes, including GATA1, STAT1, SP1, and PSG1. In summary, our findings enhance our understanding of the biological process of malignant melanoma metastasis, enabling us to identify specific genes to use for diagnostic and prognostic markers and possibly for targeted therapy.

  6. Identification of potential therapeutic target genes, key miRNAs and mechanisms in oral lichen planus by bioinformatics analysis.

    PubMed

    Gong, Cuihua; Sun, Shangtong; Liu, Bing; Wang, Jing; Chen, Xiaodong

    2017-06-01

    The study aimed to identify the potential target genes and key miRNAs as well as to explore the underlying mechanisms in the pathogenesis of oral lichen planus (OLP) by bioinformatics analysis. The microarray data of GSE38617 were downloaded from Gene Expression Omnibus (GEO) database. A total of 7 OLP and 7 normal samples were used to identify the differentially expressed genes (DEGs) and miRNAs. The DEGs were then performed functional enrichment analyses. Furthermore, DEG-miRNA network and miRNA-function network were constructed by Cytoscape software. Total 1758 DEGs (598 up- and 1160 down-regulated genes) and 40 miRNAs (17 up- and 23 down-regulated miRNAs) were selected. The up-regulated genes were related to nuclear factor-Kappa B (NF-κB) signaling pathway, while down-regulated genes were mainly enriched in the function of ribosome. Tumor necrosis factor (TNF), caspase recruitment domain family, member 11 (CARD11) and mitochondrial ribosomal protein (MRP) genes were identified in these functions. In addition, miR-302 was a hub node in DEG-miRNA network and regulated cyclin D1 (CCND1). MiR-548a-2 was the key miRNA in miRNA-function network by regulating multiple functions including ribosomal function. The NF-κB signaling pathway and ribosome function may be the pathogenic mechanisms of OLP. The genes such as TNF, CARD11, MRP genes and CCND1 may be potential therapeutic target genes in OLP. MiR-548a-2 and miR-302 may play important roles in OLP development. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Design and Pedagogical Issues in the Development of the InSight Series of Instructional Software.

    ERIC Educational Resources Information Center

    Baro, John A.; Lehmkulke, Stephen

    1993-01-01

    Design issues in development of InSight software for optometric education include choice of hardware, identification of audience, definition of scope and limitations of content, selection of user interface and programing environment, obtaining user feedback, and software distribution. Pedagogical issues include practicality and improvement on…

  8. An Analysis of Open Source Security Software Products Downloads

    ERIC Educational Resources Information Center

    Barta, Brian J.

    2014-01-01

    Despite the continued demand for open source security software, a gap in the identification of success factors related to the success of open source security software persists. There are no studies that accurately assess the extent of this persistent gap, particularly with respect to the strength of the relationships of open source software…

  9. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    PubMed

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. The regulatory software of cellular metabolism.

    PubMed

    Segrè, Daniel

    2004-06-01

    Understanding the regulation of metabolic pathways in the cell is like unraveling the 'software' that is running on the 'hardware' of the metabolic network. Transcriptional regulation of enzymes is an important component of this software. A recent systematic analysis of metabolic gene-expression data in Saccharomyces cerevisiae reveals a complex modular organization of co-expressed genes, which could increase our ability to understand and engineer cellular metabolic functions.

  11. Software forecasting as it is really done: A study of JPL software engineers

    NASA Technical Reports Server (NTRS)

    Griesel, Martha Ann; Hihn, Jairus M.; Bruno, Kristin J.; Fouser, Thomas J.; Tausworthe, Robert C.

    1993-01-01

    This paper presents a summary of the results to date of a Jet Propulsion Laboratory internally funded research task to study the costing process and parameters used by internally recognized software cost estimating experts. Protocol Analysis and Markov process modeling were used to capture software engineer's forecasting mental models. While there is significant variation between the mental models that were studied, it was nevertheless possible to identify a core set of cost forecasting activities, and it was also found that the mental models cluster around three forecasting techniques. Further partitioning of the mental models revealed clustering of activities, that is very suggestive of a forecasting lifecycle. The different forecasting methods identified were based on the use of multiple-decomposition steps or multiple forecasting steps. The multiple forecasting steps involved either forecasting software size or an additional effort forecast. Virtually no subject used risk reduction steps in combination. The results of the analysis include: the identification of a core set of well defined costing activities, a proposed software forecasting life cycle, and the identification of several basic software forecasting mental models. The paper concludes with a discussion of the implications of the results for current individual and institutional practices.

  12. DynGO: a tool for visualizing and mining of Gene Ontology and its associations

    PubMed Central

    Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H

    2005-01-01

    Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147

  13. IEEE Computer Society/Software Engineering Institute Watts S. Humphrey Software Process Achievement Award 2016: Raytheon Integrated Defense Systems Design for Six Sigma Team

    DTIC Science & Technology

    2017-04-01

    notice for non -US Government use and distribution. External use: This material may be reproduced in its entirety, without modification, and freely...Combinatorial Design Methods 4 2.1 Identification of Significant Improvement Opportunity 4 2.2 Methodology Development 4 2.3 Piloting...11 3 Process Performance Modeling and Analysis 13 3.1 Identification of Significant Improvement Opportunity 13 3.2 Methodology Development 13 3.3

  14. Gene identification in the congenital disorders of glycosylation type I by whole-exome sequencing.

    PubMed

    Timal, Sharita; Hoischen, Alexander; Lehle, Ludwig; Adamowicz, Maciej; Huijben, Karin; Sykut-Cegielska, Jolanta; Paprocka, Justyna; Jamroz, Ewa; van Spronsen, Francjan J; Körner, Christian; Gilissen, Christian; Rodenburg, Richard J; Eidhof, Ilse; Van den Heuvel, Lambert; Thiel, Christian; Wevers, Ron A; Morava, Eva; Veltman, Joris; Lefeber, Dirk J

    2012-10-01

    Congenital disorders of glycosylation type I (CDG-I) form a growing group of recessive neurometabolic diseases. Identification of disease genes is compromised by the enormous heterogeneity in clinical symptoms and the large number of potential genes involved. Until now, gene identification included the sequential application of biochemical methods in blood samples and fibroblasts. In genetically unsolved cases, homozygosity mapping has been applied in consanguineous families. Altogether, this time-consuming diagnostic strategy led to the identification of defects in 17 different CDG-I genes. Here, we applied whole-exome sequencing (WES) in combination with the knowledge of the protein N-glycosylation pathway for gene identification in our remaining group of six unsolved CDG-I patients from unrelated non-consanguineous families. Exome variants were prioritized based on a list of 76 potential CDG-I candidate genes, leading to the rapid identification of one known and two novel CDG-I gene defects. These included the first X-linked CDG-I due to a de novo mutation in ALG13, and compound heterozygous mutations in DPAGT1, together the first two steps in dolichol-PP-glycan assembly, and mutations in PGM1 in two cases, involved in nucleotide sugar biosynthesis. The pathogenicity of the mutations was confirmed by showing the deficient activity of the corresponding enzymes in patient fibroblasts. Combined with these results, the gene defect has been identified in 98% of our CDG-I patients. Our results implicate the potential of WES to unravel disease genes in the CDG-I in newly diagnosed singleton families.

  15. Transcriptomic analysis of neuregulin-1 regulated genes following ischemic stroke by computational identification of promoter binding sites: A role for the ETS-1 transcription factor.

    PubMed

    Surles-Zeigler, Monique C; Li, Yonggang; Distel, Timothy J; Omotayo, Hakeem; Ge, Shaokui; Ford, Byron D

    2018-01-01

    Ischemic stroke is a major cause of mortality in the United States. We previously showed that neuregulin-1 (NRG1) was neuroprotective in rat models of ischemic stroke. We used gene expression profiling to understand the early cellular and molecular mechanisms of NRG1's effects after the induction of ischemia. Ischemic stroke was induced by middle cerebral artery occlusion (MCAO). Rats were allocated to 3 groups: (1) control, (2) MCAO and (3) MCAO + NRG1. Cortical brain tissues were collected three hours following MCAO and NRG1 treatment and subjected to microarray analysis. Data and statistical analyses were performed using R/Bioconductor platform alongside Genesis, Ingenuity Pathway Analysis and Enrichr software packages. There were 2693 genes differentially regulated following ischemia and NRG1 treatment. These genes were organized by expression patterns into clusters using a K-means clustering algorithm. We further analyzed genes in clusters where ischemia altered gene expression, which was reversed by NRG1 (clusters 4 and 10). NRG1, IRS1, OPA3, and POU6F1 were central linking (node) genes in cluster 4. Conserved Transcription Factor Binding Site Finder (CONFAC) identified ETS-1 as a potential transcriptional regulator of NRG1 suppressed genes following ischemia. A transcription factor activity array showed that ETS-1 activity was increased 2-fold, 3 hours following ischemia and this activity was attenuated by NRG1. These findings reveal key early transcriptional mechanisms associated with neuroprotection by NRG1 in the ischemic penumbra.

  16. Characterization of reference genes for qPCR analysis in various tissues of the Fujian oyster Crassostrea angulata

    NASA Astrophysics Data System (ADS)

    Pu, Fei; Yang, Bingye; Ke, Caihuan

    2015-07-01

    Accurate quantification of transcripts using quantitative real-time polymerase chain reaction (qPCR) depends on the identification of reliable reference genes for normalization. This study aimed to identify and validate seven reference genes, including actin-2 ( ACT-2), elongation factor 1 alpha ( EF-1α), elongation factor 1 beta ( EF-1β), glyceraldehyde-3-phosphate dehydrogenase ( GAPDH), ubiquitin ( UBQ), β-tubulin ( β-TUB), and 18S ribosomal RNA, from Crassostrea angulata, a valuable marine bivalve cultured worldwide. Transcript levels of the candidate reference genes were examined using qPCR analysis and showed differential expression patterns in the mantle, gill, adductor muscle, labial palp, visceral mass, hemolymph and gonad tissues. Quantitative data were analyzed using the geNorm software to assess the expression stability of the candidate reference genes, revealing that β-TUB and UBQ were the most stable genes. The commonly used GAPDH and 18S rRNA showed low stability, making them unsuitable candidates in this system. The expression pattern of the G protein β-subunit gene ( Gβ) across tissue types was also examined and normalized to the expression of each or both of UBQ and β-TUB as internal controls. This revealed consistent trends with all three normalization approaches, thus validating the reliability of UBQ and β-TUB as optimal internal controls. The study provides the first validated reference genes for accurate data normalization in transcript profiling in Crassostrea angulata, which will be indispensable for further functional genomics studies in this economically valuable marine bivalve.

  17. Formularity: Software for Automated Formula Assignment of Natural and Other Organic Matter from Ultrahigh-Resolution Mass Spectra

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tolić, Nikola; Liu, Yina; Liyu, Andrey

    Ultrahigh-resolution mass spectrometry, such as Fourier transform ion-cyclotron resonance mass spectrometry (FT-ICR MS), can resolve thousands of molecular ions in complex organic matrices. A Compound Identification Algorithm (CIA) was previously developed for automated elemental formula assignment for natural organic matter (NOM). In this work we describe a user friendly interface for CIA, titled Formularity, which includes an additional functionality to perform search of formulas based on an Isotopic Pattern Algorithm (IPA). While CIA assigns elemental formulas for compounds containing C, H, O, N, S, and P, IPA is capable of assigning formulas for compounds containing other elements. We used halogenatedmore » organic compounds (HOC), a chemical class that is ubiquitous in nature as well as anthropogenic systems, as an example to demonstrate the capability of Formularity with IPA. A HOC standard mix was used to evaluate the identification confidence of IPA. The HOC spike in NOM and tap water were used to assess HOC identification in natural and anthropogenic matrices. Strategies for reconciliation of CIA and IPA assignments are discussed. Software and sample databases with documentation are freely available from the PNNL OMICS software repository https://omics.pnl.gov/software/formularity.« less

  18. EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries

    PubMed Central

    Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P

    2008-01-01

    Background Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. Results We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. Conclusion EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects. PMID:18402700

  19. EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries.

    PubMed

    Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P

    2008-04-10

    Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects.

  20. Identification of triacylglycerol using automated annotation of high resolution multistage mass spectral trees.

    PubMed

    Wang, Xiupin; Peng, Qingzhi; Li, Peiwu; Zhang, Qi; Ding, Xiaoxia; Zhang, Wen; Zhang, Liangxiao

    2016-10-12

    High complexity of identification for non-target triacylglycerols (TAGs) is a major challenge in lipidomics analysis. To identify non-target TAGs, a powerful tool named accurate MS(n) spectrometry generating so-called ion trees is used. In this paper, we presented a technique for efficient structural elucidation of TAGs on MS(n) spectral trees produced by LTQ Orbitrap MS(n), which was implemented as an open source software package, or TIT. The TIT software was used to support automatic annotation of non-target TAGs on MS(n) ion trees from a self-built fragment ion database. This database includes 19108 simulate TAG molecules from a random combination of fatty acids and corresponding 500582 self-built multistage fragment ions (MS ≤ 3). Our software can identify TAGs using a "stage-by-stage elimination" strategy. By utilizing the MS(1) accurate mass and referenced RKMD, the TIT software can discriminate unique elemental composition candidates. The regiospecific isomers of fatty acyl chains will be distinguished using MS(2) and MS(3) fragment spectra. We applied the algorithm to the selection of 45 TAG standards and demonstrated that the molecular ions could be 100% correctly assigned. Therefore, the TIT software could be applied to TAG identification in complex biological samples such as mouse plasma extracts. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Identification of key target genes and pathways in laryngeal carcinoma

    PubMed Central

    Liu, Feng; Du, Jintao; Liu, Jun; Wen, Bei

    2016-01-01

    The purpose of the present study was to screen the key genes associated with laryngeal carcinoma and to investigate the molecular mechanism of laryngeal carcinoma progression. The gene expression profile of GSE10935 [Gene Expression Omnibus (GEO) accession number], including 12 specimens from laryngeal papillomas and 12 specimens from normal laryngeal epithelia controls, was downloaded from the GEO database. Differentially expressed genes (DEGs) were screened in laryngeal papillomas compared with normal controls using Limma package in R language, followed by Gene Ontology (GO) enrichment analysis and pathway enrichment analysis. Furthermore, the protein-protein interaction (PPI) network of DEGs was constructed using Cytoscape software and modules were analyzed using MCODE plugin from the PPI network. Furthermore, significant biological pathway regions (sub-pathway) were identified by using iSubpathwayMiner analysis. A total of 67 DEGs were identified, including 27 up-regulated genes and 40 down-regulated genes and they were involved in different GO terms and pathways. PPI network analysis revealed that Ras association (RalGDS/AF-6) domain family member 1 (RASSF1) was a hub protein. The sub-pathway analysis identified 9 significantly enriched sub-pathways, including glycolysis/gluconeogenesis and nitrogen metabolism. Genes such as phosphoglycerate kinase 1 (PGK1), carbonic anhydrase II (CA2), and carbonic anhydrase XII (CA12) whose node degrees were >10 were identified in the disease risk sub-pathway. Genes in the sub-pathway, such as RASSF1, PGK1, CA2 and CA12 were presumed to serve critical roles in laryngeal carcinoma. The present study identified DEGs and their sub-pathways in the disease, which may serve as potential targets for treatment of laryngeal carcinoma. PMID:27446427

  2. The use of laser microdissection in the identification of suitable reference genes for normalization of quantitative real-time PCR in human FFPE epithelial ovarian tissue samples.

    PubMed

    Cai, Jing; Li, Tao; Huang, Bangxing; Cheng, Henghui; Ding, Hui; Dong, Weihong; Xiao, Man; Liu, Ling; Wang, Zehua

    2014-01-01

    Quantitative real-time PCR (qPCR) is a powerful and reproducible method of gene expression analysis in which expression levels are quantified by normalization against reference genes. Therefore, to investigate the potential biomarkers and therapeutic targets for epithelial ovarian cancer by qPCR, it is critical to identify stable reference genes. In this study, twelve housekeeping genes (ACTB, GAPDH, 18S rRNA, GUSB, PPIA, PBGD, PUM1, TBP, HRPT1, RPLP0, RPL13A, and B2M) were analyzed in 50 ovarian samples from normal, benign, borderline, and malignant tissues. For reliable results, laser microdissection (LMD), an effective technique used to prepare homogeneous starting material, was utilized to precisely excise target tissues or cells. One-way analysis of variance (ANOVA) and nonparametric (Kruskal-Wallis) tests were used to compare the expression differences. NormFinder and geNorm software were employed to further validate the suitability and stability of the candidate genes. Results showed that epithelial cells occupied a small percentage of the normal ovary indeed. The expression of ACTB, PPIA, RPL13A, RPLP0, and TBP were stable independent of the disease progression. In addition, NormFinder and geNorm identified the most stable combination (ACTB, PPIA, RPLP0, and TBP) and the relatively unstable reference gene GAPDH from the twelve commonly used housekeeping genes. Our results highlight the use of homogeneous ovarian tissues and multiple-reference normalization strategy, e.g. the combination of ACTB, PPIA, RPLP0, and TBP, for qPCR in epithelial ovarian tissues, whereas GAPDH, the most commonly used reference gene, is not recommended, especially as a single reference gene.

  3. Gene expression, signal transduction pathways and functional networks associated with growth of sporadic vestibular schwannomas.

    PubMed

    Sass, Hjalte C R; Borup, Rehannah; Alanin, Mikkel; Nielsen, Finn Cilius; Cayé-Thomasen, Per

    2017-01-01

    The objective of this study was to determine global gene expression in relation to Vestibular schwannomas (VS) growth rate and to identify signal transduction pathways and functional molecular networks associated with growth. Repeated magnetic resonance imaging (MRI) prior to surgery determined tumor growth rate. Following tissue sampling during surgery, mRNA was extracted from 16 sporadic VS. Double stranded cDNA was synthesized from the mRNA and used as template for in vitro transcription reaction to synthesize biotin-labeled antisense cRNA, which was hybridized to Affymetrix HG-U133A arrays and analyzed by dChip software. Differential gene expression was defined as a 1.5-fold difference between fast and slow growing tumors (><0.5 ccm/year), employing a p-value <0.01. Deregulated transcripts were matched against established gene ontology. Ingenuity Pathway Analysis was used for identification of signal transduction pathways and functional molecular networks associated with tumor growth. In total 109 genes were deregulated in relation to tumor growth rate. Genes associated with apoptosis, growth and cell proliferation were deregulated. Gene ontology included regulation of the cell cycle, cell differentiation and proliferation, among other functions. Fourteen pathways were associated with tumor growth. Five functional molecular networks were generated. This first study on global gene expression in relation to vestibular schwannoma growth rate identified several genes, signal transduction pathways and functional networks associated with tumor progression. Specific genes involved in apoptosis, cell growth and proliferation were deregulated in fast growing tumors. Fourteen pathways were associated with tumor growth. Generated functional networks underlined the importance of the PI3K family, among others.

  4. Identification of Suitable Reference Genes for mRNA Studies in Bone Marrow in a Mouse Model of Hematopoietic Stem Cell Transplantation.

    PubMed

    Li, H; Chen, C; Yao, H; Li, X; Yang, N; Qiao, J; Xu, K; Zeng, L

    2016-10-01

    Bone marrow micro-environment changes during hematopoietic stem cell transplantation (HSCT) with subsequent alteration of genes expression. Quantitative polymerase chain reaction (q-PCR) is a reliable and reproducible technique for the analysis of gene expression. To obtain more accurate results, it is essential to find a reference during HSCT. However, which gene is suitable during HSCT remains unclear. This study aimed to identify suitable reference genes for mRNA studies in bone marrow after HSCT. C57BL/6 mice were treated with either total body irradiation (group T) or busulfan/cyclophosphamide (BU/CY) (group B) followed by infusion of bone marrow cells. Normal mice without treatments were served as a control. All samples (group T + group B + control) were defined as group G. On days 7, 14, and 21 after transplantation, transcription levels of 7 candidate genes, ACTB, B2M, GAPDH, HMBS, HPRT, SDHA, and YWHAZ, in bone marrow cells were measured by use of real-time quantitative PCR. The expression stability of these 7 candidate reference genes were analyzed by 2 statistical software programs, GeNorm and NormFinder. Our results showed that ACTB displayed the highest expression in group G, with lowest expression of PSDHA in group T and HPRT in groups B and G. Analysis of expression stability by use of GeNorm or NormFinder demonstrated that expression of B2M in bone marrow were much more stable during HSCT, compared with other candidate genes including commonly used reference genes GAPDH and ACTB. ACTB could be used as a suitable reference gene for mRNA studies in bone marrow after HSCT. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Software automation tools for increased throughput metabolic soft-spot identification in early drug discovery.

    PubMed

    Zelesky, Veronica; Schneider, Richard; Janiszewski, John; Zamora, Ismael; Ferguson, James; Troutman, Matthew

    2013-05-01

    The ability to supplement high-throughput metabolic clearance data with structural information defining the site of metabolism should allow design teams to streamline their synthetic decisions. However, broad application of metabolite identification in early drug discovery has been limited, largely due to the time required for data review and structural assignment. The advent of mass defect filtering and its application toward metabolite scouting paved the way for the development of software automation tools capable of rapidly identifying drug-related material in complex biological matrices. Two semi-automated commercial software applications, MetabolitePilot™ and Mass-MetaSite™, were evaluated to assess the relative speed and accuracy of structural assignments using data generated on a high-resolution MS platform. Review of these applications has demonstrated their utility in providing accurate results in a time-efficient manner, leading to acceleration of metabolite identification initiatives while highlighting the continued need for biotransformation expertise in the interpretation of more complex metabolic reactions.

  6. A new scoring function for top-down spectral deconvolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kou, Qiang; Wu, Si; Liu, Xiaowen

    2014-12-18

    Background: Top-down mass spectrometry plays an important role in intact protein identification and characterization. Top-down mass spectra are more complex than bottom-up mass spectra because they often contain many isotopomer envelopes from highly charged ions, which may overlap with one another. As a result, spectral deconvolution, which converts a complex top-down mass spectrum into a monoisotopic mass list, is a key step in top-down spectral interpretation. Results: In this paper, we propose a new scoring function, L-score, for evaluating isotopomer envelopes. By combining L-score with MS-Deconv, a new software tool, MS-Deconv+, was developed for top-down spectral deconvolution. Experimental results showedmore » that MS-Deconv+ outperformed existing software tools in top-down spectral deconvolution. Conclusions: L-score shows high discriminative ability in identification of isotopomer envelopes. Using L-score, MS-Deconv+ reports many correct monoisotopic masses missed by other software tools, which are valuable for proteoform identification and characterization.« less

  7. Pim-1: A Molecular Target to Modulate Cellular Resistance to Therapy in Prostate Cancer

    DTIC Science & Technology

    2005-10-01

    Reiter RE, Lilly MB: Gene expression profiling in R- flurbiprofen -treated prostate cancer: Identification of prostate stem cell antigen as a... flurbiprofen -regulated gene. (submitted, 2006). 51. Holder SL, Zemskova M, Bremner R, Neidigh J, Lilly MB: Identification of specific, cell-permeable...profiling in R- flurbiprofen - treated prostate cancer: Identification of prostate stem cell antigen as a flurbiprofen - regulated gene. (poster

  8. Software Reviews.

    ERIC Educational Resources Information Center

    Science and Children, 1988

    1988-01-01

    Reviews five software packages for use with school age children. Includes "Science Toolkit Module 2: Earthquake Lab"; "Adaptations and Identification"; "Geoworld"; "Body Systems II Series: The Blood System: A Liquid of Life," all for Apple II, and "Science Courseware: Life Science/Biology" for…

  9. Identification of Patient Safety Risks Associated with Electronic Health Records: A Software Quality Perspective.

    PubMed

    Virginio, Luiz A; Ricarte, Ivan Luiz Marques

    2015-01-01

    Although Electronic Health Records (EHR) can offer benefits to the health care process, there is a growing body of evidence that these systems can also incur risks to patient safety when developed or used improperly. This work is a literature review to identify these risks from a software quality perspective. Therefore, the risks were classified based on the ISO/IEC 25010 software quality model. The risks identified were related mainly to the characteristics of "functional suitability" (i.e., software bugs) and "usability" (i.e., interface prone to user error). This work elucidates the fact that EHR quality problems can adversely affect patient safety, resulting in errors such as incorrect patient identification, incorrect calculation of medication dosages, and lack of access to patient data. Therefore, the risks presented here provide the basis for developers and EHR regulating bodies to pay attention to the quality aspects of these systems that can result in patient harm.

  10. Identification and Evaluation of Reliable Reference Genes in the Medicinal Fungus Shiraia bambusicola.

    PubMed

    Song, Liang; Li, Tong; Fan, Li; Shen, Xiao-Ye; Hou, Cheng-Lin

    2016-04-01

    The stability of reference genes plays a vital role in real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) analysis, which is generally regarded as a convenient and sensitive tool for the analysis of gene expression. A well-known medicinal fungus, Shiraia bambusicola, has great potential in the pharmaceutical, agricultural and food industries, but its suitable reference genes have not yet been determined. In the present study, 11 candidate reference genes in S. bambusicola were first evaluated and validated comprehensively. To identify the suitable reference genes for qRT-PCR analysis, three software-based algorithms, geNorm, NormFinder and Best Keeper, were applied to rank the tested genes. RNA samples were collected from seven fermentation stages using different media (potato dextrose or Czapek medium) and under different light conditions (12-h light/12-h dark and all-dark). The three most appropriate reference genes, ubi, tfc and ags, were able to normalize the qRT-PCR results under the culturing conditions of 12-h light/12-h dark, whereas the other three genes, vac, gke and acyl, performed better in the culturing conditions of all-dark growth. Therefore, under different light conditions, at least two reference genes (ubi and vac) could be employed to assure the reliability of qRT-PCR results. For both the natural culture medium (the most appropriate genes of this group: ubi, tfc and ags) and the chemically defined synthetic medium (the most stable genes of this group: tfc, vac and ef), the tfc gene remained the best gene used for normalizing the gene expression found with qRT-PCR. It is anticipated that these results would improve the selection of suitable reference genes for qRT-PCR assays and lay the foundation for an accurate analysis of gene expression in S. bambusicola.

  11. CrossTalk: The Journal of Defense Software Engineering. Volume 18, Number 11

    DTIC Science & Technology

    2005-11-01

    languages. Our discipline of software engineering has really experienced phenomenal growth right before our eyes. A sign that software design has...approach on a high level of abstraction. The main emphasis is on the identification and allocation of a needed functionality (e.g., a target tracker ), rather...messaging software that is the backbone of teenage culture. As increasing security constraints will increase the cost of developing and main- taining any

  12. ICESat (GLAS) Science Processing Software Document Series. Volume 2; Science Data Management Plan; 4.0

    NASA Technical Reports Server (NTRS)

    Jester, Peggy L.; Hancock, David W., III

    1999-01-01

    This document provides the Data Management Plan for the GLAS Standard Data Software (SDS) supporting the GLAS instrument of the EOS ICESat Spacecraft. The SDS encompasses the ICESat Science Investigator-led Processing System (I-SIPS) Software and the Instrument Support Facility (ISF) Software. This Plan addresses the identification, authority, and description of the interface nodes associated with the GLAS Standard Data Products and the GLAS Ancillary Data.

  13. MONGKIE: an integrated tool for network analysis and visualization for multi-omics data.

    PubMed

    Jang, Yeongjun; Yu, Namhee; Seo, Jihae; Kim, Sun; Lee, Sanghyuk

    2016-03-18

    Network-based integrative analysis is a powerful technique for extracting biological insights from multilayered omics data such as somatic mutations, copy number variations, and gene expression data. However, integrated analysis of multi-omics data is quite complicated and can hardly be done in an automated way. Thus, a powerful interactive visual mining tool supporting diverse analysis algorithms for identification of driver genes and regulatory modules is much needed. Here, we present a software platform that integrates network visualization with omics data analysis tools seamlessly. The visualization unit supports various options for displaying multi-omics data as well as unique network models for describing sophisticated biological networks such as complex biomolecular reactions. In addition, we implemented diverse in-house algorithms for network analysis including network clustering and over-representation analysis. Novel functions include facile definition and optimized visualization of subgroups, comparison of a series of data sets in an identical network by data-to-visual mapping and subsequent overlaying function, and management of custom interaction networks. Utility of MONGKIE for network-based visual data mining of multi-omics data was demonstrated by analysis of the TCGA glioblastoma data. MONGKIE was developed in Java based on the NetBeans plugin architecture, thus being OS-independent with intrinsic support of module extension by third-party developers. We believe that MONGKIE would be a valuable addition to network analysis software by supporting many unique features and visualization options, especially for analysing multi-omics data sets in cancer and other diseases. .

  14. Specific identification of Bacillus anthracis strains

    NASA Astrophysics Data System (ADS)

    Krishnamurthy, Thaiya; Deshpande, Samir; Hewel, Johannes; Liu, Hongbin; Wick, Charles H.; Yates, John R., III

    2007-01-01

    Accurate identification of human pathogens is the initial vital step in treating the civilian terrorism victims and military personnel afflicted in biological threat situations. We have applied a powerful multi-dimensional protein identification technology (MudPIT) along with newly generated software termed Profiler to identify the sequences of specific proteins observed for few strains of Bacillus anthracis, a human pathogen. Software termed Profiler was created to initially screen the MudPIT data of B. anthracis strains and establish the observed proteins specific for its strains. A database was also generated using Profiler containing marker proteins of B. anthracis and its strains, which in turn could be used for detecting the organism and its corresponding strains in samples. Analysis of the unknowns by our methodology, combining MudPIT and Profiler, led to the accurate identification of the anthracis strains present in samples. Thus, a new approach for the identification of B. anthracis strains in unknown samples, based on the molecular mass and sequences of marker proteins, has been ascertained.

  15. hSAGEing: an improved SAGE-based software for identification of human tissue-specific or common tumor markers and suppressors.

    PubMed

    Yang, Cheng-Hong; Chuang, Li-Yeh; Shih, Tsung-Mu; Chang, Hsueh-Wei

    2010-12-17

    SAGE (serial analysis of gene expression) is a powerful method of analyzing gene expression for the entire transcriptome. There are currently many well-developed SAGE tools. However, the cross-comparison of different tissues is seldom addressed, thus limiting the identification of common- and tissue-specific tumor markers. To improve the SAGE mining methods, we propose a novel function for cross-tissue comparison of SAGE data by combining the mathematical set theory and logic with a unique "multi-pool method" that analyzes multiple pools of pair-wise case controls individually. When all the settings are in "inclusion", the common SAGE tag sequences are mined. When one tissue type is in "inclusion" and the other types of tissues are not in "inclusion", the selected tissue-specific SAGE tag sequences are generated. They are displayed in tags-per-million (TPM) and fold values, as well as visually displayed in four kinds of scales in a color gradient pattern. In the fold visualization display, the top scores of the SAGE tag sequences are provided, along with cluster plots. A user-defined matrix file is designed for cross-tissue comparison by selecting libraries from publically available databases or user-defined libraries. The hSAGEing tool provides a combination of friendly cross-tissue analysis and an interface for comparing SAGE libraries for the first time. Some up- or down-regulated genes with tissue-specific or common tumor markers and suppressors are identified computationally. The tool is useful and convenient for in silico cancer transcriptomic studies and is freely available at http://bio.kuas.edu.tw/hSAGEing.

  16. Comparison between rpoB and 16S rRNA Gene Sequencing for Molecular Identification of 168 Clinical Isolates of Corynebacterium

    PubMed Central

    Khamis, Atieh; Raoult, Didier; La Scola, Bernard

    2005-01-01

    Higher proportions (91%) of 168 corynebacterial isolates were positively identified by partial rpoB gene determination than by that based on 16S rRNA gene sequences. This method is thus a simple, molecular-analysis-based method for identification of corynebacteria, but it should be used in conjunction with other tests for definitive identification. PMID:15815024

  17. Suitability of partial 16S ribosomal RNA gene sequence analysis for the identification of dangerous bacterial pathogens.

    PubMed

    Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F

    2007-03-01

    In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.

  18. PanGEA: identification of allele specific gene expression using the 454 technology.

    PubMed

    Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian

    2009-05-14

    Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: http://www.kofler.or.at/bioinformatics/PanGEA

  19. PanGEA: Identification of allele specific gene expression using the 454 technology

    PubMed Central

    Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian

    2009-01-01

    Background Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. Results We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology Conclusion To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: PMID:19442283

  20. Identification of the transcriptional regulators by expression profiling infected with hepatitis B virus.

    PubMed

    Chai, Xiaoqiang; Han, Yanan; Yang, Jian; Zhao, Xianxian; Liu, Yewang; Hou, Xugang; Tang, Yiheng; Zhao, Shirong; Li, Xiao

    2016-02-01

    The molecular pathogenesis of infection by hepatitis B virus with human is extremely complex and heterogeneous. To date the molecular information is not clearly defined despite intensive research efforts. Thus, studies aimed at transcription and regulation during virus infection or combined researches of those already known to be beneficial are needed. With the purpose of identifying the transcriptional regulators related to infection of hepatitis B virus in gene level, the gene expression profiles from some normal individuals and hepatitis B patients were analyzed in our study. In this work, the differential expressed genes were selected primarily. The several genes among those were validated in an independent set by qRT-PCR. Then the differentially co-expression analysis was conducted to identify differentially co-expressed links and differential co-expressed genes. Next, the analysis of the regulatory impact factors was performed through mapping the links and regulatory data. In order to give a further insight to these regulators, the co-expression gene modules were identified using a threshold-based hierarchical clustering method. Incidentally, the construction of the regulatory network was generated using the computer software. A total of 137,284 differentially co-expressed links and 780 differential co-expressed genes were identified. These co-expressed genes were significantly enriched inflammatory response. The results of regulatory impact factors revealed several crucial regulators related to hepatocellular carcinoma and other high-rank regulators. Meanwhile, more than one hundred co-expression gene modules were identified using clustering method. In our study, some important transcriptional regulators were identified using a computational method, which may enhance the understanding of disease mechanisms and lead to an improved treatment of hepatitis B. However, further experimental studies are required to confirm these findings. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  1. Gene expression profiles in rat mesenteric lymph nodes upon supplementation with Conjugated Linoleic Acid during gestation and suckling

    PubMed Central

    2011-01-01

    Background Diet plays a role on the development of the immune system, and polyunsaturated fatty acids can modulate the expression of a variety of genes. Human milk contains conjugated linoleic acid (CLA), a fatty acid that seems to contribute to immune development. Indeed, recent studies carried out in our group in suckling animals have shown that the immune function is enhanced after feeding them with an 80:20 isomer mix composed of c9,t11 and t10,c12 CLA. However, little work has been done on the effects of CLA on gene expression, and even less regarding immune system development in early life. Results The expression profile of mesenteric lymph nodes from animals supplemented with CLA during gestation and suckling through dam's milk (Group A) or by oral gavage (Group B), supplemented just during suckling (Group C) and control animals (Group D) was determined with the aid of the specific GeneChip® Rat Genome 230 2.0 (Affymettrix). Bioinformatics analyses were performed using the GeneSpring GX software package v10.0.2 and lead to the identification of 89 genes differentially expressed in all three dietary approaches. Generation of a biological association network evidenced several genes, such as connective tissue growth factor (Ctgf), tissue inhibitor of metalloproteinase 1 (Timp1), galanin (Gal), synaptotagmin 1 (Syt1), growth factor receptor bound protein 2 (Grb2), actin gamma 2 (Actg2) and smooth muscle alpha actin (Acta2), as highly interconnected nodes of the resulting network. Gene underexpression was confirmed by Real-Time RT-PCR. Conclusions Ctgf, Timp1, Gal and Syt1, among others, are genes modulated by CLA supplementation that may have a role on mucosal immune responses in early life. PMID:21481241

  2. Gene expression profiles in rat mesenteric lymph nodes upon supplementation with conjugated linoleic acid during gestation and suckling.

    PubMed

    Selga, Elisabet; Pérez-Cano, Francisco J; Franch, Angels; Ramírez-Santana, Carolina; Rivero, Montserrat; Ciudad, Carlos J; Castellote, Cristina; Noé, Véronique

    2011-04-11

    Diet plays a role on the development of the immune system, and polyunsaturated fatty acids can modulate the expression of a variety of genes. Human milk contains conjugated linoleic acid (CLA), a fatty acid that seems to contribute to immune development. Indeed, recent studies carried out in our group in suckling animals have shown that the immune function is enhanced after feeding them with an 80:20 isomer mix composed of c9,t11 and t10,c12 CLA. However, little work has been done on the effects of CLA on gene expression, and even less regarding immune system development in early life. The expression profile of mesenteric lymph nodes from animals supplemented with CLA during gestation and suckling through dam's milk (Group A) or by oral gavage (Group B), supplemented just during suckling (Group C) and control animals (Group D) was determined with the aid of the specific GeneChip(®) Rat Genome 230 2.0 (Affymettrix). Bioinformatics analyses were performed using the GeneSpring GX software package v10.0.2 and lead to the identification of 89 genes differentially expressed in all three dietary approaches. Generation of a biological association network evidenced several genes, such as connective tissue growth factor (Ctgf), tissue inhibitor of metalloproteinase 1 (Timp1), galanin (Gal), synaptotagmin 1 (Syt1), growth factor receptor bound protein 2 (Grb2), actin gamma 2 (Actg2) and smooth muscle alpha actin (Acta2), as highly interconnected nodes of the resulting network. Gene underexpression was confirmed by Real-Time RT-PCR. Ctgf, Timp1, Gal and Syt1, among others, are genes modulated by CLA supplementation that may have a role on mucosal immune responses in early life.

  3. Epigenomics of Development in Populus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strauss, Steve; Freitag, Michael; Mockler, Todd

    2013-01-10

    We conducted research to determine the role of epigenetic modifications during tree development using poplar (Populus trichocarpa), a model woody feedstock species. Using methylated DNA immunoprecipitation (MeDIP) or chromatin immunoprecipitation (ChIP), followed by high-throughput sequencing, we are analyzed DNA and histone methylation patterns in the P. trichocarpa genome in relation to four biological processes: bud dormancy and release, mature organ maintenance, in vitro organogenesis, and methylation suppression. Our project is now completed. We have 1) produced 22 transgenic events for a gene involved in DNA methylation suppression and studied its phenotypic consequences; 2) completed sequencing of methylated DNA from elevenmore » target tissues in wildtype P. trichocarpa; 3) updated our customized poplar genome browser using the open-source software tools (2.13) and (V2.2) of the P. trichocarpa genome; 4) produced summary data for genome methylation in P. trichocarpa, including distribution of methylation across chromosomes and in and around genes; 5) employed bioinformatic and statistical methods to analyze differences in methylation patterns among tissue types; and 6) used bisulfite sequencing of selected target genes to confirm bioinformatics and sequencing results, and gain a higher-resolution view of methylation at selected genes 7) compared methylation patterns to expression using available microarray data. Our main findings of biological significance are the identification of extensive regions of the genome that display developmental variation in DNA methylation; highly distinctive gene-associated methylation profiles in reproductive tissues, particularly male catkins; a strong whole genome/all tissue inverse association of methylation at gene bodies and promoters with gene expression; a lack of evidence that tissue specificity of gene expression is associated with gene methylation; and evidence that genome methylation is a significant impediment to tissue dedifferentiation and redifferentiation in vitro.« less

  4. Evaluation of Reference Genes for RT qPCR Analyses of Structure-Specific and Hormone Regulated Gene Expression in Physcomitrella patens Gametophytes

    PubMed Central

    Le Bail, Aude; Scholz, Sebastian; Kost, Benedikt

    2013-01-01

    The use of the moss Physcomitrella patens as a model system to study plant development and physiology is rapidly expanding. The strategic position of P. patens within the green lineage between algae and vascular plants, the high efficiency with which transgenes are incorporated by homologous recombination, advantages associated with the haploid gametophyte representing the dominant phase of the P. patens life cycle, the simple structure of protonemata, leafy shoots and rhizoids that constitute the haploid gametophyte, as well as a readily accessible high-quality genome sequence make this moss a very attractive experimental system. The investigation of the genetic and hormonal control of P. patens development heavily depends on the analysis of gene expression patterns by real time quantitative PCR (RT qPCR). This technique requires well characterized sets of reference genes, which display minimal expression level variations under all analyzed conditions, for data normalization. Sets of suitable reference genes have been described for most widely used model systems including e.g. Arabidopsis thaliana, but not for P. patens. Here, we present a RT qPCR based comparison of transcript levels of 12 selected candidate reference genes in a range of gametophytic P. patens structures at different developmental stages, and in P. patens protonemata treated with hormones or hormone transport inhibitors. Analysis of these RT qPCR data using GeNorm and NormFinder software resulted in the identification of sets of P. patens reference genes suitable for gene expression analysis under all tested conditions, and suggested that the two best reference genes are sufficient for effective data normalization under each of these conditions. PMID:23951063

  5. Identification of potential transcriptomic markers in developing pediatric sepsis: a weighted gene co-expression network analysis and a case-control validation study.

    PubMed

    Li, Yiping; Li, Yanhong; Bai, Zhenjiang; Pan, Jian; Wang, Jian; Fang, Fang

    2017-12-13

    Sepsis represents a complex disease with the dysregulated inflammatory response and high mortality rate. The goal of this study was to identify potential transcriptomic markers in developing pediatric sepsis by a co-expression module analysis of the transcriptomic dataset. Using the R software and Bioconductor packages, we performed a weighted gene co-expression network analysis to identify co-expression modules significantly associated with pediatric sepsis. Functional interpretation (gene ontology and pathway analysis) and enrichment analysis with known transcription factors and microRNAs of the identified candidate modules were then performed. In modules significantly associated with sepsis, the intramodular analysis was further performed and "hub genes" were identified and validated by quantitative real-time PCR (qPCR) in this study. 15 co-expression modules in total were detected, and four modules ("midnight blue", "cyan", "brown", and "tan") were most significantly associated with pediatric sepsis and suggested as potential sepsis-associated modules. Gene ontology analysis and pathway analysis revealed that these four modules strongly associated with immune response. Three of the four sepsis-associated modules were also enriched with known transcription factors (false discovery rate-adjusted P < 0.05). Hub genes were identified in each of the four modules. Four of the identified hub genes (MYB proto-oncogene like 1, killer cell lectin like receptor G1, stomatin, and membrane spanning 4-domains A4A) were further validated to be differentially expressed between septic children and controls by qPCR. Four pediatric sepsis-associated co-expression modules were identified in this study. qPCR results suggest that hub genes in these modules are potential transcriptomic markers for pediatric sepsis diagnosis. These results provide novel insights into the pathogenesis of pediatric sepsis and promote the generation of diagnostic gene sets.

  6. Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fields, C.A.

    1994-09-01

    This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.

  7. Direct identification of bacteria causing urinary tract infections by combining matrix-assisted laser desorption ionization-time of flight mass spectrometry with UF-1000i urine flow cytometry.

    PubMed

    Wang, X-H; Zhang, G; Fan, Y-Y; Yang, X; Sui, W-J; Lu, X-X

    2013-03-01

    Rapid identification of bacterial pathogens from clinical specimens is essential to establish an adequate empirical antibiotic therapy to treat urinary tract infections (UTIs). We used matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) combined with UF-1000i urine flow cytometry of urine specimens to quickly and accurately identify bacteria causing UTIs. We divided each urine sample into three aliquots for conventional identification, UF-1000i, and MALDI-TOF MS, respectively. We compared the results of the conventional method with those of MALDI-TOF MS combined with UF-1000i, and discrepancies were resolved by 16S rRNA gene sequencing. We analyzed 1456 urine samples from patients with UTI symptoms, and 932 (64.0%) were negative using each of the three testing methods. The combined method used UF-1000i to eliminate negative specimens and then MALDI-TOF MS to identify the remaining positive samples. The combined method was consistent with the conventional method in 1373 of 1456 cases (94.3%), and gave the correct result in 1381 of 1456 cases (94.8%). Therefore, the combined method described here can directly provide a rapid, accurate, definitive bacterial identification for the vast majority of urine samples, though the MALDI-TOF MS software analysis capabilities should be improved, with regard to mixed bacterial infection. Copyright © 2012 Elsevier B.V. All rights reserved.

  8. Drainage identification analysis and mapping, phase 2 : technical brief.

    DOT National Transportation Integrated Search

    2017-01-01

    This research studied, tested and rectified the compatibility issue related to the recent upgrades of : NJDOT vendor inspection software, and uploaded all collected data to make Drainage Identification : Analysis and Mapping System (DIAMS) current an...

  9. PAnalyzer: a software tool for protein inference in shotgun proteomics.

    PubMed

    Prieto, Gorka; Aloria, Kerman; Osinalde, Nerea; Fullaondo, Asier; Arizmendi, Jesus M; Matthiesen, Rune

    2012-11-05

    Protein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSE is the term used to name one of the DIA approaches used in QTOF instruments. MSE data require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSE data. In this paper we present PAnalyzer, a software tool focused on the protein inference process of shotgun proteomics. Our approach considers all the identified proteins and groups them when necessary indicating their confidence using different evidence categories. PAnalyzer can read protein identification files in the XML output format of the ProteinLynx Global Server (PLGS) software provided by Waters Corporation for their MSE data, and also in the mzIdentML format recently standardized by HUPO-PSI. Multiple files can also be read simultaneously and are considered as technical replicates. Results are saved to CSV, HTML and mzIdentML (in the case of a single mzIdentML input file) files. An MSE analysis of a real sample is presented to compare the results of PAnalyzer and ProteinLynx Global Server. We present a software tool to deal with the ambiguities that arise in the protein inference process. Key contributions are support for MSE data analysis by ProteinLynx Global Server and technical replicates integration. PAnalyzer is an easy to use multiplatform and free software tool.

  10. PAnalyzer: A software tool for protein inference in shotgun proteomics

    PubMed Central

    2012-01-01

    Background Protein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSE is the term used to name one of the DIA approaches used in QTOF instruments. MSE data require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSE data. Results In this paper we present PAnalyzer, a software tool focused on the protein inference process of shotgun proteomics. Our approach considers all the identified proteins and groups them when necessary indicating their confidence using different evidence categories. PAnalyzer can read protein identification files in the XML output format of the ProteinLynx Global Server (PLGS) software provided by Waters Corporation for their MSE data, and also in the mzIdentML format recently standardized by HUPO-PSI. Multiple files can also be read simultaneously and are considered as technical replicates. Results are saved to CSV, HTML and mzIdentML (in the case of a single mzIdentML input file) files. An MSE analysis of a real sample is presented to compare the results of PAnalyzer and ProteinLynx Global Server. Conclusions We present a software tool to deal with the ambiguities that arise in the protein inference process. Key contributions are support for MSE data analysis by ProteinLynx Global Server and technical replicates integration. PAnalyzer is an easy to use multiplatform and free software tool. PMID:23126499

  11. Computational methods in sequence and structure prediction

    NASA Astrophysics Data System (ADS)

    Lang, Caiyi

    This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed a software package which is capable of designing novel protein structures at the atomic resolution. This software package allows us to perform protein structure design with a flexible backbone. The backbone flexibility includes loop region relaxation as well as a secondary structure collective mode relaxation scheme. (Abstract shortened by UMI.)

  12. Pathway results from the chicken data set using GOTM, Pathway Studio and Ingenuity softwares

    PubMed Central

    Bonnet, Agnès; Lagarrigue, Sandrine; Liaubet, Laurence; Robert-Granié, Christèle; SanCristobal, Magali; Tosser-Klopp, Gwenola

    2009-01-01

    Background As presented in the introduction paper, three sets of differentially regulated genes were found after the analysis of the chicken infection data set from EADGENE. Different methods were used to interpret these results. Results GOTM, Pathway Studio and Ingenuity softwares were used to investigate the three lists of genes. The three softwares allowed the analysis of the data and highlighted different networks. However, only one set of genes, showing a differential expression between primary and secondary response gave significant biological interpretation. Conclusion Combining these databases that were developed independently on different annotation sources supplies a useful tool for a global biological interpretation of microarray data, even if they may contain some imperfections (e.g. gene not or not well annotated). PMID:19615111

  13. RANGER-DTL 2.0: Rigorous Reconstruction of Gene-Family Evolution by Duplication, Transfer, and Loss.

    PubMed

    Bansal, Mukul S; Kellis, Manolis; Kordi, Misagh; Kundu, Soumya

    2018-04-24

    RANGER-DTL 2.0 is a software program for inferring gene family evolution using Duplication-Transfer-Loss reconciliation. This new software is highly scalable and easy to use, and offers many new features not currently available in any other reconciliation program. RANGER-DTL 2.0 has a particular focus on reconciliation accuracy and can account for many sources of reconciliation uncertainty including uncertain gene tree rooting, gene tree topological uncertainty, multiple optimal reconciliations, and alternative event cost assignments. RANGER-DTL 2.0 is open-source and written in C ++ and Python. Pre-compiled executables, source code (open-source under GNU GPL), and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/. mukul.bansal@uconn.edu.

  14. Identification of early zygotic genes in the yellow fever mosquito Aedes aegypti and discovery of a motif involved in early zygotic genome activation.

    PubMed

    Biedler, James K; Hu, Wanqi; Tae, Hongseok; Tu, Zhijian

    2012-01-01

    During early embryogenesis the zygotic genome is transcriptionally silent and all mRNAs present are of maternal origin. The maternal-zygotic transition marks the time over which embryogenesis changes its dependence from maternal RNAs to zygotically transcribed RNAs. Here we present the first systematic investigation of early zygotic genes (EZGs) in a mosquito species and focus on genes involved in the onset of transcription during 2-4 hr. We used transcriptome sequencing to identify the "pure" (without maternal expression) EZGs by analyzing transcripts from four embryonic time ranges of 0-2, 2-4, 4-8, and 8-12 hr, which includes the time of cellular blastoderm formation and up to the start of gastrulation. Blast of 16,789 annotated transcripts vs. the transcriptome reads revealed evidence for 63 (P<0.001) and 143 (P<0.05) nonmaternally derived transcripts having a significant increase in expression at 2-4 hr. One third of the 63 EZG transcripts do not have predicted introns compared to 10% of all Ae. aegypti genes. We have confirmed by RT-PCR that zygotic transcription starts as early as 2-3 hours. A degenerate motif VBRGGTA was found to be overrepresented in the upstream sequences of the identified EZGs using a motif identification software called SCOPE. We find evidence for homology between this motif and the TAGteam motif found in Drosophila that has been implicated in EZG activation. A 38 bp sequence in the proximal upstream sequence of a kinesin light chain EZG (KLC2.1) contains two copies of the mosquito motif. This sequence was shown to support EZG transcription by luciferase reporter assays performed on injected early embryos, and confers early zygotic activity to a heterologous promoter from a divergent mosquito species. The results of these studies are consistent with the model of early zygotic genome activation via transcriptional activators, similar to what has been found recently in Drosophila.

  15. [Isolation and identification of Streptococcus suis serotype 2 from sick-pig samples of Sichuan province].

    PubMed

    Zhu, Hong; He, Jun; Jing, Hong-bo; Wang, Zheng-qiang; Duan, Qing

    2006-08-01

    Streptococcus suis serotype 2 (SS2) is a major pathogen frequently associated with infections in pigs. There are presently 35 serotypes of S.suis (serotype 1 to 34 and serotype 1/2) recognized on the basis of capsular antigens. Few people were reported to infect with SS2 in the past years. However, an accidental case happened in Sichuan province of China in 2005. Some people got ill and died, and all of them were closely contacted with sick pigs. Based on clinical features and epidemiologic data, this case could be caused by SS2 infection. Liver, spleen, kidney, lung and serum samples were collected and used for pathogen isolation and identification in laboratory, three strain bacteria were isolated. The three strains of SS2 showed typical morphology of SS2 on blood agar and under microscope with Gram stain. They were also agglutinated with standard serum of SS2. Biochemical characteristics of the three bacteria were tested using API 20 strep and analyzed by API software (version 3.3), results showed they were SS2. Four pairs of primer were designed, which were exactly matched the extracellular factor gene, muraminidase released protein gene, capsular polysaccharides gene and 16S rRNA gene respectively. These primers were used on polymerase chain reaction (PCR), and the PCR products were 626bp, 885bp, 487bp and 297bp on agarose gel, respectively. Drug sensitivity test were also done and results showed that they were sensitive to cefazolin, clindamycin, erythromycin, levofloxacin, nitrofurantoin, penicillin-G, and vancomycin and resistive to tetracycline. Balb/c mice infected with the isolated SS2 strain showed swelling in stomach and intestine, cyanochroia at mouth and suggillation under skin, which were similar to the clinical features of patients. Streptococcus suis serotype 2 were also found on lung sheeting sample under microscope with Gram stain. Rabbits infected with the isolated SS2 showed the similar clinical features with mice.

  16. Refined identification of Vibrio bacterial flora from Acanthasther planci based on biochemical profiling and analysis of housekeeping genes.

    PubMed

    Rivera-Posada, J A; Pratchett, M; Cano-Gomez, A; Arango-Gomez, J D; Owens, L

    2011-09-09

    We used a polyphasic approach for precise identification of bacterial flora (Vibrionaceae) isolated from crown-of-thorns starfish (COTS) from Lizard Island (Great Barrier Reef, Australia) and Guam (U.S.A., Western Pacific Ocean). Previous 16S rRNA gene phylogenetic analysis was useful to allocate and identify isolates within the Photobacterium, Splendidus and Harveyi clades but failed in the identification of Vibrio harveyi-like isolates. Species of the V harveyi group have almost indistinguishable phenotypes and genotypes, and thus, identification by standard biochemical tests and 16S rRNA gene analysis is commonly inaccurate. Biochemical profiling and sequence analysis of additional topA and mreB housekeeping genes were carried out for definitive identification of 19 bacterial isolates recovered from sick and wild COTS. For 8 isolates, biochemical profiles and topA and mreB gene sequence alignments with the closest relatives (GenBank) confirmed previous 16S rRNA-based identification: V. fortis and Photobacterium eurosenbergii species (from wild COTS), and V natriegens (from diseased COTS). Further phylogenetic analysis based on topA and mreB concatenated sequences served to identify the remaining 11 V harveyi-like isolates: V. owensii and V. rotiferianus (from wild COTS), and V. owensii, V. rotiferianus, and V. harveyi (from diseased COTS). This study further confirms the reliability of topA-mreB gene sequence analysis for identification of these close species, and it reveals a wider distribution range of the potentially pathogenic V. harveyi group.

  17. The Role of 16S rRNA Gene Sequencing in Identification of Microorganisms Misidentified by Conventional Methods

    PubMed Central

    Petti, C. A.; Polage, C. R.; Schreckenberger, P.

    2005-01-01

    Traditional methods for microbial identification require the recognition of differences in morphology, growth, enzymatic activity, and metabolism to define genera and species. Full and partial 16S rRNA gene sequencing methods have emerged as useful tools for identifying phenotypically aberrant microorganisms. We report on three bacterial blood isolates from three different College of American Pathologists-certified laboratories that were referred to ARUP Laboratories for definitive identification. Because phenotypic identification suggested unusual organisms not typically associated with the submitted clinical diagnosis, consultation with the Medical Director was sought and further testing was performed including partial 16S rRNA gene sequencing. All three patients had endocarditis, and conventional methods identified isolates from patients A, B, and C as a Facklamia sp., Eubacterium tenue, and a Bifidobacterium sp. 16S rRNA gene sequencing identified the isolates as Enterococcus faecalis, Cardiobacterium valvarum, and Streptococcus mutans, respectively. We conclude that the initial identifications of these three isolates were erroneous, may have misled clinicians, and potentially impacted patient care. 16S rRNA gene sequencing is a more objective identification tool, unaffected by phenotypic variation or technologist bias, and has the potential to reduce laboratory errors. PMID:16333109

  18. CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.

    PubMed

    Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang

    2016-12-23

    A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .

  19. CGO: utilizing and integrating gene expression microarray data in clinical research and data management.

    PubMed

    Bumm, Klaus; Zheng, Mingzhong; Bailey, Clyde; Zhan, Fenghuang; Chiriva-Internati, M; Eddlemon, Paul; Terry, Julian; Barlogie, Bart; Shaughnessy, John D

    2002-02-01

    Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.

  20. FAMIAS - A userfriendly new software tool for the mode identification of photometric and spectroscopic times series

    NASA Astrophysics Data System (ADS)

    Zima, W.

    2008-12-01

    FAMIAS (Frequency Analysis and Mode Identification for AsteroSeismology) is a collection of state-of-the-art software tools for the analysis of photometric and spectroscopic time series data. It is one of the deliverables of the Work Package NA5: Asteroseismology of the European Coordination Action in Helio- and Asteroseismology (HELAS1 ). Two main sets of tools are incorporated in FAMIAS. The first set allows to search for pe- riodicities in the data using Fourier and non-linear least-squares fitting algorithms. The other set allows to carry out a mode identification for the detected pulsation frequencies to deter- mine their pulsational quantum numbers, the harmonic degree, ℓ, and the azimuthal order, m. For the spectroscopic mode identification, the Fourier parameter fit method and the moment method are available. The photometric mode identification is based on pre-computed grids of atmospheric parameters and non-adiabatic observables, and uses the method of amplitude ratios and phase differences in different filters. The types of stars to which FAMIAS is appli- cable are main-sequence pulsators hotter than the Sun. This includes the Gamma Dor stars, Delta Sct stars, the slowly pulsating B stars and the Beta Cep stars - basically all pulsating main-sequence stars, for which empirical mode identification is required to successfully carry out asteroseismology. The complete manual for FAMIAS is published in a special issue of Communications in Asteroseismology, Vol 155. The homepage of FAMIAS2 provides the possibility to download the software and to read the on-line documentation.

  1. Identification, cloning, and expression analysis of three putative Lymantria dispar nuclear polyhedrosis virus immediate early genes

    Treesearch

    James M. Slavicek; Nancy Hayes-Plazolles

    1991-01-01

    Viral immediate early gene products are usually regulatory proteins that control expression of other viral genes at the transcriptional level or are proteins that are part of the viral DNA replication complex. The identification and functional characterization of the immediate early gene products of Lymantria dispar nuclear polyhedrosis virus (LdNPV...

  2. Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology

    PubMed Central

    Grüning, Björn A.; Paszkiewicz, Konrad; Pritchard, Leighton

    2013-01-01

    The Galaxy Project offers the popular web browser-based platform Galaxy for running bioinformatics tools and constructing simple workflows. Here, we present a broad collection of additional Galaxy tools for large scale analysis of gene and protein sequences. The motivating research theme is the identification of specific genes of interest in a range of non-model organisms, and our central example is the identification and prediction of “effector” proteins produced by plant pathogens in order to manipulate their host plant. This functional annotation of a pathogen’s predicted capacity for virulence is a key step in translating sequence data into potential applications in plant pathology. This collection includes novel tools, and widely-used third-party tools such as NCBI BLAST+ wrapped for use within Galaxy. Individual bioinformatics software tools are typically available separately as standalone packages, or in online browser-based form. The Galaxy framework enables the user to combine these and other tools to automate organism scale analyses as workflows, without demanding familiarity with command line tools and scripting. Workflows created using Galaxy can be saved and are reusable, so may be distributed within and between research groups, facilitating the construction of a set of standardised, reusable bioinformatic protocols. The Galaxy tools and workflows described in this manuscript are open source and freely available from the Galaxy Tool Shed (http://usegalaxy.org/toolshed or http://toolshed.g2.bx.psu.edu). PMID:24109552

  3. 48 CFR 209.571-6 - Identification of organizational conflicts of interest.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... business units performing systems engineering and technical assistance, professional services, or... parent corporate entity, particularly the award of a subcontract for software integration or the development of a proprietary software system architecture; and (c) The performance by, or assistance of...

  4. 48 CFR 209.571-6 - Identification of organizational conflicts of interest.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... business units performing systems engineering and technical assistance, professional services, or... parent corporate entity, particularly the award of a subcontract for software integration or the development of a proprietary software system architecture; and (c) The performance by, or assistance of...

  5. 48 CFR 209.571-6 - Identification of organizational conflicts of interest.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... business units performing systems engineering and technical assistance, professional services, or... parent corporate entity, particularly the award of a subcontract for software integration or the development of a proprietary software system architecture; and (c) The performance by, or assistance of...

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baumgardt, D.R.; Carter, S.; Maxson, M.

    The objective of this project is to design and develop an Intelligent Event Identification System, or ISEIS, which will be a prototype for routine event identification of small explosions and earthquakes and to serve as a tool for discrimination research. The first part of this study gives an overview of the system design and the results of a preliminary evaluation of the system on events in Scandinavia and the Soviet Union. The system was designed to be highly modular to allow the easy incorporation of new discriminants and/or discrimination processes. Because the main objective of the system is the identificationmore » of small events, most of the initial ISEIS prototype discriminants utilize regional seismic data recorded by the regional arrays, NORESS and ARCESS. However, ISEIS can easily process other regional array data (e.g., from GERESS and FINESA), as well as data from three-component single stations, as more of this data becomes available. The second part of this study is entitled Intelligent Event Identification System: User's Manual, and gives a detailed description of all the processing interfaces of ISEIS. The third part of this study is entitled Intelligent Event Identification System: Software Maintenance Manual, which describes the ISEIS software from the programmer's perspective and provides information for maintenance and modification of the software modules in the system.« less

  7. In Silico Detection of Sequence Variations Modifying Transcriptional Regulation

    PubMed Central

    Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob

    2008-01-01

    Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319

  8. A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

    PubMed

    Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

    2011-03-15

    Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.

  9. Comparison of traditional phenotypic identification methods with partial 5' 16S rRNA gene sequencing for species-level identification of nonfermenting Gram-negative bacilli.

    PubMed

    Cloud, Joann L; Harmsen, Dag; Iwen, Peter C; Dunn, James J; Hall, Gerri; Lasala, Paul Rocco; Hoggan, Karen; Wilson, Deborah; Woods, Gail L; Mellmann, Alexander

    2010-04-01

    Correct identification of nonfermenting Gram-negative bacilli (NFB) is crucial for patient management. We compared phenotypic identifications of 96 clinical NFB isolates with identifications obtained by 5' 16S rRNA gene sequencing. Sequencing identified 88 isolates (91.7%) with >99% similarity to a sequence from the assigned species; 61.5% of sequencing results were concordant with phenotypic results, indicating the usability of sequencing to identify NFB.

  10. Selection of reference genes for RT-qPCR analysis in tumor tissues from male hepatocellular carcinoma patients with hepatitis B infection and cirrhosis.

    PubMed

    Liu, Shuang; Zhu, Pengfei; Zhang, Ling; Ding, Shanlong; Zheng, Sujun; Wang, Yang; Lu, Fengmin

    2013-01-01

    Reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) has been widely used to quantify relative gene expression because of the high specificity, sensitivity and accuracy of this technique. However, its reliability is strongly depends on the expression stability of reference gene used for data normalization. Therefore, identification of reliable and condition specific reference genes is critical for the success of RT-qPCR. Hepatitis B virus (HBV) infection, male gender and the presence of cirrhosis are widely recognized as the leading independent risk factors for the development of hepatocellular carcinoma (HCC). This study aimed to select reliable reference gene for RT-qPCR analysis in HCC patients with all of those risk factors. Six candidate reference genes were analyzed in 33 paired tumor and non-tumor tissues from untreated HCC patients. The genes expression stabilities were assessed by geNorm and NormFinder. C-terminal binding protein 1(CTBP1) was the most stable gene among the 6 candidate genes evaluated by both geNorm and NormFinder. The expression stability values were 0.08 for CTBP1 and UBC, 0.09 for HPRT1, 0.12 for HMBS, 0.14 for GAPDH and 0.18 for 18S with geNorm analysis. The stability values suggested by NormFinder software were CTBP1: 0.044, UBC: 0.063, HMBS: 0.072, HPRT1: 0.072, GAPDH: 0.098 and 18S rRNA: 0.161. This is the first systematic analysis which suggested CTBP1 as the highest expression-stable gene in human male HBV infection related-HCC with cirrhosis. We recommend CTBP1 as the best candidate reference gene when RT-qPCR was used to determine gene(s) expression in HCC. This may facilitate the relevant HBV related HCC studies in the future.

  11. Top Down Implementation Plan for system performance test software

    NASA Technical Reports Server (NTRS)

    Jacobson, G. N.; Spinak, A.

    1982-01-01

    The top down implementation plan used for the development of system performance test software during the Mark IV-A era is described. The plan is based upon the identification of the hierarchical relationship of the individual elements of the software design, the development of a sequence of functionally oriented demonstrable steps, the allocation of subroutines to the specific step where they are first required, and objective status reporting. The results are: determination of milestones, improved managerial visibility, better project control, and a successful software development.

  12. Analyzing gene perturbation screens with nested effects models in R and bioconductor.

    PubMed

    Fröhlich, Holger; Beissbarth, Tim; Tresch, Achim; Kostka, Dennis; Jacob, Juby; Spang, Rainer; Markowetz, F

    2008-11-01

    Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.

  13. CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

    PubMed Central

    Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven

    2003-01-01

    We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413

  14. Formularity: Software for Automated Formula Assignment of Natural and Other Organic Matter from Ultrahigh-Resolution Mass Spectra.

    PubMed

    Tolić, Nikola; Liu, Yina; Liyu, Andrey; Shen, Yufeng; Tfaily, Malak M; Kujawinski, Elizabeth B; Longnecker, Krista; Kuo, Li-Jung; Robinson, Errol W; Paša-Tolić, Ljiljana; Hess, Nancy J

    2017-12-05

    Ultrahigh resolution mass spectrometry, such as Fourier transform ion cyclotron resonance mass spectrometry (FT ICR MS), can resolve thousands of molecular ions in complex organic matrices. A Compound Identification Algorithm (CIA) was previously developed for automated elemental formula assignment for natural organic matter (NOM). In this work, we describe software Formularity with a user-friendly interface for CIA function and newly developed search function Isotopic Pattern Algorithm (IPA). While CIA assigns elemental formulas for compounds containing C, H, O, N, S, and P, IPA is capable of assigning formulas for compounds containing other elements. We used halogenated organic compounds (HOC), a chemical class that is ubiquitous in nature as well as anthropogenic systems, as an example to demonstrate the capability of Formularity with IPA. A HOC standard mix was used to evaluate the identification confidence of IPA. Tap water and HOC spike in Suwannee River NOM were used to assess HOC identification in complex environmental samples. Strategies for reconciliation of CIA and IPA assignments were discussed. Software and sample databases with documentation are freely available.

  15. LIQUID: an-open source software for identifying lipids in LC-MS/MS-based lipidomics data.

    PubMed

    Kyle, Jennifer E; Crowell, Kevin L; Casey, Cameron P; Fujimoto, Grant M; Kim, Sangtae; Dautel, Sydney E; Smith, Richard D; Payne, Samuel H; Metz, Thomas O

    2017-06-01

    We introduce an open-source software, LIQUID, for semi-automated processing and visualization of LC-MS/MS-based lipidomics data. LIQUID provides users with the capability to process high throughput data and contains a customizable target library and scoring model per project needs. The graphical user interface provides visualization of multiple lines of spectral evidence for each lipid identification, allowing rapid examination of data for making confident identifications of lipid molecular species. LIQUID was compared to other freely available software commonly used to identify lipids and other small molecules (e.g. CFM-ID, MetFrag, GNPS, LipidBlast and MS-DIAL), and was found to have a faster processing time to arrive at a higher number of validated lipid identifications. LIQUID is available at http://github.com/PNNL-Comp-Mass-Spec/LIQUID . jennifer.kyle@pnnl.gov or thomas.metz@pnnl.gov. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  16. Identification of new genes in a cell envelope-cell division gene cluster of Escherichia coli: cell envelope gene murG.

    PubMed Central

    Salmond, G P; Lutkenhaus, J F; Donachie, W D

    1980-01-01

    We report the identification, cloning, and mapping of a new cell envelope gene, murG. This lies in a group of five genes of similar phenotype (in the order murE murF murG murC ddl) all concerned with peptidoglycan biosynthesis. This group is in a larger cluster of at least 10 genes, all of which are involved in some way with cell envelope growth. Images PMID:6998962

  17. RAPTR-SV: a hybrid method for the detection of structural variants

    USDA-ARS?s Scientific Manuscript database

    Motivation: Identification of Structural Variants (SV) in sequence data results in a large number of false positive calls using existing software, which overburdens subsequent validation. Results: Simulations using RAPTR-SV and another software package that uses a similar algorithm for SV detection...

  18. STRAP PTM: Software Tool for Rapid Annotation and Differential Comparison of Protein Post-Translational Modifications.

    PubMed

    Spencer, Jean L; Bhatia, Vivek N; Whelan, Stephen A; Costello, Catherine E; McComb, Mark E

    2013-12-01

    The identification of protein post-translational modifications (PTMs) is an increasingly important component of proteomics and biomarker discovery, but very few tools exist for performing fast and easy characterization of global PTM changes and differential comparison of PTMs across groups of data obtained from liquid chromatography-tandem mass spectrometry experiments. STRAP PTM (Software Tool for Rapid Annotation of Proteins: Post-Translational Modification edition) is a program that was developed to facilitate the characterization of PTMs using spectral counting and a novel scoring algorithm to accelerate the identification of differential PTMs from complex data sets. The software facilitates multi-sample comparison by collating, scoring, and ranking PTMs and by summarizing data visually. The freely available software (beta release) installs on a PC and processes data in protXML format obtained from files parsed through the Trans-Proteomic Pipeline. The easy-to-use interface allows examination of results at protein, peptide, and PTM levels, and the overall design offers tremendous flexibility that provides proteomics insight beyond simple assignment and counting.

  19. Computer applications making rapid advances in high throughput microbial proteomics (HTMP).

    PubMed

    Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen

    2014-02-01

    The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.

  20. A bioinformatic survey of RNA-binding proteins in Plasmodium.

    PubMed

    Reddy, B P Niranjan; Shrestha, Sony; Hart, Kevin J; Liang, Xiaoying; Kemirembe, Karen; Cui, Liwang; Lindner, Scott E

    2015-11-02

    The malaria parasites in the genus Plasmodium have a very complicated life cycle involving an invertebrate vector and a vertebrate host. RNA-binding proteins (RBPs) are critical factors involved in every aspect of the development of these parasites. However, very few RBPs have been functionally characterized to date in the human parasite Plasmodium falciparum. Using different bioinformatic methods and tools we searched P. falciparum genome to list and annotate RBPs. A representative 3D models for each of the RBD domain identified in P. falciparum was created using I-TESSAR and SWISS-MODEL. Microarray and RNAseq data analysis pertaining PfRBPs was performed using MeV software. Finally, Cytoscape was used to create protein-protein interaction network for CITH-Dozi and Caf1-CCR4-Not complexes. We report the identification of 189 putative RBP genes belonging to 13 different families in Plasmodium, which comprise 3.5% of all annotated genes. Almost 90% (169/189) of these genes belong to six prominent RBP classes, namely RNA recognition motifs, DEAD/H-box RNA helicases, K homology, Zinc finger, Puf and Alba gene families. Interestingly, almost all of the identified RNA-binding helicases and KH genes have cognate homologs in model species, suggesting their evolutionary conservation. Exploration of the existing P. falciparum blood-stage transcriptomes revealed that most RBPs have peak mRNA expression levels early during the intraerythrocytic development cycle, which taper off in later stages. Nearly 27% of RBPs have elevated expression in gametocytes, while 47 and 24% have elevated mRNA expression in ookinete and asexual stages. Comparative interactome analyses using human and Plasmodium protein-protein interaction datasets suggest extensive conservation of the PfCITH/PfDOZI and PfCaf1-CCR4-NOT complexes. The Plasmodium parasites possess a large number of putative RBPs belonging to most of RBP families identified so far, suggesting the presence of extensive post-transcriptional regulation in these parasites. Taken together, in silico identification of these putative RBPs provides a foundation for future functional studies aimed at defining a unique network of post-transcriptional regulation in P. falciparum.

  1. Polymerase Chain Reaction (PCR)-based methods for detection and identification of mycotoxigenic Penicillium species using conserved genes

    USDA-ARS?s Scientific Manuscript database

    Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of d...

  2. Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification.

    PubMed

    Jowkar, Gholam-Hossein; Mansoori, Eghbal G

    2016-10-01

    Identification of disease genes, using computational methods, is an important issue in biomedical and bioinformatics research. According to observations that diseases with the same or similar phenotype have the same biological characteristics, researchers have tried to identify genes by using machine learning tools. In recent attempts, some semi-supervised learning methods, called positive-unlabeled learning, is used for disease gene identification. In this paper, we present a Perceptron ensemble of graph-based positive-unlabeled learning (PEGPUL) on three types of biological attributes: gene ontologies, protein domains and protein-protein interaction networks. In our method, a reliable set of positive and negative genes are extracted using co-training schema. Then, the similarity graph of genes is built using metric learning by concentrating on multi-rank-walk method to perform inference from labeled genes. At last, a Perceptron ensemble is learned from three weighted classifiers: multilevel support vector machine, k-nearest neighbor and decision tree. The main contributions of this paper are: (i) incorporating the statistical properties of gene data through choosing proper metrics, (ii) statistical evaluation of biological features, and (iii) noise robustness characteristic of PEGPUL via using multilevel schema. In order to assess PEGPUL, we have applied it on 12950 disease genes with 949 positive genes from six class of diseases and 12001 unlabeled genes. Compared with some popular disease gene identification methods, the experimental results show that PEGPUL has reasonable performance. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. A systems biology approach to detect key pathways and interaction networks in gastric cancer on the basis of microarray analysis.

    PubMed

    Guo, Leilei; Song, Chunhua; Wang, Peng; Dai, Liping; Zhang, Jianying; Wang, Kaijuan

    2015-11-01

    The aim of the present study was to explore key molecular pathways contributing to gastric cancer (GC) and to construct an interaction network between significant pathways and potential biomarkers. Publicly available gene expression profiles of GSE29272 for GC, and data for the corresponding normal tissue, were downloaded from Gene Expression Omnibus. Pre‑processing and differential analysis were performed with R statistical software packages, and a number of differentially expressed genes (DEGs) were obtained. A functional enrichment analysis was performed for all the DEGs with a BiNGO plug‑in in Cytoscape. Their correlation was analyzed in order to construct a network. The modularity analysis and pathway identification operations were used to identify graph clusters and associated pathways. The underlying molecular mechanisms involving these DEGs were also assessed by data mining. A total of 249 DEGs, which were markedly upregulated and downregulated, were identified. The extracellular region contained the most significantly over‑represented functional terms, with respect to upregulated and downregulated genes, and the closest topological matches were identified for taste transduction and regulation of autophagy. In addition, extracellular matrix‑receptor interactions were identified as the most relevant pathway associated with the progression of GC. The genes for fibronectin 1, secreted phosphoprotein 1, collagen type 4 variant α‑1/2 and thrombospondin 1, which are involved in the pathways, may be considered as potential therapeutic targets for GC. A series of associations between candidate genes and key pathways were also identified for GC, and their correlation may provide novel insights into the pathogenesis of GC.

  4. Rapid identification of fungal pathogens in BacT/ALERT, BACTEC, and BBL MGIT media using polymerase chain reaction and DNA sequencing of the internal transcribed spacer regions.

    PubMed

    Pryce, Todd M; Palladino, Silvano; Price, Diane M; Gardam, Dianne J; Campbell, Peter B; Christiansen, Keryn J; Murray, Ronan J

    2006-04-01

    We report a direct polymerase chain reaction/sequence (d-PCRS)-based method for the rapid identification of clinically significant fungi from 5 different types of commercial broth enrichment media inoculated with clinical specimens. Media including BacT/ALERT FA (BioMérieux, Marcy l'Etoile, France) (n = 87), BACTEC Plus Aerobic/F (Becton Dickinson, Microbiology Systems, Sparks, MD) (n = 16), BACTEC Peds Plus/F (Becton Dickinson) (n = 15), BACTEC Lytic/10 Anaerobic/F (Becton Dickinson) (n = 11) bottles, and BBL MGIT (Becton Dickinson) (n = 11) were inoculated with specimens from 138 patients. A universal DNA extraction method was used combining a novel pretreatment step to remove PCR inhibitors with a column-based DNA extraction kit. Target sequences in the noncoding internal transcribed spacer regions of the rRNA gene were amplified by PCR and sequenced using a rapid (24 h) automated capillary electrophoresis system. Using sequence alignment software, fungi were identified by sequence similarity with sequences derived from isolates identified by upper-level reference laboratories or isolates defined as ex-type strains. We identified Candida albicans (n = 14), Candida parapsilosis (n = 8), Candida glabrata (n = 7), Candida krusei (n = 2), Scedosporium prolificans (n = 4), and 1 each of Candida orthopsilosis, Candida dubliniensis, Candida kefyr, Candida tropicalis, Candida guilliermondii, Saccharomyces cerevisiae, Cryptococcus neoformans, Aspergillus fumigatus, Histoplasma capsulatum, and Malassezia pachydermatis by d-PCRS analysis. All d-PCRS identifications from positive broths were in agreement with the final species identification of the isolates grown from subculture. Earlier identification of fungi using d-PCRS may facilitate prompt and more appropriate antifungal therapy.

  5. Identification of key microRNAs and genes in preeclampsia by bioinformatics analysis

    PubMed Central

    Luo, Shouling; Cao, Nannan; Tang, Yao; Gu, Weirong

    2017-01-01

    Preeclampsia is a leading cause of perinatal maternal–foetal mortality and morbidity. The aim of this study is to identify the key microRNAs and genes in preeclampsia and uncover their potential functions. We downloaded the miRNA expression profile of GSE84260 and the gene expression profile of GSE73374 from the Gene Expression Omnibus database. Differentially expressed miRNAs and genes were identified and compared to miRNA-target information from MiRWalk 2.0, and a total of 65 differentially expressed miRNAs (DEMIs), including 32 up-regulated miRNAs and 33 down-regulated miRNAs, and 91 differentially expressed genes (DEGs), including 83 up-regulated genes and 8 down-regulated genes, were identified. The pathway enrichment analyses of the DEMIs showed that the up-regulated DEMIs were enriched in the Hippo signalling pathway and MAPK signalling pathway, and the down-regulated DEMIs were enriched in HTLV-I infection and miRNAs in cancers. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses of the DEGs were performed using Multifaceted Analysis Tool for Human Transcriptome. The up-regulated DEGs were enriched in biological processes (BPs), including the response to cAMP, response to hydrogen peroxide and cell-cell adhesion mediated by integrin; no enrichment of down-regulated DEGs was identified. KEGG analysis showed that the up-regulated DEGs were enriched in the Hippo signalling pathway and pathways in cancer. A PPI network of the DEGs was constructed by using Cytoscape software, and FOS, STAT1, MMP14, ITGB1, VCAN, DUSP1, LDHA, MCL1, MET, and ZFP36 were identified as the hub genes. The current study illustrates a characteristic microRNA profile and gene profile in preeclampsia, which may contribute to the interpretation of the progression of preeclampsia and provide novel biomarkers and therapeutic targets for preeclampsia. PMID:28594854

  6. Track-A-Worm, An Open-Source System for Quantitative Assessment of C. elegans Locomotory and Bending Behavior

    PubMed Central

    Wang, Sijie Jason; Wang, Zhao-Wen

    2013-01-01

    A major challenge of neuroscience is to understand the circuit and gene bases of behavior. C. elegans is commonly used as a model system to investigate how various gene products function at specific tissue, cellular, and synaptic foci to produce complicated locomotory and bending behavior. The investigation generally requires quantitative behavioral analyses using an automated single-worm tracker, which constantly records and analyzes the position and body shape of a freely moving worm at a high magnification. Many single-worm trackers have been developed to meet lab-specific needs, but none has been widely implemented for various reasons, such as hardware difficult to assemble, and software lacking sufficient functionality, having closed source code, or using a programming language that is not broadly accessible. The lack of a versatile system convenient for wide implementation makes data comparisons difficult and compels other labs to develop new worm trackers. Here we describe Track-A-Worm, a system rich in functionality, open in source code, and easy to use. The system includes plug-and-play hardware (a stereomicroscope, a digital camera and a motorized stage), custom software written to run with Matlab in Windows 7, and a detailed user manual. Grayscale images are automatically converted to binary images followed by head identification and placement of 13 markers along a deduced spline. The software can extract and quantify a variety of parameters, including distance traveled, average speed, distance/time/speed of forward and backward locomotion, frequency and amplitude of dominant bends, overall bending activities measured as root mean square, and sum of all bends. It also plots worm travel path, bend trace, and bend frequency spectrum. All functionality is performed through graphical user interfaces and data is exported to clearly-annotated and documented Excel files. These features make Track-A-Worm a good candidate for implementation in other labs. PMID:23922769

  7. JavaGenes Molecular Evolution

    NASA Technical Reports Server (NTRS)

    Lohn, Jason; Smith, David; Frank, Jeremy; Globus, Al; Crawford, James

    2007-01-01

    JavaGenes is a general-purpose, evolutionary software system written in Java. It implements several versions of a genetic algorithm, simulated annealing, stochastic hill climbing, and other search techniques. This software has been used to evolve molecules, atomic force field parameters, digital circuits, Earth Observing Satellite schedules, and antennas. This version differs from version 0.7.28 in that it includes the molecule evolution code and other improvements. Except for the antenna code, JaveGenes is available for NASA Open Source distribution.

  8. The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

    PubMed Central

    Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth

    2006-01-01

    Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497

  9. Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

    PubMed

    Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

    2016-01-01

    The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.

  10. De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora)

    PubMed Central

    Liu, Le; Zhang, Shijie; Lian, Chunlan

    2015-01-01

    Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora. PMID:26690126

  11. Validation of Vitek 2 Nonfermenting Gram-Negative Cards and Vitek 2 Version 4.02 Software for Identification and Antimicrobial Susceptibility Testing of Nonfermenting Gram-Negative Rods from Patients with Cystic Fibrosis▿

    PubMed Central

    Otto-Karg, Ines; Jandl, Stefanie; Müller, Tobias; Stirzel, Beate; Frosch, Matthias; Hebestreit, Helge; Abele-Horn, Marianne

    2009-01-01

    Accurate identification and antimicrobial susceptibility testing (AST) of nonfermenters from cystic fibrosis patients are essential for appropriate antimicrobial treatment. This study examined the ability of the newly designed Vitek 2 nonfermenting gram-negative card (NGNC) (new gram-negative identification card; bioMérieux, Marcy-l'Ètoile, France) to identify nonfermenting gram-negative rods from cystic fibrosis patients in comparison to reference methods and the accuracy of the new Vitek 2 version 4.02 software for AST compared to the broth microdilution method. Two hundred twenty-four strains for identification and 138 strains for AST were investigated. The Vitek 2 NGNC identified 211 (94.1%) of the nonfermenters correctly. Among morphologically atypical microorganisms, five strains were misidentified and eight strains were determined with low discrimination, requiring additional tests which raised the correct identification rate to 97.8%. Regarding AST, the overall essential agreement of Vitek 2 was 97.6%, and the overall categorical agreement was 92.9%. Minor errors were found in 5.1% of strains, and major and very major errors were found in 1.6% and 0.3% of strains, respectively. In conclusion, the Vitek NGNC appears to be a reliable method for identification of morphologically typical nonfermenters and is an improvement over the API NE system and the Vitek 2 GNC database version 4.01. However, classification in morphologically atypical nonfermenters must be interpreted with care to avoid misidentification. Moreover, the new Vitek 2 version 4.02 software showed good results for AST and is suitable for routine clinical use. More work is needed for the reliable testing of strains whose MICs are close to the breakpoints. PMID:19710272

  12. Ontology-based specification, identification and analysis of perioperative risks.

    PubMed

    Uciteli, Alexandr; Neumann, Juliane; Tahar, Kais; Saleh, Kutaiba; Stucke, Stephan; Faulbrück-Röhr, Sebastian; Kaeding, André; Specht, Martin; Schmidt, Tobias; Neumuth, Thomas; Besting, Andreas; Stegemann, Dominik; Portheine, Frank; Herre, Heinrich

    2017-09-06

    Medical personnel in hospitals often works under great physical and mental strain. In medical decision-making, errors can never be completely ruled out. Several studies have shown that between 50 and 60% of adverse events could have been avoided through better organization, more attention or more effective security procedures. Critical situations especially arise during interdisciplinary collaboration and the use of complex medical technology, for example during surgical interventions and in perioperative settings (the period of time before, during and after surgical intervention). In this paper, we present an ontology and an ontology-based software system, which can identify risks across medical processes and supports the avoidance of errors in particular in the perioperative setting. We developed a practicable definition of the risk notion, which is easily understandable by the medical staff and is usable for the software tools. Based on this definition, we developed a Risk Identification Ontology (RIO) and used it for the specification and the identification of perioperative risks. An agent system was developed, which gathers risk-relevant data during the whole perioperative treatment process from various sources and provides it for risk identification and analysis in a centralized fashion. The results of such an analysis are provided to the medical personnel in form of context-sensitive hints and alerts. For the identification of the ontologically specified risks, we developed an ontology-based software module, called Ontology-based Risk Detector (OntoRiDe). About 20 risks relating to cochlear implantation (CI) have already been implemented. Comprehensive testing has indicated the correctness of the data acquisition, risk identification and analysis components, as well as the web-based visualization of results.

  13. Using Malware Analysis to Tailor SQUARE for Mobile Platforms

    DTIC Science & Technology

    2014-11-01

    identification data (SIM card and International Mobile Station Equipment Identity Number [IMEI]) to duplicate the phone in another device so that it can...applications. Key logging software can be used to steal passwords for financial websites and credit card information [Sophos 2014]. Data theft...for consumption. Apple provides a limited set of APIs and provides the iTunes store as the only ave- nue to install new software. All software

  14. For operation of the Computer Software Management and Information Center (COSMIC)

    NASA Technical Reports Server (NTRS)

    Carmon, J. L.

    1983-01-01

    During the month of June, the Survey Research Center (SRC) at the University of Georgia designed new benefits questionnaires for computer software management and information center (COSMIC). As a test of their utility, these questionnaires are now used in the benefits identification process.

  15. Using genetic markers to orient the edges in quantitative trait networks: the NEO software.

    PubMed

    Aten, Jason E; Fuller, Tova F; Lusis, Aldons J; Horvath, Steve

    2008-04-15

    Systems genetic studies have been used to identify genetic loci that affect transcript abundances and clinical traits such as body weight. The pairwise correlations between gene expression traits and/or clinical traits can be used to define undirected trait networks. Several authors have argued that genetic markers (e.g expression quantitative trait loci, eQTLs) can serve as causal anchors for orienting the edges of a trait network. The availability of hundreds of thousands of genetic markers poses new challenges: how to relate (anchor) traits to multiple genetic markers, how to score the genetic evidence in favor of an edge orientation, and how to weigh the information from multiple markers. We develop and implement Network Edge Orienting (NEO) methods and software that address the challenges of inferring unconfounded and directed gene networks from microarray-derived gene expression data by integrating mRNA levels with genetic marker data and Structural Equation Model (SEM) comparisons. The NEO software implements several manual and automatic methods for incorporating genetic information to anchor traits. The networks are oriented by considering each edge separately, thus reducing error propagation. To summarize the genetic evidence in favor of a given edge orientation, we propose Local SEM-based Edge Orienting (LEO) scores that compare the fit of several competing causal graphs. SEM fitting indices allow the user to assess local and overall model fit. The NEO software allows the user to carry out a robustness analysis with regard to genetic marker selection. We demonstrate the utility of NEO by recovering known causal relationships in the sterol homeostasis pathway using liver gene expression data from an F2 mouse cross. Further, we use NEO to study the relationship between a disease gene and a biologically important gene co-expression module in liver tissue. The NEO software can be used to orient the edges of gene co-expression networks or quantitative trait networks if the edges can be anchored to genetic marker data. R software tutorials, data, and supplementary material can be downloaded from: http://www.genetics.ucla.edu/labs/horvath/aten/NEO.

  16. Identification of a Maize Locus that Modulates the Hypersensitive Defense Response, Using Mutant-Assisted Gene Identification and Characterization (MAGIC)

    USDA-ARS?s Scientific Manuscript database

    The hypersensitive response (HR) is the most visible and arguably the most important defense response in plants, although the details of how it is controlled and executed remain patchy. In this paper a novel genetic technique called MAGIC (Mutant-Assisted Gene Identification and Characterization) i...

  17. Phisherman v 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Phisherman is an online software tool that was created to help experimenters study phishing. It can potentially be re-purposed to run other human studies. Phisherman enables studies to be run online, so that users can participate from their own computers. This means that experimenters can get data from subjects in their natural settings. Alternatively, an experimenter can also run the app online in a lab-based setting, if that is desired. The software enables the online deployment of a study that is comprised of three main parts: (1) a consent page, (2) a survey, and (3) an identification task, with instruction/transitionmore » screens between each part, allowing the experimenter to provide the user with instructions and messages. Upon logging in, the subject is taken to the permission page, where they agree to or do not agree to take part in the study. If the subject agrees to participate, then the software randomly chooses between doing the survey first (and identification task second) or the identification task first (and survey second). This is to balance possible order effects in the data. Procedurally, in the identification task, the software shows the stimuli to the subject, and asks if she thinks it is a phish (yes/no) and how confident she is about her answer. The subject is given 5 levels of certainty to select from, labeled "low" (1), to "medium" (3), to "high" (5), with the option of picking a level between low and medium (2), and between medium and high (4). After selecting his/her confidence level, then the "Next" button activates, allowing a user to move to the next email. The software saves a given subject's progress in the identification task, so that she may log in and out of the site. The consent page is a space for the experimenter to provide the subject with human studies board /internal review board information, and to formally consent to participate in the study. The survey is a space for the experimenter to provide questions and spaces for the users to input answers (allowing both multiple-choice and free-answer options). Phisherman includes administrative pages for managing the stimuli and users. This includes a tool for the experimenter to create, preview, edit, delete (if desired), and manage stimuli (emails). The stimuli may include pictures (uploaded to an appropriate folder) and links, for realism. The software includes a safety feature that prevents the user from going to any link location or opening a file/image. Instead of re-directing the subject's browser, the software provides a pop-up box with the URL location of where the user would have gone. Another administrative page may be used to create fake subject accounts for testing the software prior to deployment, as well as to delete subject accounts when necessary. Data from the experiment can be downloaded from another administrative page.« less

  18. Identification and expression of the WRKY transcription factors of Carica papaya in response to abiotic and biotic stresses.

    PubMed

    Pan, Lin-Jie; Jiang, Ling

    2014-03-01

    The WRKY transcription factor (TF) plays a very important role in the response of plants to various abiotic and biotic stresses. A local papaya database was built according to the GenBank expressed sequence tag database using the BioEdit software. Fifty-two coding sequences of Carica papaya WRKY TFs were predicted using the tBLASTn tool. The phylogenetic tree of the WRKY proteins was classified. The expression profiles of 13 selected C. papaya WRKY TF genes under stress induction were constructed by quantitative real-time polymerase chain reaction. The expression levels of these WRKY genes in response to 3 abiotic and 2 biotic stresses were evaluated. TF807.3 and TF72.14 are upregulated by low temperature; TF807.3, TF43.76, TF12.199 and TF12.62 are involved in the response to drought stress; TF9.35, TF18.51, TF72.14 and TF12.199 is involved in response to wound; TF12.199, TF807.3, TF21.156 and TF18.51 was induced by PRSV pathogen; TF72.14 and TF43.76 are upregulated by SA. The regulated expression levels of above eight genes normalized against housekeeping gene actin were significant at probability of 0.01 levels. These WRKY TFs could be related to corresponding stress resistance and selected as the candidate genes, especially, the two genes TF807.3 and TF12.199, which were regulated notably by four stresses respectively. This study may provide useful information and candidate genes for the development of transgenic stress tolerant papaya varieties.

  19. Identification of Differentially Expressed K-Ras Transcript Variants in Patients With Leiomyoma.

    PubMed

    Zolfaghari, Nooshin; Shahbazi, Shirin; Torfeh, Mahnaz; Khorasani, Maryam; Hashemi, Mehrdad; Mahdian, Reza

    2017-10-01

    Molecular studies have demonstrated a wide range of gene expression variations in uterine leiomyoma. The rat sarcoma virus/rapidly accelerated fibrosarcoma/mitogen-activated protein kinase (RAS/RAF/MAPK) is the crucial cellular pathway in transmitting external signals into nucleus. Deregulation of this pathway contributes to excessive cell proliferation and tumorigenesis. The present study aims to investigate the expression profile of the K-Ras transcripts in tissue samples from patients with leiomyoma. The patients were leiomyoma cases who had no mutation in mediator complex subunit 12 ( MED12) gene. A quantitative approach has been applied to determine the difference in the expression of the 2 main K-Ras messenger RNA (mRNA) variants. The comparison between gene expression levels in leiomyoma and normal myometrium group was performed using relative expression software tool. The expression of K-Ras4B gene was upregulated in leiomyoma group ( P = .016), suggesting the involvement of K-Ras4B in the disease pathogenesis. Pairwise comparison of the K-Ras4B expression between each leiomyoma tissue and its matched adjacent normal myometrium revealed gene upregulation in 68% of the cases. The expression of K-Ras4A mRNA was relatively upregulated in leiomyoma group ( P = .030). In addition, the mean expression of K-Ras4A gene in leiomyoma tissues relative to normal samples was 4.475 (95% confidence interval: 0.10-20.42; standard error: 0.53-12.67). In total, 58% of the cases showed more than 2-fold increase in K-Ras4A gene expression. Our results demonstrated increased expression of both K-Ras mRNA splicing variants in leiomyoma tissue. However, the ultimate result of KRAS expression on leiomyoma development depends on the overall KRAS isoform balance and, consequently, on activated signaling pathways.

  20. System IDentification Programs for AirCraft (SIDPAC)

    NASA Technical Reports Server (NTRS)

    Morelli, Eugene A.

    2002-01-01

    A collection of computer programs for aircraft system identification is described and demonstrated. The programs, collectively called System IDentification Programs for AirCraft, or SIDPAC, were developed in MATLAB as m-file functions. SIDPAC has been used successfully at NASA Langley Research Center with data from many different flight test programs and wind tunnel experiments. SIDPAC includes routines for experiment design, data conditioning, data compatibility analysis, model structure determination, equation-error and output-error parameter estimation in both the time and frequency domains, real-time and recursive parameter estimation, low order equivalent system identification, estimated parameter error calculation, linear and nonlinear simulation, plotting, and 3-D visualization. An overview of SIDPAC capabilities is provided, along with a demonstration of the use of SIDPAC with real flight test data from the NASA Glenn Twin Otter aircraft. The SIDPAC software is available without charge to U.S. citizens by request to the author, contingent on the requestor completing a NASA software usage agreement.

  1. Development of a computer-assisted forensic radiographic identification method using the lateral cervical and lumbar spine.

    PubMed

    Derrick, Sharon M; Raxter, Michelle H; Hipp, John A; Goel, Priya; Chan, Elaine F; Love, Jennifer C; Wiersema, Jason M; Akella, N Shastry

    2015-01-01

    Medical examiners and coroners (ME/C) in the United States hold statutory responsibility to identify deceased individuals who fall under their jurisdiction. The computer-assisted decedent identification (CADI) project was designed to modify software used in diagnosis and treatment of spinal injuries into a mathematically validated tool for ME/C identification of fleshed decedents. CADI software analyzes the shapes of targeted vertebral bodies imaged in an array of standard radiographs and quantifies the likelihood that any two of the radiographs contain matching vertebral bodies. Six validation tests measured the repeatability, reliability, and sensitivity of the method, and the effects of age, sex, and number of radiographs in array composition. CADI returned a 92-100% success rate in identifying the true matching pair of vertebrae within arrays of five to 30 radiographs. Further development of CADI is expected to produce a novel identification method for use in ME/C offices that is reliable, timely, and cost-effective. © 2014 American Academy of Forensic Sciences.

  2. GEAR: genomic enrichment analysis of regional DNA copy number changes.

    PubMed

    Kim, Tae-Min; Jung, Yu-Chae; Rhyu, Mun-Gan; Jung, Myeong Ho; Chung, Yeun-Jun

    2008-02-01

    We developed an algorithm named GEAR (genomic enrichment analysis of regional DNA copy number changes) for functional interpretation of genome-wide DNA copy number changes identified by array-based comparative genomic hybridization. GEAR selects two types of chromosomal alterations with potential biological relevance, i.e. recurrent and phenotype-specific alterations. Then it performs functional enrichment analysis using a priori selected functional gene sets to identify primary and clinical genomic signatures. The genomic signatures identified by GEAR represent functionally coordinated genomic changes, which can provide clues on the underlying molecular mechanisms related to the phenotypes of interest. GEAR can help the identification of key molecular functions that are activated or repressed in the tumor genomes leading to the improved understanding on the tumor biology. GEAR software is available with online manual in the website, http://www.systemsbiology.co.kr/GEAR/.

  3. 48 CFR 252.227-7017 - Identification and assertion of use, release, or disclosure restrictions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... and Computer Software—Small Business Innovation Research (SBIR) Program clause. (2) If a successful offeror will not be required to deliver technical data, the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation clause, or, if this solicitation contemplates a contract under the...

  4. 48 CFR 252.227-7017 - Identification and assertion of use, release, or disclosure restrictions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... and Computer Software—Small Business Innovation Research (SBIR) Program clause. (2) If a successful offeror will not be required to deliver technical data, the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation clause, or, if this solicitation contemplates a contract under the...

  5. 48 CFR 252.227-7017 - Identification and assertion of use, release, or disclosure restrictions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... and Computer Software—Small Business Innovative Research (SBIR) Program clause. (2) If a successful offeror will not be required to deliver technical data, the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation clause, or, if this solicitation contemplates a contract under the...

  6. 48 CFR 252.227-7017 - Identification and assertion of use, release, or disclosure restrictions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... and Computer Software—Small Business Innovation Research (SBIR) Program clause. (2) If a successful offeror will not be required to deliver technical data, the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation clause, or, if this solicitation contemplates a contract under the...

  7. 48 CFR 252.227-7017 - Identification and assertion of use, release, or disclosure restrictions.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... and Computer Software—Small Business Innovation Research (SBIR) Program clause. (2) If a successful offeror will not be required to deliver technical data, the Rights in Noncommercial Computer Software and Noncommercial Computer Software Documentation clause, or, if this solicitation contemplates a contract under the...

  8. Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis.

    PubMed

    Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

    2010-12-31

    Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification.

  9. Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis

    PubMed Central

    Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

    2010-01-01

    Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification. PMID:21347215

  10. BeeSpace Navigator: exploratory analysis of gene function using semantic indexing of biological literature.

    PubMed

    Sen Sarma, Moushumi; Arcoleo, David; Khetani, Radhika S; Chee, Brant; Ling, Xu; He, Xin; Jiang, Jing; Mei, Qiaozhu; Zhai, ChengXiang; Schatz, Bruce

    2011-07-01

    With the rapid decrease in cost of genome sequencing, the classification of gene function is becoming a primary problem. Such classification has been performed by human curators who read biological literature to extract evidence. BeeSpace Navigator is a prototype software for exploratory analysis of gene function using biological literature. The software supports an automatic analogue of the curator process to extract functions, with a simple interface intended for all biologists. Since extraction is done on selected collections that are semantically indexed into conceptual spaces, the curation can be task specific. Biological literature containing references to gene lists from expression experiments can be analyzed to extract concepts that are computational equivalents of a classification such as Gene Ontology, yielding discriminating concepts that differentiate gene mentions from other mentions. The functions of individual genes can be summarized from sentences in biological literature, to produce results resembling a model organism database entry that is automatically computed. Statistical frequency analysis based on literature phrase extraction generates offline semantic indexes to support these gene function services. The website with BeeSpace Navigator is free and open to all; there is no login requirement at www.beespace.illinois.edu for version 4. Materials from the 2010 BeeSpace Software Training Workshop are available at www.beespace.illinois.edu/bstwmaterials.php.

  11. MASH Suite Pro: A Comprehensive Software Tool for Top-Down Proteomics*

    PubMed Central

    Cai, Wenxuan; Guner, Huseyin; Gregorich, Zachery R.; Chen, Albert J.; Ayaz-Guner, Serife; Peng, Ying; Valeja, Santosh G.; Liu, Xiaowen; Ge, Ying

    2016-01-01

    Top-down mass spectrometry (MS)-based proteomics is arguably a disruptive technology for the comprehensive analysis of all proteoforms arising from genetic variation, alternative splicing, and posttranslational modifications (PTMs). However, the complexity of top-down high-resolution mass spectra presents a significant challenge for data analysis. In contrast to the well-developed software packages available for data analysis in bottom-up proteomics, the data analysis tools in top-down proteomics remain underdeveloped. Moreover, despite recent efforts to develop algorithms and tools for the deconvolution of top-down high-resolution mass spectra and the identification of proteins from complex mixtures, a multifunctional software platform, which allows for the identification, quantitation, and characterization of proteoforms with visual validation, is still lacking. Herein, we have developed MASH Suite Pro, a comprehensive software tool for top-down proteomics with multifaceted functionality. MASH Suite Pro is capable of processing high-resolution MS and tandem MS (MS/MS) data using two deconvolution algorithms to optimize protein identification results. In addition, MASH Suite Pro allows for the characterization of PTMs and sequence variations, as well as the relative quantitation of multiple proteoforms in different experimental conditions. The program also provides visualization components for validation and correction of the computational outputs. Furthermore, MASH Suite Pro facilitates data reporting and presentation via direct output of the graphics. Thus, MASH Suite Pro significantly simplifies and speeds up the interpretation of high-resolution top-down proteomics data by integrating tools for protein identification, quantitation, characterization, and visual validation into a customizable and user-friendly interface. We envision that MASH Suite Pro will play an integral role in advancing the burgeoning field of top-down proteomics. PMID:26598644

  12. Cooperation on Improved Isotopic Identification and Analysis Software for Portable, Electrically Cooled High-Resolution Gamma Spectrometry Systems Final Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dreyer, Jonathan G.; Wang, Tzu-Fang; Vo, Duc T.

    Under a 2006 agreement between the Department of Energy (DOE) of the United States of America and the Institut de Radioprotection et de Sûreté Nucléaire (IRSN) of France, the National Nuclear Security Administration (NNSA) within DOE and IRSN initiated a collaboration to improve isotopic identification and analysis of nuclear material [i.e., plutonium (Pu) and uranium (U)]. The specific aim of the collaborative project was to develop new versions of two types of isotopic identification and analysis software: (1) the fixed-energy response-function analysis for multiple energies (FRAM) codes and (2) multi-group analysis (MGA) codes. The project is entitled Action Sheet 4more » – Cooperation on Improved Isotopic Identification and Analysis Software for Portable, Electrically Cooled, High-Resolution Gamma Spectrometry Systems (Action Sheet 4). FRAM and MGA/U235HI are software codes used to analyze isotopic ratios of U and Pu. FRAM is an application that uses parameter sets for the analysis of U or Pu. MGA and U235HI are two separate applications that analyze Pu or U, respectively. They have traditionally been used by safeguards practitioners to analyze gamma spectra acquired with high-resolution gamma spectrometry (HRGS) systems that are cooled by liquid nitrogen. However, it was discovered that these analysis programs were not as accurate when used on spectra acquired with a newer generation of more portable, electrically cooled HRGS (ECHRGS) systems. In response to this need, DOE/NNSA and IRSN collaborated to update the FRAM and U235HI codes to improve their performance with newer ECHRGS systems. Lawrence Livermore National Laboratory (LLNL) and Los Alamos National Laboratory (LANL) performed this work for DOE/NNSA.« less

  13. AU-FREDI - AUTONOMOUS FREQUENCY DOMAIN IDENTIFICATION

    NASA Technical Reports Server (NTRS)

    Yam, Y.

    1994-01-01

    The Autonomous Frequency Domain Identification program, AU-FREDI, is a system of methods, algorithms and software that was developed for the identification of structural dynamic parameters and system transfer function characterization for control of large space platforms and flexible spacecraft. It was validated in the CALTECH/Jet Propulsion Laboratory's Large Spacecraft Control Laboratory. Due to the unique characteristics of this laboratory environment, and the environment-specific nature of many of the software's routines, AU-FREDI should be considered to be a collection of routines which can be modified and reassembled to suit system identification and control experiments on large flexible structures. The AU-FREDI software was originally designed to command plant excitation and handle subsequent input/output data transfer, and to conduct system identification based on the I/O data. Key features of the AU-FREDI methodology are as follows: 1. AU-FREDI has on-line digital filter design to support on-orbit optimal input design and data composition. 2. Data composition of experimental data in overlapping frequency bands overcomes finite actuator power constraints. 3. Recursive least squares sine-dwell estimation accurately handles digitized sinusoids and low frequency modes. 4. The system also includes automated estimation of model order using a product moment matrix. 5. A sample-data transfer function parametrization supports digital control design. 6. Minimum variance estimation is assured with a curve fitting algorithm with iterative reweighting. 7. Robust root solvers accurately factorize high order polynomials to determine frequency and damping estimates. 8. Output error characterization of model additive uncertainty supports robustness analysis. The research objectives associated with AU-FREDI were particularly useful in focusing the identification methodology for realistic on-orbit testing conditions. Rather than estimating the entire structure, as is typically done in ground structural testing, AU-FREDI identifies only the key transfer function parameters and uncertainty bounds that are necessary for on-line design and tuning of robust controllers. AU-FREDI's system identification algorithms are independent of the JPL-LSCL environment, and can easily be extracted and modified for use with input/output data files. The basic approach of AU-FREDI's system identification algorithms is to non-parametrically identify the sampled data in the frequency domain using either stochastic or sine-dwell input, and then to obtain a parametric model of the transfer function by curve-fitting techniques. A cross-spectral analysis of the output error is used to determine the additive uncertainty in the estimated transfer function. The nominal transfer function estimate and the estimate of the associated additive uncertainty can be used for robust control analysis and design. AU-FREDI's I/O data transfer routines are tailored to the environment of the CALTECH/ JPL-LSCL which included a special operating system to interface with the testbed. Input commands for a particular experiment (wideband, narrowband, or sine-dwell) were computed on-line and then issued to respective actuators by the operating system. The operating system also took measurements through displacement sensors and passed them back to the software for storage and off-line processing. In order to make use of AU-FREDI's I/O data transfer routines, a user would need to provide an operating system capable of overseeing such functions between the software and the experimental setup at hand. The program documentation contains information designed to support users in either providing such an operating system or modifying the system identification algorithms for use with input/output data files. It provides a history of the theoretical, algorithmic and software development efforts including operating system requirements and listings of some of the various special purpose subroutines which were developed and optimized for Lahey FORTRAN compilers on IBM PC-AT computers before the subroutines were integrated into the system software. Potential purchasers are encouraged to purchase and review the documentation before purchasing the AU-FREDI software. AU-FREDI is distributed in DEC VAX BACKUP format on a 1600 BPI 9-track magnetic tape (standard media) or a TK50 tape cartridge. AU-FREDI was developed in 1989 and is a copyrighted work with all copyright vested in NASA.

  14. Genepleio software for effective estimation of gene pleiotropy from protein sequences.

    PubMed

    Chen, Wenhai; Chen, Dandan; Zhao, Ming; Zou, Yangyun; Zeng, Yanwu; Gu, Xun

    2015-01-01

    Though pleiotropy, which refers to the phenomenon of a gene affecting multiple traits, has long played a central role in genetics, development, and evolution, estimation of the number of pleiotropy components remains a hard mission to accomplish. In this paper, we report a newly developed software package, Genepleio, to estimate the effective gene pleiotropy from phylogenetic analysis of protein sequences. Since this estimate can be interpreted as the minimum pleiotropy of a gene, it is used to play a role of reference for many empirical pleiotropy measures. This work would facilitate our understanding of how gene pleiotropy affects the pattern of genotype-phenotype map and the consequence of organismal evolution.

  15. Spatial Identification of Passive Radio Frequency Identification Tags Using Software Defined Radios

    DTIC Science & Technology

    2012-03-01

    75 3.4 Experiment Configurations . . . . . . . . . . . . . . . . . . . . 77 4.1 Simulation Enviromental Elements . . . . . . . . . . . . . . . . 79...tabletop zReader 20cm Tag vertical offset from reader z 10 cm 3dB angle of sensor antenna theat3db 0.698 radians Table 4.1: Simulation Enviromental

  16. 49 CFR Appendix D to Part 236 - Independent Review of Verification and Validation

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... standards. (f) The reviewer shall analyze all Fault Tree Analyses (FTA), Failure Mode and Effects... for each product vulnerability cited by the reviewer; (4) Identification of any documentation or... not properly followed; (6) Identification of the software verification and validation procedures, as...

  17. matK-QR classifier: a patterns based approach for plant species identification.

    PubMed

    More, Ravi Prabhakar; Mane, Rupali Chandrashekhar; Purohit, Hemant J

    2016-01-01

    DNA barcoding is widely used and most efficient approach that facilitates rapid and accurate identification of plant species based on the short standardized segment of the genome. The nucleotide sequences of maturaseK ( matK ) and ribulose-1, 5-bisphosphate carboxylase ( rbcL ) marker loci are commonly used in plant species identification. Here, we present a new and highly efficient approach for identifying a unique set of discriminating nucleotide patterns to generate a signature (i.e. regular expression) for plant species identification. In order to generate molecular signatures, we used matK and rbcL loci datasets, which encompass 125 plant species in 52 genera reported by the CBOL plant working group. Initially, we performed Multiple Sequence Alignment (MSA) of all species followed by Position Specific Scoring Matrix (PSSM) for both loci to achieve a percentage of discrimination among species. Further, we detected Discriminating Patterns (DP) at genus and species level using PSSM for the matK dataset. Combining DP and consecutive pattern distances, we generated molecular signatures for each species. Finally, we performed a comparative assessment of these signatures with the existing methods including BLASTn, Support Vector Machines (SVM), Jrip-RIPPER, J48 (C4.5 algorithm), and the Naïve Bayes (NB) methods against NCBI-GenBank matK dataset. Due to the higher discrimination success obtained with the matK as compared to the rbcL , we selected matK gene for signature generation. We generated signatures for 60 species based on identified discriminating patterns at genus and species level. Our comparative assessment results suggest that a total of 46 out of 60 species could be correctly identified using generated signatures, followed by BLASTn (34 species), SVM (18 species), C4.5 (7 species), NB (4 species) and RIPPER (3 species) methods As a final outcome of this study, we converted signatures into QR codes and developed a software matK -QR Classifier (http://www.neeri.res.in/matk_classifier/index.htm), which search signatures in the query matK gene sequences and predict corresponding plant species. This novel approach of employing pattern-based signatures opens new avenues for the classification of species. In addition to existing methods, we believe that matK -QR Classifier would be a valuable tool for molecular taxonomists enabling precise identification of plant species.

  18. The significance of gtf genes in caries expression: a rapid identification of Streptococcus mutans from dental plaque of child patients.

    PubMed

    Mishra, Apurva; Pandey, Ramesh K; Manickam, Natesan

    2015-01-01

    Rapid phylogenetic and functional gene (gtfB) identification of S. mutans from the dental plaque derived from children. Dental plaque collected from fifteen patients of age group 7-12 underwent centrifugation followed by genomic DNA extraction for S. mutans. Genomic DNA was processed with S. mutans specific primers in suitable PCR condtions for phylogenetic and functional gene (gtfB) identification. The yield and results were confirmed by agarose gel electrophoresis. 1% agarose gel electrophoresis depicts the positive PCR amplification at 1,485 bp when compared with standard 1 kbp indicating the presence of S. mutans in the test sample. Another PCR reaction was set using gtfB primers specific for S. mutans for functional gene identification. 1.2% agarose gel electrophoresis was done and a positive amplication was observed at 192 bp when compared to 100 bp standards. With the advancement in molecular biology techniques, PCR based identification and quantification of the bacterial load can be done within hours using species-specific primers and DNA probes. Thus, this technique may reduce the laboratory time spend in conventional culture methods, reduces the possibility of colony identification errors and is more sensitive to culture techniques.

  19. WGCNA: an R package for weighted correlation network analysis.

    PubMed

    Langfelder, Peter; Horvath, Steve

    2008-12-29

    Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.

  20. WGCNA: an R package for weighted correlation network analysis

    PubMed Central

    Langfelder, Peter; Horvath, Steve

    2008-01-01

    Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at . PMID:19114008

  1. Chromosomal Anomalies in Individuals with Autism: A Strategy Towards the Identification of Genes Involved in Autism

    ERIC Educational Resources Information Center

    Castermans, Dries; Wilquet, Valerie; Steyaert, Jean; van de Ven, Wim; Fryns, Jean-Pierre; Devriendt, Koen

    2004-01-01

    We review the different strategies currently used to try to identify susceptibility genes for idiopathic autism. Although identification of genes is usually straightforward in Mendelian disorders, it has proved to be much more difficult to establish in polygenic disorders like autism. Neither genome screens of affected siblings nor the large…

  2. A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography-mass spectrometry.

    PubMed

    Li, Xiao-jun; Yi, Eugene C; Kemp, Christopher J; Zhang, Hui; Aebersold, Ruedi

    2005-09-01

    There is an increasing interest in the quantitative proteomic measurement of the protein contents of substantially similar biological samples, e.g. for the analysis of cellular response to perturbations over time or for the discovery of protein biomarkers from clinical samples. Technical limitations of current proteomic platforms such as limited reproducibility and low throughput make this a challenging task. A new LC-MS-based platform is able to generate complex peptide patterns from the analysis of proteolyzed protein samples at high throughput and represents a promising approach for quantitative proteomics. A crucial component of the LC-MS approach is the accurate evaluation of the abundance of detected peptides over many samples and the identification of peptide features that can stratify samples with respect to their genetic, physiological, or environmental origins. We present here a new software suite, SpecArray, that generates a peptide versus sample array from a set of LC-MS data. A peptide array stores the relative abundance of thousands of peptide features in many samples and is in a format identical to that of a gene expression microarray. A peptide array can be subjected to an unsupervised clustering analysis to stratify samples or to a discriminant analysis to identify discriminatory peptide features. We applied the SpecArray to analyze two sets of LC-MS data: one was from four repeat LC-MS analyses of the same glycopeptide sample, and another was from LC-MS analysis of serum samples of five male and five female mice. We demonstrate through these two study cases that the SpecArray software suite can serve as an effective software platform in the LC-MS approach for quantitative proteomics.

  3. Infinity: An In-Silico Tool for Genome-Wide Prediction of Specific DNA Matrices in miRNA Genomic Loci.

    PubMed

    Falcone, Emmanuela; Grandoni, Luca; Garibaldi, Francesca; Manni, Isabella; Filligoi, Giancarlo; Piaggio, Giulia; Gurtner, Aymone

    2016-01-01

    miRNAs are potent regulators of gene expression and modulate multiple cellular processes in physiology and pathology. Deregulation of miRNAs expression has been found in various cancer types, thus, miRNAs may be potential targets for cancer therapy. However, the mechanisms through which miRNAs are regulated in cancer remain unclear. Therefore, the identification of transcriptional factor-miRNA crosstalk is one of the most update aspects of the study of miRNAs regulation. In the present study we describe the development of a fast and user-friendly software, named infinity, able to find the presence of DNA matrices, such as binding sequences for transcriptional factors, on ~65kb (kilobase) of 939 human miRNA genomic sequences, simultaneously. Of note, the power of this software has been validated in vivo by performing chromatin immunoprecipitation assays on a subset of new in silico identified target sequences (CCAAT) for the transcription factor NF-Y on colon cancer deregulated miRNA loci. Moreover, for the first time, we have demonstrated that NF-Y, through its CCAAT binding activity, regulates the expression of miRNA-181a, -181b, -21, -17, -130b, -301b in colon cancer cells. The infinity software that we have developed is a powerful tool to underscore new TF/miRNA regulatory networks. Infinity was implemented in pure Java using Eclipse framework, and runs on Linux and MS Windows machine, with MySQL database. The software is freely available on the web at https://github.com/bio-devel/infinity. The website is implemented in JavaScript, PHP and HTML with all major browsers supported.

  4. RED: A Java-MySQL Software for Identifying and Visualizing RNA Editing Sites Using Rule-Based and Statistical Filters.

    PubMed

    Sun, Yongmei; Li, Xing; Wu, Di; Pan, Qi; Ji, Yuefeng; Ren, Hong; Ding, Keyue

    2016-01-01

    RNA editing is one of the post- or co-transcriptional processes that can lead to amino acid substitutions in protein sequences, alternative pre-mRNA splicing, and changes in gene expression levels. Although several methods have been suggested to identify RNA editing sites, there remains challenges to be addressed in distinguishing true RNA editing sites from its counterparts on genome and technical artifacts. In addition, there lacks a software framework to identify and visualize potential RNA editing sites. Here, we presented a software - 'RED' (RNA Editing sites Detector) - for the identification of RNA editing sites by integrating multiple rule-based and statistical filters. The potential RNA editing sites can be visualized at the genome and the site levels by graphical user interface (GUI). To improve performance, we used MySQL database management system (DBMS) for high-throughput data storage and query. We demonstrated the validity and utility of RED by identifying the presence and absence of C→U RNA-editing sites experimentally validated, in comparison with REDItools, a command line tool to perform high-throughput investigation of RNA editing. In an analysis of a sample data-set with 28 experimentally validated C→U RNA editing sites, RED had sensitivity and specificity of 0.64 and 0.5. In comparison, REDItools had a better sensitivity (0.75) but similar specificity (0.5). RED is an easy-to-use, platform-independent Java-based software, and can be applied to RNA-seq data without or with DNA sequencing data. The package is freely available under the GPLv3 license at http://github.com/REDetector/RED or https://sourceforge.net/projects/redetector.

  5. Infinity: An In-Silico Tool for Genome-Wide Prediction of Specific DNA Matrices in miRNA Genomic Loci

    PubMed Central

    Garibaldi, Francesca; Manni, Isabella; Filligoi, Giancarlo; Piaggio, Giulia; Gurtner, Aymone

    2016-01-01

    Motivation miRNAs are potent regulators of gene expression and modulate multiple cellular processes in physiology and pathology. Deregulation of miRNAs expression has been found in various cancer types, thus, miRNAs may be potential targets for cancer therapy. However, the mechanisms through which miRNAs are regulated in cancer remain unclear. Therefore, the identification of transcriptional factor–miRNA crosstalk is one of the most update aspects of the study of miRNAs regulation. Results In the present study we describe the development of a fast and user-friendly software, named infinity, able to find the presence of DNA matrices, such as binding sequences for transcriptional factors, on ~65kb (kilobase) of 939 human miRNA genomic sequences, simultaneously. Of note, the power of this software has been validated in vivo by performing chromatin immunoprecipitation assays on a subset of new in silico identified target sequences (CCAAT) for the transcription factor NF-Y on colon cancer deregulated miRNA loci. Moreover, for the first time, we have demonstrated that NF-Y, through its CCAAT binding activity, regulates the expression of miRNA-181a, -181b, -21, -17, -130b, -301b in colon cancer cells. Conclusions The infinity software that we have developed is a powerful tool to underscore new TF/miRNA regulatory networks. Availability and Implementation Infinity was implemented in pure Java using Eclipse framework, and runs on Linux and MS Windows machine, with MySQL database. The software is freely available on the web at https://github.com/bio-devel/infinity. The website is implemented in JavaScript, PHP and HTML with all major browsers supported. PMID:27082112

  6. RED: A Java-MySQL Software for Identifying and Visualizing RNA Editing Sites Using Rule-Based and Statistical Filters

    PubMed Central

    Sun, Yongmei; Li, Xing; Wu, Di; Pan, Qi; Ji, Yuefeng; Ren, Hong; Ding, Keyue

    2016-01-01

    RNA editing is one of the post- or co-transcriptional processes that can lead to amino acid substitutions in protein sequences, alternative pre-mRNA splicing, and changes in gene expression levels. Although several methods have been suggested to identify RNA editing sites, there remains challenges to be addressed in distinguishing true RNA editing sites from its counterparts on genome and technical artifacts. In addition, there lacks a software framework to identify and visualize potential RNA editing sites. Here, we presented a software − ‘RED’ (RNA Editing sites Detector) − for the identification of RNA editing sites by integrating multiple rule-based and statistical filters. The potential RNA editing sites can be visualized at the genome and the site levels by graphical user interface (GUI). To improve performance, we used MySQL database management system (DBMS) for high-throughput data storage and query. We demonstrated the validity and utility of RED by identifying the presence and absence of C→U RNA-editing sites experimentally validated, in comparison with REDItools, a command line tool to perform high-throughput investigation of RNA editing. In an analysis of a sample data-set with 28 experimentally validated C→U RNA editing sites, RED had sensitivity and specificity of 0.64 and 0.5. In comparison, REDItools had a better sensitivity (0.75) but similar specificity (0.5). RED is an easy-to-use, platform-independent Java-based software, and can be applied to RNA-seq data without or with DNA sequencing data. The package is freely available under the GPLv3 license at http://github.com/REDetector/RED or https://sourceforge.net/projects/redetector. PMID:26930599

  7. Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

    PubMed

    Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

    2018-06-01

    Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological control. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017. Published by Elsevier B.V.

  8. Genome-Wide Identification of Host-Segregating Epidemiological Markers for Source Attribution in Campylobacter jejuni

    PubMed Central

    Thépault, Amandine; Méric, Guillaume; Rivoal, Katell; Pascoe, Ben; Mageiros, Leonardos; Touzain, Fabrice; Rose, Valérie; Béven, Véronique; Chemaly, Marianne

    2017-01-01

    ABSTRACT Campylobacter is among the most common worldwide causes of bacterial gastroenteritis. This organism is part of the commensal microbiota of numerous host species, including livestock, and these animals constitute potential sources of human infection. Molecular typing approaches, especially multilocus sequence typing (MLST), have been used to attribute the source of human campylobacteriosis by quantifying the relative abundance of alleles at seven MLST loci among isolates from animal reservoirs and human infection, implicating chicken as a major infection source. The increasing availability of bacterial genomes provides data on allelic variation at loci across the genome, providing the potential to improve the discriminatory power of data for source attribution. Here we present a source attribution approach based on the identification of novel epidemiological markers among a reference pan-genome list of 1,810 genes identified by gene-by-gene comparison of 884 genomes of Campylobacter jejuni isolates from animal reservoirs, the environment, and clinical cases. Fifteen loci involved in metabolic activities, protein modification, signal transduction, and stress response or coding for hypothetical proteins were selected as host-segregating markers and used to attribute the source of 42 French and 281 United Kingdom clinical C. jejuni isolates. Consistent with previous studies of British campylobacteriosis, analyses performed using STRUCTURE software attributed 56.8% of British clinical cases to chicken, emphasizing the importance of this host reservoir as an infection source in the United Kingdom. However, among French clinical isolates, approximately equal proportions of isolates were attributed to chicken and ruminant reservoirs, suggesting possible differences in the relative importance of animal host reservoirs and indicating a benefit for further national-scale attribution modeling to account for differences in production, behavior, and food consumption. IMPORTANCE Accurately quantifying the relative contribution of different host reservoirs to human Campylobacter infection is an ongoing challenge. This study, based on the development of a novel source attribution approach, provides the first results of source attribution in Campylobacter jejuni in France. A systematic analysis using gene-by-gene comparison of 884 genomes of C. jejuni isolates, with a pan-genome list of genes, identified 15 novel epidemiological markers for source attribution. The different proportions of French and United Kingdom clinical isolates attributed to each host reservoir illustrate a potential role for local/national variations in C. jejuni transmission dynamics. PMID:28115376

  9. GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores.

    PubMed

    Chikkagoudar, Satish; Wang, Kai; Li, Mingyao

    2011-05-26

    Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.

  10. GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores

    PubMed Central

    2011-01-01

    Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/. PMID:21615923

  11. Identification of potential internal control genes for real-time PCR analysis during stress response in Pyropia haitanensis

    NASA Astrophysics Data System (ADS)

    Wang, Xia; Feng, Jianhua; Huang, Aiyou; He, Linwen; Niu, Jianfeng; Wang, Guangce

    2017-11-01

    Pyropia haitanensis has prominent stress-resistance characteristics and is endemic to China. Studies into the stress responses in these algae could provide valuable information on the stress-response mechanisms in the intertidal Rhodophyta. Here, the effects of salinity and light intensity on the quantum yield of photosystem II in Py. haitanensis were investigated using pulse-amplitude-modulation fluorometry. Total RNA and genomic DNA of the samples under different stress conditions were isolated. By normalizing to the genomic DNA quantity, the RNA content in each sample was evaluated. The cDNA was synthesized and the expression levels of seven potential internal control genes were evaluated using qRT-PCR method. Then, we used geNorm, a common statistical algorithm, to analyze the qRT-PCR data of seven reference genes. Potential genes that may constantly be expressed under different conditions were selected, and these genes showed stable expression levels in samples under a salinity treatment, while tubulin, glyceraldehyde-3-phosphate dehydrogenase and actin showed stability in samples stressed by strong light. Based on the results of the pulse amplitude-modulation fluorometry, an absolute quantification was performed to obtain gene copy numbers in certain stress-treated samples. The stably expressed genes as determined by the absolute quantification in certain samples conformed to the results of the geNorm screening. Based on the results of the software analysis and absolute quantification, we proposed that elongation factor 3 and 18S ribosomal RNA could be used as internal control genes when the Py. haitanensis blades were subjected to salinity stress, and that α-tubulin and 18S ribosomal RNA could be used as the internal control genes when the stress was from strong light. In general, our findings provide a convenient reference for the selection of internal control genes when designing experiments related to stress responses in Py. haitanensis.

  12. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data

    PubMed Central

    Glez-Peña, Daniel; Álvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-01

    Background Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. Results DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. Conclusion DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. PMID:19178723

  13. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data.

    PubMed

    Glez-Peña, Daniel; Alvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-29

    Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.

  14. Software support environment design knowledge capture

    NASA Technical Reports Server (NTRS)

    Dollman, Tom

    1990-01-01

    The objective of this task is to assess the potential for using the software support environment (SSE) workstations and associated software for design knowledge capture (DKC) tasks. This assessment will include the identification of required capabilities for DKC and hardware/software modifications needed to support DKC. Several approaches to achieving this objective are discussed and interim results are provided: (1) research into the problem of knowledge engineering in a traditional computer-aided software engineering (CASE) environment, like the SSE; (2) research into the problem of applying SSE CASE tools to develop knowledge based systems; and (3) direct utilization of SSE workstations to support a DKC activity.

  15. DHS-STEM Internship at Lawrence Livermore National Laboratory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feldman, B

    2008-08-18

    This summer I had the fortunate opportunity through the DHS-STEM program to attend Lawrence Livermore National Laboratories (LLNL) to work with Tom Slezak on the bioinformatics team. The bioinformatics team, among other things, helps to develop TaqMan and microarray probes for the identification of pathogens. My main project at the laboratory was to test such probe identification capabilities against metagenomic (unsequenced) data from around the world. Using various sequence analysis tools (Vmatch and Blastall) and several we developed ourselves, about 120 metagenomic sequencing projects were compared against a collection of all completely sequenced genomes and Lawrence Livermore National Laboratory's (LLNL)more » current probe database. For the probes, the Blastall algorithms compared each individual metagenomic project using various parameters allowing for the natural ambiguities of in vitro hybridization (mismatches, deletions, insertions, hairpinning, etc.). A low level cutoff was used to eliminate poor sequence matches, and to leave a large variety of higher quality matches for future research into the hybridization of sequences with mutations and variations. Any hits with at least 80% base pair conservation over 80% of the length of the match. Because of the size of our whole genome database, we utilized the exact match algorithm of Vmatch to quickly search and compare genomes for exact matches with varying lower level limits on sequence length. I also provided preliminary feasibility analyses to support a potential industry-funded project to develop a multiplex assay on several genera and species. Each genus and species was evaluated based on the amount of sequenced genomes, amount of near neighbor sequenced genomes, presence of identifying genes--metabolistic or antibiotic resistant genes--and the availability of research on the identification of the specific genera or species. Utilizing the bioinformatic team's software, I was able to develop and/or update several TaqMan probes for these and develop a plan of identification for the more difficult ones. One suggestion for a genus with low conservation was to separate species into several groups and look for probes within these and then use a combination of probes to identify a genus. This has the added benefit of also providing subgenus identification in larger genera. During both projects I had developed a set of computer programs to simplify or consolidate several processes. These programs were constructed with the intent of being reused to either repeat these results, further this research, or to start a similar project. A big problem in the bioinformatic/sequencing field is the variability of data storage formats which make using data from various sources extremely difficult. Excluding for the moment the many errors present in online database genome sequences, there are still many difficulties in converting one data type into another successfully every time. Dealing with hundreds of files, each hundreds of megabytes, requires automation which in turn requires good data mining software. The programs I developed will help ease this issue and make more genomic sources available for use. With these programs it is extremely easy to gather the data, cleanse it, convert it and run it through some analysis software and even analyze the output of this software. When dealing with vast amounts of data it is vital for the researcher to optimize the process--which became clear to me with only ten weeks to work with. Due to the time constraint of the internship, I was unable to finish my metagenomic project; I did finish with success, my second project, discovering TaqMan identification for genera and species. Although I did not complete my first project I made significant findings along the way that suggest the need for further research on the subject. I found several instances of false positives in the metagenomic data from our microarrays which indicates the need to sequence more metagenomic samples. My initial research shows the importance of expanding our known metagenomic world; at this point there is always the likelihood of developing probes with unknown interactions because there is not enough sequencing. On the other hand my research did point out the sensitivity and quality of LLNL's microarrays when it identified a parvoviridae infection in a mosquito metagenomic sample from southern California. It also uniquely identified the presence of several species of the adenovirus which could mean that there was some archaic strain of the adenovirus present in the metagenomic sample or there was a contamination in the sample, requiring a further investigation to clarify.« less

  16. SEURAT: visual analytics for the integrated analysis of microarray data.

    PubMed

    Gribov, Alexander; Sill, Martin; Lück, Sonja; Rücker, Frank; Döhner, Konstanze; Bullinger, Lars; Benner, Axel; Unwin, Antony

    2010-06-03

    In translational cancer research, gene expression data is collected together with clinical data and genomic data arising from other chip based high throughput technologies. Software tools for the joint analysis of such high dimensional data sets together with clinical data are required. We have developed an open source software tool which provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data together with associated clinical data, array CGH data and SNP array data. The different data types are organized by a comprehensive data manager. Interactive tools are provided for all graphics: heatmaps, dendrograms, barcharts, histograms, eventcharts and a chromosome browser, which displays genetic variations along the genome. All graphics are dynamic and fully linked so that any object selected in a graphic will be highlighted in all other graphics. For exploratory data analysis the software provides unsupervised data analytics like clustering, seriation algorithms and biclustering algorithms. The SEURAT software meets the growing needs of researchers to perform joint analysis of gene expression, genomical and clinical data.

  17. Combining mouse mammary gland gene expression and comparative mapping for the identification of candidate genes for QTL of milk production traits in cattle

    PubMed Central

    Ron, Micha; Israeli, Galit; Seroussi, Eyal; Weller, Joel I; Gregg, Jeffrey P; Shani, Moshe; Medrano, Juan F

    2007-01-01

    Background Many studies have found segregating quantitative trait loci (QTL) for milk production traits in different dairy cattle populations. However, even for relatively large effects with a saturated marker map the confidence interval for QTL location by linkage analysis spans tens of map units, or hundreds of genes. Combining mapping and arraying has been suggested as an approach to identify candidate genes. Thus, gene expression analysis in the mammary gland of genes positioned in the confidence interval of the QTL can bridge the gap between fine mapping and quantitative trait nucleotide (QTN) determination. Results We hybridized Affymetrix microarray (MG-U74v2), containing 12,488 murine probes, with RNA derived from mammary gland of virgin, pregnant, lactating and involuting C57BL/6J mice in a total of nine biological replicates. We combined microarray data from two additional studies that used the same design in mice with a total of 75 biological replicates. The same filtering and normalization was applied to each microarray data using GeneSpring software. Analysis of variance identified 249 differentially expressed probe sets common to the three experiments along the four developmental stages of puberty, pregnancy, lactation and involution. 212 genes were assigned to their bovine map positions through comparative mapping, and thus form a list of candidate genes for previously identified QTLs for milk production traits. A total of 82 of the genes showed mammary gland-specific expression with at least 3-fold expression over the median representing all tissues tested in GeneAtlas. Conclusion This work presents a web tool for candidate genes for QTL (cgQTL) that allows navigation between the map of bovine milk production QTL, potential candidate genes and their level of expression in mammary gland arrays and in GeneAtlas. Three out of four confirmed genes that affect QTL in livestock (ABCG2, DGAT1, GDF8, IGF2) were over expressed in the target organ. Thus, cgQTL can be used to determine priority of candidate genes for QTN analysis based on differential expression in the target organ. PMID:17584498

  18. Behavioral biometrics for verification and recognition of malicious software agents

    NASA Astrophysics Data System (ADS)

    Yampolskiy, Roman V.; Govindaraju, Venu

    2008-04-01

    Homeland security requires technologies capable of positive and reliable identification of humans for law enforcement, government, and commercial applications. As artificially intelligent agents improve in their abilities and become a part of our everyday life, the possibility of using such programs for undermining homeland security increases. Virtual assistants, shopping bots, and game playing programs are used daily by millions of people. We propose applying statistical behavior modeling techniques developed by us for recognition of humans to the identification and verification of intelligent and potentially malicious software agents. Our experimental results demonstrate feasibility of such methods for both artificial agent verification and even for recognition purposes.

  19. Optimizing the Performance of Radionuclide Identification Software in the Hunt for Nuclear Security Threats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fotion, Katherine A.

    2016-08-18

    The Radionuclide Analysis Kit (RNAK), my team’s most recent nuclide identification software, is entering the testing phase. A question arises: will removing rare nuclides from the software’s library improve its overall performance? An affirmative response indicates fundamental errors in the software’s framework, while a negative response confirms the effectiveness of the software’s key machine learning algorithms. After thorough testing, I found that the performance of RNAK cannot be improved with the library choice effect, thus verifying the effectiveness of RNAK’s algorithms—multiple linear regression, Bayesian network using the Viterbi algorithm, and branch and bound search.

  20. NASA Tech Briefs, January 2004

    NASA Technical Reports Server (NTRS)

    2004-01-01

    Topics covered include: Multisensor Instrument for Real-Time Biological Monitoring; Sensor for Monitoring Nanodevice-Fabrication Plasmas; Backed Bending Actuator; Compact Optoelectronic Compass; Micro Sun Sensor for Spacecraft; Passive IFF: Autonomous Nonintrusive Rapid Identification of Friendly Assets; Finned-Ladder Slow-Wave Circuit for a TWT; Directional Radio-Frequency Identification Tag Reader; Integrated Solar-Energy-Harvesting and -Storage Device; Event-Driven Random-Access-Windowing CCD Imaging System; Stroboscope Controller for Imaging Helicopter Rotors; Software for Checking State-charts; Program Predicts Broadband Noise from a Turbofan Engine; Protocol for a Delay-Tolerant Data-Communication Network; Software Implements a Space-Mission File-Transfer Protocol; Making Carbon-Nanotube Arrays Using Block Copolymers: Part 2; Modular Rake of Pitot Probes; Preloading To Accelerate Slow-Crack-Growth Testing; Miniature Blimps for Surveillance and Collection of Samples; Hybrid Automotive Engine Using Ethanol-Burning Miller Cycle; Fabricating Blazed Diffraction Gratings by X-Ray Lithography; Freeze-Tolerant Condensers; The StarLight Space Interferometer; Champagne Heat Pump; Controllable Sonar Lenses and Prisms Based on ERFs; Measuring Gravitation Using Polarization Spectroscopy; Serial-Turbo-Trellis-Coded Modulation with Rate-1 Inner Code; Enhanced Software for Scheduling Space-Shuttle Processing; Bayesian-Augmented Identification of Stars in a Narrow View; Spacecraft Orbits for Earth/Mars-Lander Radio Relay; and Self-Inflatable/Self-Rigidizable Reflectarray Antenna.

  1. Development and validation of a real-time TaqMan assay for the detection and enumeration of Pseudomonas fluorescens ATCC 13525 used as a challenge organism in testing of food equipments.

    PubMed

    Saha, Ratul; Bestervelt, Lorelle L; Donofrio, Robert S

    2012-02-01

    Pseudomonas fluorescens ATCC 13525 is used as the challenge organism to evaluate the efficacy of the clean-in-place (CIP) process of food equipment (automatic ice-maker) as per NSF/ANSI Standard 12. Traditional culturing methodology is presently used to determine the concentration of the challenge organism, which takes 48 h to confirm the cell density. Storage of the challenge preparation in the refrigerator might alter the cell density as P. fluorescens is capable of growing at 4 °C. Also, background organism can grow on the Pseudomonas F agar (PFA) used for the recovery of P. fluorescens thus affecting the results of the test. Real-time TaqMan assay targeting the cpn60 gene was developed for the enumeration and the identification of P. fluorescens because of its specificity, accuracy, and shorter turnaround time. The TaqMan primer-probe pair developed using the Allele ID® 7.0 probe design software was highly specific and sensitive for the target organism. The sensitivity of the assay was 10 colony forming units (CFU)/mL. The assay was also successful in determining the concentration of the challenge preparation within 2 h. Based on these observations, TaqMan assay targeting the cpn60 gene can be efficiently used for strain level identification and enumeration of bacteria. Pseudomonas fluorescens ATCC 13525 is used as a challenge organism in the efficacy testing of clean-in-place process of food equipments. Currently, culturing technique is used for its identification and estimation, which is not only time-consuming but also prone to error. Real-time TaqMan assay is more specific, sensitive, and accurate along with a shorter turnaround time compared to culturing techniques, thereby increasing the overall quality of the testing methodology to evaluate the clean-in-place process critical for the food industry to protect public health and safety. © 2012 Institute of Food Technologists®

  2. In silico analysis of cacao (Theobroma cacao L.) genes that involved in pathogen and disease responses

    NASA Astrophysics Data System (ADS)

    Agung, Muhammad Budi; Budiarsa, I. Made; Suwastika, I. Nengah

    2017-02-01

    Cocoa bean is one of the main commodities from Indonesia for the world, which still have problem regarding yield degradation due to pathogens and disease attack. Developing robust cacao plant that genetically resistant to pathogen and disease attack is an ideal solution in over taking on this problem. The aim of this study was to identify Theobroma cacao genes on database of cacao genome that homolog to response genes of pathogen and disease attack in other plant, through in silico analysis. Basic information survey and gene identification were performed in GenBank and The Arabidopsis Information Resource database. The In silico analysis contains protein BLAST, homology test of each gene's protein candidates, and identification of homologue gene in Cacao Genome Database using data source "Theobroma cacao cv. Matina 1-6 v1.1" genome. Identification found that Thecc1EG011959t1 (EDS1), Thecc1EG006803t1 (EDS5), Thecc1EG013842t1 (ICS1), and Thecc1EG015614t1 (BG_PPAP) gene of Cacao Genome Database were Theobroma cacao genes that homolog to plant's resistance genes which highly possible to have similar functions of each gene's homologue gene.

  3. Preliminary input to the space shuttle reaction control subsystem failure detection and identification software requirements (uncontrolled)

    NASA Technical Reports Server (NTRS)

    Bergmann, E.

    1976-01-01

    The current baseline method and software implementation of the space shuttle reaction control subsystem failure detection and identification (RCS FDI) system is presented. This algorithm is recommended for conclusion in the redundancy management (RM) module of the space shuttle guidance, navigation, and control system. Supporting software is presented, and recommended for inclusion in the system management (SM) and display and control (D&C) systems. RCS FDI uses data from sensors in the jets, in the manifold isolation valves, and in the RCS fuel and oxidizer storage tanks. A list of jet failures and fuel imbalance warnings is generated for use by the jet selection algorithm of the on-orbit and entry flight control systems, and to inform the crew and ground controllers of RCS failure status. Manifold isolation valve close commands are generated in the event of failed on or leaking jets to prevent loss of large quantities of RCS fuel.

  4. Rice Molecular Breeding Laboratories in the Genomics Era: Current Status and Future Considerations

    PubMed Central

    Collard, Bert C. Y.; Vera Cruz, Casiana M.; McNally, Kenneth L.; Virk, Parminder S.; Mackill, David J.

    2008-01-01

    Using DNA markers in plant breeding with marker-assisted selection (MAS) could greatly improve the precision and efficiency of selection, leading to the accelerated development of new crop varieties. The numerous examples of MAS in rice have prompted many breeding institutes to establish molecular breeding labs. The last decade has produced an enormous amount of genomics research in rice, including the identification of thousands of QTLs for agronomically important traits, the generation of large amounts of gene expression data, and cloning and characterization of new genes, including the detection of single nucleotide polymorphisms. The pinnacle of genomics research has been the completion and annotation of genome sequences for indica and japonica rice. This information—coupled with the development of new genotyping methodologies and platforms, and the development of bioinformatics databases and software tools—provides even more exciting opportunities for rice molecular breeding in the 21st century. However, the great challenge for molecular breeders is to apply genomics data in actual breeding programs. Here, we review the current status of MAS in rice, current genomics projects and promising new genotyping methodologies, and evaluate the probable impact of genomics research. We also identify critical research areas to “bridge the application gap” between QTL identification and applied breeding that need to be addressed to realize the full potential of MAS, and propose ideas and guidelines for establishing rice molecular breeding labs in the postgenome sequence era to integrate molecular breeding within the context of overall rice breeding and research programs. PMID:18528527

  5. Whole-exome sequencing supports genetic heterogeneity in childhood apraxia of speech.

    PubMed

    Worthey, Elizabeth A; Raca, Gordana; Laffin, Jennifer J; Wilk, Brandon M; Harris, Jeremy M; Jakielski, Kathy J; Dimmock, David P; Strand, Edythe A; Shriberg, Lawrence D

    2013-10-02

    Childhood apraxia of speech (CAS) is a rare, severe, persistent pediatric motor speech disorder with associated deficits in sensorimotor, cognitive, language, learning and affective processes. Among other neurogenetic origins, CAS is the disorder segregating with a mutation in FOXP2 in a widely studied, multigenerational London family. We report the first whole-exome sequencing (WES) findings from a cohort of 10 unrelated participants, ages 3 to 19 years, with well-characterized CAS. As part of a larger study of children and youth with motor speech sound disorders, 32 participants were classified as positive for CAS on the basis of a behavioral classification marker using auditory-perceptual and acoustic methods that quantify the competence, precision and stability of a speaker's speech, prosody and voice. WES of 10 randomly selected participants was completed using the Illumina Genome Analyzer IIx Sequencing System. Image analysis, base calling, demultiplexing, read mapping, and variant calling were performed using Illumina software. Software developed in-house was used for variant annotation, prioritization and interpretation to identify those variants likely to be deleterious to neurodevelopmental substrates of speech-language development. Among potentially deleterious variants, clinically reportable findings of interest occurred on a total of five chromosomes (Chr3, Chr6, Chr7, Chr9 and Chr17), which included six genes either strongly associated with CAS (FOXP1 and CNTNAP2) or associated with disorders with phenotypes overlapping CAS (ATP13A4, CNTNAP1, KIAA0319 and SETX). A total of 8 (80%) of the 10 participants had clinically reportable variants in one or two of the six genes, with variants in ATP13A4, KIAA0319 and CNTNAP2 being the most prevalent. Similar to the results reported in emerging WES studies of other complex neurodevelopmental disorders, our findings from this first WES study of CAS are interpreted as support for heterogeneous genetic origins of this pediatric motor speech disorder with multiple genes, pathways and complex interactions. We also submit that our findings illustrate the potential use of WES for both gene identification and case-by-case clinical diagnostics in pediatric motor speech disorders.

  6. SiGN-SSM: open source parallel software for estimating gene networks with state space models.

    PubMed

    Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

    2011-04-15

    SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.

  7. The Application of COI Gene for Species Identification of Forensically Important Muscid Flies (Diptera: Muscidae).

    PubMed

    Ren, Lipin; Chen, Wei; Shang, Yanjie; Meng, Fanming; Zha, Lagabaiyila; Wang, Yong; Guo, Yadong

    2018-05-17

    Muscid Flies (Diptera: Muscidae) are of great forensic importance due to their wide distribution, ubiquitous and synanthropic nature. They are frequently neglected as they tend to arrive at the corpses later than the flesh flies and blow flies. Moreover, the lack of species-level identification also hinders investigation of medicolegal purposes. To overcome the difficulty of morphological identification, molecular method has gained relevance. Cytochrome c oxidase subunit I (COI) gene has been widely utilized. Nonetheless, to achieve correct identification of an unknown sample, it is important to survey certain muscid taxa from its geographic distribution range. Accordingly, the aim of this study is to contribute more geographically specific. We sequenced the COI gene of 51 muscid specimens of 12 species, and added all correct sequences available in GenBank to yield a total data set of 125 COI sequences from 33 muscid species to evaluate the COI gene as a molecular diagnostic tool. The interspecific distances were extremely high (4.7-19.8%) in either the standard barcoding fragment (658 bp) or the long COI sequence (1,019-1,535 bp), demonstrating that these two genetic markers were nearly identical in the species identification. However, the intraspecific distances of the long COI sequences were significantly higher than the barcoding region for the conspecific species that geographical locations vary greatly. Therefore, genetic diversity presented in this study provides a reference for species identification of muscid flies. Nevertheless, further investigation and data from more muscid species are required to enhance the efficacy of species-level identification using COI gene as a genetic marker.

  8. Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories.

    PubMed

    Woo, P C Y; Lau, S K P; Teng, J L L; Tse, H; Yuen, K-Y

    2008-10-01

    In the last decade, as a result of the widespread use of PCR and DNA sequencing, 16S rDNA sequencing has played a pivotal role in the accurate identification of bacterial isolates and the discovery of novel bacteria in clinical microbiology laboratories. For bacterial identification, 16S rDNA sequencing is particularly important in the case of bacteria with unusual phenotypic profiles, rare bacteria, slow-growing bacteria, uncultivable bacteria and culture-negative infections. Not only has it provided insights into aetiologies of infectious disease, but it also helps clinicians in choosing antibiotics and in determining the duration of treatment and infection control procedures. With the use of 16S rDNA sequencing, 215 novel bacterial species, 29 of which belong to novel genera, have been discovered from human specimens in the past 7 years of the 21st century (2001-2007). One hundred of the 215 novel species, 15 belonging to novel genera, have been found in four or more subjects. The largest number of novel species discovered were of the genera Mycobacterium (n = 12) and Nocardia (n = 6). The oral cavity/dental-related specimens (n = 19) and the gastrointestinal tract (n = 26) were the most important sites for discovery and/or reservoirs of novel species. Among the 100 novel species, Streptococcus sinensis, Laribacter hongkongensis, Clostridium hathewayi and Borrelia spielmanii have been most thoroughly characterized, with the reservoirs and routes of transmission documented, and S. sinensis, L. hongkongensis and C. hathewayi have been found globally. One of the greatest hurdles in putting 16S rDNA sequencing into routine use in clinical microbiology laboratories is automation of the technology. The only step that can be automated at the moment is input of the 16S rDNA sequence of the bacterial isolate for identification into one of the software packages that will generate the result of the identity of the isolate on the basis of its sequence database. However, studies on the accuracy of the software packages have given highly varied results, and interpretation of results remains difficult for most technicians, and even for clinical microbiologists. To fully utilize 16S rDNA sequencing in clinical microbiology, better guidelines are needed for interpretation of the identification results, and additional/supplementary methods are necessary for bacterial species that cannot be identified confidently by 16S rDNA sequencing alone.

  9. Identification of novel mutations in endometrial cancer patients by whole-exome sequencing.

    PubMed

    Chang, Ya-Sian; Huang, Hsien-Da; Yeh, Kun-Tu; Chang, Jan-Gowth

    2017-05-01

    The aim of the present study was to identify genomic alterations in Taiwanese endometrial cancer patients. This information is vitally important in Taiwan, where endometrial cancer is the second most common gynecological cancer. We performed whole-exome sequencing on DNA from 14 tumor tissue samples from Taiwanese endometrial cancer patients. We used the Genome Analysis Tool kit software package for data analysis, and the dbSNP, Catalogue of Somatic Mutations in Cancer (COSMIC) and The Cancer Genome Atlas (TCGA) databases for comparisons. Variants were validated via Sanger sequencing. We identified 143 non-synonymous mutations in 756 canonical cancer-related genes and 1,271 non-synonymous mutations in non-canonical cancer-related genes in 14 endometrial samples. PTEN, KRAS and PIK3R1 were the most frequently mutated canonical cancer-related genes. Our results revealed nine potential driver genes (MAPT, IL24, MCM6, TSC1, BIRC2, CIITA, DST, CASP8 and NOTCH2) and 21 potential passenger genes (ARMCX4, IGSF10, VPS13C, DCT, DNAH14, TLN1, ZNF605, ZSCAN29, MOCOS, CMYA5, PCDH17, UGT1A8, CYFIP2, MACF1, NUDT5, JAKMIP1, PCDHGB4, FAM178A, SNX6, IMP4 and PCMTD1). The detected molecular aberrations led to putative activation of the mTOR, Wnt, MAPK, VEGF and ErbB pathways, as well as aberrant DNA repair, cell cycle control and apoptosis pathways. We characterized the mutational landscape and genetic alterations in multiple cellular pathways of endometrial cancer in the Taiwanese population.

  10. Identification, expression and phylogenetic analysis of EgG1Y162 from Echinococcus granulosus.

    PubMed

    Zhang, Fengbo; Ma, Xiumin; Zhu, Yuejie; Wang, Hongying; Liu, Xianfei; Zhu, Min; Ma, Haimei; Wen, Hao; Fan, Haining; Ding, Jianbing

    2014-01-01

    This study was to clone, identify and analyze the characteristics of egG1Y162 gene from Echinococcus granulosus. Genomic DNA and total RNAs were extracted from four different developmental stages of protoscolex, germinal layer, adult and egg of Echinococcus granulosus, respectively. Fluorescent quantitative PCR was used for analyzing the expression of egG1Y162 gene. Prokaryotic expression plasmid of pET41a-EgG1Y162 was constructed to express recombinant His-EgG1Y162 antigen. Western blot analysis was performed to detect antigenicity of EgG1Y162 antigen. Gene sequence, amino acid alignment and phylogenetic tree of EgG1Y162 were analyzed by BLAST, online Spidey and MEGA4 software, respectively. EgG1Y162 gene was expressed in four developmental stages of Echinococcus granulosus. And, egG1Y162 gene expression was the highest in the adult stage, with the relative value of 19.526, significantly higher than other three stages. Additionally, Western blot analysis revealed that EgG1Y162 recombinant protein had good reaction with serum samples from Echinococcus granulosus infected human and dog. Moreover, EgG1Y162 antigen was phylogenetically closest to EmY162 antigen, with the similarity over 90%. Our study identified EgG1Y162 antigen in Echinococcus granulosus for the first time. EgG1Y162 antigen had a high similarity with EmY162 antigen, with the genetic differences mainly existing in the intron region. And, EgG1Y162 recombinant protein showed good antigenicity.

  11. Streptococcus iniae, a Human and Animal Pathogen: Specific Identification by the Chaperonin 60 Gene Identification Method

    PubMed Central

    Goh, Swee Han; Driedger, David; Gillett, Sandra; Low, Donald E.; Hemmingsen, Sean M.; Amos, Mayben; Chan, David; Lovgren, Marguerite; Willey, Barbara M.; Shaw, Carol; Smith, John A.

    1998-01-01

    It was recently reported that Streptococcus iniae, a bacterial pathogen of aquatic animals, can cause serious disease in humans. Using the chaperonin 60 (Cpn60) gene identification method with reverse checkerboard hybridization and chemiluminescent detection, we identified correctly each of 12 S. iniae samples among 34 aerobic gram-positive isolates from animal and clinical human sources. PMID:9650992

  12. Automated designation of tie-points for image-to-image coregistration.

    Treesearch

    R.E. Kennedy; W.B. Cohen

    2003-01-01

    Image-to-image registration requires identification of common points in both images (image tie-points: ITPs). Here we describe software implementing an automated, area-based technique for identifying ITPs. The ITP software was designed to follow two strategies: ( I ) capitalize on human knowledge and pattern recognition strengths, and (2) favour robustness in many...

  13. Identification of Factors That Affect Software Complexity.

    ERIC Educational Resources Information Center

    Kaiser, Javaid

    A survey of computer scientists was conducted to identify factors that affect software complexity. A total of 160 items were selected from the literature to include in a questionnaire sent to 425 individuals who were employees of computer-related businesses in Lawrence and Kansas City. The items were grouped into nine categories called system…

  14. Practical Issues in Implementing Software Reliability Measurement

    NASA Technical Reports Server (NTRS)

    Nikora, Allen P.; Schneidewind, Norman F.; Everett, William W.; Munson, John C.; Vouk, Mladen A.; Musa, John D.

    1999-01-01

    Many ways of estimating software systems' reliability, or reliability-related quantities, have been developed over the past several years. Of particular interest are methods that can be used to estimate a software system's fault content prior to test, or to discriminate between components that are fault-prone and those that are not. The results of these methods can be used to: 1) More accurately focus scarce fault identification resources on those portions of a software system most in need of it. 2) Estimate and forecast the risk of exposure to residual faults in a software system during operation, and develop risk and safety criteria to guide the release of a software system to fielded use. 3) Estimate the efficiency of test suites in detecting residual faults. 4) Estimate the stability of the software maintenance process.

  15. Automatic identification approach for high-performance liquid chromatography-multiple reaction monitoring fatty acid global profiling.

    PubMed

    Tie, Cai; Hu, Ting; Jia, Zhi-Xin; Zhang, Jin-Lan

    2015-08-18

    Fatty acids (FAs) are a group of lipid molecules that are essential to organisms. As potential biomarkers for different diseases, FAs have attracted increasing attention from both biological researchers and the pharmaceutical industry. A sensitive and accurate method for globally profiling and identifying FAs is required for biomarker discovery. The high selectivity and sensitivity of high-performance liquid chromatography-multiple reaction monitoring (HPLC-MRM) gives it great potential to fulfill the need to identify FAs from complicated matrices. This paper developed a new approach for global FA profiling and identification for HPLC-MRM FA data mining. Mathematical models for identifying FAs were simulated using the isotope-induced retention time (RT) shift (IRS) and peak area ratios between parallel isotope peaks for a series of FA standards. The FA structures were predicated using another model based on the RT and molecular weight. Fully automated FA identification software was coded using the Qt platform based on these mathematical models. Different samples were used to verify the software. A high identification efficiency (greater than 75%) was observed when 96 FA species were identified in plasma. This FAs identification strategy promises to accelerate FA research and applications.

  16. Cell type-selective disease-association of genes under high regulatory load

    PubMed Central

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-01-01

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3′ UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. PMID:26338775

  17. Dissemination of metallo-β-lactamase in Pseudomonas aeruginosa isolates in Egypt: mutation in blaVIM-4.

    PubMed

    Hashem, Hany; Hanora, Amro; Abdalla, Salah; Shaeky, Alaa; Saad, Alaa

    2017-05-01

    This study was designed to investigate the prevalence of metallo-β-lactamase (MBL) in Pseudomonas aeruginosa isolates collected from Suez Canal University Hospital in Ismailia, Egypt. Antibiotic susceptibility testing and phenotypic and genotypic screening for MBLs were performed on 147 isolates of P. aeruginosa. MICs were determined by agar dilution method for carbapenem that was ≥2 μg/mL for meropenem. MBL genes were detected by multiplex and monoplex PCR for P. aeruginosa-harbored plasmids. Mutation profile of sequenced MBL genes was screened using online software Clustal Omega. Out of 147 P. aeruginosa, 39 (26.5%) were carbapenem-resistant isolates and 25 (64%) were confirmed to be positive for MBLs. The susceptibility rate of P. aeruginosa toward polymyxin B and norfloxacin was 99% and 88%, respectively. Identification of collected isolates by API analysis and constructed phylogenetic tree of 16S rRNA showed that the isolates were related to P. aeruginosa species. The frequency of blaGIM-1, blaSIM-1, and blaSPM-1 was 52%, 48%, and 24%, respectively. BlaVIM and blaIMP-like genes were 20% and 4% and the sequences confirm the isolate to be blaVIM-1, blaVIM-2, blaVIM-4, and blaIMP-1. Three mutations were identified in blaVIM-4 gene. Our study emphasizes the high occurrence of multidrug-resistant P. aeruginosa-producing MBL enzymes. © 2017 APMIS. Published by John Wiley & Sons Ltd.

  18. VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding.

    PubMed

    Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou

    2014-07-01

    Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.

  19. DNASynth: a software application to optimization of artificial gene synthesis

    NASA Astrophysics Data System (ADS)

    Muczyński, Jan; Nowak, Robert M.

    2017-08-01

    DNASynth is a client-server software application in which the client runs in a web browser. The aim of this program is to support and optimize process of artificial gene synthesizing using Ligase Chain Reaction. Thanks to LCR it is possible to obtain DNA strand coding defined by user peptide. The DNA sequence is calculated by optimization algorithm that consider optimal codon usage, minimal energy of secondary structures and minimal number of required LCR. Additionally absence of sequences characteristic for defined by user set of restriction enzymes is guaranteed. The presented software was tested on synthetic and real data.

  20. Systemic analysis of genome-wide expression profiles identified potential therapeutic targets of demethylation drugs for glioblastoma.

    PubMed

    Ning, Tongbo; Cui, Hao; Sun, Feng; Zou, Jidian

    2017-09-05

    Glioblastoma represents one of the most aggressive malignant brain tumors with high morbidity and motility. Demethylation drugs have been developed for its treatment with little efficacy has been observed. The purpose of this study was to screen therapeutic targets of demethylation drugs or bioactive molecules for glioblastoma through systemic bioinformatics analysis. We firstly downloaded genome-wide expression profiles from the Gene Expression Omnibus (GEO) and conducted the primary analysis through R software, mainly including preprocessing of raw microarray data, transformation between probe ID and gene symbol and identification of differential expression genes (DEGs). Secondly, functional enrichment analysis was conducted via the Database for Annotation, Visualization and Integrated Discovery (DAVID) to explore biological processes involved in the development of glioblastoma. Thirdly, we constructed protein-protein interaction (PPI) network of interested genes and conducted cross analysis for multi datasets to obtain potential therapeutic targets for glioblastoma. Finally, we further confirmed the therapeutic targets through real-time RT-PCR. As a result, biological processes that related to cancer development, amino metabolism, immune response and etc. were found to be significantly enriched in genes that differential expression in glioblastoma and regulated by 5'aza-dC. Besides, network and cross analysis identified ACAT2, UFC1 and CYB5R1 as novel therapeutic targets of demethylation drugs which also confirmed by real time RT-PCR. In conclusions, our study identified several biological processes and genes that involved in the development of glioblastoma and regulated by 5'aza-dC, which would be helpful for the treatment of glioblastoma. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Matrix-Assisted Laser Desorption Ionization (MALDI)-Time of Flight Mass Spectrometry- and MALDI Biotyper-Based Identification of Cultured Biphenyl-Metabolizing Bacteria from Contaminated Horseradish Rhizosphere Soil▿

    PubMed Central

    Uhlik, Ondrej; Strejcek, Michal; Junkova, Petra; Sanda, Miloslav; Hroudova, Miluse; Vlcek, Cestmir; Mackova, Martina; Macek, Tomas

    2011-01-01

    Bacteria that are able to utilize biphenyl as a sole source of carbon were extracted and isolated from polychlorinated biphenyl (PCB)-contaminated soil vegetated by horseradish. Isolates were identified using matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS). The usage of MALDI Biotyper for the classification of isolates was evaluated and compared to 16S rRNA gene sequence analysis. A wide spectrum of bacteria was isolated, with Arthrobacter, Serratia, Rhodococcus, and Rhizobium being predominant. Arthrobacter isolates also represented the most diverse group. The use of MALDI Biotyper in many cases permitted the identification at the level of species, which was not achieved by 16S rRNA gene sequence analyses. However, some isolates had to be identified by 16S rRNA gene analyses if MALDI Biotyper-based identification was at the level of probable or not reliable identification, usually due to a lack of reference spectra included in the database. Overall, this study shows the possibility of using MALDI-TOF MS and MALDI Biotyper for the fast and relatively nonlaborious identification/classification of soil isolates. At the same time, it demonstrates the dominant role of employing 16S rRNA gene analyses for the identification of recently isolated strains that can later fill the gaps in the protein-based identification databases. PMID:21821747

  2. Transcriptional profile of breast muscle in heat stressed layers is similar to that of broiler chickens at control temperature.

    PubMed

    Zahoor, Imran; de Koning, Dirk-Jan; Hocking, Paul M

    2017-09-20

    In recent years, the commercial importance of changes in muscle function of broiler chickens and of the corresponding effects on meat quality has increased. Furthermore, broilers are more sensitive to heat stress during transport and at high ambient temperatures than smaller egg-laying chickens. We hypothesised that heat stress would amplify muscle damage and expression of genes that are involved in such changes and, thus, lead to the identification of pathways and networks associated with broiler muscle and meat quality traits. Broiler and layer chickens were exposed to control or high ambient temperatures to characterise differences in gene expression between the two genotypes and the two environments. Whole-genome expression studies in breast muscles of broiler and layer chickens were conducted before and after heat stress; 2213 differentially-expressed genes were detected based on a significant (P < 0.05) genotype × treatment interaction. This gene set was analysed with the BioLayout Express 3D and Ingenuity Pathway Analysis software and relevant biological pathways and networks were identified. Genes involved in functions related to inflammatory reactions, cell death, oxidative stress and tissue damage were upregulated in control broilers compared with control and heat-stressed layers. Expression of these genes was further increased in heat-stressed broilers. Differences in gene expression between broiler and layer chickens under control and heat stress conditions suggest that damage of breast muscles in broilers at normal ambient temperatures is similar to that in heat-stressed layers and is amplified when broilers are exposed to heat stress. The patterns of gene expression of the two genotypes under heat stress were almost the polar opposite of each other, which is consistent with the conclusion that broiler chickens were not able to cope with heat stress by dissipating their body heat. The differentially expressed gene networks and pathways were consistent with the pathological changes that are observed in the breast muscle of heat-stressed broilers.

  3. A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

    PubMed

    Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang

    2017-08-23

    Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.

  4. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is available as a stand-alone software package at http://wiki.chasmsoftware.org and is hosted by the CRAVAT web server at http://www.cravat.us PMID:23819870

  5. Identification and functional analysis of a core gene module associated with hepatitis C virus-induced human hepatocellular carcinoma progression.

    PubMed

    Bai, Gaobo; Zheng, Wenling; Ma, Wenli

    2018-05-01

    Hepatitis C virus (HCV)-induced human hepatocellular carcinoma (HCC) progression may be due to a complex multi-step processes. The developmental mechanism of these processes is worth investigating for the prevention, diagnosis and therapy of HCC. The aim of the present study was to investigate the molecular mechanism underlying the progression of HCV-induced hepatocarcinogenesis. First, the dynamic gene module, consisting of key genes associated with progression between the normal stage and HCC, was identified using the Weighted Gene Co-expression Network Analysis tool from R language. By defining those genes in the module as seeds, the change of co-expression in differentially expressed gene sets in two consecutive stages of pathological progression was examined. Finally, interaction pairs of HCV viral proteins and their directly targeted proteins in the identified module were extracted from the literature and a comprehensive interaction dataset from yeast two-hybrid experiments. By combining the interactions between HCV and their targets, and protein-protein interactions in the Search Tool for the Retrieval of Interacting Genes database (STRING), the HCV-key genes interaction network was constructed and visualized using Cytoscape software 3.2. As a result, a module containing 44 key genes was identified to be associated with HCC progression, due to the dynamic features and functions of those genes in the module. Several important differentially co-expressed gene pairs were identified between non-HCC and HCC stages. In the key genes, cyclin dependent kinase 1 (CDK1), NDC80, cyclin A2 (CCNA2) and rac GTPase activating protein 1 (RACGAP1) were shown to be targeted by the HCV nonstructural proteins NS5A, NS3 and NS5B, respectively. The four genes perform an intermediary role between the HCV viral proteins and the dysfunctional module in the HCV key genes interaction network. These findings provided valuable information for understanding the mechanism of HCV-induced HCC progression and for seeking drug targets for the therapy and prevention of HCC.

  6. Semi-automated De-identification of German Content Sensitive Reports for Big Data Analytics.

    PubMed

    Seuss, Hannes; Dankerl, Peter; Ihle, Matthias; Grandjean, Andrea; Hammon, Rebecca; Kaestle, Nicola; Fasching, Peter A; Maier, Christian; Christoph, Jan; Sedlmayr, Martin; Uder, Michael; Cavallaro, Alexander; Hammon, Matthias

    2017-07-01

    Purpose  Projects involving collaborations between different institutions require data security via selective de-identification of words or phrases. A semi-automated de-identification tool was developed and evaluated on different types of medical reports natively and after adapting the algorithm to the text structure. Materials and Methods  A semi-automated de-identification tool was developed and evaluated for its sensitivity and specificity in detecting sensitive content in written reports. Data from 4671 pathology reports (4105 + 566 in two different formats), 2804 medical reports, 1008 operation reports, and 6223 radiology reports of 1167 patients suffering from breast cancer were de-identified. The content was itemized into four categories: direct identifiers (name, address), indirect identifiers (date of birth/operation, medical ID, etc.), medical terms, and filler words. The software was tested natively (without training) in order to establish a baseline. The reports were manually edited and the model re-trained for the next test set. After manually editing 25, 50, 100, 250, 500 and if applicable 1000 reports of each type re-training was applied. Results  In the native test, 61.3 % of direct and 80.8 % of the indirect identifiers were detected. The performance (P) increased to 91.4 % (P25), 96.7 % (P50), 99.5 % (P100), 99.6 % (P250), 99.7 % (P500) and 100 % (P1000) for direct identifiers and to 93.2 % (P25), 97.9 % (P50), 97.2 % (P100), 98.9 % (P250), 99.0 % (P500) and 99.3 % (P1000) for indirect identifiers. Without training, 5.3 % of medical terms were falsely flagged as critical data. The performance increased, after training, to 4.0 % (P25), 3.6 % (P50), 4.0 % (P100), 3.7 % (P250), 4.3 % (P500), and 3.1 % (P1000). Roughly 0.1 % of filler words were falsely flagged. Conclusion  Training of the developed de-identification tool continuously improved its performance. Training with roughly 100 edited reports enables reliable detection and labeling of sensitive data in different types of medical reports. Key Points:   · Collaborations between different institutions require de-identification of patients' data. · Software-based de-identification of content-sensitive reports grows in importance as a result of 'Big data'. · A de-identification software was developed and tested natively and after training. · The proposed de-identification software worked quite reliably, following training with roughly 100 edited reports. · A final check of the texts by an authorized person remains necessary. Citation Format · Seuss H, Dankerl P, Ihle M et al. Semi-automated De-identification of German Content Sensitive Reports for Big Data Analytics. Fortschr Röntgenstr 2017; 189: 661 - 671. © Georg Thieme Verlag KG Stuttgart · New York.

  7. Qualification of Simulation Software for Safety Assessment of Sodium Cooled Fast Reactors. Requirements and Recommendations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Nicholas R.; Pointer, William David; Sieger, Matt

    2016-04-01

    The goal of this review is to enable application of codes or software packages for safety assessment of advanced sodium-cooled fast reactor (SFR) designs. To address near-term programmatic needs, the authors have focused on two objectives. First, the authors have focused on identification of requirements for software QA that must be satisfied to enable the application of software to future safety analyses. Second, the authors have collected best practices applied by other code development teams to minimize cost and time of initial code qualification activities and to recommend a path to the stated goal.

  8. Mugshot Identification Database (MID)

    National Institute of Standards and Technology Data Gateway

    NIST Mugshot Identification Database (MID) (Web, free access)   NIST Special Database 18 is being distributed for use in development and testing of automated mugshot identification systems. The database consists of three CD-ROMs, containing a total of 3248 images of variable size using lossless compression. A newer version of the compression/decompression software on the CDROM can be found at the website http://www.nist.gov/itl/iad/ig/nigos.cfm as part of the NBIS package.

  9. Diagnostic test for prenatal identification of Down's syndrome and mental retardation and gene therapy therefor

    DOEpatents

    Smith, Desmond J.; Rubin, Edward M.

    2000-01-01

    A a diagnostic test useful for prenatal identification of Down syndrome and mental retardation. A method for gene therapy for correction and treatment of Down syndrome. DYRK gene involved in the ability to learn. A method for diagnosing Down's syndrome and mental retardation and an assay therefor. A pharmaceutical composition for treatment of Down's syndrome mental retardation.

  10. Targeting Conserved Genes in Penicillium Species.

    PubMed

    Peterson, Stephen W

    2017-01-01

    Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of dideoxynucleotide-labeled fragments or NGS. The sequences are compared to a database of validated isolates. Identification of species indicates the potential of the fungus to make particular mycotoxins.

  11. Multiscale global identification of porous structures

    NASA Astrophysics Data System (ADS)

    Hatłas, Marcin; Beluch, Witold

    2018-01-01

    The paper is devoted to the evolutionary identification of the material constants of porous structures based on measurements conducted on a macro scale. Numerical homogenization with the RVE concept is used to determine the equivalent properties of a macroscopically homogeneous material. Finite element method software is applied to solve the boundary-value problem in both scales. Global optimization methods in form of evolutionary algorithm are employed to solve the identification task. Modal analysis is performed to collect the data necessary for the identification. A numerical example presenting the effectiveness of proposed attitude is attached.

  12. Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial.

    PubMed

    Velupillai, Sumithra; Dalianis, Hercules; Hassel, Martin; Nilsson, Gunnar H

    2009-12-01

    Electronic patient records (EPRs) contain a large amount of information written in free text. This information is considered very valuable for research but is also very sensitive since the free text parts may contain information that could reveal the identity of a patient. Therefore, methods for de-identifying EPRs are needed. The work presented here aims to perform a manual and automatic Protected Health Information (PHI)-annotation trial for EPRs written in Swedish. This study consists of two main parts: the initial creation of a manually PHI-annotated gold standard, and the porting and evaluation of an existing de-identification software written for American English to Swedish in a preliminary automatic de-identification trial. Results are measured with precision, recall and F-measure. This study reports fairly high Inter-Annotator Agreement (IAA) results on the manually created gold standard, especially for specific tags such as names. The average IAA over all tags was 0.65 F-measure (0.84 F-measure highest pairwise agreement). For name tags the average IAA was 0.80 F-measure (0.91 F-measure highest pairwise agreement). Porting a de-identification software written for American English to Swedish directly was unfortunately non-trivial, yielding poor results. Developing gold standard sets as well as automatic systems for de-identification tasks in Swedish is feasible. However, discussions and definitions on identifiable information is needed, as well as further developments both on the tag sets and the annotation guidelines, in order to get a reliable gold standard. A completely new de-identification software needs to be developed.

  13. Evaluation of mass spectrometric data using principal component analysis for determination of the effects of organic lakes on protein binder identification.

    PubMed

    Hrdlickova Kuckova, Stepanka; Rambouskova, Gabriela; Hynek, Radovan; Cejnar, Pavel; Oltrogge, Doris; Fuchs, Robert

    2015-11-01

    Matrix-assisted laser desorption/ionisation-time of flight (MALDI-TOF) mass spectrometry is commonly used for the identification of proteinaceous binders and their mixtures in artworks. The determination of protein binders is based on a comparison between the m/z values of tryptic peptides in the unknown sample and a reference one (egg, casein, animal glues etc.), but this method has greater potential to study changes due to ageing and the influence of organic/inorganic components on protein identification. However, it is necessary to then carry out statistical evaluation on the obtained data. Before now, it has been complicated to routinely convert the mass spectrometric data into a statistical programme, to extract and match the appropriate peaks. Only several 'homemade' computer programmes without user-friendly interfaces are available for these purposes. In this paper, we would like to present our completely new, publically available, non-commercial software, ms-alone and multiMS-toolbox, for principal component analyses of MALDI-TOF MS data for R software, and their application to the study of the influence of heterogeneous matrices (organic lakes) for protein identification. Using this new software, we determined the main factors that influence the protein analyses of artificially aged model mixtures of organic lakes and fish glue, prepared according to historical recipes that were used for book illumination, using MALDI-TOF peptide mass mapping. Copyright © 2015 John Wiley & Sons, Ltd.

  14. Supporting Fourth-Grade Students' Word Identification Using Application Software

    ERIC Educational Resources Information Center

    Moser, Gary P.; Morrison, Timothy G.; Wilcox, Brad

    2017-01-01

    A quasi-experimental study examined effects of a 10-week word structure intervention with fourth-grade students. During daily 10-15-minute practice periods, students worked individually with mobile apps focused on specific aspects of word identification. Pre- and post-treatment assessments showed no differences in rate and accuracy of oral reading…

  15. Exploring State-of-the-Art Software for Forensic Authorship Identification

    ERIC Educational Resources Information Center

    Guillén-Nieto, Victoria; Vargas-Sierra, Chelo; Pardiño-Juan, Maria; Martinez-Barco, Patricio; Suárez-Cueto, Armando

    2008-01-01

    Back in the 1990s Malcolm Coulthard announced the beginnings of an emerging discipline, "forensic linguistics", resulting from the interface of language, crime and the law. Today the courts are more than ever calling on language experts to help in certain types of cases, such as authorship identification, plagiarism, legal interpreting…

  16. The Use of Computer-Assisted Identification of ARIMA Time-Series.

    ERIC Educational Resources Information Center

    Brown, Roger L.

    This study was conducted to determine the effects of using various levels of tutorial statistical software for the tentative identification of nonseasonal ARIMA models, a statistical technique proposed by Box and Jenkins for the interpretation of time-series data. The Box-Jenkins approach is an iterative process encompassing several stages of…

  17. 75 FR 60689 - Hazardous Waste Management System; Identification and Listing of Hazardous Waste; Proposed Rule

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-01

    ... exclude (or delist) a certain solid waste generated by its Beaumont, Texas, facility from the lists of hazardous wastes. EPA used the Delisting Risk Assessment Software (DRAS) Version 3.0 in the evaluation of... Waste Management System; Identification and Listing of Hazardous Waste; Proposed Rule AGENCY...

  18. A novel statistical approach for identification of the master regulator transcription factor.

    PubMed

    Sikdar, Sinjini; Datta, Susmita

    2017-02-02

    Transcription factors are known to play key roles in carcinogenesis and therefore, are gaining popularity as potential therapeutic targets in drug development. A 'master regulator' transcription factor often appears to control most of the regulatory activities of the other transcription factors and the associated genes. This 'master regulator' transcription factor is at the top of the hierarchy of the transcriptomic regulation. Therefore, it is important to identify and target the master regulator transcription factor for proper understanding of the associated disease process and identifying the best therapeutic option. We present a novel two-step computational approach for identification of master regulator transcription factor in a genome. At the first step of our method we test whether there exists any master regulator transcription factor in the system. We evaluate the concordance of two ranked lists of transcription factors using a statistical measure. In case the concordance measure is statistically significant, we conclude that there is a master regulator. At the second step, our method identifies the master regulator transcription factor, if there exists one. In the simulation scenario, our method performs reasonably well in validating the existence of a master regulator when the number of subjects in each treatment group is reasonably large. In application to two real datasets, our method ensures the existence of master regulators and identifies biologically meaningful master regulators. An R code for implementing our method in a sample test data can be found in http://www.somnathdatta.org/software . We have developed a screening method of identifying the 'master regulator' transcription factor just using only the gene expression data. Understanding the regulatory structure and finding the master regulator help narrowing the search space for identifying biomarkers for complex diseases such as cancer. In addition to identifying the master regulator our method provides an overview of the regulatory structure of the transcription factors which control the global gene expression profiles and consequently the cell functioning.

  19. Techniques for development of safety-related software for surgical robots.

    PubMed

    Varley, P

    1999-12-01

    Regulatory bodies require evidence that software controlling potentially hazardous devices is developed to good manufacturing practices. Effective techniques used in other industries assume long timescales and high staffing levels and can be unsuitable for use without adaptation in developing electronic healthcare devices. This paper discusses a set of techniques used in practice to develop software for a particular innovative medical product, an endoscopic camera manipulator. These techniques include identification of potential hazards and tracing their mitigating factors through the project lifecycle.

  20. [Analysis of software for identifying spectral line of laser-induced breakdown spectroscopy based on LabVIEW].

    PubMed

    Hu, Zhi-yu; Zhang, Lei; Ma, Wei-guang; Yan, Xiao-juan; Li, Zhi-xin; Zhang, Yong-zhi; Wang, Le; Dong, Lei; Yin, Wang-bao; Jia, Suo-tang

    2012-03-01

    Self-designed identifying software for LIBS spectral line was introduced. Being integrated with LabVIEW, the soft ware can smooth spectral lines and pick peaks. The second difference and threshold methods were employed. Characteristic spectrum of several elements matches the NIST database, and realizes automatic spectral line identification and qualitative analysis of the basic composition of sample. This software can analyze spectrum handily and rapidly. It will be a useful tool for LIBS.

  1. VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites.

    PubMed

    Spinozzi, Giulio; Calabria, Andrea; Brasca, Stefano; Beretta, Stefano; Merelli, Ivan; Milanesi, Luciano; Montini, Eugenio

    2017-11-25

    Bioinformatics tools designed to identify lentiviral or retroviral vector insertion sites in the genome of host cells are used to address the safety and long-term efficacy of hematopoietic stem cell gene therapy applications and to study the clonal dynamics of hematopoietic reconstitution. The increasing number of gene therapy clinical trials combined with the increasing amount of Next Generation Sequencing data, aimed at identifying integration sites, require both highly accurate and efficient computational software able to correctly process "big data" in a reasonable computational time. Here we present VISPA2 (Vector Integration Site Parallel Analysis, version 2), the latest optimized computational pipeline for integration site identification and analysis with the following features: (1) the sequence analysis for the integration site processing is fully compliant with paired-end reads and includes a sequence quality filter before and after the alignment on the target genome; (2) an heuristic algorithm to reduce false positive integration sites at nucleotide level to reduce the impact of Polymerase Chain Reaction or trimming/alignment artifacts; (3) a classification and annotation module for integration sites; (4) a user friendly web interface as researcher front-end to perform integration site analyses without computational skills; (5) the time speedup of all steps through parallelization (Hadoop free). We tested VISPA2 performances using simulated and real datasets of lentiviral vector integration sites, previously obtained from patients enrolled in a hematopoietic stem cell gene therapy clinical trial and compared the results with other preexisting tools for integration site analysis. On the computational side, VISPA2 showed a > 6-fold speedup and improved precision and recall metrics (1 and 0.97 respectively) compared to previously developed computational pipelines. These performances indicate that VISPA2 is a fast, reliable and user-friendly tool for integration site analysis, which allows gene therapy integration data to be handled in a cost and time effective fashion. Moreover, the web access of VISPA2 ( http://openserver.itb.cnr.it/vispa/ ) ensures accessibility and ease of usage to researches of a complex analytical tool. We released the source code of VISPA2 in a public repository ( https://bitbucket.org/andreacalabria/vispa2 ).

  2. Gene expression complex networks: synthesis, identification, and analysis.

    PubMed

    Lopes, Fabrício M; Cesar, Roberto M; Costa, Luciano Da F

    2011-10-01

    Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree variation, decreasing its network recovery rate with the increase of . The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

  3. Selecting Advanced Software Technology in Two Small Manufacturing Enterprises

    DTIC Science & Technology

    2004-05-01

    improving workflow to further reduce delivery times, enhance customer service, and obtain a competitive advantage . The company wanted help... environment , stakeholders’ needs, ecommerce , shop floor visualization, and collaboration capability. These statements are not significantly different...for the purpose of describing a software environment . This identification does not imply any recommendation or endorsement by NIST, the SEI, CMU, or

  4. The Role of Dynamic Software in the Identification and Construction of Mathematical Relationships

    ERIC Educational Resources Information Center

    Santos-Trigo, Manuel

    2004-01-01

    What features of mathematical thinking do students exhibit when they use dynamic software in their problem solving approaches? To what extent does the systematic use of technology favour students' development of problem solving competences? What type of reasoning do students develop as a result of using a particular tool? This study documents…

  5. Is Chinese Software Engineering Professionalizing or Not?: Specialization of Knowledge, Subjective Identification and Professionalization

    ERIC Educational Resources Information Center

    Yang, Yan

    2012-01-01

    Purpose: This paper aims to discuss the challenge for the classical idea of professionalism in understanding the Chinese software engineering industry after giving a close insight into the development of this industry as well as individual engineers with a psycho-societal perspective. Design/methodology/approach: The study starts with the general…

  6. Species-Level Identification of Actinomyces Isolates Causing Invasive Infections: Multiyear Comparison of Vitek MS (Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry) to Partial Sequencing of the 16S rRNA Gene.

    PubMed

    Lynch, T; Gregson, D; Church, D L

    2016-03-01

    Actinomyces species are uncommon but important causes of invasive infections. The ability of our regional clinical microbiology laboratory to report species-level identification of Actinomyces relied on molecular identification by partial sequencing of the 16S ribosomal gene prior to the implementation of the Vitek MS (matrix-assisted laser desorption ionization-time of flight mass spectrometry [MALDI-TOF MS]) system. We compared the use of the Vitek MS to that of 16S rRNA gene sequencing for reliable species-level identification of invasive infections caused by Actinomyces spp. because limited data had been published for this important genera. A total of 115 cases of Actinomyces spp., either alone or as part of a polymicrobial infection, were diagnosed between 2011 and 2014. Actinomyces spp. were considered the principal pathogen in bloodstream infections (n = 17, 15%), in skin and soft tissue abscesses (n = 25, 22%), and in pulmonary (n = 26, 23%), bone (n = 27, 23%), intraabdominal (n = 16, 14%), and central nervous system (n = 4, 3%) infections. Compared to sequencing and identification from the SmartGene Integrated Database Network System (IDNS), Vitek MS identified 47/115 (41%) isolates to the correct species and 10 (9%) isolates to the correct genus. However, the Vitek MS was unable to provide identification for 43 (37%) isolates while 15 (13%) had discordant results. Phylogenetic analyses of the 16S rRNA sequences demonstrate high diversity in recovered Actinomyces spp. and provide additional information to compare/confirm discordant identifications between MALDI-TOF and 16S rRNA gene sequences. This study highlights the diversity of clinically relevant Actinomyces spp. and provides an important typing comparison. Based on our analysis, 16S rRNA gene sequencing should be used to rapidly identify Actinomyces spp. until MALDI-TOF databases are optimized. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  7. Species-Level Identification of Actinomyces Isolates Causing Invasive Infections: Multiyear Comparison of Vitek MS (Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry) to Partial Sequencing of the 16S rRNA Gene

    PubMed Central

    Gregson, D.; Church, D. L.

    2016-01-01

    Actinomyces species are uncommon but important causes of invasive infections. The ability of our regional clinical microbiology laboratory to report species-level identification of Actinomyces relied on molecular identification by partial sequencing of the 16S ribosomal gene prior to the implementation of the Vitek MS (matrix-assisted laser desorption ionization–time of flight mass spectrometry [MALDI-TOF MS]) system. We compared the use of the Vitek MS to that of 16S rRNA gene sequencing for reliable species-level identification of invasive infections caused by Actinomyces spp. because limited data had been published for this important genera. A total of 115 cases of Actinomyces spp., either alone or as part of a polymicrobial infection, were diagnosed between 2011 and 2014. Actinomyces spp. were considered the principal pathogen in bloodstream infections (n = 17, 15%), in skin and soft tissue abscesses (n = 25, 22%), and in pulmonary (n = 26, 23%), bone (n = 27, 23%), intraabdominal (n = 16, 14%), and central nervous system (n = 4, 3%) infections. Compared to sequencing and identification from the SmartGene Integrated Database Network System (IDNS), Vitek MS identified 47/115 (41%) isolates to the correct species and 10 (9%) isolates to the correct genus. However, the Vitek MS was unable to provide identification for 43 (37%) isolates while 15 (13%) had discordant results. Phylogenetic analyses of the 16S rRNA sequences demonstrate high diversity in recovered Actinomyces spp. and provide additional information to compare/confirm discordant identifications between MALDI-TOF and 16S rRNA gene sequences. This study highlights the diversity of clinically relevant Actinomyces spp. and provides an important typing comparison. Based on our analysis, 16S rRNA gene sequencing should be used to rapidly identify Actinomyces spp. until MALDI-TOF databases are optimized. PMID:26739153

  8. The effectiveness of three regions in mitochondrial genome for aphid DNA barcoding: a case in Lachininae.

    PubMed

    Chen, Rui; Jiang, Li-Yun; Qiao, Ge-Xia

    2012-01-01

    The mitochondrial gene COI has been widely used by taxonomists as a standard DNA barcode sequence for the identification of many animal species. However, the COI region is of limited use for identifying certain species and is not efficiently amplified by PCR in all animal taxa. To evaluate the utility of COI as a DNA barcode and to identify other barcode genes, we chose the aphid subfamily Lachninae (Hemiptera: Aphididae) as the focus of our study. We compared the results obtained using COI with two other mitochondrial genes, COII and Cytb. In addition, we propose a new method to improve the efficiency of species identification using DNA barcoding. Three mitochondrial genes (COI, COII and Cytb) were sequenced and were used in the identification of over 80 species of Lachninae. The COI and COII genes demonstrated a greater PCR amplification efficiency than Cytb. Species identification using COII sequences had a higher frequency of success (96.9% in "best match" and 90.8% in "best close match") and yielded lower intra- and higher interspecific genetic divergence values than the other two markers. The use of "tag barcodes" is a new approach that involves attaching a species-specific tag to the standard DNA barcode. With this method, the "barcoding overlap" can be nearly eliminated. As a result, we were able to increase the identification success rate from 83.9% to 95.2% by using COI and the "best close match" technique. A COII-based identification system should be more effective in identifying lachnine species than COI or Cytb. However, the Cytb gene is an effective marker for the study of aphid population genetics due to its high sequence diversity. Furthermore, the use of "tag barcodes" can improve the accuracy of DNA barcoding identification by reducing or removing the overlap between intra- and inter-specific genetic divergence values.

  9. Identification of three duplicated Spin genes in medaka (Oryzias latipes).

    PubMed

    Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang

    2005-05-09

    Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.

  10. Identification of pathogenic gene variants in small families with intellectually disabled siblings by exome sequencing.

    PubMed

    Schuurs-Hoeijmakers, Janneke H M; Vulto-van Silfhout, Anneke T; Vissers, Lisenka E L M; van de Vondervoort, Ilse I G M; van Bon, Bregje W M; de Ligt, Joep; Gilissen, Christian; Hehir-Kwa, Jayne Y; Neveling, Kornelia; del Rosario, Marisol; Hira, Gausiya; Reitano, Santina; Vitello, Aurelio; Failla, Pinella; Greco, Donatella; Fichera, Marco; Galesi, Ornella; Kleefstra, Tjitske; Greally, Marie T; Ockeloen, Charlotte W; Willemsen, Marjolein H; Bongers, Ernie M H F; Janssen, Irene M; Pfundt, Rolph; Veltman, Joris A; Romano, Corrado; Willemsen, Michèl A; van Bokhoven, Hans; Brunner, Han G; de Vries, Bert B A; de Brouwer, Arjan P M

    2013-12-01

    Intellectual disability (ID) is a common neurodevelopmental disorder affecting 1-3% of the general population. Mutations in more than 10% of all human genes are considered to be involved in this disorder, although the majority of these genes are still unknown. We investigated 19 small non-consanguineous families with two to five affected siblings in order to identify pathogenic gene variants in known, novel and potential ID candidate genes. Non-consanguineous families have been largely ignored in gene identification studies as small family size precludes prior mapping of the genetic defect. Using exome sequencing, we identified pathogenic mutations in three genes, DDHD2, SLC6A8, and SLC9A6, of which the latter two have previously been implicated in X-linked ID phenotypes. In addition, we identified potentially pathogenic mutations in BCORL1 on the X-chromosome and in MCM3AP, PTPRT, SYNE1, and ZNF528 on autosomes. We show that potentially pathogenic gene variants can be identified in small, non-consanguineous families with as few as two affected siblings, thus emphasising their value in the identification of syndromic and non-syndromic ID genes.

  11. Integrated Quantitative Transcriptome Maps of Human Trisomy 21 Tissues and Cells

    PubMed Central

    Pelleri, Maria Chiara; Cattani, Chiara; Vitale, Lorenza; Antonaros, Francesca; Strippoli, Pierluigi; Locatelli, Chiara; Cocchi, Guido; Piovesan, Allison; Caracausi, Maria

    2018-01-01

    Down syndrome (DS) is due to the presence of an extra full or partial chromosome 21 (Hsa21). The identification of genes contributing to DS pathogenesis could be the key to any rational therapy of the associated intellectual disability. We aim at generating quantitative transcriptome maps in DS integrating all gene expression profile datasets available for any cell type or tissue, to obtain a complete model of the transcriptome in terms of both expression values for each gene and segmental trend of gene expression along each chromosome. We used the TRAM (Transcriptome Mapper) software for this meta-analysis, comparing transcript expression levels and profiles between DS and normal brain, lymphoblastoid cell lines, blood cells, fibroblasts, thymus and induced pluripotent stem cells, respectively. TRAM combined, normalized, and integrated datasets from different sources and across diverse experimental platforms. The main output was a linear expression value that may be used as a reference for each of up to 37,181 mapped transcripts analyzed, related to both known genes and expression sequence tag (EST) clusters. An independent example in vitro validation of fibroblast transcriptome map data was performed through “Real-Time” reverse transcription polymerase chain reaction showing an excellent correlation coefficient (r = 0.93, p < 0.0001) with data obtained in silico. The availability of linear expression values for each gene allowed the testing of the gene dosage hypothesis of the expected 3:2 DS/normal ratio for Hsa21 as well as other human genes in DS, in addition to listing genes differentially expressed with statistical significance. Although a fraction of Hsa21 genes escapes dosage effects, Hsa21 genes are selectively over-expressed in DS samples compared to genes from other chromosomes, reflecting a decisive role in the pathogenesis of the syndrome. Finally, the analysis of chromosomal segments reveals a high prevalence of Hsa21 over-expressed segments over the other genomic regions, suggesting, in particular, a specific region on Hsa21 that appears to be frequently over-expressed (21q22). Our complete datasets are released as a new framework to investigate transcription in DS for individual genes as well as chromosomal segments in different cell types and tissues. PMID:29740474

  12. Gene and protein nomenclature in public databases

    PubMed Central

    Fundel, Katrin; Zimmer, Ralf

    2006-01-01

    Background Frequently, several alternative names are in use for biological objects such as genes and proteins. Applications like manual literature search, automated text-mining, named entity identification, gene/protein annotation, and linking of knowledge from different information sources require the knowledge of all used names referring to a given gene or protein. Various organism-specific or general public databases aim at organizing knowledge about genes and proteins. These databases can be used for deriving gene and protein name dictionaries. So far, little is known about the differences between databases in terms of size, ambiguities and overlap. Results We compiled five gene and protein name dictionaries for each of the five model organisms (yeast, fly, mouse, rat, and human) from different organism-specific and general public databases. We analyzed the degree of ambiguity of gene and protein names within and between dictionaries, to a lexicon of common English words and domain-related non-gene terms, and we compared different data sources in terms of size of extracted dictionaries and overlap of synonyms between those. The study shows that the number of genes/proteins and synonyms covered in individual databases varies significantly for a given organism, and that the degree of ambiguity of synonyms varies significantly between different organisms. Furthermore, it shows that, despite considerable efforts of co-curation, the overlap of synonyms in different data sources is rather moderate and that the degree of ambiguity of gene names with common English words and domain-related non-gene terms varies depending on the considered organism. Conclusion In conclusion, these results indicate that the combination of data contained in different databases allows the generation of gene and protein name dictionaries that contain significantly more used names than dictionaries obtained from individual data sources. Furthermore, curation of combined dictionaries considerably increases size and decreases ambiguity. The entries of the curated synonym dictionary are available for manual querying, editing, and PubMed- or Google-search via the ProThesaurus-wiki. For automated querying via custom software, we offer a web service and an exemplary client application. PMID:16899134

  13. Identification of pathogen avirulencegenes in the fusiform rust pathosystem

    Treesearch

    John M. Davis; Katherine E. Smith; Amanda Pendleton; Jason A. Smith; C. Dana Nelson

    2012-01-01

    The Cronartium quercuum f.sp. fusiforme (Cqf) whole genome sequencing project will enable identification of avirulence genes in the most devastating pine fungal pathogen in the southeastern United States. Amerson and colleagues (unpublished) have mapped nine fusiform rust resistance genes in loblolly pine,...

  14. 20 years since the introduction of DNA barcoding: from theory to application.

    PubMed

    Fišer Pečnikar, Živa; Buzan, Elena V

    2014-02-01

    Traditionally, taxonomic identification has relied upon morphological characters. In the last two decades, molecular tools based on DNA sequences of short standardised gene fragments, termed DNA barcodes, have been developed for species discrimination. The most common DNA barcode used in animals is a fragment of the cytochrome c oxidase (COI) mitochondrial gene, while for plants, two chloroplast gene fragments from the RuBisCo large subunit (rbcL) and maturase K (matK) genes are widely used. Information gathered from DNA barcodes can be used beyond taxonomic studies and will have far-reaching implications across many fields of biology, including ecology (rapid biodiversity assessment and food chain analysis), conservation biology (monitoring of protected species), biosecurity (early identification of invasive pest species), medicine (identification of medically important pathogens and their vectors) and pharmacology (identification of active compounds). However, it is important that the limitations of DNA barcoding are understood and techniques continually adapted and improved as this young science matures.

  15. Identification and characterization of novel microRNAs for fruit development and quality in hot pepper (Capsicum annuum L.).

    PubMed

    Liu, Zhoubin; Zhang, Yuping; Ou, Lijun; Kang, Linyu; Liu, Yuhua; Lv, Junheng; Wei, Ge; Yang, Bozhi; Yang, Sha; Chen, Wenchao; Dai, Xiongze; Li, Xuefeng; Zhou, Shudong; Zhang, Zhuqing; Ma, Yanqing; Zou, Xuexiao

    2017-04-15

    MicroRNAs (miRNAs) are non-coding small RNAs which play an important regulatory role in various biological processes. Previous studies have reported that miRNAs are involved in fruit development in model plants. However, the miRNAs related to fruit development and quality in hot pepper (Capsicum annuum L.) remains unknown. In this study, small RNA populations from different fruit ripening stages and different varieties were compared using next-generation sequencing technology. Totally, 59 known miRNAs and 310 novel miRNAs were identified from four libraries using miRDeep2 software. For these novel miRNAs, 656 targets were predicted and 402 of them were annotated. GO analysis and KEGG pathways suggested that some of the predicted miRNAs targeted genes involved in starch sucrose metabolism and amino sugar as well as nucleotide sugar metabolism. Quantitative RT-PCR validated the contrasting expression patterns between several miRNAs and their target genes. These results will provide an important foundation for future studies on the regulation of miRNAs involved in fruit development and quality. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. [Infection and molecular characteristics of Giardia in clinical diarrheal patients].

    PubMed

    Liu, Hua; Shen, Yu-juan; Zhang, Yu-mei; Wang, Bin; Liu, Hui; Cao, Jian-ping

    2015-04-01

    To initially understand the infection status and the molecular characteristics of Giardia in clinical diarrheal patients. A total of 95 stool samples were collected from the clinical diarrheal patients admitted in a hospital in Shanghai from May to July, 2014, and the Giardia cysts in the samples were examined by an optical microscope. Then the tpi gene of Giardia in the positive samples were amplified by using the nested-PCR method, and the PCR products were sequenced and analyzed by using BLAST, ClustalX 1.83, and the phylogenetic tree was drawn by using MEGA6.0 software. Only one patient was infected with Giardia and the positive detection rate was 1.05%. The Giardia cysts in the fecal specimen were seen clearly under the microscope. Through the identification by PCR, the amplified fragment was about 530 bp, and the sequencing analysis indicated it was Giardia and which was further identified as assemblage B by drawing phylogenetic tree based on tpi gene. Meanwhile, the sequence had 100% homology with the reported sequence from huian (KF271445). Giardia infection can occur in the clinical diarrheal patients. The study could provide more data for understanding the genetic characteristics of Giardia and the epidemiological study of giardiasis.

  17. Principal network analysis: identification of subnetworks representing major dynamics using gene expression data

    PubMed Central

    Kim, Yongsoo; Kim, Taek-Kyun; Kim, Yungu; Yoo, Jiho; You, Sungyong; Lee, Inyoul; Carlson, George; Hood, Leroy; Choi, Seungjin; Hwang, Daehee

    2011-01-01

    Motivation: Systems biology attempts to describe complex systems behaviors in terms of dynamic operations of biological networks. However, there is lack of tools that can effectively decode complex network dynamics over multiple conditions. Results: We present principal network analysis (PNA) that can automatically capture major dynamic activation patterns over multiple conditions and then generate protein and metabolic subnetworks for the captured patterns. We first demonstrated the utility of this method by applying it to a synthetic dataset. The results showed that PNA correctly captured the subnetworks representing dynamics in the data. We further applied PNA to two time-course gene expression profiles collected from (i) MCF7 cells after treatments of HRG at multiple doses and (ii) brain samples of four strains of mice infected with two prion strains. The resulting subnetworks and their interactions revealed network dynamics associated with HRG dose-dependent regulation of cell proliferation and differentiation and early PrPSc accumulation during prion infection. Availability: The web-based software is available at: http://sbm.postech.ac.kr/pna. Contact: dhhwang@postech.ac.kr; seungjin@postech.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193522

  18. Differential gene expression analysis in glioblastoma cells and normal human brain cells based on GEO database.

    PubMed

    Wang, Anping; Zhang, Guibin

    2017-11-01

    The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.

  19. DNA Commission of the International Society for Forensic Genetics: Recommendations on the validation of software programs performing biostatistical calculations for forensic genetics applications.

    PubMed

    Coble, M D; Buckleton, J; Butler, J M; Egeland, T; Fimmers, R; Gill, P; Gusmão, L; Guttman, B; Krawczak, M; Morling, N; Parson, W; Pinto, N; Schneider, P M; Sherry, S T; Willuweit, S; Prinz, M

    2016-11-01

    The use of biostatistical software programs to assist in data interpretation and calculate likelihood ratios is essential to forensic geneticists and part of the daily case work flow for both kinship and DNA identification laboratories. Previous recommendations issued by the DNA Commission of the International Society for Forensic Genetics (ISFG) covered the application of bio-statistical evaluations for STR typing results in identification and kinship cases, and this is now being expanded to provide best practices regarding validation and verification of the software required for these calculations. With larger multiplexes, more complex mixtures, and increasing requests for extended family testing, laboratories are relying more than ever on specific software solutions and sufficient validation, training and extensive documentation are of upmost importance. Here, we present recommendations for the minimum requirements to validate bio-statistical software to be used in forensic genetics. We distinguish between developmental validation and the responsibilities of the software developer or provider, and the internal validation studies to be performed by the end user. Recommendations for the software provider address, for example, the documentation of the underlying models used by the software, validation data expectations, version control, implementation and training support, as well as continuity and user notifications. For the internal validations the recommendations include: creating a validation plan, requirements for the range of samples to be tested, Standard Operating Procedure development, and internal laboratory training and education. To ensure that all laboratories have access to a wide range of samples for validation and training purposes the ISFG DNA commission encourages collaborative studies and public repositories of STR typing results. Published by Elsevier Ireland Ltd.

  20. Identification of gene regulation models from single-cell data

    NASA Astrophysics Data System (ADS)

    Weber, Lisa; Raymond, William; Munsky, Brian

    2018-09-01

    In quantitative analyses of biological processes, one may use many different scales of models (e.g. spatial or non-spatial, deterministic or stochastic, time-varying or at steady-state) or many different approaches to match models to experimental data (e.g. model fitting or parameter uncertainty/sloppiness quantification with different experiment designs). These different analyses can lead to surprisingly different results, even when applied to the same data and the same model. We use a simplified gene regulation model to illustrate many of these concerns, especially for ODE analyses of deterministic processes, chemical master equation and finite state projection analyses of heterogeneous processes, and stochastic simulations. For each analysis, we employ MATLAB and PYTHON software to consider a time-dependent input signal (e.g. a kinase nuclear translocation) and several model hypotheses, along with simulated single-cell data. We illustrate different approaches (e.g. deterministic and stochastic) to identify the mechanisms and parameters of the same model from the same simulated data. For each approach, we explore how uncertainty in parameter space varies with respect to the chosen analysis approach or specific experiment design. We conclude with a discussion of how our simulated results relate to the integration of experimental and computational investigations to explore signal-activated gene expression models in yeast (Neuert et al 2013 Science 339 584–7) and human cells (Senecal et al 2014 Cell Rep. 8 75–83)5.

  1. Identification of a single nucleotide polymorphism indicative of high risk in acute myocardial infarction

    PubMed Central

    Shalia, Kavita; Saranath, Dhananjaya; Rayar, Jaipreet; Shah, Vinod K.; Mashru, Manoj R.; Soneji, Surendra L.

    2017-01-01

    Background & objectives: Acute myocardial infarction (AMI) is a major health concern in India. The aim of the study was to identify single nucleotide polymorphisms (SNPs) associated with AMI in patients using dedicated chip and validating the identified SNPs on custom-designed chips using high-throughput microarray analysis. Methods: In pilot phase, 48 AMI patients and 48 healthy controls were screened for SNPs using human CVD55K BeadChip with 48,472 SNP probes on Illumina high-throughput microarray platform. The identified SNPs were validated by genotyping additional 160 patients and 179 controls using custom-made Illumina VeraCode GoldenGate Genotyping Assay. Analysis was carried out using PLINK software. Results: From the pilot phase, 98 SNPs present on 94 genes were identified with increased risk of AMI (odds ratio of 1.84-8.85, P=0.04861-0.003337). Five of these SNPs demonstrated association with AMI in the validation phase (P<0.05). Among these, one SNP rs9978223 on interferon gamma receptor 2 [IFNGR2, interferon (IFN)-gamma transducer 1] gene showed a significant association (P=0.00021) with AMI below Bonferroni corrected P value (P=0.00061). IFNGR2 is the second subunit of the receptor for IFN-gamma, an important cytokine in inflammatory reactions. Interpretation & conclusions: The study identified an SNP rs9978223 on IFNGR2 gene, associated with increased risk in AMI patient from India. PMID:29434065

  2. Cell type-selective disease-association of genes under high regulatory load.

    PubMed

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-10-15

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3' UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. [Exploration of common biological pathways for attention deficit hyperactivity disorder and low birth weight].

    PubMed

    Xiang, Bo; Yu, Minglan; Liang, Xuemei; Lei, Wei; Huang, Chaohua; Chen, Jing; He, Wenying; Zhang, Tao; Li, Tao; Liu, Kezhi

    2017-12-10

    To explore common biological pathways for attention deficit hyperactivity disorder (ADHD) and low birth weight (LBW). Thei-Gsea4GwasV2 software was used to analyze the result of genome-wide association analysis (GWAS) for LBW (pathways were derived from Reactome), and nominally significant (P< 0.05, FDR< 0.25) pathways were tested for replication in ADHD.Significant pathways were analyzed with DAPPLE and Reatome FI software to identify genes involved in such pathways, with each cluster enriched with the gene ontology (GO). The Centiscape2.0 software was used to calculate the degree of genetic networks and the betweenness value to explore the core node (gene). Weighed gene co-expression network analysis (WGCNA) was then used to explore the co-expression of genes in these pathways.With gene expression data derived from BrainSpan, GO enrichment was carried out for each gene module. Eleven significant biological pathways was identified in association with LBW, among which two (Selenoamino acid metabolism and Diseases associated with glycosaminoglycan metabolism) were replicated during subsequent ADHD analysis. Network analysis of 130 genes in these pathways revealed that some of the sub-networksare related with morphology of cerebellum, development of hippocampus, and plasticity of synaptic structure. Upon co-expression network analysis, 120 genes passed the quality control and were found to express in 3 gene modules. These modules are mainly related to the regulation of synaptic structure and activity regulation. ADHD and LBW share some biological regulation processes. Anomalies of such proces sesmay predispose to ADHD.

  4. COPS: Detecting Co-Occurrence and Spatial Arrangement of Transcription Factor Binding Motifs in Genome-Wide Datasets

    PubMed Central

    Lohmann, Ingrid

    2012-01-01

    In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209

  5. Sorting through the chaff, nDNA gene trees for phylogenetic inference and hybrid identification of annual sunflowers (Helianthus sect. Helianthus).

    PubMed

    Moody, Michael L; Rieseberg, Loren H

    2012-07-01

    The annual sunflowers (Helianthus sect. Helianthus) present a formidable challenge for phylogenetic inference because of ancient hybrid speciation, recent introgression, and suspected issues with deep coalescence. Here we analyze sequence data from 11 nuclear DNA (nDNA) genes for multiple genotypes of species within the section to (1) reconstruct the phylogeny of this group, (2) explore the utility of nDNA gene trees for detecting hybrid speciation and introgression; and (3) test an empirical method of hybrid identification based on the phylogenetic congruence of nDNA gene trees from tightly linked genes. We uncovered considerable topological heterogeneity among gene trees with or without three previously identified hybrid species included in the analyses, as well as a general lack of reciprocal monophyly of species. Nonetheless, partitioned Bayesian analyses provided strong support for the reciprocal monophyly of all species except H. annuus (0.89 PP), the most widespread and abundant annual sunflower. Previous hypotheses of relationships among taxa were generally strongly supported (1.0 PP), except among taxa typically associated with H. annuus, apparently due to the paraphyly of the latter in all gene trees. While the individual nDNA gene trees provided a useful means for detecting recent hybridization, identification of ancient hybridization was problematic for all ancient hybrid species, even when linkage was considered. We discuss biological factors that affect the efficacy of phylogenetic methods for hybrid identification.

  6. Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

    PubMed Central

    Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru

    2015-01-01

    Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225

  7. P185-M Protein Identification and Validation of Results in Workflows that Integrate over Various Instruments, Datasets, Search Engines

    PubMed Central

    Hufnagel, P.; Glandorf, J.; Körting, G.; Jabs, W.; Schweiger-Hufnagel, U.; Hahner, S.; Lubeck, M.; Suckau, D.

    2007-01-01

    Analysis of complex proteomes often results in long protein lists, but falls short in measuring the validity of identification and quantification results on a greater number of proteins. Biological and technical replicates are mandatory, as is the combination of the MS data from various workflows (gels, 1D-LC, 2D-LC), instruments (TOF/TOF, trap, qTOF or FTMS), and search engines. We describe a database-driven study that combines two workflows, two mass spectrometers, and four search engines with protein identification following a decoy database strategy. The sample was a tryptically digested lysate (10,000 cells) of a human colorectal cancer cell line. Data from two LC-MALDI-TOF/TOF runs and a 2D-LC-ESI-trap run using capillary and nano-LC columns were submitted to the proteomics software platform ProteinScape. The combined MALDI data and the ESI data were searched using Mascot (Matrix Science), Phenyx (GeneBio), ProteinSolver (Bruker and Protagen), and Sequest (Thermo) against a decoy database generated from IPI-human in order to obtain one protein list across all workflows and search engines at a defined maximum false-positive rate of 5%. ProteinScape combined the data to one LC-MALDI and one LC-ESI dataset. The initial separate searches from the two combined datasets generated eight independent peptide lists. These were compiled into an integrated protein list using the ProteinExtractor algorithm. An initial evaluation of the generated data led to the identification of approximately 1200 proteins. Result integration on a peptide level allowed discrimination of protein isoforms that would not have been possible with a mere combination of protein lists.

  8. rpoB Gene Sequencing for Identification of Corynebacterium Species

    PubMed Central

    Khamis, Atieh; Raoult, Didier; La Scola, Bernard

    2004-01-01

    The genus Corynebacterium is a heterogeneous group of species comprising human and animal pathogens and environmental bacteria. It is defined on the basis of several phenotypic characters and the results of DNA-DNA relatedness and, more recently, 16S rRNA gene sequencing. However, the 16S rRNA gene is not polymorphic enough to ensure reliable phylogenetic studies and needs to be completely sequenced for accurate identification. The almost complete rpoB sequences of 56 Corynebacterium species were determined by both PCR and genome walking methods. In all cases the percent similarities between different species were lower than those observed by 16S rRNA gene sequencing, even for those species with degrees of high similarity. Several clusters supported by high bootstrap values were identified. In order to propose a method for strain identification which does not require sequencing of the complete rpoB sequence (approximately 3,500 bp), we identified an area with a high degree of polymorphism, bordered by conserved sequences that can be used as universal primers for PCR amplification and sequencing. The sequence of this fragment (434 to 452 bp) allows accurate species identification and may be used in the future for routine sequence-based identification of Corynebacterium species. PMID:15364970

  9. Identification of a key recombinant narrows the CADASIL gene region to 8 cM and argues against allelism of CADASIL and familial hemiplegic migraine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dichgans, M.; Mayer, M.; Straube, A.

    1996-02-15

    This article reports on new information regarding the genetic mapping of the human CADASIL gene region. Previously, the gene had been mapped to human chromosome 19q12. Using the identification of a chromosomal crossover, the region has been refined to an 8-cM interval. 11 refs., 2 figs., 1 tab.

  10. Large Scale Single Nucleotide Polymorphism Study of PD Susceptibility

    DTIC Science & Technology

    2005-03-01

    identification of eight genetic loci in the familial PD, the results of intensive investigations of polymorphisms in dozens of genes related to sporadic, late...1) investigate the association between classical, sporadic PD and 2386 SNPs in 23 genes implicated in the pathogenesis of PD; (2) construct...addition, experiences derived from this study may be applied in other complex disorders for the identification of susceptibility genes , as well as in genome

  11. SEURAT: Visual analytics for the integrated analysis of microarray data

    PubMed Central

    2010-01-01

    Background In translational cancer research, gene expression data is collected together with clinical data and genomic data arising from other chip based high throughput technologies. Software tools for the joint analysis of such high dimensional data sets together with clinical data are required. Results We have developed an open source software tool which provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data together with associated clinical data, array CGH data and SNP array data. The different data types are organized by a comprehensive data manager. Interactive tools are provided for all graphics: heatmaps, dendrograms, barcharts, histograms, eventcharts and a chromosome browser, which displays genetic variations along the genome. All graphics are dynamic and fully linked so that any object selected in a graphic will be highlighted in all other graphics. For exploratory data analysis the software provides unsupervised data analytics like clustering, seriation algorithms and biclustering algorithms. Conclusions The SEURAT software meets the growing needs of researchers to perform joint analysis of gene expression, genomical and clinical data. PMID:20525257

  12. A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija

    Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset ofmore » genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.« less

  13. A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation.

    PubMed

    Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija; Auguin, Daniel; Lainé, Éric; Davin, Laurence B; Cort, John R; Lewis, Norman G; Hano, Christophe

    2018-05-01

    Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset of genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.

  14. [Genome-wide identification and expression analysis of the WRKY gene family in peach].

    PubMed

    Gu, Yan-bing; Ji, Zhi-rui; Chi, Fu-mei; Qiao, Zhuang; Xu, Cheng-nan; Zhang, Jun-xiang; Zhou, Zong-shan; Dong, Qing-long

    2016-03-01

    The WRKY transcription factors are one of the largest families of transcriptional regulators and play diverse regulatory roles in biotic and abiotic stresses, plant growth and development processes. In this study, the WRKY DNA-binding domain (Pfam Database number: PF03106) downloaded from Pfam protein families database was exploited to identify WRKY genes from the peach (Prunus persica 'Lovell') genome using HMMER 3.0. The obtained amino acid sequences were analyzed with DNAMAN 5.0, WebLogo 3, MEGA 5.1, MapInspect and MEME bioinformatics softwares. Totally 61 peach WRKY genes were found in the peach genome. Our phylogenetic analysis revealed that peach WRKY genes were classified into three Groups: Ⅰ, Ⅱ and Ⅲ. The WRKY N-terminal and C-terminal domains of Group Ⅰ (group I-N and group I-C) were monophyletic. The Group Ⅱ was sub-divided into five distinct clades (groupⅡ-a, Ⅱ-b, Ⅱ-c, Ⅱ-d and Ⅱ-e). Our domain analysis indicated that the WRKY regions contained a highly conserved heptapeptide stretch WRKYGQK at its N-terminus followed by a zinc-finger motif. The chromosome mapping analysis showed that peach WRKY genes were distributed with different densities over 8 chromosomes. The intron-exon structure analysis revealed that structures of the WRKY gene were highly conserved in the peach. The conserved motif analysis showed that the conserved motifs 1, 2 and 3, which specify the WRKY domain, were observed in all peach WRKY proteins, motif 5 as the unknown domain was observed in group Ⅱ-d, two WRKY domains were assigned to GroupⅠ. SqRT-PCR and qRT-PCR results indicated that 16 PpWRKY genes were expressed in roots, stems, leaves, flowers and fruits at various expression levels. Our analysis thus identified the PpWRKY gene families, and future functional studies are needed to reveal its specific roles.

  15. An Expressed Sequence Tag collection from the male antennae of the Noctuid moth Spodoptera littoralis: a resource for olfactory and pheromone detection research

    PubMed Central

    2011-01-01

    Background Nocturnal insects such as moths are ideal models to study the molecular bases of olfaction that they use, among examples, for the detection of mating partners and host plants. Knowing how an odour generates a neuronal signal in insect antennae is crucial for understanding the physiological bases of olfaction, and also could lead to the identification of original targets for the development of olfactory-based control strategies against herbivorous moth pests. Here, we describe an Expressed Sequence Tag (EST) project to characterize the antennal transcriptome of the noctuid pest model, Spodoptera littoralis, and to identify candidate genes involved in odour/pheromone detection. Results By targeting cDNAs from male antennae, we biased gene discovery towards genes potentially involved in male olfaction, including pheromone reception. A total of 20760 ESTs were obtained from a normalized library and were assembled in 9033 unigenes. 6530 were annotated based on BLAST analyses and gene prediction software identified 6738 ORFs. The unigenes were compared to the Bombyx mori proteome and to ESTs derived from Lepidoptera transcriptome projects. We identified a large number of candidate genes involved in odour and pheromone detection and turnover, including 31 candidate chemosensory receptor genes, but also genes potentially involved in olfactory modulation. Conclusions Our project has generated a large collection of antennal transcripts from a Lepidoptera. The normalization process, allowing enrichment in low abundant genes, proved to be particularly relevant to identify chemosensory receptors in a species for which no genomic data are available. Our results also suggest that olfactory modulation can take place at the level of the antennae itself. These EST resources will be invaluable for exploring the mechanisms of olfaction and pheromone detection in S. littoralis, and for ultimately identifying original targets to fight against moth herbivorous pests. PMID:21276261

  16. Analysis of the resistance mechanisms in sugarcane during Sporisorium scitamineum infection using RNA-seq and microscopy.

    PubMed

    McNeil, Meredith D; Bhuiyan, Shamsul A; Berkman, Paul J; Croft, Barry J; Aitken, Karen S

    2018-01-01

    Smut caused by biotrophic fungus Sporisorium scitamineum is a major disease of cultivated sugarcane that can cause considerable yield losses. It has been suggested in literature that there are at least two types of resistance mechanisms in sugarcane plants: an external resistance, due to chemical or physical barriers in the sugarcane bud, and an internal resistance governed by the interaction of plant and fungus within the plant tissue. Detailed molecular studies interrogating these two different resistance mechanisms in sugarcane are scarce. Here, we use light microscopy and global expression profiling with RNA-seq to investigate these mechanisms in sugarcane cultivar CP74-2005, a cultivar that possibly possesses both internal and external defence mechanisms. A total of 861 differentially expressed genes (DEGs) were identified in a comparison between infected and non-infected buds at 48 hours post-inoculation (hpi), with 457 (53%) genes successfully annotated using BLAST2GO software. This includes genes involved in the phenylpropanoid pathway, cell wall biosynthesis, plant hormone signal transduction and disease resistance genes. Finally, the expression of 13 DEGs with putative roles in S. scitamineum resistance were confirmed by quantitative real-time reverse transcription PCR (qRT-PCR) analysis, and the results were consistent with the RNA-seq data. These results highlight that the early sugarcane response to S. scitamineum infection is complex and many of the disease response genes are attenuated in sugarcane cultivar CP74-2005, while others, like genes involved in the phenylpropanoid pathway, are induced. This may point to the role of the different disease resistance mechanisms that operate in cultivars such as CP74-2005, whereby the early response is dominated by external mechanisms and then as the infection progresses, the internal mechanisms are switched on. Identification of genes underlying resistance in sugarcane will increase our knowledge of the sugarcane-S. scitamineum interaction and facilitate the introgression of new resistance genes into commercial sugarcane cultivars.

  17. Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

    PubMed Central

    Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

    2016-01-01

    Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825

  18. Software and database for the analysis of mutations in the human FBN1 gene.

    PubMed Central

    Collod, G; Béroud, C; Soussi, T; Junien, C; Boileau, C

    1996-01-01

    Fibrillin is the major component of extracellular microfibrils. Mutations in the fibrillin gene on chromosome 15 (FBN1) were described at first in the heritable connective tissue disorder, Marfan syndrome (MFS). More recently, FBN1 has also been shown to harbor mutations related to a spectrum of conditions phenotypically related to MFS and many mutations will have to be accumulated before genotype/phenotype relationships emerge. To facilitate mutational analysis of the FBN1 gene, a software package along with a computerized database (currently listing 63 entries) have been created. PMID:8594563

  19. How DNA barcoding can be more effective in microalgae identification: a case of cryptic diversity revelation in Scenedesmus (Chlorophyceae)

    PubMed Central

    Zou, Shanmei; Fei, Cong; Wang, Chun; Gao, Zhan; Bao, Yachao; He, Meilin; Wang, Changhai

    2016-01-01

    Microalgae identification is extremely difficult. The efficiency of DNA barcoding in microalgae identification involves ideal gene markers and approaches employed, which however, is still under the way. Although Scenedesmus has obtained much research in producing lipids its identification is difficult. Here we present a comprehensive coalescent, distance and character-based DNA barcoding for 118 Scenedesmus strains based on rbcL, tufA, ITS and 16S. The four genes, and their combined data rbcL + tufA + ITS + 16S, rbcL + tufA and ITS + 16S were analyzed by all of GMYC, P ID, PTP, ABGD, and character-based barcoding respectively. It was apparent that the three combined gene data showed a higher proportion of resolution success than the single gene. In comparison, the GMYC and PTP analysis produced more taxonomic lineages. The ABGD generated various resolution in discrimination among the single and combined data. The character-based barcoding was proved to be the most effective approach for species discrimination in both single and combined data which produced consistent species identification. All the integrated results recovered 11 species, five out of which were revealed as potential cryptic species. We suggest that the character-based DNA barcoding together with other approaches based on multiple genes and their combined data could be more effective in microalgae diversity revelation. PMID:27827440

  20. How DNA barcoding can be more effective in microalgae identification: a case of cryptic diversity revelation in Scenedesmus (Chlorophyceae).

    PubMed

    Zou, Shanmei; Fei, Cong; Wang, Chun; Gao, Zhan; Bao, Yachao; He, Meilin; Wang, Changhai

    2016-11-09

    Microalgae identification is extremely difficult. The efficiency of DNA barcoding in microalgae identification involves ideal gene markers and approaches employed, which however, is still under the way. Although Scenedesmus has obtained much research in producing lipids its identification is difficult. Here we present a comprehensive coalescent, distance and character-based DNA barcoding for 118 Scenedesmus strains based on rbcL, tufA, ITS and 16S. The four genes, and their combined data rbcL + tufA + ITS + 16S, rbcL + tufA and ITS + 16S were analyzed by all of GMYC, P ID, PTP, ABGD, and character-based barcoding respectively. It was apparent that the three combined gene data showed a higher proportion of resolution success than the single gene. In comparison, the GMYC and PTP analysis produced more taxonomic lineages. The ABGD generated various resolution in discrimination among the single and combined data. The character-based barcoding was proved to be the most effective approach for species discrimination in both single and combined data which produced consistent species identification. All the integrated results recovered 11 species, five out of which were revealed as potential cryptic species. We suggest that the character-based DNA barcoding together with other approaches based on multiple genes and their combined data could be more effective in microalgae diversity revelation.

  1. A benchmarking tool to evaluate computer tomography perfusion infarct core predictions against a DWI standard.

    PubMed

    Cereda, Carlo W; Christensen, Søren; Campbell, Bruce Cv; Mishra, Nishant K; Mlynash, Michael; Levi, Christopher; Straka, Matus; Wintermark, Max; Bammer, Roland; Albers, Gregory W; Parsons, Mark W; Lansberg, Maarten G

    2016-10-01

    Differences in research methodology have hampered the optimization of Computer Tomography Perfusion (CTP) for identification of the ischemic core. We aim to optimize CTP core identification using a novel benchmarking tool. The benchmarking tool consists of an imaging library and a statistical analysis algorithm to evaluate the performance of CTP. The tool was used to optimize and evaluate an in-house developed CTP-software algorithm. Imaging data of 103 acute stroke patients were included in the benchmarking tool. Median time from stroke onset to CT was 185 min (IQR 180-238), and the median time between completion of CT and start of MRI was 36 min (IQR 25-79). Volumetric accuracy of the CTP-ROIs was optimal at an rCBF threshold of <38%; at this threshold, the mean difference was 0.3 ml (SD 19.8 ml), the mean absolute difference was 14.3 (SD 13.7) ml, and CTP was 67% sensitive and 87% specific for identification of DWI positive tissue voxels. The benchmarking tool can play an important role in optimizing CTP software as it provides investigators with a novel method to directly compare the performance of alternative CTP software packages. © The Author(s) 2015.

  2. Polymerase chain reaction-based identification of clinically relevant Pasteurellaceae isolated from cats and dogs in Poland.

    PubMed

    Król, Jaroslaw; Bania, Jacek; Florek, Magdalena; Pliszczak-Król, Aleksandra; Staroniewicz, Zdzislaw

    2011-05-01

    A set of polymerase chain reaction (PCR) assays for identification of the most important Pasteurellaceae species encountered in cats and dogs were developed. Primers for Pasteurella multocida were designed to detect a fragment of the kmt, a gene encoding the outer-membrane protein. Primers specific to Pasteurella canis, Pasteurella dagmatis, and Pasteurella stomatis were based on the manganese-dependent superoxide dismutase gene (sodA) and those specific to [Haemophilus] haemoglobinophilus on species-specific sequences of the 16S ribosomal RNA gene. All the primers were tested on respective reference and control strains and applied to the identification of 47 canine and feline field isolates of Pasteurellaceae. The PCR assays were shown to be species specific, providing a valuable supplement to phenotypic identification of species within this group of bacteria. © 2011 The Author(s)

  3. ExAtlas: An interactive online tool for meta-analysis of gene expression data.

    PubMed

    Sharov, Alexei A; Schlessinger, David; Ko, Minoru S H

    2015-12-01

    We have developed ExAtlas, an on-line software tool for meta-analysis and visualization of gene expression data. In contrast to existing software tools, ExAtlas compares multi-component data sets and generates results for all combinations (e.g. all gene expression profiles versus all Gene Ontology annotations). ExAtlas handles both users' own data and data extracted semi-automatically from the public repository (GEO/NCBI database). ExAtlas provides a variety of tools for meta-analyses: (1) standard meta-analysis (fixed effects, random effects, z-score, and Fisher's methods); (2) analyses of global correlations between gene expression data sets; (3) gene set enrichment; (4) gene set overlap; (5) gene association by expression profile; (6) gene specificity; and (7) statistical analysis (ANOVA, pairwise comparison, and PCA). ExAtlas produces graphical outputs, including heatmaps, scatter-plots, bar-charts, and three-dimensional images. Some of the most widely used public data sets (e.g. GNF/BioGPS, Gene Ontology, KEGG, GAD phenotypes, BrainScan, ENCODE ChIP-seq, and protein-protein interaction) are pre-loaded and can be used for functional annotations.

  4. 49 CFR Appendix F to Part 236 - Minimum Requirements of FRA Directed Independent Third-Party Assessment of PTC System Safety...

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ..., national, or international standards. (f) The reviewer shall analyze all Fault Tree Analyses (FTA), Failure... cited by the reviewer; (4) Identification of any documentation or information sought by the reviewer...) Identification of the hardware and software verification and validation procedures for the PTC system's safety...

  5. Rapid identification and classification of Listeria spp. and serotype assignment of Listeria monocytogenes using fourier transform-infrared spectroscopy and artificial neural network analysis

    USDA-ARS?s Scientific Manuscript database

    The use of Fourier Transform-Infrared Spectroscopy (FT-IR) in conjunction with Artificial Neural Network software, NeuroDeveloper™ was examined for the rapid identification and classification of Listeria species and serotyping of Listeria monocytogenes. A spectral library was created for 245 strains...

  6. A standardized framing for reporting protein identifications in mzIdentML 1.2

    PubMed Central

    Seymour, Sean L.; Farrah, Terry; Binz, Pierre-Alain; Chalkley, Robert J.; Cottrell, John S.; Searle, Brian C.; Tabb, David L.; Vizcaíno, Juan Antonio; Prieto, Gorka; Uszkoreit, Julian; Eisenacher, Martin; Martínez-Bartolomé, Salvador; Ghali, Fawaz; Jones, Andrew R.

    2015-01-01

    Inferring which protein species have been detected in bottom-up proteomics experiments has been a challenging problem for which solutions have been maturing over the past decade. While many inference approaches now function well in isolation, comparing and reconciling the results generated across different tools remains difficult. It presently stands as one of the greatest barriers in collaborative efforts such as the Human Proteome Project and public repositories like the PRoteomics IDEntifications (PRIDE) database. Here we present a framework for reporting protein identifications that seeks to improve capabilities for comparing results generated by different inference tools. This framework standardizes the terminology for describing protein identification results, associated with the HUPO-Proteomics Standards Initiative (PSI) mzIdentML standard, while still allowing for differing methodologies to reach that final state. It is proposed that developers of software for reporting identification results will adopt this terminology in their outputs. While the new terminology does not require any changes to the core mzIdentML model, it represents a significant change in practice, and, as such, the rules will be released via a new version of the mzIdentML specification (version 1.2) so that consumers of files are able to determine whether the new guidelines have been adopted by export software. PMID:25092112

  7. POTAMOS mass spectrometry calculator: computer aided mass spectrometry to the post-translational modifications of proteins. A focus on histones.

    PubMed

    Vlachopanos, A; Soupsana, E; Politou, A S; Papamokos, G V

    2014-12-01

    Mass spectrometry is a widely used technique for protein identification and it has also become the method of choice in order to detect and characterize the post-translational modifications (PTMs) of proteins. Many software tools have been developed to deal with this complication. In this paper we introduce a new, free and user friendly online software tool, named POTAMOS Mass Spectrometry Calculator, which was developed in the open source application framework Ruby on Rails. It can provide calculated mass spectrometry data in a time saving manner, independently of instrumentation. In this web application we have focused on a well known protein family of histones whose PTMs are believed to play a crucial role in gene regulation, as suggested by the so called "histone code" hypothesis. The PTMs implemented in this software are: methylations of arginines and lysines, acetylations of lysines and phosphorylations of serines and threonines. The application is able to calculate the kind, the number and the combinations of the possible PTMs corresponding to a given peptide sequence and a given mass along with the full set of the unique primary structures produced by the possible distributions along the amino acid sequence. It can also calculate the masses and charges of a fragmented histone variant, which carries predefined modifications already implemented. Additional functionality is provided by the calculation of the masses of fragments produced upon protein cleavage by the proteolytic enzymes that are most widely used in proteomics studies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. catcher: A Software Program to Detect Answer Copying in Multiple-Choice Tests Based on Nominal Response Model

    ERIC Educational Resources Information Center

    Kalender, Ilker

    2012-01-01

    catcher is a software program designed to compute the [omega] index, a common statistical index for the identification of collusions (cheating) among examinees taking an educational or psychological test. It requires (a) responses and (b) ability estimations of individuals, and (c) item parameters to make computations and outputs the results of…

  9. Hierarchical Segmentation Enhances Diagnostic Imaging

    NASA Technical Reports Server (NTRS)

    2007-01-01

    Bartron Medical Imaging LLC (BMI), of New Haven, Connecticut, gained a nonexclusive license from Goddard Space Flight Center to use the RHSEG software in medical imaging. To manage image data, BMI then licensed two pattern-matching software programs from NASA's Jet Propulsion Laboratory that were used in image analysis and three data-mining and edge-detection programs from Kennedy Space Center. More recently, BMI made NASA history by being the first company to partner with the Space Agency through a Cooperative Research and Development Agreement to develop a 3-D version of RHSEG. With U.S. Food and Drug Administration clearance, BMI will sell its Med-Seg imaging system with the 2-D version of the RHSEG software to analyze medical imagery from CAT and PET scans, MRI, ultrasound, digitized X-rays, digitized mammographies, dental X-rays, soft tissue analyses, moving object analyses, and soft-tissue slides such as Pap smears for the diagnoses and management of diseases. Extending the software's capabilities to three dimensions will eventually enable production of pixel-level views of a tumor or lesion, early identification of plaque build-up in arteries, and identification of density levels of microcalcification in mammographies.

  10. Specification-based software sizing: An empirical investigation of function metrics

    NASA Technical Reports Server (NTRS)

    Jeffery, Ross; Stathis, John

    1993-01-01

    For some time the software industry has espoused the need for improved specification-based software size metrics. This paper reports on a study of nineteen recently developed systems in a variety of application domains. The systems were developed by a single software services corporation using a variety of languages. The study investigated several metric characteristics. It shows that: earlier research into inter-item correlation within the overall function count is partially supported; a priori function counts, in themself, do not explain the majority of the effort variation in software development in the organization studied; documentation quality is critical to accurate function identification; and rater error is substantial in manual function counting. The implication of these findings for organizations using function based metrics are explored.

  11. The Tetracorder user guide: version 4.4

    USGS Publications Warehouse

    Livo, Keith Eric; Clark, Roger N.

    2014-01-01

    Imaging spectroscopy mapping software assists in the identification and mapping of materials based on their chemical properties as expressed in spectral measurements of a planet including the solid or liquid surface or atmosphere. Such software can be used to analyze field, aircraft, or spacecraft data; remote sensing datasets; or laboratory spectra. Tetracorder is a set of software algorithms commanded through an expert system to identify materials based on their spectra (Clark and others, 2003). Tetracorder also can be used in traditional remote sensing analyses, because some of the algorithms are a version of a matched filter. Thus, depending on the instructions fed to the Tetracorder system, results can range from simple matched filter output, to spectral feature fitting, to full identification of surface materials (within the limits of the spectral signatures of materials over the spectral range and resolution of the imaging spectroscopy data). A basic understanding of spectroscopy by the user is required for developing an optimum mapping strategy and assessing the results.

  12. Multi-Agent Diagnosis and Control of an Air Revitalization System for Life Support in Space

    NASA Technical Reports Server (NTRS)

    Malin, Jane T.; Kowing, Jeffrey; Nieten, Joseph; Graham, Jeffrey s.; Schreckenghost, Debra; Bonasso, Pete; Fleming, Land D.; MacMahon, Matt; Thronesbery, Carroll

    2000-01-01

    An architecture of interoperating agents has been developed to provide control and fault management for advanced life support systems in space. In this adjustable autonomy architecture, software agents coordinate with human agents and provide support in novel fault management situations. This architecture combines the Livingstone model-based mode identification and reconfiguration (MIR) system with the 3T architecture for autonomous flexible command and control. The MIR software agent performs model-based state identification and diagnosis. MIR identifies novel recovery configurations and the set of commands required for the recovery. The AZT procedural executive and the human operator use the diagnoses and recovery recommendations, and provide command sequencing. User interface extensions have been developed to support human monitoring of both AZT and MIR data and activities. This architecture has been demonstrated performing control and fault management for an oxygen production system for air revitalization in space. The software operates in a dynamic simulation testbed.

  13. Comparison of growth on mannitol salt agar, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry, VITEK® 2 with partial sequencing of 16S rRNA gene for identification of coagulase-negative staphylococci.

    PubMed

    Ayeni, Funmilola A; Andersen, Camilla; Nørskov-Lauritsen, Niels

    2017-04-01

    Mannitol salt agar (MSA) is often used in resources' limited laboratories for identification of S. aureus however, coagulase-negative staphylococci (CoNS) grows and ferments mannitol on MSA. 171 strains of CoNS which have been previously misidentified as S. aureus due to growth on MSA were collected from different locations in Nigeria and two methods for identification of CoNS were compared i.e. ViTEK 2 and MALDI-TOF MS with partial 16S rRNA gene sequencing as gold standard. Partial tuf gene sequencing was used for contradicting identification. All 171 strains (13 species) grew on MSA and ferments mannitol. All tested strains of S. epidermidis, S. haemolyticus, S. nepalensis, S. pasteuri, S. sciuri,, S. warneri, S. xylosus, S. capitis were correctly identified by MALDI-TOF while variable identification were observed in S. saprophyticus and S. cohnii (90%, 81%). There was low identification of S. arlettae (14%) while all strains of S. kloosii and S. gallinarum were misidentified. There is absence of S. gallinarum in the MALDI-TOF database at the period of this study. All tested strains of S. epidermidis, S. gallinarum, S. haemolyticus, S. sciuri,, S. warneri, S. xylosus and S. capitis were correctly identified by ViTEK while variable identification were observed in S. saprophyticus, S. arlettae, S. cohnii, S. kloosii, (84%, 86%, 75%, 60%) and misidentification of S. nepalensis, S. pasteuri. Partial sequencing of 16S rRNA gene was used as gold standard for most strains except S. capitis and S. xylosus where the two species were misidentified by partial sequencing of 16S rRNA contrary to MALDI-TOF and ViTEK identification. Tuf gene sequencing was used for correct identification. Characteristic growth on MSA for CoNS is also identical to S. aureus growth on the media and therefore, MSA could not differentiate between S. aureus and CoNS. The percentage accuracy of ViTEK was better than MALDI-TOF in identification of CoNS. Although partial sequencing of 16S rRNA gene was used as gold standard in this study, it could not correctly identify S. capitis and S. xylosus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. [Isolation and identification of cow-origin Cryptosporidium isolates in Hefei].

    PubMed

    Sun, Tao; Liu, Wei; Wang, Ju-Hua; Xue, Xiu-Heng; Zhao, Chang-Cheng; Li, Pei-Ying

    2011-12-01

    To isolate cow-origin Cryptosporidium in Hefei, and identify its species. 285 dairy cattle fecal samples collected from a farm in Hefei were examined by using floating saturated solution of sucrose and modified acid-fast staining. Cryptosporidium oocysts were isolated and purified from positive fecal samples. Genetic DNA was extracted to be the template. According to the sequence of 18S rRNA gene and HSP70 gene from Cryptosporidium sp., the primers were designed and synthesized. The PCR products were amplified by PCR and nested-PCR. The nested PCR products were cloned and sequenced. Homology searches and phylogenic tree construction were done by DNAStar software. Five fecal samples were positive by morphological methods with an infection rate of 1.8% (5/285). Oocysts from the 5 positive fecal samples were elliptical or ovoid detected by using floating saturated solution of sucrose and modified acid-fast staining with the size of 7.37 microm x 6.13 microm and 7.58 microm x 6.20 microm, and a shape index of 1.20 and 1.22, respectively. Nested-PCR resulted in a 18S rRNA and HSP70 gene fragments with approximately 250 bp and 325 bp, respectively. The five isolates showed a high level of nucleic acid identity with sequence data of the 18S rRNA gene of Cryptosporidium andersoni (DQ989573), and they were clustered in the same clade. The highest HSP70 gene sequence identity was found among the five isolates and other reported C. andersoni isolates (AY954892 and DQ989576), and they were placed into the same clade. The cow-origin Cryptosporidium isolates derived from Hefei is Cryptosporidium andersoni.

  15. Identification, expression and phylogenetic analysis of EgG1Y162 from Echinococcus granulosus

    PubMed Central

    Zhang, Fengbo; Ma, Xiumin; Zhu, Yuejie; Wang, Hongying; Liu, Xianfei; Zhu, Min; Ma, Haimei; Wen, Hao; Fan, Haining; Ding, Jianbing

    2014-01-01

    Objective: This study was to clone, identify and analyze the characteristics of egG1Y162 gene from Echinococcus granulosus. Methods: Genomic DNA and total RNAs were extracted from four different developmental stages of protoscolex, germinal layer, adult and egg of Echinococcus granulosus, respectively. Fluorescent quantitative PCR was used for analyzing the expression of egG1Y162 gene. Prokaryotic expression plasmid of pET41a-EgG1Y162 was constructed to express recombinant His-EgG1Y162 antigen. Western blot analysis was performed to detect antigenicity of EgG1Y162 antigen. Gene sequence, amino acid alignment and phylogenetic tree of EgG1Y162 were analyzed by BLAST, online Spidey and MEGA4 software, respectively. Results: EgG1Y162 gene was expressed in four developmental stages of Echinococcus granulosus. And, egG1Y162 gene expression was the highest in the adult stage, with the relative value of 19.526, significantly higher than other three stages. Additionally, Western blot analysis revealed that EgG1Y162 recombinant protein had good reaction with serum samples from Echinococcus granulosus infected human and dog. Moreover, EgG1Y162 antigen was phylogenetically closest to EmY162 antigen, with the similarity over 90%. Conclusion: Our study identified EgG1Y162 antigen in Echinococcus granulosus for the first time. EgG1Y162 antigen had a high similarity with EmY162 antigen, with the genetic differences mainly existing in the intron region. And, EgG1Y162 recombinant protein showed good antigenicity. PMID:25337206

  16. DM-BLD: differential methylation detection using a hierarchical Bayesian model exploiting local dependency.

    PubMed

    Wang, Xiao; Gu, Jinghua; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua

    2017-01-15

    The advent of high-throughput DNA methylation profiling techniques has enabled the possibility of accurate identification of differentially methylated genes for cancer research. The large number of measured loci facilitates whole genome methylation study, yet posing great challenges for differential methylation detection due to the high variability in tumor samples. We have developed a novel probabilistic approach, D: ifferential M: ethylation detection using a hierarchical B: ayesian model exploiting L: ocal D: ependency (DM-BLD), to detect differentially methylated genes based on a Bayesian framework. The DM-BLD approach features a joint model to capture both the local dependency of measured loci and the dependency of methylation change in samples. Specifically, the local dependency is modeled by Leroux conditional autoregressive structure; the dependency of methylation changes is modeled by a discrete Markov random field. A hierarchical Bayesian model is developed to fully take into account the local dependency for differential analysis, in which differential states are embedded as hidden variables. Simulation studies demonstrate that DM-BLD outperforms existing methods for differential methylation detection, particularly when the methylation change is moderate and the variability of methylation in samples is high. DM-BLD has been applied to breast cancer data to identify important methylated genes (such as polycomb target genes and genes involved in transcription factor activity) associated with breast cancer recurrence. A Matlab package of DM-BLD is available at http://www.cbil.ece.vt.edu/software.htm CONTACT: Xuan@vt.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Identification of conserved drought stress responsive gene-network across tissues and developmental stages in rice.

    PubMed

    Smita, Shuchi; Katiyar, Amit; Pandey, Dev Mani; Chinnusamy, Viswanathan; Archak, Sunil; Bansal, Kailash Chander

    2013-01-01

    Identification of genes that are coexpressed across various tissues and environmental stresses is biologically interesting, since they may play coordinated role in similar biological processes. Genes with correlated expression patterns can be best identified by using coexpression network analysis of transcriptome data. In the present study, we analyzed the temporal-spatial coordination of gene expression in root, leaf and panicle of rice under drought stress and constructed network using WGCNA and Cytoscape. Total of 2199 differentially expressed genes (DEGs) were identified in at least three or more tissues, wherein 88 genes have coordinated expression profile among all the six tissues under drought stress. These 88 highly coordinated genes were further subjected to module identification in the coexpression network. Based on chief topological properties we identified 18 hub genes such as ABC transporter, ATP-binding protein, dehydrin, protein phosphatase 2C, LTPL153 - Protease inhibitor, phosphatidylethanolaminebinding protein, lactose permease-related, NADP-dependent malic enzyme, etc. Motif enrichment analysis showed the presence of ABRE cis-elements in the promoters of > 62% of the coordinately expressed genes. Our results suggest that drought stress mediated upregulated gene expression was coordinated through an ABA-dependent signaling pathway across tissues, at least for the subset of genes identified in this study, while down regulation appears to be regulated by tissue specific pathways in rice.

  18. Gene finding in metatranscriptomic sequences.

    PubMed

    Ismail, Wazim Mohammed; Ye, Yuzhen; Tang, Haixu

    2014-01-01

    Metatranscriptomic sequencing is a highly sensitive bioassay of functional activity in a microbial community, providing complementary information to the metagenomic sequencing of the community. The acquisition of the metatranscriptomic sequences will enable us to refine the annotations of the metagenomes, and to study the gene activities and their regulation in complex microbial communities and their dynamics. In this paper, we present TransGeneScan, a software tool for finding genes in assembled transcripts from metatranscriptomic sequences. By incorporating several features of metatranscriptomic sequencing, including strand-specificity, short intergenic regions, and putative antisense transcripts into a Hidden Markov Model, TranGeneScan can predict a sense transcript containing one or multiple genes (in an operon) or an antisense transcript. We tested TransGeneScan on a mock metatranscriptomic data set containing three known bacterial genomes. The results showed that TranGeneScan performs better than metagenomic gene finders (MetaGeneMark and FragGeneScan) on predicting protein coding genes in assembled transcripts, and achieves comparable or even higher accuracy than gene finders for microbial genomes (Glimmer and GeneMark). These results imply, with the assistance of metatranscriptomic sequencing, we can obtain a broad and precise picture about the genes (and their functions) in a microbial community. TransGeneScan is available as open-source software on SourceForge at https://sourceforge.net/projects/transgenescan/.

  19. Identification of optimal reference genes for RT-qPCR in the rat hypothalamus and intestine for the study of obesity.

    PubMed

    Li, B; Matter, E K; Hoppert, H T; Grayson, B E; Seeley, R J; Sandoval, D A

    2014-02-01

    Obesity has a complicated metabolic pathology, and defining the underlying mechanisms of obesity requires integrative studies with molecular end points. Real-time quantitative PCR (RT-qPCR) is a powerful tool that has been widely utilized. However, the importance of using carefully validated reference genes in RT-qPCR seems to have been overlooked in obesity-related research. The objective of this study was to select a set of reference genes with stable expressions to be used for RT-qPCR normalization in rats under fasted vs re-fed and chow vs high-fat diet (HFD) conditions. Male long-Evans rats were treated under four conditions: chow/fasted, chow/re-fed, HFD/fasted and HFD/re-fed. Expression stabilities of 13 candidate reference genes were evaluated in the rat hypothalamus, duodenum, jejunum and ileum using the ReFinder software program. The optimal number of reference genes needed for RT-qPCR analyses was determined using geNorm. Using geNorm analysis, we found that it was sufficient to use the two most stably expressed genes as references in RT-qPCR analyses for each tissue under specific experimental conditions. B2M and RPLP0 in the hypothalamus, RPS18 and HMBS in the duodenum, RPLP2 and RPLP0 in the jejunum and RPS18 and YWHAZ in the ileum were the most suitable pairs for a normalization study when the four aforementioned experimental conditions were considered. Our study demonstrates that gene expression levels of reference genes commonly used in obesity-related studies, such as ACTB or RPS18, are altered by changes in acute or chronic energy status. These findings underline the importance of using reference genes that are stable in expression across experimental conditions when studying the rat hypothalamus and intestine, because these tissues have an integral role in the regulation of energy homeostasis. It is our hope that this study will raise awareness among obesity researchers on the essential need for reference gene validation in gene expression studies.

  20. DNA typing from skeletal remains following an explosion in a military fort--first experience in Ecuador (South-America).

    PubMed

    González-Andrade, Fabricio; Sánchez, Dora

    2005-10-01

    We present individual body identification efforts, to identify skeletal remains and relatives of missing persons of an explosion took place inside one of the munitions recesses of the Armoured Brigade of the Galapagos Armoured Cavalry, in the city of Riobamba, Ecuador, on Wednesday, November 20, 2002. Nineteen samples of bone remains and two tissue samples (a blood stain on a piece of fabric) from the zero zone were analysed. DNA extraction was made by Isoamilic Phenol-Chloroform-Alcohol, and proteinase K. We increased PCR cycles to identify DNA from bones to 35 cycles in some cases. An ABI 310 sequencer was used. Determination of the fragment size and the allelic designation of the different loci was carried out by comparison with the allelic ladders of the PowerPlex 16 kit and Gene Scan Analysis Software programme. Five possible family groups were established and were compared with the profiles found. Classical Bayesian methods were used to calculate the Likelihood Ratio and it was possible to identify five different genetic profiles in our country. This paper is important because is a novel experience for our forensic services, because this was the first time DNA had been used as an identification method in disasters, and it was validated by Ecuadorian justice like a very effective method.

  1. An automated graphics tool for comparative genomics: the Coulson plot generator

    PubMed Central

    2013-01-01

    Background Comparative analysis is an essential component to biology. When applied to genomics for example, analysis may require comparisons between the predicted presence and absence of genes in a group of genomes under consideration. Frequently, genes can be grouped into small categories based on functional criteria, for example membership of a multimeric complex, participation in a metabolic or signaling pathway or shared sequence features and/or paralogy. These patterns of retention and loss are highly informative for the prediction of function, and hence possible biological context, and can provide great insights into the evolutionary history of cellular functions. However, representation of such information in a standard spreadsheet is a poor visual means from which to extract patterns within a dataset. Results We devised the Coulson Plot, a new graphical representation that exploits a matrix of pie charts to display comparative genomics data. Each pie is used to describe a complex or process from a separate taxon, and is divided into sectors corresponding to the number of proteins (subunits) in a complex/process. The predicted presence or absence of proteins in each complex are delineated by occupancy of a given sector; this format is visually highly accessible and makes pattern recognition rapid and reliable. A key to the identity of each subunit, plus hierarchical naming of taxa and coloring are included. A java-based application, the Coulson plot generator (CPG) automates graphic production, with a tab or comma-delineated text file as input and generating an editable portable document format or svg file. Conclusions CPG software may be used to rapidly convert spreadsheet data to a graphical matrix pie chart format. The representation essentially retains all of the information from the spreadsheet but presents a graphically rich format making comparisons and identification of patterns significantly clearer. While the Coulson plot format is highly useful in comparative genomics, its original purpose, the software can be used to visualize any dataset where entity occupancy is compared between different classes. Availability CPG software is available at sourceforge http://sourceforge.net/projects/coulson and http://dl.dropbox.com/u/6701906/Web/Sites/Labsite/CPG.html PMID:23621955

  2. Identification of new stress-induced microRNA and their targets in wheat using computational approach.

    PubMed

    Pandey, Bharati; Gupta, Om Prakash; Pandey, Dev Mani; Sharma, Indu; Sharma, Pradeep

    2013-05-01

    MicroRNAs (miRNAs) are a class of short endogenous non-coding small RNA molecules of about 18-22 nucleotides in length. Their main function is to downregulate gene expression in different manners like translational repression, mRNA cleavage and epigenetic modification. Computational predictions have raised the number of miRNAs in wheat significantly using an EST based approach. Hence, a combinatorial approach which is amalgamation of bioinformatics software and perl script was used to identify new miRNA to add to the growing database of wheat miRNA. Identification of miRNAs was initiated by mining the EST (Expressed Sequence Tags) database available at National Center for Biotechnology Information. In this investigation, 4677 mature microRNA sequences belonging to 50 miRNA families from different plant species were used to predict miRNA in wheat. A total of five abiotic stress-responsive new miRNAs were predicted and named Ta-miR5653, Ta-miR855, Ta-miR819k, Ta-miR3708 and Ta-miR5156. In addition, four previously identified miRNA, i.e., Ta-miR1122, miR1117, Ta-miR1134 and Ta-miR1133 were predicted in newly identified EST sequence and 14 potential target genes were subsequently predicted, most of which seems to encode ubiquitin carrier protein, serine/threonine protein kinase, 40S ribosomal protein, F-box/kelch-repeat protein, BTB/POZ domain-containing protein, transcription factors which are involved in growth, development, metabolism and stress response. Our result has increased the number of miRNAs in wheat, which should be useful for further investigation into the biological functions and evolution of miRNAs in wheat and other plant species.

  3. Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing

    PubMed Central

    Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco

    2016-01-01

    SUMMARY The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques. PMID:27559074

  4. DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates

    PubMed Central

    Peng, Hao; Yang, Yifan; Zhe, Shandian; Wang, Jian; Gribskov, Michael; Qi, Yuan

    2017-01-01

    Abstract Motivation High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. Results We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. Availability and implementation The software is available at https://github.com/hao-peng/DEIsoM Contact pengh@alumni.purdue.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28595376

  5. Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data.

    PubMed

    Yip, Shun H; Sham, Pak Chung; Wang, Junwen

    2018-02-21

    Traditional RNA sequencing (RNA-seq) allows the detection of gene expression variations between two or more cell populations through differentially expressed gene (DEG) analysis. However, genes that contribute to cell-to-cell differences are not discoverable with RNA-seq because RNA-seq samples are obtained from a mixture of cells. Single-cell RNA-seq (scRNA-seq) allows the detection of gene expression in each cell. With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a homogeneous cell population, such as a population of embryonic stem cells. This analysis is implemented in many software packages. In this study, we compare seven HVG methods from six software packages, including BASiCS, Brennecke, scLVM, scran, scVEGs and Seurat. Our results demonstrate that reproducibility in HVG analysis requires a larger sample size than DEG analysis. Discrepancies between methods and potential issues in these tools are discussed and recommendations are made.

  6. Functional clustering of time series gene expression data by Granger causality

    PubMed Central

    2012-01-01

    Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425

  7. Using PATIMDB to Create Bacterial Transposon Insertion Mutant Libraries

    PubMed Central

    Urbach, Jonathan M.; Wei, Tao; Liberati, Nicole; Grenfell-Lee, Daniel; Villanueva, Jacinto; Wu, Gang; Ausubel, Frederick M.

    2015-01-01

    PATIMDB is a software package for facilitating the generation of transposon mutant insertion libraries. The software has two main functions: process tracking and automated sequence analysis. The process tracking function specifically includes recording the status and fates of multiwell plates and samples in various stages of library construction. Automated sequence analysis refers specifically to the pipeline of sequence analysis starting with ABI files from a sequencing facility and ending with insertion location identifications. The protocols in this unit describe installation and use of PATIMDB software. PMID:19343706

  8. IEEE/AIAA/NASA Digital Avionics Systems Conference, 9th, Virginia Beach, VA, Oct. 15-18, 1990, Proceedings

    NASA Technical Reports Server (NTRS)

    1990-01-01

    The present conference on digital avionics discusses vehicle-management systems, spacecraft avionics, special vehicle avionics, communication/navigation/identification systems, software qualification and quality assurance, launch-vehicle avionics, Ada applications, sensor and signal processing, general aviation avionics, automated software development, design-for-testability techniques, and avionics-software engineering. Also discussed are optical technology and systems, modular avionics, fault-tolerant avionics, commercial avionics, space systems, data buses, crew-station technology, embedded processors and operating systems, AI and expert systems, data links, and pilot/vehicle interfaces.

  9. [Measurement of intracranial hematoma volume by personal computer].

    PubMed

    DU, Wanping; Tan, Lihua; Zhai, Ning; Zhou, Shunke; Wang, Rui; Xue, Gongshi; Xiao, An

    2011-01-01

    To explore the method for intracranial hematoma volume measurement by the personal computer. Forty cases of various intracranial hematomas were measured by the computer tomography with quantitative software and personal computer with Photoshop CS3 software, respectively. the data from the 2 methods were analyzed and compared. There was no difference between the data from the computer tomography and the personal computer (P>0.05). The personal computer with Photoshop CS3 software can measure the volume of various intracranial hematomas precisely, rapidly and simply. It should be recommended in the clinical medicolegal identification.

  10. Astrometrica: Astrometric data reduction of CCD images

    NASA Astrophysics Data System (ADS)

    Raab, Herbert

    2012-03-01

    Astrometrica is an interactive software tool for scientific grade astrometric data reduction of CCD images. The current version of the software is for the Windows 32bit operating system family. Astrometrica reads FITS (8, 16 and 32 bit integer files) and SBIG image files. The size of the images is limited only by available memory. It also offers automatic image calibration (Dark Frame and Flat Field correction), automatic reference star identification, automatic moving object detection and identification, and access to new-generation star catalogs (PPMXL, UCAC 3 and CMC-14), in addition to online help and other features. Astrometrica is shareware, available for use for a limited period of time (100 days) for free; special arrangements can be made for educational projects.

  11. Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

    PubMed

    Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

    2017-07-12

    The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets. SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.

  12. Identification of blood meal sources of Lutzomyia longipalpis using polymerase chain reaction-restriction fragment length polymorphism analysis of the cytochrome B gene

    PubMed Central

    Soares, Vítor Yamashiro Rocha; da Silva, Jailthon Carlos; da Silva, Kleverton Ribeiro; Cruz, Maria do Socorro Pires e; Santos, Marcos Pérsio Dantas; Ribolla, Paulo Eduardo Martins; Alonso, Diego Peres; Coelho, Luiz Felipe Leomil; Costa, Dorcas Lamounier; Costa, Carlos Henrique Nery

    2014-01-01

    An analysis of the dietary content of haematophagous insects can provide important information about the transmission networks of certain zoonoses. The present study evaluated the potential of polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis of the mitochondrial cytochrome B (cytb) gene to differentiate between vertebrate species that were identified as possible sources of sandfly meals. The complete cytb gene sequences of 11 vertebrate species available in the National Center for Biotechnology Information database were digested with Aci I, Alu I, Hae III and Rsa I restriction enzymes in silico using Restriction Mapper software. The cytb gene fragment (358 bp) was amplified from tissue samples of vertebrate species and the dietary contents of sandflies and digested with restriction enzymes. Vertebrate species presented a restriction fragment profile that differed from that of other species, with the exception of Canis familiaris and Cerdocyon thous. The 358 bp fragment was identified in 76 sandflies. Of these, 10 were evaluated using the restriction enzymes and the food sources were predicted for four: Homo sapiens (1), Bos taurus (1) and Equus caballus (2). Thus, the PCR-RFLP technique could be a potential method for identifying the food sources of arthropods. However, some points must be clarified regarding the applicability of the method, such as the extent of DNA degradation through intestinal digestion, the potential for multiple sources of blood meals and the need for greater knowledge regarding intraspecific variations in mtDNA. PMID:24821056

  13. The landscape of transposable elements in the finished genome of the fungal wheat pathogen Mycosphaerella graminicola

    USDA-ARS?s Scientific Manuscript database

    Repetitive sequence analysis has become an integral part of genome sequencing projects in addition to gene identification and annotation. Identification of repeats is important not only because it improves gene prediction, but also because of the role that repetitive sequences play in determining th...

  14. Identification, Characterisation and Clinical Development of the New Generation of Breast Cancer Susceptibility Alleles

    DTIC Science & Technology

    2010-03-01

    amino acid substitution in this gene has been associated with uric acid nephrolithiasis (32). Recent GWAS have identified another variant within this...Identification of a novel gene and a common variant associated with uric acid nephrolithiasis in a Sardinian genetic isolate. Am J Hum Genet 72

  15. Possibilities in identification of genomic species of Burkholderia cepacia complex by PCR and RFLP.

    PubMed

    Navrátilová, Lucie; Chromá, Magdalena; Hanulík, Vojtech; Raclavský, Vladislav

    2013-01-01

    The strains belonging to Burkholderia cepacia complex are important opportunistic pathogens in immunocompromised patients and cause serious diseases. It is possible to obtain isolates from soil, water, plants and human samples. Taxonomy of this group is difficult. Burkholderia cepacia complex consists of seventeen genomic species and the genetic scheme is based on recA gene. Commonly, first five genomovars occurre in humans, mostly genomovars II and III, subdivision IIIA. Within this study we tested identification of first five genomovars by PCR with following melting analysis and RFLP. The experiments were targeted on eubacterial 16S rDNA and specific gene recA, which allowed identification of all five genomovars. RecA gene appeared as more suitable than 16S rDNA, which enabled direct identification of only genomovars II and V; genomovars I, III and IV were similar within 16S rDNA sequence.

  16. SeMPI: a genome-based secondary metabolite prediction and identification web server.

    PubMed

    Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

    2017-07-03

    The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Visual identification system for homeland security and law enforcement support

    NASA Astrophysics Data System (ADS)

    Samuel, Todd J.; Edwards, Don; Knopf, Michael

    2005-05-01

    This paper describes the basic configuration for a visual identification system (VIS) for Homeland Security and law enforcement support. Security and law enforcement systems with an integrated VIS will accurately and rapidly provide identification of vehicles or containers that have entered, exited or passed through a specific monitoring location. The VIS system stores all images and makes them available for recall for approximately one week. Images of alarming vehicles will be archived indefinitely as part of the alarming vehicle"s or cargo container"s record. Depending on user needs, the digital imaging information will be provided electronically to the individual inspectors, supervisors, and/or control center at the customer"s office. The key components of the VIS are the high-resolution cameras that capture images of vehicles, lights, presence sensors, image cataloging software, and image recognition software. In addition to the cameras, the physical integration and network communications of the VIS components with the balance of the security system and client must be ensured.

  18. MetCCS predictor: a web server for predicting collision cross-section values of metabolites in ion mobility-mass spectrometry based metabolomics.

    PubMed

    Zhou, Zhiwei; Xiong, Xin; Zhu, Zheng-Jiang

    2017-07-15

    In metabolomics, rigorous structural identification of metabolites presents a challenge for bioinformatics. The use of collision cross-section (CCS) values of metabolites derived from ion mobility-mass spectrometry effectively increases the confidence of metabolite identification, but this technique suffers from the limit number of available CCS values. Currently, there is no software available for rapidly generating the metabolites' CCS values. Here, we developed the first web server, namely, MetCCS Predictor, for predicting CCS values. It can predict the CCS values of metabolites using molecular descriptors within a few seconds. Common users with limited background on bioinformatics can benefit from this software and effectively improve the metabolite identification in metabolomics. The web server is freely available at: http://www.metabolomics-shanghai.org/MetCCS/ . jiangzhu@sioc.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  19. geneGIS: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

    DTIC Science & Technology

    2012-09-30

    computational tools provide the ability to display, browse, select, filter and summarize spatio-temporal relationships of these individual-based...her research assistant at Esri, Shaun Walbridge, and members of the Marine Mammal Institute ( MMI ), including Tomas Follet and Debbie Steel. This...Genomics Laboratory, MMI , OSU. 4 As part of the geneGIS initiative, these SPLASH photo-identification records and the geneSPLASH DNA profiles

  20. Oligonucleotide microarray for the identification of potential mycotoxigenic fungi

    PubMed Central

    2010-01-01

    Background Mycotoxins are secondary metabolites which are produced by numerous fungi and pose a continuous challenge to the safety and quality of food commodities in South Africa. These toxins have toxicologically relevant effects on humans and animals that eat contaminated foods. In this study, a diagnostic DNA microarray was developed for the identification of the most common food-borne fungi, as well as the genes leading to toxin production. Results A total of 40 potentially mycotoxigenic fungi isolated from different food commodities, as well as the genes that are involved in the mycotoxin synthetic pathways, were analyzed. For fungal identification, oligonucleotide probes were designed by exploiting the sequence variations of the elongation factor 1-alpha (EF-1 α) coding regions and the internal transcribed spacer (ITS) regions of the rRNA gene cassette. For the detection of fungi able to produce mycotoxins, oligonucleotide probes directed towards genes leading to toxin production from different fungal strains were identified in data available in the public domain. The probes selected for fungal identification and the probes specific for toxin producing genes were spotted onto microarray slides. Conclusions The diagnostic microarray developed can be used to identify single pure strains or cultures of potentially mycotoxigenic fungi as well as genes leading to toxin production in both laboratory samples and maize-derived foods offering an interesting potential for microbiological laboratories. PMID:20307326

  1. Genotyping microsatellite DNA markers at putative disease loci in inbred/multiplex families with respiratory chain complex I deficiency allows rapid identification of a novel nonsense mutation (IVS1nt -1) in the NDUFS4 gene in Leigh syndrome.

    PubMed

    Bénit, Paule; Steffann, Julie; Lebon, Sophie; Chretien, Dominique; Kadhom, Noman; de Lonlay, Pascale; Goldenberg, Alice; Dumez, Yves; Dommergues, Marc; Rustin, Pierre; Munnich, Arnold; Rötig, Agnès

    2003-05-01

    Complex I deficiency, the most common cause of mitochondrial disorders, accounts for a variety of clinical symptoms and its genetic heterogeneity makes identification of the disease genes particularly tedious. Indeed, most of the 43 complex I subunits are encoded by nuclear genes, only seven of them being mitochondrially encoded. In order to offer urgent prenatal diagnosis, we have studied an inbred/multiplex family with complex I deficiency by using microsatellite DNA markers flanking the putative disease loci. Microsatellite DNA markers have allowed us to exclude the NDUFS7, NDUFS8, NDUFV1 and NDUFS1 genes and to find homozygosity at the NDUFS4 locus. Direct sequencing has led to identification of a homozygous splice acceptor site mutation in intron 1 of the NDUFS4 gene (IVS1nt -1, G-->A); this was not found in chorion villi of the ongoing pregnancy. We suggest that genotyping microsatellite DNA markers at putative disease loci in inbred/multiplex families helps to identify the disease-causing mutation. More generally, we suggest giving consideration to a more systematic microsatellite analysis of putative disease loci for identification of disease genes in inbred/multiplex families affected with genetically heterogeneous conditions.

  2. QTL Mapping and CRISPR/Cas9 Editing to Identify a Drug Resistance Gene in Toxoplasma gondii.

    PubMed

    Shen, Bang; Powell, Robin H; Behnke, Michael S

    2017-06-22

    Scientific knowledge is intrinsically linked to available technologies and methods. This article will present two methods that allowed for the identification and verification of a drug resistance gene in the Apicomplexan parasite Toxoplasma gondii, the method of Quantitative Trait Locus (QTL) mapping using a Whole Genome Sequence (WGS) -based genetic map and the method of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 -based gene editing. The approach of QTL mapping allows one to test if there is a correlation between a genomic region(s) and a phenotype. Two datasets are required to run a QTL scan, a genetic map based on the progeny of a recombinant cross and a quantifiable phenotype assessed in each of the progeny of that cross. These datasets are then formatted to be compatible with R/qtl software that generates a QTL scan to identify significant loci correlated with the phenotype. Although this can greatly narrow the search window of possible candidates, QTLs span regions containing a number of genes from which the causal gene needs to be identified. Having WGS of the progeny was critical to identify the causal drug resistance mutation at the gene level. Once identified, the candidate mutation can be verified by genetic manipulation of drug sensitive parasites. The most facile and efficient method to genetically modify T. gondii is the CRISPR/Cas9 system. This system comprised of just 2 components both encoded on a single plasmid, a single guide RNA (gRNA) containing a 20 bp sequence complementary to the genomic target and the Cas9 endonuclease that generates a double-strand DNA break (DSB) at the target, repair of which allows for insertion or deletion of sequences around the break site. This article provides detailed protocols to use CRISPR/Cas9 based genome editing tools to verify the gene responsible for sinefungin resistance and to construct transgenic parasites.

  3. QTL Mapping and CRISPR/Cas9 Editing to Identify a Drug Resistance Gene in Toxoplasma gondii

    PubMed Central

    Shen, Bang; Powell, Robin H.; Behnke, Michael S.

    2017-01-01

    Scientific knowledge is intrinsically linked to available technologies and methods. This article will present two methods that allowed for the identification and verification of a drug resistance gene in the Apicomplexan parasite Toxoplasma gondii, the method of Quantitative Trait Locus (QTL) mapping using a Whole Genome Sequence (WGS) -based genetic map and the method of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 -based gene editing. The approach of QTL mapping allows one to test if there is a correlation between a genomic region(s) and a phenotype. Two datasets are required to run a QTL scan, a genetic map based on the progeny of a recombinant cross and a quantifiable phenotype assessed in each of the progeny of that cross. These datasets are then formatted to be compatible with R/qtl software that generates a QTL scan to identify significant loci correlated with the phenotype. Although this can greatly narrow the search window of possible candidates, QTLs span regions containing a number of genes from which the causal gene needs to be identified. Having WGS of the progeny was critical to identify the causal drug resistance mutation at the gene level. Once identified, the candidate mutation can be verified by genetic manipulation of drug sensitive parasites. The most facile and efficient method to genetically modify T. gondii is the CRISPR/Cas9 system. This system comprised of just 2 components both encoded on a single plasmid, a single guide RNA (gRNA) containing a 20 bp sequence complementary to the genomic target and the Cas9 endonuclease that generates a double-strand DNA break (DSB) at the target, repair of which allows for insertion or deletion of sequences around the break site. This article provides detailed protocols to use CRISPR/Cas9 based genome editing tools to verify the gene responsible for sinefungin resistance and to construct transgenic parasites. PMID:28671645

  4. Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA.

    PubMed

    Yin, Li; Cai, Zhihui; Zhu, Baoan; Xu, Cunshuan

    2018-02-14

    Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.

  5. Indel-seq: a fast-forward genetics approach for identification of trait-associated putative candidate genomic regions and its application in pigeonpea (Cajanus cajan).

    PubMed

    Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Sinha, Pallavi; Kale, Sandip M; Parupalli, Swathi; Kumar, Vinay; Chitikineni, Annapurna; Vechalapu, Suryanarayana; Sameer Kumar, Chanda Venkata; Sharma, Mamta; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Muniswamy, Sonnappa; Varshney, Rajeev K

    2017-07-01

    Identification of candidate genomic regions associated with target traits using conventional mapping methods is challenging and time-consuming. In recent years, a number of single nucleotide polymorphism (SNP)-based mapping approaches have been developed and used for identification of candidate/putative genomic regions. However, in the majority of these studies, insertion-deletion (Indel) were largely ignored. For efficient use of Indels in mapping target traits, we propose Indel-seq approach, which is a combination of whole-genome resequencing (WGRS) and bulked segregant analysis (BSA) and relies on the Indel frequencies in extreme bulks. Deployment of Indel-seq approach for identification of candidate genomic regions associated with fusarium wilt (FW) and sterility mosaic disease (SMD) resistance in pigeonpea has identified 16 Indels affecting 26 putative candidate genes. Of these 26 affected putative candidate genes, 24 genes showed effect in the upstream/downstream of the genic region and two genes showed effect in the genes. Validation of these 16 candidate Indels in other FW- and SMD-resistant and FW- and SMD-susceptible genotypes revealed a significant association of five Indels (three for FW and two for SMD resistance). Comparative analysis of Indel-seq with other genetic mapping approaches highlighted the importance of the approach in identification of significant genomic regions associated with target traits. Therefore, the Indel-seq approach can be used for quick and precise identification of candidate genomic regions for any target traits in any crop species. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  6. Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways.

    PubMed

    Li, Chunquan; Han, Junwei; Yao, Qianlan; Zou, Chendan; Xu, Yanjun; Zhang, Chunlong; Shang, Desi; Zhou, Lingyun; Zou, Chaoxia; Sun, Zeguo; Li, Jing; Zhang, Yunpeng; Yang, Haixiu; Gao, Xu; Li, Xia

    2013-05-01

    Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

  7. Novel Method for Reliable Identification of Siccibacter and Franconibacter Strains: from “Pseudo-Cronobacter” to New Enterobacteriaceae Genera

    PubMed Central

    Vlach, Jiří; Junková, Petra; Karamonová, Ludmila; Blažková, Martina; Fukal, Ladislav

    2017-01-01

    ABSTRACT In the last decade, strains of the genera Franconibacter and Siccibacter have been misclassified as first Enterobacter and later Cronobacter. Because Cronobacter is a serious foodborne pathogen that affects premature neonates and elderly individuals, such misidentification may not only falsify epidemiological statistics but also lead to tests of powdered infant formula or other foods giving false results. Currently, the main ways of identifying Franconibacter and Siccibacter strains are by biochemical testing or by sequencing of the fusA gene as part of Cronobacter multilocus sequence typing (MLST), but in relation to these strains the former is generally highly difficult and unreliable while the latter remains expensive. To address this, we developed a fast, simple, and most importantly, reliable method for Franconibacter and Siccibacter identification based on intact-cell matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS). Our method integrates the following steps: data preprocessing using mMass software; principal-component analysis (PCA) for the selection of mass spectrum fingerprints of Franconibacter and Siccibacter strains; optimization of the Biotyper database settings for the creation of main spectrum projections (MSPs). This methodology enabled us to create an in-house MALDI MS database that extends the current MALDI Biotyper database by including Franconibacter and Siccibacter strains. Finally, we verified our approach using seven previously unclassified strains, all of which were correctly identified, thereby validating our method. IMPORTANCE We show that the majority of methods currently used for the identification of Franconibacter and Siccibacter bacteria are not able to properly distinguish these strains from those of Cronobacter. While sequencing of the fusA gene as part of Cronobacter MLST remains the most reliable such method, it is highly expensive and time-consuming. Here, we demonstrate a cost-effective and reliable alternative that correctly distinguishes between Franconibacter, Siccibacter, and Cronobacter bacteria and identifies Franconibacter and Siccibacter at the species level. Using intact-cell MALDI-TOF MS, we extend the current MALDI Biotyper database with 11 Franconibacter and Siccibacter MSPs. In addition, the use of our approach is likely to lead to a more reliable identification scheme for Franconibacter and Siccibacter strains and, consequently, a more trustworthy epidemiological picture of their involvement in disease. PMID:28455327

  8. ENCoRE: an efficient software for CRISPR screens identifies new players in extrinsic apoptosis.

    PubMed

    Trümbach, Dietrich; Pfeiffer, Susanne; Poppe, Manuel; Scherb, Hagen; Doll, Sebastian; Wurst, Wolfgang; Schick, Joel A

    2017-11-25

    As CRISPR/Cas9 mediated screens with pooled guide libraries in somatic cells become increasingly established, an unmet need for rapid and accurate companion informatics tools has emerged. We have developed a lightweight and efficient software to easily manipulate large raw next generation sequencing datasets derived from such screens into informative relational context with graphical support. The advantages of the software entitled ENCoRE (Easy NGS-to-Gene CRISPR REsults) include a simple graphical workflow, platform independence, local and fast multithreaded processing, data pre-processing and gene mapping with custom library import. We demonstrate the capabilities of ENCoRE to interrogate results from a pooled CRISPR cellular viability screen following Tumor Necrosis Factor-alpha challenge. The results not only identified stereotypical players in extrinsic apoptotic signaling but two as yet uncharacterized members of the extrinsic apoptotic cascade, Smg7 and Ces2a. We further validated and characterized cell lines containing mutations in these genes against a panel of cell death stimuli and involvement in p53 signaling. In summary, this software enables bench scientists with sensitive data or without access to informatic cores to rapidly interpret results from large scale experiments resulting from pooled CRISPR/Cas9 library screens.

  9. Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy.

    PubMed

    O'Reilly, Paul; Ortutay, Csaba; Gernon, Grainne; O'Connell, Enda; Seoighe, Cathal; Boyce, Susan; Serrano, Luis; Szegezdi, Eva

    2014-12-19

    Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC=0·84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.

  10. Mouse forward genetics in the study of the peripheral nervous system and human peripheral neuropathy

    PubMed Central

    Douglas, Darlene S.; Popko, Brian

    2009-01-01

    Forward genetics, the phenotype-driven approach to investigating gene identity and function, has a long history in mouse genetics. Random mutations in the mouse transcend bias about gene function and provide avenues towards unique discoveries. The study of the peripheral nervous system is no exception; from historical strains such as the trembler mouse, which led to the identification of PMP22 as a human disease gene causing multiple forms of peripheral neuropathy, to the more recent identification of the claw paw and sprawling mutations, forward genetics has long been a tool for probing the physiology, pathogenesis, and genetics of the PNS. Even as spontaneous and mutagenized mice continue to enable the identification of novel genes, provide allelic series for detailed functional studies, and generate models useful for clinical research, new methods, such as the piggyBac transposon, are being developed to further harness the power of forward genetics. PMID:18481175

  11. In silico mining and PCR-based approaches to transcription factor discovery in non-model plants: gene discovery of the WRKY transcription factors in conifers.

    PubMed

    Liu, Jun-Jun; Xiang, Yu

    2011-01-01

    WRKY transcription factors are key regulators of numerous biological processes in plant growth and development, as well as plant responses to abiotic and biotic stresses. Research on biological functions of plant WRKY genes has focused in the past on model plant species or species with largely characterized transcriptomes. However, a variety of non-model plants, such as forest conifers, are essential as feed, biofuel, and wood or for sustainable ecosystems. Identification of WRKY genes in these non-model plants is equally important for understanding the evolutionary and function-adaptive processes of this transcription factor family. Because of limited genomic information, the rarity of regulatory gene mRNAs in transcriptomes, and the sequence divergence to model organism genes, identification of transcription factors in non-model plants using methods similar to those generally used for model plants is difficult. This chapter describes a gene family discovery strategy for identification of WRKY transcription factors in conifers by a combination of in silico-based prediction and PCR-based experimental approaches. Compared to traditional cDNA library screening or EST sequencing at transcriptome scales, this integrated gene discovery strategy provides fast, simple, reliable, and specific methods to unveil the WRKY gene family at both genome and transcriptome levels in non-model plants.

  12. Bioinformatics-Based Identification of Candidate Genes from QTLs Associated with Cell Wall Traits in Populus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ranjan, Priya; Yin, Tongming; Zhang, Xinye

    2009-11-01

    Quantitative trait locus (QTL) studies are an integral part of plant research and are used to characterize the genetic basis of phenotypic variation observed in structured populations and inform marker-assisted breeding efforts. These QTL intervals can span large physical regions on a chromosome comprising hundreds of genes, thereby hampering candidate gene identification. Genome history, evolution, and expression evidence can be used to narrow the genes in the interval to a smaller list that is manageable for detailed downstream functional genomics characterization. Our primary motivation for the present study was to address the need for a research methodology that identifies candidatemore » genes within a broad QTL interval. Here we present a bioinformatics-based approach for subdividing candidate genes within QTL intervals into alternate groups of high probability candidates. Application of this approach in the context of studying cell wall traits, specifically lignin content and S/G ratios of stem and root in Populus plants, resulted in manageable sets of genes of both known and putative cell wall biosynthetic function. These results provide a roadmap for future experimental work leading to identification of new genes controlling cell wall recalcitrance and, ultimately, in the utility of plant biomass as an energy feedstock.« less

  13. GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

    PubMed

    Zheng, Qi; Wang, Xiu-Jie

    2008-07-01

    Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/

  14. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-08

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Gene Discovery in Prostate Cancer: Functional Identification and Isolation of PAC-1, a Novel Tumor Suppressor Gene Within Chromosome 10p

    DTIC Science & Technology

    1999-09-01

    I.. Zbar. B.. androle for the VHL gene in the development of hyperplasia in a number Lerman. I. I. Identification of the son Hippel-Lindau disease...of heterozy- gosity of chromosome 3p markers in small-cell lung cancer. Nature (Lond.). 329: eleguns produced hyperplasia in all tissues (26...central fibrovascular core lined by cuboidal tumor cells. Tumor weights were determined (Fig. 2d). At the end of 47 days after cells were

  16. Identification and phylogenetic analysis of a sheep pox virus isolated from the Ningxia Hui Autonomous Region of China.

    PubMed

    Zhu, X L; Yang, F; Li, H X; Dou, Y X; Meng, X L; Li, H; Luo, X N; Cai, X P

    2013-05-14

    An outbreak of sheep pox was investigated in the Ningxia Hui Autonomous Region in China. Through immunofluorescence testing, isolated viruses, polymerase chain reaction identification, and electron microscopic examination, the isolated strain was identified as a sheep pox virus. The virus was identified through sequence and phylogenetic analysis of the P32 gene, open reading frame (ORF) 095, and ORF 103 genes. This study is the first to use the ORF 095 and ORF 103 genes as candidate genes for the analysis of sheep pox. The results showed that the ORF 095 and ORF 103 genes could be used for the genotyping of the sheep pox virus.

  17. Comprehensive Identification Of Specific Genes Controlling Complex Traits Through A Genome-Wide Screen for Cis-Acting Regulatory Elements - An Example Using Marek's Disease

    USDA-ARS?s Scientific Manuscript database

    The comprehensive identification of genes underlying phenotypic variation of complex traits remains a major challenge. Most genome-wide screens lack sufficient resolving power as they typically depend on linkage. An alternate method is to screen for allele-specific expression (ASE), a simple yet pow...

  18. BioCreative III interactive task: an overview

    PubMed Central

    2011-01-01

    Background The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested. Results A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation. Discussion The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge. PMID:22151968

  19. In Silico Identification of Candidate Genes for Fertility Restoration in Cytoplasmic Male Sterile Perennial Ryegrass (Lolium perenne L.)

    PubMed Central

    Sykes, Timothy; Yates, Steven; Nagy, Istvan; Asp, Torben; Small, Ian

    2017-01-01

    Perennial ryegrass (Lolium perenne L.) is widely used for forage production in both permanent and temporary grassland systems. To increase yields in perennial ryegrass, recent breeding efforts have been focused on strategies to more efficiently exploit heterosis by hybrid breeding. Cytoplasmic male sterility (CMS) is a widely applied mechanism to control pollination for commercial hybrid seed production and although CMS systems have been identified in perennial ryegrass, they are yet to be fully characterized. Here, we present a bioinformatics pipeline for efficient identification of candidate restorer of fertility (Rf) genes for CMS. From a high-quality draft of the perennial ryegrass genome, 373 pentatricopeptide repeat (PPR) genes were identified and classified, further identifying 25 restorer of fertility-like PPR (RFL) genes through a combination of DNA sequence clustering and comparison to known Rf genes. This extensive gene family was targeted as the majority of Rf genes in higher plants are RFL genes. These RFL genes were further investigated by phylogenetic analyses, identifying three groups of perennial ryegrass RFLs. These three groups likely represent genomic regions of active RFL generation and identify the probable location of perennial ryegrass PPR-Rf genes. This pipeline allows for the identification of candidate PPR-Rf genes from genomic sequence data and can be used in any plant species. Functional markers for PPR-Rf genes will facilitate map-based cloning of Rf genes and enable the use of CMS as an efficient tool to control pollination for hybrid crop production. PMID:26951780

  20. Pyviko: an automated Python tool to design gene knockouts in complex viruses with overlapping genes.

    PubMed

    Taylor, Louis J; Strebel, Klaus

    2017-01-07

    Gene knockouts are a common tool used to study gene function in various organisms. However, designing gene knockouts is complicated in viruses, which frequently contain sequences that code for multiple overlapping genes. Designing mutants that can be traced by the creation of new or elimination of existing restriction sites further compounds the difficulty in experimental design of knockouts of overlapping genes. While software is available to rapidly identify restriction sites in a given nucleotide sequence, no existing software addresses experimental design of mutations involving multiple overlapping amino acid sequences in generating gene knockouts. Pyviko performed well on a test set of over 240,000 gene pairs collected from viral genomes deposited in the National Center for Biotechnology Information Nucleotide database, identifying a point mutation which added a premature stop codon within the first 20 codons of the target gene in 93.2% of all tested gene-overprinted gene pairs. This shows that Pyviko can be used successfully in a wide variety of contexts to facilitate the molecular cloning and study of viral overprinted genes. Pyviko is an extensible and intuitive Python tool for designing knockouts of overlapping genes. Freely available as both a Python package and a web-based interface ( http://louiejtaylor.github.io/pyViKO/ ), Pyviko simplifies the experimental design of gene knockouts in complex viruses with overlapping genes.

Top