Bioconductor: open software development for computational biology and bioinformatics
Gentleman, Robert C; Carey, Vincent J; Bates, Douglas M; Bolstad, Ben; Dettling, Marcel; Dudoit, Sandrine; Ellis, Byron; Gautier, Laurent; Ge, Yongchao; Gentry, Jeff; Hornik, Kurt; Hothorn, Torsten; Huber, Wolfgang; Iacus, Stefano; Irizarry, Rafael; Leisch, Friedrich; Li, Cheng; Maechler, Martin; Rossini, Anthony J; Sawitzki, Gunther; Smith, Colin; Smyth, Gordon; Tierney, Luke; Yang, Jean YH; Zhang, Jianhua
2004-01-01
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples. PMID:15461798
Orchestrating high-throughput genomic analysis with Bioconductor
Huber, Wolfgang; Carey, Vincent J.; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S.; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D.; Irizarry, Rafael A.; Lawrence, Michael; Love, Michael I.; MacDonald, James; Obenchain, Valerie; Oleś, Andrzej K.; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K.; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin
2015-01-01
Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors. PMID:25633503
ggCyto: Next Generation Open-Source Visualization Software for Cytometry.
Van, Phu; Jiang, Wenxin; Gottardo, Raphael; Finak, Greg
2018-06-01
Open source software for computational cytometry has gained in popularity over the past few years. Efforts such as FlowCAP, the Lyoplate and Euroflow projects have highlighted the importance of efforts to standardize both experimental and computational aspects of cytometry data analysis. The R/BioConductor platform hosts the largest collection of open source cytometry software covering all aspects of data analysis and providing infrastructure to represent and analyze cytometry data with all relevant experimental, gating, and cell population annotations enabling fully reproducible data analysis. Data visualization frameworks to support this infrastructure have lagged behind. ggCyto is a new open-source BioConductor software package for cytometry data visualization built on ggplot2 that enables ggplot-like functionality with the core BioConductor flow cytometry data structures. Amongst its features are the ability to transform data and axes on-the-fly using cytometry-specific transformations, plot faceting by experimental meta-data variables, and partial matching of channel, marker and cell populations names to the contents of the BioConductor cytometry data structures. We demonstrate the salient features of the package using publicly available cytometry data with complete reproducible examples in a supplementary material vignette. https://bioconductor.org/packages/devel/bioc/html/ggcyto.html. gfinak@fredhutch.org. Supplementary data are available at Bioinformatics online and at http://rglab.org/ggcyto/.
GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.
Davis, Sean; Meltzer, Paul S
2007-07-15
Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.
Analyzing gene perturbation screens with nested effects models in R and bioconductor.
Fröhlich, Holger; Beissbarth, Tim; Tresch, Achim; Kostka, Dennis; Jacob, Juby; Spang, Rainer; Markowetz, F
2008-11-01
Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.
R classes and methods for SNP array data.
Scharpf, Robert B; Ruczinski, Ingo
2010-01-01
The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.
BeadArray Expression Analysis Using Bioconductor
Ritchie, Matthew E.; Dunning, Mark J.; Smith, Mike L.; Shi, Wei; Lynch, Andy G.
2011-01-01
Illumina whole-genome expression BeadArrays are a popular choice in gene profiling studies. Aside from the vendor-provided software tools for analyzing BeadArray expression data (GenomeStudio/BeadStudio), there exists a comprehensive set of open-source analysis tools in the Bioconductor project, many of which have been tailored to exploit the unique properties of this platform. In this article, we explore a number of these software packages and demonstrate how to perform a complete analysis of BeadArray data in various formats. The key steps of importing data, performing quality assessments, preprocessing, and annotation in the common setting of assessing differential expression in designed experiments will be covered. PMID:22144879
The Risa R/Bioconductor package: integrative data analysis from experimental metadata and back again
2014-01-01
Background The ISA-Tab format and software suite have been developed to break the silo effect induced by technology-specific formats for a variety of data types and to better support experimental metadata tracking. Experimentalists seldom use a single technique to monitor biological signals. Providing a multi-purpose, pragmatic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment. Results The Risa package bridges the gap between the metadata collection and curation in an ISA-compliant way and the data analysis using the widely used statistical computing environment R. The package offers functionality for: i) parsing ISA-Tab datasets into R objects, ii) augmenting annotation with extra metadata not explicitly stated in the ISA syntax; iii) interfacing with domain specific R packages iv) suggesting potentially useful R packages available in Bioconductor for subsequent processing of the experimental data described in the ISA format; and finally v) saving back to ISA-Tab files augmented with analysis specific metadata from R. We demonstrate these features by presenting use cases for mass spectrometry data and DNA microarray data. Conclusions The Risa package is open source (with LGPL license) and freely available through Bioconductor. By making Risa available, we aim to facilitate the task of processing experimental data, encouraging a uniform representation of experimental information and results while delivering tools for ensuring traceability and provenance tracking. Software availability The Risa package is available since Bioconductor 2.11 (version 1.0.0) and version 1.2.1 appeared in Bioconductor 2.12, both along with documentation and examples. The latest version of the code is at the development branch in Bioconductor and can also be accessed from GitHub https://github.com/ISA-tools/Risa, where the issue tracker allows users to report bugs or feature requests. PMID:24564732
Analysis of ChIP-seq Data in R/Bioconductor.
de Santiago, Ines; Carroll, Thomas
2018-01-01
The development of novel high-throughput sequencing methods for ChIP (chromatin immunoprecipitation) has provided a very powerful tool to study gene regulation in multiple conditions at unprecedented resolution and scale. Proactive quality-control and appropriate data analysis techniques are of critical importance to extract the most meaningful results from the data. Over the last years, an array of R/Bioconductor tools has been developed allowing researchers to process and analyze ChIP-seq data. This chapter provides an overview of the methods available to analyze ChIP-seq data based primarily on software packages from the open-source Bioconductor project. Protocols described in this chapter cover basic steps including data alignment, peak calling, quality control and data visualization, as well as more complex methods such as the identification of differentially bound regions and functional analyses to annotate regulatory regions. The steps in the data analysis process were demonstrated on publicly available data sets and will serve as a demonstration of the computational procedures routinely used for the analysis of ChIP-seq data in R/Bioconductor, from which readers can construct their own analysis pipelines.
Software for the Integration of Multiomics Experiments in Bioconductor.
Ramos, Marcel; Schiffer, Lucas; Re, Angela; Azhar, Rimsha; Basunia, Azfar; Rodriguez, Carmen; Chan, Tiffany; Chapman, Phil; Davis, Sean R; Gomez-Cabrero, David; Culhane, Aedin C; Haibe-Kains, Benjamin; Hansen, Kasper D; Kodali, Hanish; Louis, Marie S; Mer, Arvind S; Riester, Markus; Morgan, Martin; Carey, Vince; Waldron, Levi
2017-11-01
Multiomics experiments are increasingly commonplace in biomedical research and add layers of complexity to experimental design, data integration, and analysis. R and Bioconductor provide a generic framework for statistical analysis and visualization, as well as specialized data classes for a variety of high-throughput data types, but methods are lacking for integrative analysis of multiomics experiments. The MultiAssayExperiment software package, implemented in R and leveraging Bioconductor software and design principles, provides for the coordinated representation of, storage of, and operation on multiple diverse genomics data. We provide the unrestricted multiple 'omics data for each cancer tissue in The Cancer Genome Atlas as ready-to-analyze MultiAssayExperiment objects and demonstrate in these and other datasets how the software simplifies data representation, statistical analysis, and visualization. The MultiAssayExperiment Bioconductor package reduces major obstacles to efficient, scalable, and reproducible statistical analysis of multiomics data and enhances data science applications of multiple omics datasets. Cancer Res; 77(21); e39-42. ©2017 AACR . ©2017 American Association for Cancer Research.
González-Beltrán, Alejandra; Neumann, Steffen; Maguire, Eamonn; Sansone, Susanna-Assunta; Rocca-Serra, Philippe
2014-01-01
The ISA-Tab format and software suite have been developed to break the silo effect induced by technology-specific formats for a variety of data types and to better support experimental metadata tracking. Experimentalists seldom use a single technique to monitor biological signals. Providing a multi-purpose, pragmatic and accessible format that abstracts away common constructs for describing Investigations, Studies and Assays, ISA is increasingly popular. To attract further interest towards the format and extend support to ensure reproducible research and reusable data, we present the Risa package, which delivers a central component to support the ISA format by enabling effortless integration with R, the popular, open source data crunching environment. The Risa package bridges the gap between the metadata collection and curation in an ISA-compliant way and the data analysis using the widely used statistical computing environment R. The package offers functionality for: i) parsing ISA-Tab datasets into R objects, ii) augmenting annotation with extra metadata not explicitly stated in the ISA syntax; iii) interfacing with domain specific R packages iv) suggesting potentially useful R packages available in Bioconductor for subsequent processing of the experimental data described in the ISA format; and finally v) saving back to ISA-Tab files augmented with analysis specific metadata from R. We demonstrate these features by presenting use cases for mass spectrometry data and DNA microarray data. The Risa package is open source (with LGPL license) and freely available through Bioconductor. By making Risa available, we aim to facilitate the task of processing experimental data, encouraging a uniform representation of experimental information and results while delivering tools for ensuring traceability and provenance tracking. The Risa package is available since Bioconductor 2.11 (version 1.0.0) and version 1.2.1 appeared in Bioconductor 2.12, both along with documentation and examples. The latest version of the code is at the development branch in Bioconductor and can also be accessed from GitHub https://github.com/ISA-tools/Risa, where the issue tracker allows users to report bugs or feature requests.
RBioCloud: A Light-Weight Framework for Bioconductor and R-based Jobs on the Cloud.
Varghese, Blesson; Patel, Ishan; Barker, Adam
2015-01-01
Large-scale ad hoc analytics of genomic data is popular using the R-programming language supported by over 700 software packages provided by Bioconductor. More recently, analytical jobs are benefitting from on-demand computing and storage, their scalability and their low maintenance cost, all of which are offered by the cloud. While biologists and bioinformaticists can take an analytical job and execute it on their personal workstations, it remains challenging to seamlessly execute the job on the cloud infrastructure without extensive knowledge of the cloud dashboard. How analytical jobs can not only with minimum effort be executed on the cloud, but also how both the resources and data required by the job can be managed is explored in this paper. An open-source light-weight framework for executing R-scripts using Bioconductor packages, referred to as `RBioCloud', is designed and developed. RBioCloud offers a set of simple command-line tools for managing the cloud resources, the data and the execution of the job. Three biological test cases validate the feasibility of RBioCloud. The framework is available from http://www.rbiocloud.com.
McCarthy, Davis J; Campbell, Kieran R; Lun, Aaron T L; Wills, Quin F
2017-04-15
Single-cell RNA sequencing (scRNA-seq) is increasingly used to study gene expression at the level of individual cells. However, preparing raw sequence data for further analysis is not a straightforward process. Biases, artifacts and other sources of unwanted variation are present in the data, requiring substantial time and effort to be spent on pre-processing, quality control (QC) and normalization. We have developed the R/Bioconductor package scater to facilitate rigorous pre-processing, quality control, normalization and visualization of scRNA-seq data. The package provides a convenient, flexible workflow to process raw sequencing reads into a high-quality expression dataset ready for downstream analysis. scater provides a rich suite of plotting tools for single-cell data and a flexible data structure that is compatible with existing tools and can be used as infrastructure for future software development. The open-source code, along with installation instructions, vignettes and case studies, is available through Bioconductor at http://bioconductor.org/packages/scater . davis@ebi.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Blattmann, Peter; Heusel, Moritz; Aebersold, Ruedi
2016-01-01
SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.
Importing MAGE-ML format microarray data into BioConductor.
Durinck, Steffen; Allemeersch, Joke; Carey, Vincent J; Moreau, Yves; De Moor, Bart
2004-12-12
The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. http://www.bioconductor.org. Open Source.
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
Robinson, Mark D; McCarthy, Davis J; Smyth, Gordon K
2010-01-01
It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).
RImmPort: an R/Bioconductor package that enables ready-for-analysis immunology research data.
Shankar, Ravi D; Bhattacharya, Sanchita; Jujjavarapu, Chethan; Andorf, Sandra; Wiser, Jeffery A; Butte, Atul J
2017-04-01
: Open access to raw clinical and molecular data related to immunological studies has created a tremendous opportunity for data-driven science. We have developed RImmPort that prepares NIAID-funded research study datasets in ImmPort (immport.org) for analysis in R. RImmPort comprises of three main components: (i) a specification of R classes that encapsulate study data, (ii) foundational methods to load data of a specific study and (iii) generic methods to slice and dice data across different dimensions in one or more studies. Furthermore, RImmPort supports open formalisms, such as CDISC standards on the open source bioinformatics platform Bioconductor, to ensure that ImmPort curated study datasets are seamlessly accessible and ready for analysis, thus enabling innovative bioinformatics research in immunology. RImmPort is available as part of Bioconductor (bioconductor.org/packages/RImmPort). rshankar@stanford.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages
Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan
2016-01-01
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks. PMID:28232861
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages.
Silva, Tiago C; Colaprico, Antonio; Olsen, Catharina; D'Angelo, Fulvio; Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan
2016-01-01
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.
GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.
Zhu, Lihua Julie; Lawrence, Michael; Gupta, Ankit; Pagès, Hervé; Kucukural, Alper; Garber, Manuel; Wolfe, Scot A
2017-05-15
Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html .
Lun, Aaron T.L.; Smyth, Gordon K.
2016-01-01
Chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) is widely used to identify binding sites for a target protein in the genome. An important scientific application is to identify changes in protein binding between different treatment conditions, i.e. to detect differential binding. This can reveal potential mechanisms through which changes in binding may contribute to the treatment effect. The csaw package provides a framework for the de novo detection of differentially bound genomic regions. It uses a window-based strategy to summarize read counts across the genome. It exploits existing statistical software to test for significant differences in each window. Finally, it clusters windows into regions for output and controls the false discovery rate properly over all detected regions. The csaw package can handle arbitrarily complex experimental designs involving biological replicates. It can be applied to both transcription factor and histone mark datasets, and, more generally, to any type of sequencing data measuring genomic coverage. csaw performs favorably against existing methods for de novo DB analyses on both simulated and real data. csaw is implemented as a R software package and is freely available from the open-source Bioconductor project. PMID:26578583
Dai, Yilin; Guo, Ling; Li, Meng; Chen, Yi-Bu
2012-06-08
Microarray data analysis presents a significant challenge to researchers who are unable to use the powerful Bioconductor and its numerous tools due to their lack of knowledge of R language. Among the few existing software programs that offer a graphic user interface to Bioconductor packages, none have implemented a comprehensive strategy to address the accuracy and reliability issue of microarray data analysis due to the well known probe design problems associated with many widely used microarray chips. There is also a lack of tools that would expedite the functional analysis of microarray results. We present Microarray Я US, an R-based graphical user interface that implements over a dozen popular Bioconductor packages to offer researchers a streamlined workflow for routine differential microarray expression data analysis without the need to learn R language. In order to enable a more accurate analysis and interpretation of microarray data, we incorporated the latest custom probe re-definition and re-annotation for Affymetrix and Illumina chips. A versatile microarray results output utility tool was also implemented for easy and fast generation of input files for over 20 of the most widely used functional analysis software programs. Coupled with a well-designed user interface, Microarray Я US leverages cutting edge Bioconductor packages for researchers with no knowledge in R language. It also enables a more reliable and accurate microarray data analysis and expedites downstream functional analysis of microarray results.
rTANDEM, an R/Bioconductor package for MS/MS protein identification.
Fournier, Frédéric; Joly Beauparlant, Charles; Paradis, René; Droit, Arnaud
2014-08-01
rTANDEM is an R/Bioconductor package that interfaces the X!Tandem protein identification algorithm. The package can run the multi-threaded algorithm on proteomic data files directly from R. It also provides functions to convert search parameters and results to/from R as well as functions to manipulate parameters and automate searches. An associated R package, shinyTANDEM, provides a web-based graphical interface to visualize and interpret the results. Together, those two packages form an entry point for a general MS/MS-based proteomic pipeline in R/Bioconductor. rTANDEM and shinyTANDEM are distributed in R/Bioconductor, http://bioconductor.org/packages/release/bioc/. The packages are under open licenses (GPL-3 and Artistice-1.0). frederic.fournier@crchuq.ulaval.ca or arnaud.droit@crchuq.ulaval.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
This proposal develops scalable R / Bioconductor software infrastructure and data resources to integrate complex, heterogeneous, and large cancer genomic experiments. The falling cost of genomic assays facilitates collection of multiple data types (e.g., gene and transcript expression, structural variation, copy number, methylation, and microRNA data) from a set of clinical specimens. Furthermore, substantial resources are now available from large consortium activities like The Cancer Genome Atlas (TCGA).
Computational methods for evaluation of cell-based data assessment--Bioconductor.
Le Meur, Nolwenn
2013-02-01
Recent advances in miniaturization and automation of technologies have enabled cell-based assay high-throughput screening, bringing along new challenges in data analysis. Automation, standardization, reproducibility have become requirements for qualitative research. The Bioconductor community has worked in that direction proposing several R packages to handle high-throughput data including flow cytometry (FCM) experiment. Altogether, these packages cover the main steps of a FCM analysis workflow, that is, data management, quality assessment, normalization, outlier detection, automated gating, cluster labeling, and feature extraction. Additionally, the open-source philosophy of R and Bioconductor, which offers room for new development, continuously drives research and improvement of theses analysis methods, especially in the field of clustering and data mining. This review presents the principal FCM packages currently available in R and Bioconductor, their advantages and their limits. Copyright © 2012 Elsevier Ltd. All rights reserved.
WebArray: an online platform for microarray data analysis
Xia, Xiaoqin; McClelland, Michael; Wang, Yipeng
2005-01-01
Background Many cutting-edge microarray analysis tools and algorithms, including commonly used limma and affy packages in Bioconductor, need sophisticated knowledge of mathematics, statistics and computer skills for implementation. Commercially available software can provide a user-friendly interface at considerable cost. To facilitate the use of these tools for microarray data analysis on an open platform we developed an online microarray data analysis platform, WebArray, for bench biologists to utilize these tools to explore data from single/dual color microarray experiments. Results The currently implemented functions were based on limma and affy package from Bioconductor, the spacings LOESS histogram (SPLOSH) method, PCA-assisted normalization method and genome mapping method. WebArray incorporates these packages and provides a user-friendly interface for accessing a wide range of key functions of limma and others, such as spot quality weight, background correction, graphical plotting, normalization, linear modeling, empirical bayes statistical analysis, false discovery rate (FDR) estimation, chromosomal mapping for genome comparison. Conclusion WebArray offers a convenient platform for bench biologists to access several cutting-edge microarray data analysis tools. The website is freely available at . It runs on a Linux server with Apache and MySQL. PMID:16371165
chimeraviz: a tool for visualizing chimeric RNA.
Lågstad, Stian; Zhao, Sen; Hoff, Andreas M; Johannessen, Bjarne; Lingjærde, Ole Christian; Skotheim, Rolf I
2017-09-15
Advances in high-throughput RNA sequencing have enabled more efficient detection of fusion transcripts, but the technology and associated software used for fusion detection from sequencing data often yield a high false discovery rate. Good prioritization of the results is important, and this can be helped by a visualization framework that automatically integrates RNA data with known genomic features. Here we present chimeraviz , a Bioconductor package that automates the creation of chimeric RNA visualizations. The package supports input from nine different fusion-finder tools: deFuse, EricScript, InFusion, JAFFA, FusionCatcher, FusionMap, PRADA, SOAPfuse and STAR-FUSION. chimeraviz is an R package available via Bioconductor ( https://bioconductor.org/packages/release/bioc/html/chimeraviz.html ) under Artistic-2.0. Source code and support is available at GitHub ( https://github.com/stianlagstad/chimeraviz ). rolf.i.skotheim@rr-research.no. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Using Kepler for Tool Integration in Microarray Analysis Workflows.
Gan, Zhuohui; Stowe, Jennifer C; Altintas, Ilkay; McCulloch, Andrew D; Zambon, Alexander C
Increasing numbers of genomic technologies are leading to massive amounts of genomic data, all of which requires complex analysis. More and more bioinformatics analysis tools are being developed by scientist to simplify these analyses. However, different pipelines have been developed using different software environments. This makes integrations of these diverse bioinformatics tools difficult. Kepler provides an open source environment to integrate these disparate packages. Using Kepler, we integrated several external tools including Bioconductor packages, AltAnalyze, a python-based open source tool, and R-based comparison tool to build an automated workflow to meta-analyze both online and local microarray data. The automated workflow connects the integrated tools seamlessly, delivers data flow between the tools smoothly, and hence improves efficiency and accuracy of complex data analyses. Our workflow exemplifies the usage of Kepler as a scientific workflow platform for bioinformatics pipelines.
The use of open source bioinformatics tools to dissect transcriptomic data.
Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera
2012-01-01
Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.
Haunsberger, Stefan J; Connolly, Niamh M C; Prehn, Jochen H M
2017-02-15
The miRBase database is the central and official repository for miRNAs and the current release is miRBase version 21.0. Name changes in different miRBase releases cause inconsistencies in miRNA names from version to version. When working with only a small number of miRNAs the translation can be done manually. However, with large sets of miRNAs, the necessary correction of such inconsistencies becomes burdensome and error-prone. We developed miRNAmeConverter , available as a Bioconductor R package and web interface that addresses the challenges associated with mature miRNA name inconsistencies. The main algorithm implemented enables high-throughput automatic translation of species-independent mature miRNA names to user selected miRBase versions. The web interface enables users less familiar with R to translate miRNA names given in form of a list or embedded in text and download of the results. The miRNAmeConverter R package is open source under the Artistic-2.0 license. It is freely available from Bioconductor ( http://bioconductor.org/packages/miRNAmeConverter ). The web interface is based on R Shiny and can be accessed under the URL http://www.systemsmedicineireland.ie/tools/mirna-name-converter/ . The database that miRNAmeConverter depends on is provided by the annotation package miRBaseVersions.db and can be downloaded from Bioconductor ( http://bioconductor.org/packages/miRBaseVersions.db ). Minimum R version 3.3.0 is required. stefanhaunsberger@rcsi.ie. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
esATAC: An Easy-to-use Systematic pipeline for ATAC-seq data analysis.
Wei, Zheng; Zhang, Wei; Fang, Huan; Li, Yanda; Wang, Xiaowo
2018-03-07
ATAC-seq is rapidly emerging as one of the major experimental approaches to probe chromatin accessibility genome-wide. Here, we present "esATAC", a highly integrated easy-to-use R/Bioconductor package, for systematic ATAC-seq data analysis. It covers essential steps for full analyzing procedure, including raw data processing, quality control and downstream statistical analysis such as peak calling, enrichment analysis and transcription factor footprinting. esATAC supports one command line execution for preset pipelines, and provides flexible interfaces for building customized pipelines. esATAC package is open source under the GPL-3.0 license. It is implemented in R and C ++. Source code and binaries for Linux, MAC OS X and Windows are available through Bioconductor https://www.bioconductor.org/packages/release/bioc/html/esATAC.html). xwwang@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online.
ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data.
Ou, Jianhong; Liu, Haibo; Yu, Jun; Kelliher, Michelle A; Castilla, Lucio H; Lawson, Nathan D; Zhu, Lihua Julie
2018-03-01
ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .
Wen, Bo; Xu, Shaohang; Sheynkman, Gloria M; Feng, Qiang; Lin, Liang; Wang, Quanhui; Xu, Xun; Wang, Jun; Liu, Siqi
2014-11-01
Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ReQON: a Bioconductor package for recalibrating quality scores from next-generation sequencing data
2012-01-01
Background Next-generation sequencing technologies have become important tools for genome-wide studies. However, the quality scores that are assigned to each base have been shown to be inaccurate. If the quality scores are used in downstream analyses, these inaccuracies can have a significant impact on the results. Results Here we present ReQON, a tool that recalibrates the base quality scores from an input BAM file of aligned sequencing data using logistic regression. ReQON also generates diagnostic plots showing the effectiveness of the recalibration. We show that ReQON produces quality scores that are both more accurate, in the sense that they more closely correspond to the probability of a sequencing error, and do a better job of discriminating between sequencing errors and non-errors than the original quality scores. We also compare ReQON to other available recalibration tools and show that ReQON is less biased and performs favorably in terms of quality score accuracy. Conclusion ReQON is an open source software package, written in R and available through Bioconductor, for recalibrating base quality scores for next-generation sequencing data. ReQON produces a new BAM file with more accurate quality scores, which can improve the results of downstream analysis, and produces several diagnostic plots showing the effectiveness of the recalibration. PMID:22946927
From reads to regions: a Bioconductor workflow to detect differential binding in ChIP-seq data
Lun, Aaron T. L.; Smyth, Gordon K.
2016-01-01
Chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) is widely used to identify the genomic binding sites for protein of interest. Most conventional approaches to ChIP-seq data analysis involve the detection of the absolute presence (or absence) of a binding site. However, an alternative strategy is to identify changes in the binding intensity between two biological conditions, i.e., differential binding (DB). This may yield more relevant results than conventional analyses, as changes in binding can be associated with the biological difference being investigated. The aim of this article is to facilitate the implementation of DB analyses, by comprehensively describing a computational workflow for the detection of DB regions from ChIP-seq data. The workflow is based primarily on R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, from alignment of read sequences to interpretation and visualization of putative DB regions. In particular, detection of DB regions will be conducted using the counts for sliding windows from the csaw package, with statistical modelling performed using methods in the edgeR package. Analyses will be demonstrated on real histone mark and transcription factor data sets. This will provide readers with practical usage examples that can be applied in their own studies. PMID:26834993
Reproducible Bioconductor workflows using browser-based interactive notebooks and containers.
Almugbel, Reem; Hung, Ling-Hong; Hu, Jiaming; Almutairy, Abeer; Ortogero, Nicole; Tamta, Yashaswi; Yeung, Ka Yee
2018-01-01
Bioinformatics publications typically include complex software workflows that are difficult to describe in a manuscript. We describe and demonstrate the use of interactive software notebooks to document and distribute bioinformatics research. We provide a user-friendly tool, BiocImageBuilder, that allows users to easily distribute their bioinformatics protocols through interactive notebooks uploaded to either a GitHub repository or a private server. We present four different interactive Jupyter notebooks using R and Bioconductor workflows to infer differential gene expression, analyze cross-platform datasets, process RNA-seq data and KinomeScan data. These interactive notebooks are available on GitHub. The analytical results can be viewed in a browser. Most importantly, the software contents can be executed and modified. This is accomplished using Binder, which runs the notebook inside software containers, thus avoiding the need to install any software and ensuring reproducibility. All the notebooks were produced using custom files generated by BiocImageBuilder. BiocImageBuilder facilitates the publication of workflows with a point-and-click user interface. We demonstrate that interactive notebooks can be used to disseminate a wide range of bioinformatics analyses. The use of software containers to mirror the original software environment ensures reproducibility of results. Parameters and code can be dynamically modified, allowing for robust verification of published results and encouraging rapid adoption of new methods. Given the increasing complexity of bioinformatics workflows, we anticipate that these interactive software notebooks will become as necessary for documenting software methods as traditional laboratory notebooks have been for documenting bench protocols, and as ubiquitous. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
MIRA: An R package for DNA methylation-based inference of regulatory activity.
Lawson, John T; Tomazou, Eleni M; Bock, Christoph; Sheffield, Nathan C
2018-03-01
DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for independent region sets with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for each region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of open chromatin and protein binding regions to be leveraged for novel insight into the regulatory state of DNA methylation datasets. R package available on Bioconductor: http://bioconductor.org/packages/release/bioc/html/MIRA.html. nsheffield@virginia.edu.
Finak, Greg; Frelinger, Jacob; Jiang, Wenxin; Newell, Evan W.; Ramey, John; Davis, Mark M.; Kalams, Spyros A.; De Rosa, Stephen C.; Gottardo, Raphael
2014-01-01
Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment. PMID:25167361
Finak, Greg; Frelinger, Jacob; Jiang, Wenxin; Newell, Evan W; Ramey, John; Davis, Mark M; Kalams, Spyros A; De Rosa, Stephen C; Gottardo, Raphael
2014-08-01
Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.
Prins, Pjotr; Goto, Naohisa; Yates, Andrew; Gautier, Laurent; Willis, Scooter; Fields, Christopher; Katayama, Toshiaki
2012-01-01
Open-source software (OSS) encourages computer programmers to reuse software components written by others. In evolutionary bioinformatics, OSS comes in a broad range of programming languages, including C/C++, Perl, Python, Ruby, Java, and R. To avoid writing the same functionality multiple times for different languages, it is possible to share components by bridging computer languages and Bio* projects, such as BioPerl, Biopython, BioRuby, BioJava, and R/Bioconductor. In this chapter, we compare the two principal approaches for sharing software between different programming languages: either by remote procedure call (RPC) or by sharing a local call stack. RPC provides a language-independent protocol over a network interface; examples are RSOAP and Rserve. The local call stack provides a between-language mapping not over the network interface, but directly in computer memory; examples are R bindings, RPy, and languages sharing the Java Virtual Machine stack. This functionality provides strategies for sharing of software between Bio* projects, which can be exploited more often. Here, we present cross-language examples for sequence translation, and measure throughput of the different options. We compare calling into R through native R, RSOAP, Rserve, and RPy interfaces, with the performance of native BioPerl, Biopython, BioJava, and BioRuby implementations, and with call stack bindings to BioJava and the European Molecular Biology Open Software Suite. In general, call stack approaches outperform native Bio* implementations and these, in turn, outperform RPC-based approaches. To test and compare strategies, we provide a downloadable BioNode image with all examples, tools, and libraries included. The BioNode image can be run on VirtualBox-supported operating systems, including Windows, OSX, and Linux.
Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E
2015-09-29
In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Aryee, Martin J.; Jaffe, Andrew E.; Corrada-Bravo, Hector; Ladd-Acosta, Christine; Feinberg, Andrew P.; Hansen, Kasper D.; Irizarry, Rafael A.
2014-01-01
Motivation: The recently released Infinium HumanMethylation450 array (the ‘450k’ array) provides a high-throughput assay to quantify DNA methylation (DNAm) at ∼450 000 loci across a range of genomic features. Although less comprehensive than high-throughput sequencing-based techniques, this product is more cost-effective and promises to be the most widely used DNAm high-throughput measurement technology over the next several years. Results: Here we describe a suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data. The software is structured to easily adapt to future versions of the technology. We include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale. We show how our software provides a powerful and flexible development platform for future methods. We also illustrate how our methods empower the technology to make discoveries previously thought to be possible only with sequencing-based methods. Availability and implementation: http://bioconductor.org/packages/release/bioc/html/minfi.html. Contact: khansen@jhsph.edu; rafa@jimmy.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24478339
Yeung, Ka Yee
2016-01-01
Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface. PMID:27045593
Hung, Ling-Hong; Kristiyanto, Daniel; Lee, Sung Bong; Yeung, Ka Yee
2016-01-01
Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface.
Ostrovnaya, Irina; Seshan, Venkatraman E; Olshen, Adam B; Begg, Colin B
2011-06-15
If a cancer patient develops multiple tumors, it is sometimes impossible to determine whether these tumors are independent or clonal based solely on pathological characteristics. Investigators have studied how to improve this diagnostic challenge by comparing the presence of loss of heterozygosity (LOH) at selected genetic locations of tumor samples, or by comparing genomewide copy number array profiles. We have previously developed statistical methodology to compare such genomic profiles for an evidence of clonality. We assembled the software for these tests in a new R package called 'Clonality'. For LOH profiles, the package contains significance tests. The analysis of copy number profiles includes a likelihood ratio statistic and reference distribution, as well as an option to produce various plots that summarize the results. Bioconductor (http://bioconductor.org/packages/release/bioc/html/Clonality.html) and http://www.mskcc.org/mskcc/html/13287.cfm.
Bioconductor | Informatics Technology for Cancer Research (ITCR)
Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. R/Bioconductor will be enhanced to meet the increasing complexity of multiassay cancer genomics experiments.
BioconductorBuntu: a Linux distribution that implements a web-based DNA microarray analysis server.
Geeleher, Paul; Morris, Dermot; Hinde, John P; Golden, Aaron
2009-06-01
BioconductorBuntu is a custom distribution of Ubuntu Linux that automatically installs a server-side microarray processing environment, providing a user-friendly web-based GUI to many of the tools developed by the Bioconductor Project, accessible locally or across a network. System installation is via booting off a CD image or by using a Debian package provided to upgrade an existing Ubuntu installation. In its current version, several microarray analysis pipelines are supported including oligonucleotide, dual-or single-dye experiments, including post-processing with Gene Set Enrichment Analysis. BioconductorBuntu is designed to be extensible, by server-side integration of further relevant Bioconductor modules as required, facilitated by its straightforward underlying Python-based infrastructure. BioconductorBuntu offers an ideal environment for the development of processing procedures to facilitate the analysis of next-generation sequencing datasets. BioconductorBuntu is available for download under a creative commons license along with additional documentation and a tutorial from (http://bioinf.nuigalway.ie).
Gautier, Laurent
2010-12-21
Computer languages can be domain-related, and in the case of multidisciplinary projects, knowledge of several languages will be needed in order to quickly implements ideas. Moreover, each computer language has relative strong points, making some languages better suited than others for a given task to be implemented. The Bioconductor project, based on the R language, has become a reference for the numerical processing and statistical analysis of data coming from high-throughput biological assays, providing a rich selection of methods and algorithms to the research community. At the same time, Python has matured as a rich and reliable language for the agile development of prototypes or final implementations, as well as for handling large data sets. The data structures and functions from Bioconductor can be exposed to Python as a regular library. This allows a fully transparent and native use of Bioconductor from Python, without one having to know the R language and with only a small community of translators required to know both. To demonstrate this, we have implemented such Python representations for key infrastructure packages in Bioconductor, letting a Python programmer handle annotation data, microarray data, and next-generation sequencing data. Bioconductor is now not solely reserved to R users. Building a Python application using Bioconductor functionality can be done just like if Bioconductor was a Python package. Moreover, similar principles can be applied to other languages and libraries. Our Python package is available at: http://pypi.python.org/pypi/rpy2-bioconductor-extensions/.
2010-01-01
Background Computer languages can be domain-related, and in the case of multidisciplinary projects, knowledge of several languages will be needed in order to quickly implements ideas. Moreover, each computer language has relative strong points, making some languages better suited than others for a given task to be implemented. The Bioconductor project, based on the R language, has become a reference for the numerical processing and statistical analysis of data coming from high-throughput biological assays, providing a rich selection of methods and algorithms to the research community. At the same time, Python has matured as a rich and reliable language for the agile development of prototypes or final implementations, as well as for handling large data sets. Results The data structures and functions from Bioconductor can be exposed to Python as a regular library. This allows a fully transparent and native use of Bioconductor from Python, without one having to know the R language and with only a small community of translators required to know both. To demonstrate this, we have implemented such Python representations for key infrastructure packages in Bioconductor, letting a Python programmer handle annotation data, microarray data, and next-generation sequencing data. Conclusions Bioconductor is now not solely reserved to R users. Building a Python application using Bioconductor functionality can be done just like if Bioconductor was a Python package. Moreover, similar principles can be applied to other languages and libraries. Our Python package is available at: http://pypi.python.org/pypi/rpy2-bioconductor-extensions/ PMID:21210978
Polyester: simulating RNA-seq datasets with differential transcript expression.
Frazee, Alyssa C; Jaffe, Andrew E; Langmead, Ben; Leek, Jeffrey T
2015-09-01
Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user. Polyester is freely available from Bioconductor (http://bioconductor.org/). jtleek@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
MeV+R: using MeV as a graphical user interface for Bioconductor applications in microarray analysis
Chu, Vu T; Gottardo, Raphael; Raftery, Adrian E; Bumgarner, Roger E; Yeung, Ka Yee
2008-01-01
We present MeV+R, an integration of the JAVA MultiExperiment Viewer program with Bioconductor packages. This integration of MultiExperiment Viewer and R is easily extensible to other R packages and provides users with point and click access to traditionally command line driven tools written in R. We demonstrate the ability to use MultiExperiment Viewer as a graphical user interface for Bioconductor applications in microarray data analysis by incorporating three Bioconductor packages, RAMA, BRIDGE and iterativeBMA. PMID:18652698
Chen, Yunshun; Lun, Aaron T L; Smyth, Gordon K
2016-01-01
In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.
Sanges, Remo; Cordero, Francesca; Calogero, Raffaele A
2007-12-15
OneChannelGUI is an add-on Bioconductor package providing a new set of functions extending the capability of the affylmGUI package. This library provides a graphical interface (GUI) for Bioconductor libraries to be used for quality control, normalization, filtering, statistical validation and data mining for single channel microarrays. Affymetrix 3' expression (IVT) arrays as well as the new whole transcript expression arrays, i.e. gene/exon 1.0 ST, are actually implemented. oneChannelGUI is available for most platforms on which R runs, i.e. Windows and Unix-like machines. http://www.bioconductor.org/packages/2.0/bioc/html/oneChannelGUI.html
Li, Ruidong; Qu, Han; Wang, Shibo; Wei, Julong; Zhang, Le; Ma, Renyuan; Lu, Jianming; Zhu, Jianguo; Zhong, Wei-De; Jia, Zhenyu
2018-03-02
The large-scale multidimensional omics data in the Genomic Data Commons (GDC) provides opportunities to investigate the crosstalk among different RNA species and their regulatory mechanisms in cancers. Easy-to-use bioinformatics pipelines are needed to facilitate such studies. We have developed a user-friendly R/Bioconductor package, named GDCRNATools, for downloading, organizing, and analyzing RNA data in GDC with an emphasis on deciphering the lncRNA-mRNA related competing endogenous RNAs (ceRNAs) regulatory network in cancers. Many widely used bioinformatics tools and databases are utilized in our package. Users can easily pack preferred downstream analysis pipelines or integrate their own pipelines into the workflow. Interactive shiny web apps built in GDCRNATools greatly improve visualization of results from the analysis. GDCRNATools is an R/Bioconductor package that is freely available at Bioconductor (http://bioconductor.org/packages/devel/bioc/html/GDCRNATools.html). Detailed instructions, manual and example code are also available in Github (https://github.com/Jialab-UCR/GDCRNATools). arthur.jia@ucr.edu or zhongwd2009@live.cn or doctorzhujianguo@163.com.
Morgan, Martin; Anders, Simon; Lawrence, Michael; Aboyoun, Patrick; Pagès, Hervé; Gentleman, Robert
2009-01-01
Summary: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources. Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows. Contact: mtmorgan@fhcrc.org PMID:19654119
flowAI: automatic and interactive anomaly discerning tools for flow cytometry data.
Monaco, Gianni; Chen, Hao; Poidinger, Michael; Chen, Jinmiao; de Magalhães, João Pedro; Larbi, Anis
2016-08-15
Flow cytometry (FCM) is widely used in both clinical and basic research to characterize cell phenotypes and functions. The latest FCM instruments analyze up to 20 markers of individual cells, producing high-dimensional data. This requires the use of the latest clustering and dimensionality reduction techniques to automatically segregate cell sub-populations in an unbiased manner. However, automated analyses may lead to false discoveries due to inter-sample differences in quality and properties. We present an R package, flowAI, containing two methods to clean FCM files from unwanted events: (i) an automatic method that adopts algorithms for the detection of anomalies and (ii) an interactive method with a graphical user interface implemented into an R shiny application. The general approach behind the two methods consists of three key steps to check and remove suspected anomalies that derive from (i) abrupt changes in the flow rate, (ii) instability of signal acquisition and (iii) outliers in the lower limit and margin events in the upper limit of the dynamic range. For each file analyzed our software generates a summary of the quality assessment from the aforementioned steps. The software presented is an intuitive solution seeking to improve the results not only of manual but also and in particular of automatic analysis on FCM data. R source code available through Bioconductor: http://bioconductor.org/packages/flowAI/ CONTACTS: mongianni1@gmail.com or Anis_Larbi@immunol.a-star.edu.sg Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
destiny: diffusion maps for large-scale single-cell data in R.
Angerer, Philipp; Haghverdi, Laleh; Büttner, Maren; Theis, Fabian J; Marr, Carsten; Buettner, Florian
2016-04-15
: Diffusion maps are a spectral method for non-linear dimension reduction and have recently been adapted for the visualization of single-cell expression data. Here we present destiny, an efficient R implementation of the diffusion map algorithm. Our package includes a single-cell specific noise model allowing for missing and censored values. In contrast to previous implementations, we further present an efficient nearest-neighbour approximation that allows for the processing of hundreds of thousands of cells and a functionality for projecting new data on existing diffusion maps. We exemplarily apply destiny to a recent time-resolved mass cytometry dataset of cellular reprogramming. destiny is an open-source R/Bioconductor package "bioconductor.org/packages/destiny" also available at www.helmholtz-muenchen.de/icb/destiny A detailed vignette describing functions and workflows is provided with the package. carsten.marr@helmholtz-muenchen.de or f.buettner@helmholtz-muenchen.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
HiTC: exploration of high-throughput ‘C’ experiments
Servant, Nicolas; Lajoie, Bryan R.; Nora, Elphège P.; Giorgetti, Luca; Chen, Chong-Jian; Heard, Edith; Dekker, Job; Barillot, Emmanuel
2012-01-01
Summary: The R/Bioconductor package HiTC facilitates the exploration of high-throughput 3C-based data. It allows users to import and export ‘C’ data, to transform, normalize, annotate and visualize interaction maps. The package operates within the Bioconductor framework and thus offers new opportunities for future development in this field. Availability and implementation: The R package HiTC is available from the Bioconductor website. A detailed vignette provides additional documentation and help for using the package. Contact: nicolas.servant@curie.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22923296
Egea, Jose A; Henriques, David; Cokelaer, Thomas; Villaverde, Alejandro F; MacNamara, Aidan; Danciu, Diana-Patricia; Banga, Julio R; Saez-Rodriguez, Julio
2014-05-10
Optimization is the key to solving many problems in computational biology. Global optimization methods, which provide a robust methodology, and metaheuristics in particular have proven to be the most efficient methods for many applications. Despite their utility, there is a limited availability of metaheuristic tools. We present MEIGO, an R and Matlab optimization toolbox (also available in Python via a wrapper of the R version), that implements metaheuristics capable of solving diverse problems arising in systems biology and bioinformatics. The toolbox includes the enhanced scatter search method (eSS) for continuous nonlinear programming (cNLP) and mixed-integer programming (MINLP) problems, and variable neighborhood search (VNS) for Integer Programming (IP) problems. Additionally, the R version includes BayesFit for parameter estimation by Bayesian inference. The eSS and VNS methods can be run on a single-thread or in parallel using a cooperative strategy. The code is supplied under GPLv3 and is available at http://www.iim.csic.es/~gingproc/meigo.html. Documentation and examples are included. The R package has been submitted to BioConductor. We evaluate MEIGO against optimization benchmarks, and illustrate its applicability to a series of case studies in bioinformatics and systems biology where it outperforms other state-of-the-art methods. MEIGO provides a free, open-source platform for optimization that can be applied to multiple domains of systems biology and bioinformatics. It includes efficient state of the art metaheuristics, and its open and modular structure allows the addition of further methods.
2014-01-01
Background Optimization is the key to solving many problems in computational biology. Global optimization methods, which provide a robust methodology, and metaheuristics in particular have proven to be the most efficient methods for many applications. Despite their utility, there is a limited availability of metaheuristic tools. Results We present MEIGO, an R and Matlab optimization toolbox (also available in Python via a wrapper of the R version), that implements metaheuristics capable of solving diverse problems arising in systems biology and bioinformatics. The toolbox includes the enhanced scatter search method (eSS) for continuous nonlinear programming (cNLP) and mixed-integer programming (MINLP) problems, and variable neighborhood search (VNS) for Integer Programming (IP) problems. Additionally, the R version includes BayesFit for parameter estimation by Bayesian inference. The eSS and VNS methods can be run on a single-thread or in parallel using a cooperative strategy. The code is supplied under GPLv3 and is available at http://www.iim.csic.es/~gingproc/meigo.html. Documentation and examples are included. The R package has been submitted to BioConductor. We evaluate MEIGO against optimization benchmarks, and illustrate its applicability to a series of case studies in bioinformatics and systems biology where it outperforms other state-of-the-art methods. Conclusions MEIGO provides a free, open-source platform for optimization that can be applied to multiple domains of systems biology and bioinformatics. It includes efficient state of the art metaheuristics, and its open and modular structure allows the addition of further methods. PMID:24885957
MWASTools: an R/bioconductor package for metabolome-wide association studies.
Rodriguez-Martinez, Andrea; Posma, Joram M; Ayala, Rafael; Neves, Ana L; Anwar, Maryam; Petretto, Enrico; Emanueli, Costanza; Gauguier, Dominique; Nicholson, Jeremy K; Dumas, Marc-Emmanuel
2018-03-01
MWASTools is an R package designed to provide an integrated pipeline to analyse metabonomic data in large-scale epidemiological studies. Key functionalities of our package include: quality control analysis; metabolome-wide association analysis using various models (partial correlations, generalized linear models); visualization of statistical outcomes; metabolite assignment using statistical total correlation spectroscopy (STOCSY); and biological interpretation of metabolome-wide association studies results. The MWASTools R package is implemented in R (version > =3.4) and is available from Bioconductor: https://bioconductor.org/packages/MWASTools/. m.dumas@imperial.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.
Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James
2018-02-01
Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).
Cyrface: An interface from Cytoscape to R that provides a user interface to R packages.
Gonçalves, Emanuel; Mirlach, Franz; Saez-Rodriguez, Julio
2013-01-01
There is an increasing number of software packages to analyse biological experimental data in the R environment. In particular, Bioconductor, a repository of curated R packages, is one of the most comprehensive resources for bioinformatics and biostatistics. The use of these packages is increasing, but it requires a basic understanding of the R language, as well as the syntax of the specific package used. The availability of user graphical interfaces for these packages would decrease the learning curve and broaden their application. Here, we present a Cytoscape app termed Cyrface that allows Cytoscape apps to connect to any function and package developed in R. Cyrface can be used to run R packages from within the Cytoscape environment making use of a graphical user interface. Moreover, it can link R packages with the capabilities of Cytoscape and its apps, in particular network visualization and analysis. Cyrface's utility has been demonstrated for two Bioconductor packages ( CellNOptR and DrugVsDisease), and here we further illustrate its usage by implementing a workflow of data analysis and visualization. Download links, installation instructions and user guides can be accessed from the Cyrface's homepage ( http://www.ebi.ac.uk/saezrodriguez/cyrface/) and from the Cytoscape app store ( http://apps.cytoscape.org/apps/cyrface).
Menu-driven cloud computing and resource sharing for R and Bioconductor.
Bolouri, Hamid; Dulepet, Rajiv; Angerman, Michael
2011-08-15
We report CRdata.org, a cloud-based, free, open-source web server for running analyses and sharing data and R scripts with others. In addition to using the free, public service, CRdata users can launch their own private Amazon Elastic Computing Cloud (EC2) nodes and store private data and scripts on Amazon's Simple Storage Service (S3) with user-controlled access rights. All CRdata services are provided via point-and-click menus. CRdata is open-source and free under the permissive MIT License (opensource.org/licenses/mit-license.php). The source code is in Ruby (ruby-lang.org/en/) and available at: github.com/seerdata/crdata. hbolouri@fhcrc.org.
DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
Glez-Peña, Daniel; Álvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino
2009-01-01
Background Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. Results DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. Conclusion DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. PMID:19178723
DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data.
Glez-Peña, Daniel; Alvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino
2009-01-29
Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.
Del Carratore, Francesco; Jankevics, Andris; Eisinga, Rob; Heskes, Tom; Hong, Fangxin; Breitling, Rainer
2017-09-01
The Rank Product (RP) is a statistical technique widely used to detect differentially expressed features in molecular profiling experiments such as transcriptomics, metabolomics and proteomics studies. An implementation of the RP and the closely related Rank Sum (RS) statistics has been available in the RankProd Bioconductor package for several years. However, several recent advances in the understanding of the statistical foundations of the method have made a complete refactoring of the existing package desirable. We implemented a completely refactored version of the RankProd package, which provides a more principled implementation of the statistics for unpaired datasets. Moreover, the permutation-based P -value estimation methods have been replaced by exact methods, providing faster and more accurate results. RankProd 2.0 is available at Bioconductor ( https://www.bioconductor.org/packages/devel/bioc/html/RankProd.html ) and as part of the mzMatch pipeline ( http://www.mzmatch.sourceforge.net ). rainer.breitling@manchester.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
AOP: An R Package For Sufficient Causal Analysis in Pathway ...
Summary: How can I quickly find the key events in a pathway that I need to monitor to predict that a/an beneficial/adverse event/outcome will occur? This is a key question when using signaling pathways for drug/chemical screening in pharma-cology, toxicology and risk assessment. By identifying these sufficient causal key events, we have fewer events to monitor for a pathway, thereby decreasing assay costs and time, while maximizing the value of the information. I have developed the “aop” package which uses backdoor analysis of causal net-works to identify these minimal sets of key events that are suf-ficient for making causal predictions. Availability and Implementation: The source and binary are available online through the Bioconductor project (http://www.bioconductor.org/) as an R package titled “aop”. The R/Bioconductor package runs within the R statistical envi-ronment. The package has functions that can take pathways (as directed graphs) formatted as a Cytoscape JSON file as input, or pathways can be represented as directed graphs us-ing the R/Bioconductor “graph” package. The “aop” package has functions that can perform backdoor analysis to identify the minimal set of key events for making causal predictions.Contact: burgoon.lyle@epa.gov This paper describes an R/Bioconductor package that was developed to facilitate the identification of key events within an AOP that are the minimal set of sufficient key events that need to be tested/monit
2012-01-01
Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html. PMID:22531049
Menu-driven cloud computing and resource sharing for R and Bioconductor
Bolouri, Hamid; Angerman, Michael
2011-01-01
Summary: We report CRdata.org, a cloud-based, free, open-source web server for running analyses and sharing data and R scripts with others. In addition to using the free, public service, CRdata users can launch their own private Amazon Elastic Computing Cloud (EC2) nodes and store private data and scripts on Amazon's Simple Storage Service (S3) with user-controlled access rights. All CRdata services are provided via point-and-click menus. Availability and Implementation: CRdata is open-source and free under the permissive MIT License (opensource.org/licenses/mit-license.php). The source code is in Ruby (ruby-lang.org/en/) and available at: github.com/seerdata/crdata. Contact: hbolouri@fhcrc.org PMID:21685055
Huntley, Melanie A; Larson, Jessica L; Chaivorapol, Christina; Becker, Gabriel; Lawrence, Michael; Hackney, Jason A; Kaminker, Joshua S
2013-12-15
It is common for computational analyses to generate large amounts of complex data that are difficult to process and share with collaborators. Standard methods are needed to transform such data into a more useful and intuitive format. We present ReportingTools, a Bioconductor package, that automatically recognizes and transforms the output of many common Bioconductor packages into rich, interactive, HTML-based reports. Reports are not generic, but have been individually designed to reflect content specific to the result type detected. Tabular output included in reports is sortable, filterable and searchable and contains context-relevant hyperlinks to external databases. Additionally, in-line graphics have been developed for specific analysis types and are embedded by default within table rows, providing a useful visual summary of underlying raw data. ReportingTools is highly flexible and reports can be easily customized for specific applications using the well-defined API. The ReportingTools package is implemented in R and available from Bioconductor (version ≥ 2.11) at the URL: http://bioconductor.org/packages/release/bioc/html/ReportingTools.html. Installation instructions and usage documentation can also be found at the above URL.
2013-01-01
Background Surrogate variable analysis (SVA) is a powerful method to identify, estimate, and utilize the components of gene expression heterogeneity due to unknown and/or unmeasured technical, genetic, environmental, or demographic factors. These sources of heterogeneity are common in gene expression studies, and failing to incorporate them into the analysis can obscure results. Using SVA increases the biological accuracy and reproducibility of gene expression studies by identifying these sources of heterogeneity and correctly accounting for them in the analysis. Results Here we have developed a web application called SVAw (Surrogate variable analysis Web app) that provides a user friendly interface for SVA analyses of genome-wide expression studies. The software has been developed based on open source bioconductor SVA package. In our software, we have extended the SVA program functionality in three aspects: (i) the SVAw performs a fully automated and user friendly analysis workflow; (ii) It calculates probe/gene Statistics for both pre and post SVA analysis and provides a table of results for the regression of gene expression on the primary variable of interest before and after correcting for surrogate variables; and (iii) it generates a comprehensive report file, including graphical comparison of the outcome for the user. Conclusions SVAw is a web server freely accessible solution for the surrogate variant analysis of high-throughput datasets and facilitates removing all unwanted and unknown sources of variation. It is freely available for use at http://psychiatry.igm.jhmi.edu/sva. The executable packages for both web and standalone application and the instruction for installation can be downloaded from our web site. PMID:23497726
Pathview: an R/Bioconductor package for pathway-based data integration and visualization.
Luo, Weijun; Brouwer, Cory
2013-07-15
Pathview is a novel tool set for pathway-based data integration and visualization. It maps and renders user data on relevant pathway graphs. Users only need to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps and integrates user data onto the pathway and renders pathway graphs with the mapped data. Although built as a stand-alone program, Pathview may seamlessly integrate with pathway and functional analysis tools for large-scale and fully automated analysis pipelines. The package is freely available under the GPLv3 license through Bioconductor and R-Forge. It is available at http://bioconductor.org/packages/release/bioc/html/pathview.html and at http://Pathview.r-forge.r-project.org/. luo_weijun@yahoo.com Supplementary data are available at Bioinformatics online.
Model-based gene set analysis for Bioconductor.
Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien
2011-07-01
Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.
iGC-an integrated analysis package of gene expression and copy number alteration.
Lai, Yi-Pin; Wang, Liang-Bo; Wang, Wei-An; Lai, Liang-Chuan; Tsai, Mong-Hsun; Lu, Tzu-Pin; Chuang, Eric Y
2017-01-14
With the advancement in high-throughput technologies, researchers can simultaneously investigate gene expression and copy number alteration (CNA) data from individual patients at a lower cost. Traditional analysis methods analyze each type of data individually and integrate their results using Venn diagrams. Challenges arise, however, when the results are irreproducible and inconsistent across multiple platforms. To address these issues, one possible approach is to concurrently analyze both gene expression profiling and CNAs in the same individual. We have developed an open-source R/Bioconductor package (iGC). Multiple input formats are supported and users can define their own criteria for identifying differentially expressed genes driven by CNAs. The analysis of two real microarray datasets demonstrated that the CNA-driven genes identified by the iGC package showed significantly higher Pearson correlation coefficients with their gene expression levels and copy numbers than those genes located in a genomic region with CNA. Compared with the Venn diagram approach, the iGC package showed better performance. The iGC package is effective and useful for identifying CNA-driven genes. By simultaneously considering both comparative genomic and transcriptomic data, it can provide better understanding of biological and medical questions. The iGC package's source code and manual are freely available at https://www.bioconductor.org/packages/release/bioc/html/iGC.html .
User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org
Eijssen, Lars M. T.; Jaillard, Magali; Adriaens, Michiel E.; Gaj, Stan; de Groot, Philip J.; Müller, Michael; Evelo, Chris T.
2013-01-01
Quality control (QC) is crucial for any scientific method producing data. Applying adequate QC introduces new challenges in the genomics field where large amounts of data are produced with complex technologies. For DNA microarrays, specific algorithms for QC and pre-processing including normalization have been developed by the scientific community, especially for expression chips of the Affymetrix platform. Many of these have been implemented in the statistical scripting language R and are available from the Bioconductor repository. However, application is hampered by lack of integrative tools that can be used by users of any experience level. To fill this gap, we developed a freely available tool for QC and pre-processing of Affymetrix gene expression results, extending, integrating and harmonizing functionality of Bioconductor packages. The tool can be easily accessed through a wizard-like web portal at http://www.arrayanalysis.org or downloaded for local use in R. The portal provides extensive documentation, including user guides, interpretation help with real output illustrations and detailed technical documentation. It assists newcomers to the field in performing state-of-the-art QC and pre-processing while offering data analysts an integral open-source package. Providing the scientific community with this easily accessible tool will allow improving data quality and reuse and adoption of standards. PMID:23620278
FunChIP: an R/Bioconductor package for functional classification of ChIP-seq shapes.
Parodi, Alice C L; Sangalli, Laura M; Vantini, Simone; Amati, Bruno; Secchi, Piercesare; Morelli, Marco J
2017-08-15
Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) generates local accumulations of sequencing reads on the genome ("peaks"), which correspond to specific protein-DNA interactions or chromatin modifications. Peaks are detected by considering their total area above a background signal, usually neglecting their shapes, which instead may convey additional biological information. We present FunChIP, an R/Bioconductor package for clustering peaks according to a functional representation of their shapes: after approximating their profiles with cubic B-splines, FunChIP minimizes their functional distance and classifies the peaks applying a k-mean alignment and clustering algorithm. The whole pipeline is user-friendly and provides visualization functions for a quick inspection of the results. An application to the transcription factor Myc in 3T9 murine fibroblasts shows that clusters of peaks with different shapes are associated with different genomic locations and different transcriptional regulatory activity. The package is implemented in R and is available under Artistic Licence 2.0 from the Bioconductor website (http://bioconductor.org/packages/FunChIP). marco.morelli@iit.it. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Lareau, Caleb A; Aryee, Martin J; Berger, Bonnie
2018-02-15
The 3D architecture of DNA within the nucleus is a key determinant of interactions between genes, regulatory elements, and transcriptional machinery. As a result, differences in DNA looping structure are associated with variation in gene expression and cell state. To systematically assess changes in DNA looping architecture between samples, we introduce diffloop, an R/Bioconductor package that provides a suite of functions for the quality control, statistical testing, annotation, and visualization of DNA loops. We demonstrate this functionality by detecting differences between ENCODE ChIA-PET samples and relate looping to variability in epigenetic state. Diffloop is implemented as an R/Bioconductor package available at https://bioconductor.org/packages/release/bioc/html/diffloop.html. aryee.martin@mgh.harvard.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor
Collado-Torres, Leonardo; Nellore, Abhinav; Jaffe, Andrew E.
2017-01-01
The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recountBioconductor package. This workflow explains in detail how to use the recountpackage and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. Step-by-step directions show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. This workflow thus provides further information to understand the data in recount2 and a compendium of R code to use the data. PMID:29043067
GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus
Zhu, Yuelin; Davis, Sean; Stephens, Robert; Meltzer, Paul S.; Chen, Yidong
2008-01-01
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website. Contact: yidong@mail.nih.gov PMID:18842599
Gel, Bernat; Díez-Villanueva, Anna; Serra, Eduard; Buschbeck, Marcus; Peinado, Miguel A; Malinverni, Roberto
2016-01-15
Statistically assessing the relation between a set of genomic regions and other genomic features is a common challenging task in genomic and epigenomic analyses. Randomization based approaches implicitly take into account the complexity of the genome without the need of assuming an underlying statistical model. regioneR is an R package that implements a permutation test framework specifically designed to work with genomic regions. In addition to the predefined randomization and evaluation strategies, regioneR is fully customizable allowing the use of custom strategies to adapt it to specific questions. Finally, it also implements a novel function to evaluate the local specificity of the detected association. regioneR is an R package released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/regioneR). rmalinverni@carrerasresearch.org. © The Author 2015. Published by Oxford University Press.
Computational Biology Methods for Characterization of Pluripotent Cells.
Araúzo-Bravo, Marcos J
2016-01-01
Pluripotent cells are a powerful tool for regenerative medicine and drug discovery. Several techniques have been developed to induce pluripotency, or to extract pluripotent cells from different tissues and biological fluids. However, the characterization of pluripotency requires tedious, expensive, time-consuming, and not always reliable wet-lab experiments; thus, an easy, standard quality-control protocol of pluripotency assessment remains to be established. Here to help comes the use of high-throughput techniques, and in particular, the employment of gene expression microarrays, which has become a complementary technique for cellular characterization. Research has shown that the transcriptomics comparison with an Embryonic Stem Cell (ESC) of reference is a good approach to assess the pluripotency. Under the premise that the best protocol is a computer software source code, here I propose and explain line by line a software protocol coded in R-Bioconductor for pluripotency assessment based on the comparison of transcriptomics data of pluripotent cells with an ESC of reference. I provide advice for experimental design, warning about possible pitfalls, and guides for results interpretation.
Cuadros-Inostroza, Alvaro; Caldana, Camila; Redestig, Henning; Kusano, Miyako; Lisec, Jan; Peña-Cortés, Hugo; Willmitzer, Lothar; Hannah, Matthew A
2009-12-16
Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.
2009-01-01
Background Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. Results We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. Conclusions TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data. PMID:20015393
Tra, Yolande V; Evans, Irene M
2010-01-01
BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course.
Evans, Irene M.
2010-01-01
BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954
duVerle, David A; Yotsukura, Sohiya; Nomura, Seitaro; Aburatani, Hiroyuki; Tsuda, Koji
2016-09-13
Single-cell RNA sequencing is fast becoming one the standard method for gene expression measurement, providing unique insights into cellular processes. A number of methods, based on general dimensionality reduction techniques, have been suggested to help infer and visualise the underlying structure of cell populations from single-cell expression levels, yet their models generally lack proper biological grounding and struggle at identifying complex differentiation paths. Here we introduce cellTree: an R/Bioconductor package that uses a novel statistical approach, based on document analysis techniques, to produce tree structures outlining the hierarchical relationship between single-cell samples, while identifying latent groups of genes that can provide biological insights. With cellTree, we provide experimentalists with an easy-to-use tool, based on statistically and biologically-sound algorithms, to efficiently explore and visualise single-cell RNA data. The cellTree package is publicly available in the online Bionconductor repository at: http://bioconductor.org/packages/cellTree/ .
Meyer, Patrick E; Lafitte, Frédéric; Bontempi, Gianluca
2008-10-29
This paper presents the R/Bioconductor package minet (version 1.1.6) which provides a set of functions to infer mutual information networks from a dataset. Once fed with a microarray dataset, the package returns a network where nodes denote genes, edges model statistical dependencies between genes and the weight of an edge quantifies the statistical evidence of a specific (e.g transcriptional) gene-to-gene interaction. Four different entropy estimators are made available in the package minet (empirical, Miller-Madow, Schurmann-Grassberger and shrink) as well as four different inference methods, namely relevance networks, ARACNE, CLR and MRNET. Also, the package integrates accuracy assessment tools, like F-scores, PR-curves and ROC-curves in order to compare the inferred network with a reference one. The package minet provides a series of tools for inferring transcriptional networks from microarray data. It is freely available from the Comprehensive R Archive Network (CRAN) as well as from the Bioconductor website.
DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis.
Yu, Guangchuang; Wang, Li-Gen; Yan, Guang-Rong; He, Qing-Yu
2015-02-15
Disease ontology (DO) annotates human genes in the context of disease. DO is important annotation in translating molecular findings from high-throughput data to clinical relevance. DOSE is an R package providing semantic similarity computations among DO terms and genes which allows biologists to explore the similarities of diseases and of gene functions in disease perspective. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented to support discovering disease associations of high-throughput biological data. This allows biologists to verify disease relevance in a biological experiment and identify unexpected disease associations. Comparison among gene clusters is also supported. DOSE is released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/DOSE.html). Supplementary data are available at Bioinformatics online. gcyu@connect.hku.hk or tqyhe@jnu.edu.cn. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv
2013-01-01
De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962
GenoGAM: genome-wide generalized additive models for ChIP-Seq analysis.
Stricker, Georg; Engelhardt, Alexander; Schulz, Daniel; Schmid, Matthias; Tresch, Achim; Gagneur, Julien
2017-08-01
Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein-DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-Seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective. Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays. Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.html . gagneur@in.tum.de. Supplementary information is available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Wang, Likun; Yang, Luhe; Peng, Zuohan; Lu, Dan; Jin, Yan; McNutt, Michael; Yin, Yuxin
2015-01-01
With the burgeoning development of cloud technology and services, there are an increasing number of users who prefer cloud to run their applications. All software and associated data are hosted on the cloud, allowing users to access them via a web browser from any computer, anywhere. This paper presents cisPath, an R/Bioconductor package deployed on cloud servers for client users to visualize, manage, and share functional protein interaction networks. With this R package, users can easily integrate downloaded protein-protein interaction information from different online databases with private data to construct new and personalized interaction networks. Additional functions allow users to generate specific networks based on private databases. Since the results produced with the use of this package are in the form of web pages, cloud users can easily view and edit the network graphs via the browser, using a mouse or touch screen, without the need to download them to a local computer. This package can also be installed and run on a local desktop computer. Depending on user preference, results can be publicized or shared by uploading to a web server or cloud driver, allowing other users to directly access results via a web browser. This package can be installed and run on a variety of platforms. Since all network views are shown in web pages, such package is particularly useful for cloud users. The easy installation and operation is an attractive quality for R beginners and users with no previous experience with cloud services.
Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E
2016-03-11
Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.
2015-01-01
Background With the burgeoning development of cloud technology and services, there are an increasing number of users who prefer cloud to run their applications. All software and associated data are hosted on the cloud, allowing users to access them via a web browser from any computer, anywhere. This paper presents cisPath, an R/Bioconductor package deployed on cloud servers for client users to visualize, manage, and share functional protein interaction networks. Results With this R package, users can easily integrate downloaded protein-protein interaction information from different online databases with private data to construct new and personalized interaction networks. Additional functions allow users to generate specific networks based on private databases. Since the results produced with the use of this package are in the form of web pages, cloud users can easily view and edit the network graphs via the browser, using a mouse or touch screen, without the need to download them to a local computer. This package can also be installed and run on a local desktop computer. Depending on user preference, results can be publicized or shared by uploading to a web server or cloud driver, allowing other users to directly access results via a web browser. Conclusions This package can be installed and run on a variety of platforms. Since all network views are shown in web pages, such package is particularly useful for cloud users. The easy installation and operation is an attractive quality for R beginners and users with no previous experience with cloud services. PMID:25708840
MAPI: towards the integrated exploitation of bioinformatics Web Services.
Ramirez, Sergio; Karlsson, Johan; Trelles, Oswaldo
2011-10-27
Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI) that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others).
SpeCond: a method to detect condition-specific gene expression
2011-01-01
Transcriptomic studies routinely measure expression levels across numerous conditions. These datasets allow identification of genes that are specifically expressed in a small number of conditions. However, there are currently no statistically robust methods for identifying such genes. Here we present SpeCond, a method to detect condition-specific genes that outperforms alternative approaches. We apply the method to a dataset of 32 human tissues to determine 2,673 specifically expressed genes. An implementation of SpeCond is freely available as a Bioconductor package at http://www.bioconductor.org/packages/release/bioc/html/SpeCond.html. PMID:22008066
Zackay, Arie; Steinhoff, Christine
2010-12-15
Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org.
2010-01-01
Background Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. Findings MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. Conclusions The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org. PMID:21159174
Chimera: a Bioconductor package for secondary analysis of fusion products.
Beccuti, Marco; Carrara, Matteo; Cordero, Francesca; Lazzarato, Fulvio; Donatelli, Susanna; Nadalin, Francesca; Policriti, Alberto; Calogero, Raffaele A
2014-12-15
Chimera is a Bioconductor package that organizes, annotates, analyses and validates fusions reported by different fusion detection tools; current implementation can deal with output from bellerophontes, chimeraScan, deFuse, fusionCatcher, FusionFinder, FusionHunter, FusionMap, mapSplice, Rsubread, tophat-fusion and STAR. The core of Chimera is a fusion data structure that can store fusion events detected with any of the aforementioned tools. Fusions are then easily manipulated with standard R functions or through the set of functionalities specifically developed in Chimera with the aim of supporting the user in managing fusions and discriminating false-positive results. © The Author 2014. Published by Oxford University Press.
clusterProfiler: an R package for comparing biological themes among gene clusters.
Yu, Guangchuang; Wang, Li-Gen; Han, Yanyan; He, Qing-Yu
2012-05-01
Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.
psygenet2r: a R/Bioconductor package for the analysis of psychiatric disease genes.
Gutiérrez-Sacristán, Alba; Hernández-Ferrer, Carles; González, Juan R; Furlong, Laura I
2017-12-15
Psychiatric disorders have a great impact on morbidity and mortality. Genotype-phenotype resources for psychiatric diseases are key to enable the translation of research findings to a better care of patients. PsyGeNET is a knowledge resource on psychiatric diseases and their genes, developed by text mining and curated by domain experts. We present psygenet2r, an R package that contains a variety of functions for leveraging PsyGeNET database and facilitating its analysis and interpretation. The package offers different types of queries to the database along with variety of analysis and visualization tools, including the study of the anatomical structures in which the genes are expressed and gaining insight of gene's molecular function. Psygenet2r is especially suited for network medicine analysis of psychiatric disorders. The package is implemented in R and is available under MIT license from Bioconductor (http://bioconductor.org/packages/release/bioc/html/psygenet2r.html). juanr.gonzalez@isglobal.org or laura.furlong@upf.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data
Colaprico, Antonio; Silva, Tiago C.; Olsen, Catharina; Garofano, Luciano; Cava, Claudia; Garolini, Davide; Sabedot, Thais S.; Malta, Tathiane M.; Pagnotta, Stefano M.; Castiglioni, Isabella; Ceccarelli, Michele; Bontempi, Gianluca; Noushmehr, Houtan
2016-01-01
The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries. PMID:26704973
lpNet: a linear programming approach to reconstruct signal transduction networks.
Matos, Marta R A; Knapp, Bettina; Kaderali, Lars
2015-10-01
With the widespread availability of high-throughput experimental technologies it has become possible to study hundreds to thousands of cellular factors simultaneously, such as coding- or non-coding mRNA or protein concentrations. Still, extracting information about the underlying regulatory or signaling interactions from these data remains a difficult challenge. We present a flexible approach towards network inference based on linear programming. Our method reconstructs the interactions of factors from a combination of perturbation/non-perturbation and steady-state/time-series data. We show both on simulated and real data that our methods are able to reconstruct the underlying networks fast and efficiently, thus shedding new light on biological processes and, in particular, into disease's mechanisms of action. We have implemented the approach as an R package available through bioconductor. This R package is freely available under the Gnu Public License (GPL-3) from bioconductor.org (http://bioconductor.org/packages/release/bioc/html/lpNet.html) and is compatible with most operating systems (Windows, Linux, Mac OS) and hardware architectures. bettina.knapp@helmholtz-muenchen.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Taminau, Jonatan; Meganck, Stijn; Lazar, Cosmin; Steenhoff, David; Coletta, Alain; Molter, Colin; Duque, Robin; de Schaetzen, Virginie; Weiss Solís, David Y; Bersini, Hugues; Nowé, Ann
2012-12-24
With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck. We present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well. By providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/].
Mathelier, Anthony; Zhao, Xiaobei; Zhang, Allen W.; Parcy, François; Worsley-Hunt, Rebecca; Arenillas, David J.; Buchman, Sorana; Chen, Chih-yu; Chou, Alice; Ienasescu, Hans; Lim, Jonathan; Shyr, Casper; Tan, Ge; Zhou, Michelle; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.
2014-01-01
JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR—the JASPAR CORE subcollection, which contains curated, non-redundant profiles—with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods. PMID:24194598
Mathelier, Anthony; Zhao, Xiaobei; Zhang, Allen W; Parcy, François; Worsley-Hunt, Rebecca; Arenillas, David J; Buchman, Sorana; Chen, Chih-yu; Chou, Alice; Ienasescu, Hans; Lim, Jonathan; Shyr, Casper; Tan, Ge; Zhou, Michelle; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W
2014-01-01
JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR-the JASPAR CORE subcollection, which contains curated, non-redundant profiles-with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods.
Waardenberg, Ashley J; Basset, Samuel D; Bouveret, Romaric; Harvey, Richard P
2015-09-02
Gene ontology (GO) enrichment is commonly used for inferring biological meaning from systems biology experiments. However, determining differential GO and pathway enrichment between DNA-binding experiments or using the GO structure to classify experiments has received little attention. Herein, we present a bioinformatics tool, CompGO, for identifying Differentially Enriched Gene Ontologies, called DiEGOs, and pathways, through the use of a z-score derivation of log odds ratios, and visualizing these differences at GO and pathway level. Through public experimental data focused on the cardiac transcription factor NKX2-5, we illustrate the problems associated with comparing GO enrichments between experiments using a simple overlap approach. We have developed an R/Bioconductor package, CompGO, which implements a new statistic normally used in epidemiological studies for performing comparative GO analyses and visualizing comparisons from . BED data containing genomic coordinates as well as gene lists as inputs. We justify the statistic through inclusion of experimental data and compare to the commonly used overlap method. CompGO is freely available as a R/Bioconductor package enabling easy integration into existing pipelines and is available at: http://www.bioconductor.org/packages/release/bioc/html/CompGO.html packages/release/bioc/html/CompGO.html.
Iorio, Francesco; Bernardo-Faura, Marti; Gobbi, Andrea; Cokelaer, Thomas; Jurman, Giuseppe; Saez-Rodriguez, Julio
2016-12-20
Networks are popular and powerful tools to describe and model biological processes. Many computational methods have been developed to infer biological networks from literature, high-throughput experiments, and combinations of both. Additionally, a wide range of tools has been developed to map experimental data onto reference biological networks, in order to extract meaningful modules. Many of these methods assess results' significance against null distributions of randomized networks. However, these standard unconstrained randomizations do not preserve the functional characterization of the nodes in the reference networks (i.e. their degrees and connection signs), hence including potential biases in the assessment. Building on our previous work about rewiring bipartite networks, we propose a method for rewiring any type of unweighted networks. In particular we formally demonstrate that the problem of rewiring a signed and directed network preserving its functional connectivity (F-rewiring) reduces to the problem of rewiring two induced bipartite networks. Additionally, we reformulate the lower bound to the iterations' number of the switching-algorithm to make it suitable for the F-rewiring of networks of any size. Finally, we present BiRewire3, an open-source Bioconductor package enabling the F-rewiring of any type of unweighted network. We illustrate its application to a case study about the identification of modules from gene expression data mapped on protein interaction networks, and a second one focused on building logic models from more complex signed-directed reference signaling networks and phosphoproteomic data. BiRewire3 it is freely available at https://www.bioconductor.org/packages/BiRewire/ , and it should have a broad application as it allows an efficient and analytically derived statistical assessment of results from any network biology tool.
Pagès, Hervé
2018-01-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set. PMID:29723188
Lun, Aaron T L; Pagès, Hervé; Smith, Mike L
2018-05-01
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Perkins, James R; Dawes, John M; McMahon, Steve B; Bennett, David L H; Orengo, Christine; Kohl, Matthias
2012-07-02
Measuring gene transcription using real-time reverse transcription polymerase chain reaction (RT-qPCR) technology is a mainstay of molecular biology. Technologies now exist to measure the abundance of many transcripts in parallel. The selection of the optimal reference gene for the normalisation of this data is a recurring problem, and several algorithms have been developed in order to solve it. So far nothing in R exists to unite these methods, together with other functions to read in and normalise the data using the chosen reference gene(s). We have developed two R/Bioconductor packages, ReadqPCR and NormqPCR, intended for a user with some experience with high-throughput data analysis using R, who wishes to use R to analyse RT-qPCR data. We illustrate their potential use in a workflow analysing a generic RT-qPCR experiment, and apply this to a real dataset. Packages are available from http://www.bioconductor.org/packages/release/bioc/html/ReadqPCR.htmland http://www.bioconductor.org/packages/release/bioc/html/NormqPCR.html These packages increase the repetoire of RT-qPCR analysis tools available to the R user and allow them to (amongst other things) read their data into R, hold it in an ExpressionSet compatible R object, choose appropriate reference genes, normalise the data and look for differential expression between samples.
The Emergence of Open-Source Software in China
ERIC Educational Resources Information Center
Pan, Guohua; Bonk, Curtis J.
2007-01-01
The open-source software movement is gaining increasing momentum in China. Of the limited numbers of open-source software in China, "Red Flag Linux" stands out most strikingly, commanding 30 percent share of Chinese software market. Unlike the spontaneity of open-source movement in North America, open-source software development in…
ChAMP: updated methylation analysis pipeline for Illumina BeadChips.
Tian, Yuan; Morris, Tiffany J; Webster, Amy P; Yang, Zhen; Beck, Stephan; Feber, Andrew; Teschendorff, Andrew E
2017-12-15
The Illumina Infinium HumanMethylationEPIC BeadChip is the new platform for high-throughput DNA methylation analysis, effectively doubling the coverage compared to the older 450 K array. Here we present a significantly updated and improved version of the Bioconductor package ChAMP, which can be used to analyze EPIC and 450k data. Many enhanced functionalities have been added, including correction for cell-type heterogeneity, network analysis and a series of interactive graphical user interfaces. ChAMP is a BioC package available from https://bioconductor.org/packages/release/bioc/html/ChAMP.html. a.teschendorff@ucl.ac.uk or s.beck@ucl.ac.uk or a.feber@ucl.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Zha, Xianfeng; Yin, Qingsong; Tan, Huo; Wang, Chunyan; Chen, Shaohua; Yang, Lijian; Li, Bo; Wu, Xiuli; Li, Yangqiu
2013-05-01
Antigen-specific, T-cell receptor (TCR)-modified cytotoxic T lymphocytes (CTLs) that target tumors are an attractive strategy for specific adoptive immunotherapy. Little is known about whether there are any alterations in the gene expression profile after TCR gene transduction in T cells. We constructed TCR gene-redirected CTLs with specificity for diffuse large B-cell lymphoma (DLBCL)-associated antigens to elucidate the gene expression profiles of TCR gene-redirected T-cells, and we further analyzed the gene expression profile pattern of these redirected T-cells by Affymetrix microarrays. The resulting data were analyzed using Bioconductor software, a two-fold cut-off expression change was applied together with anti-correlation of the profile ratios to render the microarray analysis set. The fold change of all genes was calculated by comparing the three TCR gene-modified T-cells and a negative control counterpart. The gene pathways were analyzed using Bioconductor and Kyoto Encyclopedia of Genes and Genomes. Identical genes whose fold change was greater than or equal to 2.0 in all three TCR gene-redirected T-cell groups in comparison with the negative control were identified as the differentially expressed genes. The differentially expressed genes were comprised of 33 up-regulated genes and 1 down-regulated gene including JUNB, FOS, TNF, INF-γ, DUSP2, IL-1B, CXCL1, CXCL2, CXCL9, CCL2, CCL4, and CCL8. These genes are mainly involved in the TCR signaling, mitogen-activated protein kinase signaling, and cytokine-cytokine receptor interaction pathways. In conclusion, we characterized the gene expression profile of DLBCL-specific TCR gene-redirected T-cells. The changes corresponded to an up-regulation in the differentiation and proliferation of the T-cells. These data may help to explain some of the characteristics of the redirected T-cells.
Detecting Rhythms in Time Series with RAIN
Thaben, Paul F.; Westermark, Pål O.
2014-01-01
A fundamental problem in research on biological rhythms is that of detecting and assessing the significance of rhythms in large sets of data. Classic methods based on Fourier theory are often hampered by the complex and unpredictable characteristics of experimental and biological noise. Robust nonparametric methods are available but are limited to specific wave forms. We present RAIN, a robust nonparametric method for the detection of rhythms of prespecified periods in biological data that can detect arbitrary wave forms. When applied to measurements of the circadian transcriptome and proteome of mouse liver, the sets of transcripts and proteins with rhythmic abundances were significantly expanded due to the increased detection power, when we controlled for false discovery. Validation against independent data confirmed the quality of these results. The large expansion of the circadian mouse liver transcriptomes and proteomes reflected the prevalence of nonsymmetric wave forms and led to new conclusions about function. RAIN was implemented as a freely available software package for R/Bioconductor and is presently also available as a web interface. PMID:25326247
Haas, Brian J; Papanicolaou, Alexie; Yassour, Moran; Grabherr, Manfred; Blood, Philip D; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N; Henschel, Robert; LeDuc, Richard D; Friedman, Nir; Regev, Aviv
2013-08-01
De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
ADaCGH: A Parallelized Web-Based Application and R Package for the Analysis of aCGH Data
Díaz-Uriarte, Ramón; Rueda, Oscar M.
2007-01-01
Background Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. Methodology/Principal Findings ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. Conclusions/Significance ADaCGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45×); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects. PMID:17710137
ADaCGH: A parallelized web-based application and R package for the analysis of aCGH data.
Díaz-Uriarte, Ramón; Rueda, Oscar M
2007-08-15
Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies. ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers. ADACGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45x); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects.
missMethyl: an R package for analyzing data from Illumina's HumanMethylation450 platform.
Phipson, Belinda; Maksimovic, Jovana; Oshlack, Alicia
2016-01-15
DNA methylation is one of the most commonly studied epigenetic modifications due to its role in both disease and development. The Illumina HumanMethylation450 BeadChip is a cost-effective way to profile >450 000 CpGs across the human genome, making it a popular platform for profiling DNA methylation. Here we introduce missMethyl, an R package with a suite of tools for performing normalization, removal of unwanted variation in differential methylation analysis, differential variability testing and gene set analysis for the 450K array. missMethyl is an R package available from the Bioconductor project at www.bioconductor.org. alicia.oshlack@mcri.edu.au Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Grote, Steffi; Prüfer, Kay; Kelso, Janet; Dannemann, Michael
2016-10-15
We present ABAEnrichment, an R package that tests for expression enrichment in specific brain regions at different developmental stages using expression information gathered from multiple regions of the adult and developing human brain, together with ontologically organized structural information about the brain, both provided by the Allen Brain Atlas. We validate ABAEnrichment by successfully recovering the origin of gene sets identified in specific brain cell-types and developmental stages. ABAEnrichment was implemented as an R package and is available under GPL (≥ 2) from the Bioconductor website (http://bioconductor.org/packages/3.3/bioc/html/ABAEnrichment.html). steffi_grote@eva.mpg.de, kelso@eva.mpg.de or michael_dannemann@eva.mpg.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
2012-01-01
Background Measuring gene transcription using real-time reverse transcription polymerase chain reaction (RT-qPCR) technology is a mainstay of molecular biology. Technologies now exist to measure the abundance of many transcripts in parallel. The selection of the optimal reference gene for the normalisation of this data is a recurring problem, and several algorithms have been developed in order to solve it. So far nothing in R exists to unite these methods, together with other functions to read in and normalise the data using the chosen reference gene(s). Results We have developed two R/Bioconductor packages, ReadqPCR and NormqPCR, intended for a user with some experience with high-throughput data analysis using R, who wishes to use R to analyse RT-qPCR data. We illustrate their potential use in a workflow analysing a generic RT-qPCR experiment, and apply this to a real dataset. Packages are available from http://www.bioconductor.org/packages/release/bioc/html/ReadqPCR.htmland http://www.bioconductor.org/packages/release/bioc/html/NormqPCR.html Conclusions These packages increase the repetoire of RT-qPCR analysis tools available to the R user and allow them to (amongst other things) read their data into R, hold it in an ExpressionSet compatible R object, choose appropriate reference genes, normalise the data and look for differential expression between samples. PMID:22748112
Software for Real-Time Analysis of Subsonic Test Shot Accuracy
2014-03-01
used the C++ programming language, the Open Source Computer Vision ( OpenCV ®) software library, and Microsoft Windows® Application Programming...video for comparison through OpenCV image analysis tools. Based on the comparison, the software then computed the coordinates of each shot relative to...DWB researchers wanted to use the Open Source Computer Vision ( OpenCV ) software library for capturing and analyzing frames of video. OpenCV contains
diffuStats: an R package to compute diffusion-based scores on biological networks.
Picart-Armada, Sergio; Thompson, Wesley K; Buil, Alfonso; Perera-Lluna, Alexandre
2018-02-01
Label propagation and diffusion over biological networks are a common mathematical formalism in computational biology for giving context to molecular entities and prioritizing novel candidates in the area of study. There are several choices in conceiving the diffusion process-involving the graph kernel, the score definitions and the presence of a posterior statistical normalization-which have an impact on the results. This manuscript describes diffuStats, an R package that provides a collection of graph kernels and diffusion scores, as well as a parallel permutation analysis for the normalized scores, that eases the computation of the scores and their benchmarking for an optimal choice. The R package diffuStats is publicly available in Bioconductor, https://bioconductor.org, under the GPL-3 license. sergi.picart@upc.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Madrigal, Pedro
2017-03-01
Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomic science, as it allows both to evaluate reproducibility of biological or technical replicates, and to compare different datasets to identify their potential correlations. Here we present fCCAC, an application of functional canonical correlation analysis to assess covariance of nucleic acid sequencing datasets such as chromatin immunoprecipitation followed by deep sequencing (ChIP-seq). We show how this method differs from other measures of correlation, and exemplify how it can reveal shared covariance between histone modifications and DNA binding proteins, such as the relationship between the H3K4me3 chromatin mark and its epigenetic writers and readers. An R/Bioconductor package is available at http://bioconductor.org/packages/fCCAC/ . pmb59@cam.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
ImpulseDE: detection of differentially expressed genes in time series data using impulse models.
Sander, Jil; Schultze, Joachim L; Yosef, Nir
2017-03-01
Perturbations in the environment lead to distinctive gene expression changes within a cell. Observed over time, those variations can be characterized by single impulse-like progression patterns. ImpulseDE is an R package suited to capture these patterns in high throughput time series datasets. By fitting a representative impulse model to each gene, it reports differentially expressed genes across time points from a single or between two time courses from two experiments. To optimize running time, the code uses clustering and multi-threading. By applying ImpulseDE , we demonstrate its power to represent underlying biology of gene expression in microarray and RNA-Seq data. ImpulseDE is available on Bioconductor ( https://bioconductor.org/packages/ImpulseDE/ ). niryosef@berkeley.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages.
Park, Seung-Jin; Kim, Jong-Hwan; Yoon, Byung-Ha; Kim, Seon-Young
2017-03-01
Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. 'dada2' performs trimming of the high-throughput sequencing data. 'QuasR' and 'mosaics' perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, 'ChIPseeker' performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git.
A ChIP-Seq Data Analysis Pipeline Based on Bioconductor Packages
Park, Seung-Jin; Kim, Jong-Hwan; Yoon, Byung-Ha; Kim, Seon-Young
2017-01-01
Nowadays, huge volumes of chromatin immunoprecipitation-sequencing (ChIP-Seq) data are generated to increase the knowledge on DNA-protein interactions in the cell, and accordingly, many tools have been developed for ChIP-Seq analysis. Here, we provide an example of a streamlined workflow for ChIP-Seq data analysis composed of only four packages in Bioconductor: dada2, QuasR, mosaics, and ChIPseeker. ‘dada2’ performs trimming of the high-throughput sequencing data. ‘QuasR’ and ‘mosaics’ perform quality control and mapping of the input reads to the reference genome and peak calling, respectively. Finally, ‘ChIPseeker’ performs annotation and visualization of the called peaks. This workflow runs well independently of operating systems (e.g., Windows, Mac, or Linux) and processes the input fastq files into various results in one run. R code is available at github: https://github.com/ddhb/Workflow_of_Chipseq.git. PMID:28416945
Low, Diana H P; Motakis, Efthymios
2013-10-01
Binding free energy calculations obtained through molecular dynamics simulations reflect intermolecular interaction states through a series of independent snapshots. Typically, the free energies of multiple simulated series (each with slightly different starting conditions) need to be estimated. Previous approaches carry out this task by moving averages at certain decorrelation times, assuming that the system comes from a single conformation description of binding events. Here, we discuss a more general approach that uses statistical modeling, wavelets denoising and hierarchical clustering to estimate the significance of multiple statistically distinct subpopulations, reflecting potential macrostates of the system. We present the deltaGseg R package that performs macrostate estimation from multiple replicated series and allows molecular biologists/chemists to gain physical insight into the molecular details that are not easily accessible by experimental techniques. deltaGseg is a Bioconductor R package available at http://bioconductor.org/packages/release/bioc/html/deltaGseg.html.
Behind Linus's Law: Investigating Peer Review Processes in Open Source
ERIC Educational Resources Information Center
Wang, Jing
2013-01-01
Open source software has revolutionized the way people develop software, organize collaborative work, and innovate. The numerous open source software systems that have been created and adopted over the past decade are influential and vital in all aspects of work and daily life. The understanding of open source software development can enhance its…
Evaluation and selection of open-source EMR software packages based on integrated AHP and TOPSIS.
Zaidan, A A; Zaidan, B B; Al-Haiqi, Ahmed; Kiah, M L M; Hussain, Muzammil; Abdulnabi, Mohamed
2015-02-01
Evaluating and selecting software packages that meet the requirements of an organization are difficult aspects of software engineering process. Selecting the wrong open-source EMR software package can be costly and may adversely affect business processes and functioning of the organization. This study aims to evaluate and select open-source EMR software packages based on multi-criteria decision-making. A hands-on study was performed and a set of open-source EMR software packages were implemented locally on separate virtual machines to examine the systems more closely. Several measures as evaluation basis were specified, and the systems were selected based a set of metric outcomes using Integrated Analytic Hierarchy Process (AHP) and TOPSIS. The experimental results showed that GNUmed and OpenEMR software can provide better basis on ranking score records than other open-source EMR software packages. Copyright © 2014 Elsevier Inc. All rights reserved.
diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data.
Lun, Aaron T L; Smyth, Gordon K
2015-08-19
Chromatin conformation capture with high-throughput sequencing (Hi-C) is a technique that measures the in vivo intensity of interactions between all pairs of loci in the genome. Most conventional analyses of Hi-C data focus on the detection of statistically significant interactions. However, an alternative strategy involves identifying significant changes in the interaction intensity (i.e., differential interactions) between two or more biological conditions. This is more statistically rigorous and may provide more biologically relevant results. Here, we present the diffHic software package for the detection of differential interactions from Hi-C data. diffHic provides methods for read pair alignment and processing, counting into bin pairs, filtering out low-abundance events and normalization of trended or CNV-driven biases. It uses the statistical framework of the edgeR package to model biological variability and to test for significant differences between conditions. Several options for the visualization of results are also included. The use of diffHic is demonstrated with real Hi-C data sets. Performance against existing methods is also evaluated with simulated data. On real data, diffHic is able to successfully detect interactions with significant differences in intensity between biological conditions. It also compares favourably to existing software tools on simulated data sets. These results suggest that diffHic is a viable approach for differential analyses of Hi-C data.
NASA Astrophysics Data System (ADS)
Yetman, G.; Downs, R. R.
2011-12-01
Software deployment is needed to process and distribute scientific data throughout the data lifecycle. Developing software in-house can take software development teams away from other software development projects and can require efforts to maintain the software over time. Adopting and reusing software and system modules that have been previously developed by others can reduce in-house software development and maintenance costs and can contribute to the quality of the system being developed. A variety of models are available for reusing and deploying software and systems that have been developed by others. These deployment models include open source software, vendor-supported open source software, commercial software, and combinations of these approaches. Deployment in Earth science data processing and distribution has demonstrated the advantages and drawbacks of each model. Deploying open source software offers advantages for developing and maintaining scientific data processing systems and applications. By joining an open source community that is developing a particular system module or application, a scientific data processing team can contribute to aspects of the software development without having to commit to developing the software alone. Communities of interested developers can share the work while focusing on activities that utilize in-house expertise and addresses internal requirements. Maintenance is also shared by members of the community. Deploying vendor-supported open source software offers similar advantages to open source software. However, by procuring the services of a vendor, the in-house team can rely on the vendor to provide, install, and maintain the software over time. Vendor-supported open source software may be ideal for teams that recognize the value of an open source software component or application and would like to contribute to the effort, but do not have the time or expertise to contribute extensively. Vendor-supported software may also have the additional benefits of guaranteed up-time, bug fixes, and vendor-added enhancements. Deploying commercial software can be advantageous for obtaining system or software components offered by a vendor that meet in-house requirements. The vendor can be contracted to provide installation, support and maintenance services as needed. Combining these options offers a menu of choices, enabling selection of system components or software modules that meet the evolving requirements encountered throughout the scientific data lifecycle.
The Emergence of Open-Source Software in North America
ERIC Educational Resources Information Center
Pan, Guohua; Bonk, Curtis J.
2007-01-01
Unlike conventional models of software development, the open source model is based on the collaborative efforts of users who are also co-developers of the software. Interest in open source software has grown exponentially in recent years. A "Google" search for the phrase open source in early 2005 returned 28.8 million webpage hits, while…
Adopting Open Source Software to Address Software Risks during the Scientific Data Life Cycle
NASA Astrophysics Data System (ADS)
Vinay, S.; Downs, R. R.
2012-12-01
Software enables the creation, management, storage, distribution, discovery, and use of scientific data throughout the data lifecycle. However, the capabilities offered by software also present risks for the stewardship of scientific data, since future access to digital data is dependent on the use of software. From operating systems to applications for analyzing data, the dependence of data on software presents challenges for the stewardship of scientific data. Adopting open source software provides opportunities to address some of the proprietary risks of data dependence on software. For example, in some cases, open source software can be deployed to avoid licensing restrictions for using, modifying, and transferring proprietary software. The availability of the source code of open source software also enables the inclusion of modifications, which may be contributed by various community members who are addressing similar issues. Likewise, an active community that is maintaining open source software can be a valuable source of help, providing an opportunity to collaborate to address common issues facing adopters. As part of the effort to meet the challenges of software dependence for scientific data stewardship, risks from software dependence have been identified that exist during various times of the data lifecycle. The identification of these risks should enable the development of plans for mitigating software dependencies, where applicable, using open source software, and to improve understanding of software dependency risks for scientific data and how they can be reduced during the data life cycle.
Free for All: Open Source Software
ERIC Educational Resources Information Center
Schneider, Karen
2008-01-01
Open source software has become a catchword in libraryland. Yet many remain unclear about open source's benefits--or even what it is. So what is open source software (OSS)? It's software that is free in every sense of the word: free to download, free to use, and free to view or modify. Most OSS is distributed on the Web and one doesn't need to…
2016-01-06
of- breed software components and software products lines (SPLs) that are subject to different IP license and cybersecurity requirements. The... commercially priced closed source software components, to be used in the design, implementation, deployment, and evolution of open architecture (OA... breed software components and software products lines (SPLs) that are subject to different IP license and cybersecurity requirements. The Department
Developing open-source codes for electromagnetic geophysics using industry support
NASA Astrophysics Data System (ADS)
Key, K.
2017-12-01
Funding for open-source software development in academia often takes the form of grants and fellowships awarded by government bodies and foundations where there is no conflict-of-interest between the funding entity and the free dissemination of the open-source software products. Conversely, funding for open-source projects in the geophysics industry presents challenges to conventional business models where proprietary licensing offers value that is not present in open-source software. Such proprietary constraints make it easier to convince companies to fund academic software development under exclusive software distribution agreements. A major challenge for obtaining commercial funding for open-source projects is to offer a value proposition that overcomes the criticism that such funding is a give-away to the competition. This work draws upon a decade of experience developing open-source electromagnetic geophysics software for the oil, gas and minerals exploration industry, and examines various approaches that have been effective for sustaining industry sponsorship.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-01
... Packard Company Business Critical Systems, Mission Critical Business Software Division, OpenVMS Operating... Software Division, OpenVMS Operating System Development Group, Including an Employee Operating Out of the..., Mission Critical Business Software Division, OpenVMS Operating System Development Group, including...
A Study of Clinically Related Open Source Software Projects
Hogarth, Michael A.; Turner, Stuart
2005-01-01
Open source software development has recently gained significant interest due to several successful mainstream open source projects. This methodology has been proposed as being similarly viable and beneficial in the clinical application domain as well. However, the clinical software development venue differs significantly from the mainstream software venue. Existing clinical open source projects have not been well characterized nor formally studied so the ‘fit’ of open source in this domain is largely unknown. In order to better understand the open source movement in the clinical application domain, we undertook a study of existing open source clinical projects. In this study we sought to characterize and classify existing clinical open source projects and to determine metrics for their viability. This study revealed several findings which we believe could guide the healthcare community in its quest for successful open source clinical software projects. PMID:16779056
methylPipe and compEpiTools: a suite of R packages for the integrative analysis of epigenomics data.
Kishore, Kamal; de Pretis, Stefano; Lister, Ryan; Morelli, Marco J; Bianchi, Valerio; Amati, Bruno; Ecker, Joseph R; Pelizzola, Mattia
2015-09-29
Numerous methods are available to profile several epigenetic marks, providing data with different genome coverage and resolution. Large epigenomic datasets are then generated, and often combined with other high-throughput data, including RNA-seq, ChIP-seq for transcription factors (TFs) binding and DNase-seq experiments. Despite the numerous computational tools covering specific steps in the analysis of large-scale epigenomics data, comprehensive software solutions for their integrative analysis are still missing. Multiple tools must be identified and combined to jointly analyze histone marks, TFs binding and other -omics data together with DNA methylation data, complicating the analysis of these data and their integration with publicly available datasets. To overcome the burden of integrating various data types with multiple tools, we developed two companion R/Bioconductor packages. The former, methylPipe, is tailored to the analysis of high- or low-resolution DNA methylomes in several species, accommodating (hydroxy-)methyl-cytosines in both CpG and non-CpG sequence context. The analysis of multiple whole-genome bisulfite sequencing experiments is supported, while maintaining the ability of integrating targeted genomic data. The latter, compEpiTools, seamlessly incorporates the results obtained with methylPipe and supports their integration with other epigenomics data. It provides a number of methods to score these data in regions of interest, leading to the identification of enhancers, lncRNAs, and RNAPII stalling/elongation dynamics. Moreover, it allows a fast and comprehensive annotation of the resulting genomic regions, and the association of the corresponding genes with non-redundant GeneOntology terms. Finally, the package includes a flexible method based on heatmaps for the integration of various data types, combining annotation tracks with continuous or categorical data tracks. methylPipe and compEpiTools provide a comprehensive Bioconductor-compliant solution for the integrative analysis of heterogeneous epigenomics data. These packages are instrumental in providing biologists with minimal R skills a complete toolkit facilitating the analysis of their own data, or in accelerating the analyses performed by more experienced bioinformaticians.
Ciobanu, O
2009-01-01
The objective of this study was to obtain three-dimensional (3D) images and to perform biomechanical simulations starting from DICOM images obtained by computed tomography (CT). Open source software were used to prepare digitized 2D images of tissue sections and to create 3D reconstruction from the segmented structures. Finally, 3D images were used in open source software in order to perform biomechanic simulations. This study demonstrates the applicability and feasibility of open source software developed in our days for the 3D reconstruction and biomechanic simulation. The use of open source software may improve the efficiency of investments in imaging technologies and in CAD/CAM technologies for implants and prosthesis fabrication which need expensive specialized software.
Government Technology Acquisition Policy: The Case of Proprietary versus Open Source Software
ERIC Educational Resources Information Center
Hemphill, Thomas A.
2005-01-01
This article begins by explaining the concepts of proprietary and open source software technology, which are now competing in the marketplace. A review of recent individual and cooperative technology development and public policy advocacy efforts, by both proponents of open source software and advocates of proprietary software, subsequently…
Mifsud, Borbala; Martincorena, Inigo; Darbo, Elodie; Sugar, Robert; Schoenfelder, Stefan; Fraser, Peter; Luscombe, Nicholas M
2017-01-01
Hi-C is one of the main methods for investigating spatial co-localisation of DNA in the nucleus. However, the raw sequencing data obtained from Hi-C experiments suffer from large biases and spurious contacts, making it difficult to identify true interactions. Existing methods use complex models to account for biases and do not provide a significance threshold for detecting interactions. Here we introduce a simple binomial probabilistic model that resolves complex biases and distinguishes between true and false interactions. The model corrects biases of known and unknown origin and yields a p-value for each interaction, providing a reliable threshold based on significance. We demonstrate this experimentally by testing the method against a random ligation dataset. Our method outperforms previous methods and provides a statistical framework for further data analysis, such as comparisons of Hi-C interactions between different conditions. GOTHiC is available as a BioConductor package (http://www.bioconductor.org/packages/release/bioc/html/GOTHiC.html).
OncoSimulR: genetic simulation with arbitrary epistasis and mutator genes in asexual populations.
Diaz-Uriarte, Ramon
2017-06-15
OncoSimulR implements forward-time genetic simulations of biallelic loci in asexual populations with special focus on cancer progression. Fitness can be defined as an arbitrary function of genetic interactions between multiple genes or modules of genes, including epistasis, restrictions in the order of accumulation of mutations, and order effects. Mutation rates can differ among genes, and can be affected by (anti)mutator genes. Also available are sampling from simulations (including single-cell sampling), plotting the genealogical relationships of clones and generating and plotting fitness landscapes. Implemented in R and C ++, freely available from BioConductor for Linux, Mac and Windows under the GNU GPL license. Version 2.5.9 or higher available from: http://www.bioconductor.org/packages/devel/bioc/html/OncoSimulR.html . GitHub repository at: https://github.com/rdiaz02/OncoSimul. ramon.diaz@iib.uam.es. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Vrahatis, Aristidis G; Balomenos, Panos; Tsakalidis, Athanasios K; Bezerianos, Anastasios
2016-12-15
DEsubs is a network-based systems biology R package that extracts disease-perturbed subpathways within a pathway network as recorded by RNA-seq experiments. It contains an extensive and customized framework with a broad range of operation modes at all stages of the subpathway analysis, enabling so a case-specific approach. The operation modes include pathway network construction and processing, subpathway extraction, visualization and enrichment analysis with regard to various biological and pharmacological features. Its capabilities render DEsubs a tool-guide for both the modeler and experimentalist for the identification of more robust systems-level drug targets and biomarkers for complex diseases. DEsubs is implemented as an R package following Bioconductor guidelines: http://bioconductor.org/packages/DEsubs/ CONTACT: tassos.bezerianos@nus.edu.sgSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Using Peptide-Level Proteomics Data for Detecting Differentially Expressed Proteins.
Suomi, Tomi; Corthals, Garry L; Nevalainen, Olli S; Elo, Laura L
2015-11-06
The expression of proteins can be quantified in high-throughput means using different types of mass spectrometers. In recent years, there have emerged label-free methods for determining protein abundance. Although the expression is initially measured at the peptide level, a common approach is to combine the peptide-level measurements into protein-level values before differential expression analysis. However, this simple combination is prone to inconsistencies between peptides and may lose valuable information. To this end, we introduce here a method for detecting differentially expressed proteins by combining peptide-level expression-change statistics. Using controlled spike-in experiments, we show that the approach of averaging peptide-level expression changes yields more accurate lists of differentially expressed proteins than does the conventional protein-level approach. This is particularly true when there are only few replicate samples or the differences between the sample groups are small. The proposed technique is implemented in the Bioconductor package PECA, and it can be downloaded from http://www.bioconductor.org.
ERIC Educational Resources Information Center
Pfaffman, Jay
2008-01-01
Free/Open Source Software (FOSS) applications meet many of the software needs of high school science classrooms. In spite of the availability and quality of FOSS tools, they remain unknown to many teachers and utilized by fewer still. In a world where most software has restrictions on copying and use, FOSS is an anomaly, free to use and to…
Ardal, Christine; Alstadsæter, Annette; Røttingen, John-Arne
2011-09-28
Innovation through an open source model has proven to be successful for software development. This success has led many to speculate if open source can be applied to other industries with similar success. We attempt to provide an understanding of open source software development characteristics for researchers, business leaders and government officials who may be interested in utilizing open source innovation in other contexts and with an emphasis on drug discovery. A systematic review was performed by searching relevant, multidisciplinary databases to extract empirical research regarding the common characteristics and barriers of initiating and maintaining an open source software development project. Common characteristics to open source software development pertinent to open source drug discovery were extracted. The characteristics were then grouped into the areas of participant attraction, management of volunteers, control mechanisms, legal framework and physical constraints. Lastly, their applicability to drug discovery was examined. We believe that the open source model is viable for drug discovery, although it is unlikely that it will exactly follow the form used in software development. Hybrids will likely develop that suit the unique characteristics of drug discovery. We suggest potential motivations for organizations to join an open source drug discovery project. We also examine specific differences between software and medicines, specifically how the need for laboratories and physical goods will impact the model as well as the effect of patents.
An Analysis of Open Source Security Software Products Downloads
ERIC Educational Resources Information Center
Barta, Brian J.
2014-01-01
Despite the continued demand for open source security software, a gap in the identification of success factors related to the success of open source security software persists. There are no studies that accurately assess the extent of this persistent gap, particularly with respect to the strength of the relationships of open source software…
NASA Astrophysics Data System (ADS)
Daniell, James; Simpson, Alanna; Gunasekara, Rashmin; Baca, Abigail; Schaefer, Andreas; Ishizawa, Oscar; Murnane, Rick; Tijssen, Annegien; Deparday, Vivien; Forni, Marc; Himmelfarb, Anne; Leder, Jan
2015-04-01
Over the past few decades, a plethora of open access software packages for the calculation of earthquake, volcanic, tsunami, storm surge, wind and flood have been produced globally. As part of the World Bank GFDRR Review released at the Understanding Risk 2014 Conference, over 80 such open access risk assessment software packages were examined. Commercial software was not considered in the evaluation. A preliminary analysis was used to determine whether the 80 models were currently supported and if they were open access. This process was used to select a subset of 31 models that include 8 earthquake models, 4 cyclone models, 11 flood models, and 8 storm surge/tsunami models for more detailed analysis. By using multi-criteria analysis (MCDA) and simple descriptions of the software uses, the review allows users to select a few relevant software packages for their own testing and development. The detailed analysis evaluated the models on the basis of over 100 criteria and provides a synopsis of available open access natural hazard risk modelling tools. In addition, volcano software packages have since been added making the compendium of risk software tools in excess of 100. There has been a huge increase in the quality and availability of open access/source software over the past few years. For example, private entities such as Deltares now have an open source policy regarding some flood models (NGHS). In addition, leaders in developing risk models in the public sector, such as Geoscience Australia (EQRM, TCRM, TsuDAT, AnuGA) or CAPRA (ERN-Flood, Hurricane, CRISIS2007 etc.), are launching and/or helping many other initiatives. As we achieve greater interoperability between modelling tools, we will also achieve a future wherein different open source and open access modelling tools will be increasingly connected and adapted towards unified multi-risk model platforms and highly customised solutions. It was seen that many software tools could be improved by enabling user-defined exposure and vulnerability. Without this function, many tools can only be used regionally and not at global or continental scale. It is becoming increasingly easy to use multiple packages for a single region and/or hazard to characterize the uncertainty in the risk, or use as checks for the sensitivities in the analysis. There is a potential for valuable synergy between existing software. A number of open source software packages could be combined to generate a multi-risk model with multiple views of a hazard. This extensive review has simply attempted to provide a platform for dialogue between all open source and open access software packages and to hopefully inspire collaboration between developers, given the great work done by all open access and open source developers.
Open-source software: not quite endsville.
Stahl, Matthew T
2005-02-01
Open-source software will never achieve ubiquity. There are environments in which it simply does not flourish. By its nature, open-source development requires free exchange of ideas, community involvement, and the efforts of talented and dedicated individuals. However, pressures can come from several sources that prevent this from happening. In addition, openness and complex licensing issues invite misuse and abuse. Care must be taken to avoid the pitfalls of open-source software.
Open Source Molecular Modeling
Pirhadi, Somayeh; Sunseri, Jocelyn; Koes, David Ryan
2016-01-01
The success of molecular modeling and computational chemistry efforts are, by definition, dependent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. PMID:27631126
2011-01-01
Background Innovation through an open source model has proven to be successful for software development. This success has led many to speculate if open source can be applied to other industries with similar success. We attempt to provide an understanding of open source software development characteristics for researchers, business leaders and government officials who may be interested in utilizing open source innovation in other contexts and with an emphasis on drug discovery. Methods A systematic review was performed by searching relevant, multidisciplinary databases to extract empirical research regarding the common characteristics and barriers of initiating and maintaining an open source software development project. Results Common characteristics to open source software development pertinent to open source drug discovery were extracted. The characteristics were then grouped into the areas of participant attraction, management of volunteers, control mechanisms, legal framework and physical constraints. Lastly, their applicability to drug discovery was examined. Conclusions We believe that the open source model is viable for drug discovery, although it is unlikely that it will exactly follow the form used in software development. Hybrids will likely develop that suit the unique characteristics of drug discovery. We suggest potential motivations for organizations to join an open source drug discovery project. We also examine specific differences between software and medicines, specifically how the need for laboratories and physical goods will impact the model as well as the effect of patents. PMID:21955914
2015-05-01
Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and Mobile Devices Walt Scacchi and Thomas...2015 to 00-00-2015 4. TITLE AND SUBTITLE Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and...architecture (OA) software systems Emerging challenges in achieving Better Buying Power (BBP) via OA software systems for Web- based and Mobile devices
Khan, Aziz; Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; van der Lee, Robin; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Sandelin, Albin; Vandepoele, Klaas; Lenhard, Boris; Ballester, Benoît; Wasserman, Wyeth W; Parcy, François; Mathelier, Anthony
2018-01-04
JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Vandepoele, Klaas; Parcy, François
2018-01-01
Abstract JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. PMID:29140473
Mathelier, Anthony; Fornes, Oriol; Arenillas, David J.; Chen, Chih-yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W.; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.
2016-01-01
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. PMID:26531826
The Open Source Teaching Project (OSTP): Research Note.
ERIC Educational Resources Information Center
Hirst, Tony
The Open Source Teaching Project (OSTP) is an attempt to apply a variant of the successful open source software approach to the development of educational materials. Open source software is software licensed in such a way as to allow anyone the right to modify and use it. From such a simple premise, a whole industry has arisen, most notably in the…
ERIC Educational Resources Information Center
Simpson, James Daniel
2014-01-01
Free, libre, and open source software (FLOSS) is software that is collaboratively developed. FLOSS provides end-users with the source code and the freedom to adapt or modify a piece of software to fit their needs (Deek & McHugh, 2008; Stallman, 2010). FLOSS has a 30 year history that dates to the open hacker community at the Massachusetts…
Preparing a scientific manuscript in Linux: Today's possibilities and limitations.
Tchantchaleishvili, Vakhtang; Schmitto, Jan D
2011-10-22
Increasing number of scientists are enthusiastic about using free, open source software for their research purposes. Authors' specific goal was to examine whether a Linux-based operating system with open source software packages would allow to prepare a submission-ready scientific manuscript without the need to use the proprietary software. Preparation and editing of scientific manuscripts is possible using Linux and open source software. This letter to the editor describes key steps for preparation of a publication-ready scientific manuscript in a Linux-based operating system, as well as discusses the necessary software components. This manuscript was created using Linux and open source programs for Linux.
NASA Astrophysics Data System (ADS)
Abdullah, Johari Yap; Omar, Marzuki; Pritam, Helmi Mohd Hadi; Husein, Adam; Rajion, Zainul Ahmad
2016-12-01
3D printing of mandible is important for pre-operative planning, diagnostic purposes, as well as for education and training. Currently, the processing of CT data is routinely performed with commercial software which increases the cost of operation and patient management for a small clinical setting. Usage of open-source software as an alternative to commercial software for 3D reconstruction of the mandible from CT data is scarce. The aim of this study is to compare two methods of 3D reconstruction of the mandible using commercial Materialise Mimics software and open-source Medical Imaging Interaction Toolkit (MITK) software. Head CT images with a slice thickness of 1 mm and a matrix of 512x512 pixels each were retrieved from the server located at the Radiology Department of Hospital Universiti Sains Malaysia. The CT data were analysed and the 3D models of mandible were reconstructed using both commercial Materialise Mimics and open-source MITK software. Both virtual 3D models were saved in STL format and exported to 3matic and MeshLab software for morphometric and image analyses. Both models were compared using Wilcoxon Signed Rank Test and Hausdorff Distance. No significant differences were obtained between the 3D models of the mandible produced using Mimics and MITK software. The 3D model of the mandible produced using MITK open-source software is comparable to the commercial MIMICS software. Therefore, open-source software could be used in clinical setting for pre-operative planning to minimise the operational cost.
Open source IPSEC software in manned and unmanned space missions
NASA Astrophysics Data System (ADS)
Edwards, Jacob
Network security is a major topic of research because cyber attackers pose a threat to national security. Securing ground-space communications for NASA missions is important because attackers could endanger mission success and human lives. This thesis describes how an open source IPsec software package was used to create a secure and reliable channel for ground-space communications. A cost efficient, reproducible hardware testbed was also created to simulate ground-space communications. The testbed enables simulation of low-bandwidth and high latency communications links to experiment how the open source IPsec software reacts to these network constraints. Test cases were built that allowed for validation of the testbed and the open source IPsec software. The test cases also simulate using an IPsec connection from mission control ground routers to points of interest in outer space. Tested open source IPsec software did not meet all the requirements. Software changes were suggested to meet requirements.
OSIRIX: open source multimodality image navigation software
NASA Astrophysics Data System (ADS)
Rosset, Antoine; Pysher, Lance; Spadola, Luca; Ratib, Osman
2005-04-01
The goal of our project is to develop a completely new software platform that will allow users to efficiently and conveniently navigate through large sets of multidimensional data without the need of high-end expensive hardware or software. We also elected to develop our system on new open source software libraries allowing other institutions and developers to contribute to this project. OsiriX is a free and open-source imaging software designed manipulate and visualize large sets of medical images: http://homepage.mac.com/rossetantoine/osirix/
76 FR 75875 - Defense Federal Acquisition Regulation Supplement; Open Source Software Public Meeting
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-05
... Regulation Supplement; Open Source Software Public Meeting AGENCY: Defense Acquisition Regulations System... initiate a dialogue with industry regarding the use of open source software in DoD contracts. DATES: Public... be held in the General Services Administration (GSA), Central Office Auditorium, 1800 F Street NW...
Open Source Software Development and Lotka's Law: Bibliometric Patterns in Programming.
ERIC Educational Resources Information Center
Newby, Gregory B.; Greenberg, Jane; Jones, Paul
2003-01-01
Applies Lotka's Law to metadata on open source software development. Authoring patterns found in software development productivity are found to be comparable to prior studies of Lotka's Law for scientific and scholarly publishing, and offer promise in predicting aggregate behavior of open source developers. (Author/LRW)
Cölfen, Helmut; Laue, Thomas M; Wohlleben, Wendel; Schilling, Kristian; Karabudak, Engin; Langhorst, Bradley W; Brookes, Emre; Dubbs, Bruce; Zollars, Dan; Rocco, Mattia; Demeler, Borries
2010-02-01
Progress in analytical ultracentrifugation (AUC) has been hindered by obstructions to hardware innovation and by software incompatibility. In this paper, we announce and outline the Open AUC Project. The goals of the Open AUC Project are to stimulate AUC innovation by improving instrumentation, detectors, acquisition and analysis software, and collaborative tools. These improvements are needed for the next generation of AUC-based research. The Open AUC Project combines on-going work from several different groups. A new base instrument is described, one that is designed from the ground up to be an analytical ultracentrifuge. This machine offers an open architecture, hardware standards, and application programming interfaces for detector developers. All software will use the GNU Public License to assure that intellectual property is available in open source format. The Open AUC strategy facilitates collaborations, encourages sharing, and eliminates the chronic impediments that have plagued AUC innovation for the last 20 years. This ultracentrifuge will be equipped with multiple and interchangeable optical tracks so that state-of-the-art electronics and improved detectors will be available for a variety of optical systems. The instrument will be complemented by a new rotor, enhanced data acquisition and analysis software, as well as collaboration software. Described here are the instrument, the modular software components, and a standardized database that will encourage and ease integration of data analysis and interpretation software.
ArrayInitiative - a tool that simplifies creating custom Affymetrix CDFs
2011-01-01
Background Probes on a microarray represent a frozen view of a genome and are quickly outdated when new sequencing studies extend our knowledge, resulting in significant measurement error when analyzing any microarray experiment. There are several bioinformatics approaches to improve probe assignments, but without in-house programming expertise, standardizing these custom array specifications as a usable file (e.g. as Affymetrix CDFs) is difficult, owing mostly to the complexity of the specification file format. However, without correctly standardized files there is a significant barrier for testing competing analysis approaches since this file is one of the required inputs for many commonly used algorithms. The need to test combinations of probe assignments and analysis algorithms led us to develop ArrayInitiative, a tool for creating and managing custom array specifications. Results ArrayInitiative is a standalone, cross-platform, rich client desktop application for creating correctly formatted, custom versions of manufacturer-provided (default) array specifications, requiring only minimal knowledge of the array specification rules and file formats. Users can import default array specifications, import probe sequences for a default array specification, design and import a custom array specification, export any array specification to multiple output formats, export the probe sequences for any array specification and browse high-level information about the microarray, such as version and number of probes. The initial release of ArrayInitiative supports the Affymetrix 3' IVT expression arrays we currently analyze, but as an open source application, we hope that others will contribute modules for other platforms. Conclusions ArrayInitiative allows researchers to create new array specifications, in a standard format, based upon their own requirements. This makes it easier to test competing design and analysis strategies that depend on probe definitions. Since the custom array specifications are easily exported to the manufacturer's standard format, researchers can analyze these customized microarray experiments using established software tools, such as those available in Bioconductor. PMID:21548938
Preparing a scientific manuscript in Linux: Today's possibilities and limitations
2011-01-01
Background Increasing number of scientists are enthusiastic about using free, open source software for their research purposes. Authors' specific goal was to examine whether a Linux-based operating system with open source software packages would allow to prepare a submission-ready scientific manuscript without the need to use the proprietary software. Findings Preparation and editing of scientific manuscripts is possible using Linux and open source software. This letter to the editor describes key steps for preparation of a publication-ready scientific manuscript in a Linux-based operating system, as well as discusses the necessary software components. This manuscript was created using Linux and open source programs for Linux. PMID:22018246
The case for open-source software in drug discovery.
DeLano, Warren L
2005-02-01
Widespread adoption of open-source software for network infrastructure, web servers, code development, and operating systems leads one to ask how far it can go. Will "open source" spread broadly, or will it be restricted to niches frequented by hopeful hobbyists and midnight hackers? Here we identify reasons for the success of open-source software and predict how consumers in drug discovery will benefit from new open-source products that address their needs with increased flexibility and in ways complementary to proprietary options.
Integrating open-source software applications to build molecular dynamics systems.
Allen, Bruce M; Predecki, Paul K; Kumosa, Maciej
2014-04-05
Three open-source applications, NanoEngineer-1, packmol, and mis2lmp are integrated using an open-source file format to quickly create molecular dynamics (MD) cells for simulation. The three software applications collectively make up the open-source software (OSS) suite known as MD Studio (MDS). The software is validated through software engineering practices and is verified through simulation of the diglycidyl ether of bisphenol-a and isophorone diamine (DGEBA/IPD) system. Multiple simulations are run using the MDS software to create MD cells, and the data generated are used to calculate density, bulk modulus, and glass transition temperature of the DGEBA/IPD system. Simulation results compare well with published experimental and numerical results. The MDS software prototype confirms that OSS applications can be analyzed against real-world research requirements and integrated to create a new capability. Copyright © 2014 Wiley Periodicals, Inc.
The Visible Human Data Sets (VHD) and Insight Toolkit (ITk): Experiments in Open Source Software
Ackerman, Michael J.; Yoo, Terry S.
2003-01-01
From its inception in 1989, the Visible Human Project was designed as an experiment in open source software. In 1994 and 1995 the male and female Visible Human data sets were released by the National Library of Medicine (NLM) as open source data sets. In 2002 the NLM released the first version of the Insight Toolkit (ITk) as open source software. PMID:14728278
Free and open source software for the manipulation of digital images.
Solomon, Robert W
2009-06-01
Free and open source software is a type of software that is nearly as powerful as commercial software but is freely downloadable. This software can do almost everything that the expensive programs can. GIMP (gnu image manipulation program) is the free program that is comparable to Photoshop, and versions are available for Windows, Macintosh, and Linux platforms. This article briefly describes how GIMP can be installed and used to manipulate radiology images. It is no longer necessary to budget large amounts of money for high-quality software to achieve the goals of image processing and document creation because free and open source software is available for the user to download at will.
The 2017 Bioinformatics Open Source Conference (BOSC)
Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Munoz-Torres, Monica; Tzovaras, Bastian Greshake; Wiencko, Heather
2017-01-01
The Bioinformatics Open Source Conference (BOSC) is a meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. The 18th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2017) took place in Prague, Czech Republic in July 2017. The conference brought together nearly 250 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, open and reproducible science, and this year’s theme, open data. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community, called the OBF Codefest. PMID:29118973
The 2017 Bioinformatics Open Source Conference (BOSC).
Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Munoz-Torres, Monica; Tzovaras, Bastian Greshake; Wiencko, Heather
2017-01-01
The Bioinformatics Open Source Conference (BOSC) is a meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. The 18th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2017) took place in Prague, Czech Republic in July 2017. The conference brought together nearly 250 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, open and reproducible science, and this year's theme, open data. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community, called the OBF Codefest.
Open core control software for surgical robots.
Arata, Jumpei; Kozuka, Hiroaki; Kim, Hyung Wook; Takesue, Naoyuki; Vladimirov, B; Sakaguchi, Masamichi; Tokuda, Junichi; Hata, Nobuhiko; Chinzei, Kiyoyuki; Fujimoto, Hideo
2010-05-01
In these days, patients and doctors in operation room are surrounded by many medical devices as resulting from recent advancement of medical technology. However, these cutting-edge medical devices are working independently and not collaborating with each other, even though the collaborations between these devices such as navigation systems and medical imaging devices are becoming very important for accomplishing complex surgical tasks (such as a tumor removal procedure while checking the tumor location in neurosurgery). On the other hand, several surgical robots have been commercialized, and are becoming common. However, these surgical robots are not open for collaborations with external medical devices in these days. A cutting-edge "intelligent surgical robot" will be possible in collaborating with surgical robots, various kinds of sensors, navigation system and so on. On the other hand, most of the academic software developments for surgical robots are "home-made" in their research institutions and not open to the public. Therefore, open source control software for surgical robots can be beneficial in this field. From these perspectives, we developed Open Core Control software for surgical robots to overcome these challenges. In general, control softwares have hardware dependencies based on actuators, sensors and various kinds of internal devices. Therefore, these control softwares cannot be used on different types of robots without modifications. However, the structure of the Open Core Control software can be reused for various types of robots by abstracting hardware dependent parts. In addition, network connectivity is crucial for collaboration between advanced medical devices. The OpenIGTLink is adopted in Interface class which plays a role to communicate with external medical devices. At the same time, it is essential to maintain the stable operation within the asynchronous data transactions through network. In the Open Core Control software, several techniques for this purpose were introduced. Virtual fixture is well known technique as a "force guide" for supporting operators to perform precise manipulation by using a master-slave robot. The virtual fixture for precise and safety surgery was implemented on the system to demonstrate an idea of high-level collaboration between a surgical robot and a navigation system. The extension of virtual fixture is not a part of the Open Core Control system, however, the function such as virtual fixture cannot be realized without a tight collaboration between cutting-edge medical devices. By using the virtual fixture, operators can pre-define an accessible area on the navigation system, and the area information can be transferred to the robot. In this manner, the surgical console generates the reflection force when the operator tries to get out from the pre-defined accessible area during surgery. The Open Core Control software was implemented on a surgical master-slave robot and stable operation was observed in a motion test. The tip of the surgical robot was displayed on a navigation system by connecting the surgical robot with a 3D position sensor through the OpenIGTLink. The accessible area was pre-defined before the operation, and the virtual fixture was displayed as a "force guide" on the surgical console. In addition, the system showed stable performance in a duration test with network disturbance. In this paper, a design of the Open Core Control software for surgical robots and the implementation of virtual fixture were described. The Open Core Control software was implemented on a surgical robot system and showed stable performance in high-level collaboration works. The Open Core Control software is developed to be a widely used platform of surgical robots. Safety issues are essential for control software of these complex medical devices. It is important to follow the global specifications such as a FDA requirement "General Principles of Software Validation" or IEC62304. For following these regulations, it is important to develop a self-test environment. Therefore, a test environment is now under development to test various interference in operation room such as a noise of electric knife by considering safety and test environment regulations such as ISO13849 and IEC60508. The Open Core Control software is currently being developed software in open-source manner and available on the Internet. A communization of software interface is becoming a major trend in this field. Based on this perspective, the Open Core Control software can be expected to bring contributions in this field.
Journal of Open Source Software (JOSS): design and first-year review
NASA Astrophysics Data System (ADS)
Smith, Arfon M.
2018-01-01
JOSS is a free and open-access journal that publishes articles describing research software across all disciplines. It has the dual goals of improving the quality of the software submitted and providing a mechanism for research software developers to receive credit. While designed to work within the current merit system of science, JOSS addresses the dearth of rewards for key contributions to science made in the form of software. JOSS publishes articles that encapsulate scholarship contained in the software itself, and its rigorous peer review targets the software components: functionality, documentation, tests, continuous integration, and the license. A JOSS article contains an abstract describing the purpose and functionality of the software, references, and a link to the software archive. JOSS published more than 100 articles in its first year, many from the scientific python ecosystem (including a number of articles related to astronomy and astrophysics). JOSS is a sponsored project of the nonprofit organization NumFOCUS and is an affiliate of the Open Source Initiative.In this presentation, I'll describes the motivation, design, and progress of the Journal of Open Source Software (JOSS) and how it compares to other avenues for publishing research software in astronomy.
ERIC Educational Resources Information Center
Kamthan, Pankaj
2007-01-01
Open Source Software (OSS) has introduced a new dimension in software community. As the development and use of OSS becomes prominent, the question of its integration in education arises. In this paper, the following practices fundamental to projects and processes in software engineering are examined from an OSS perspective: project management;…
Open source software integrated into data services of Japanese planetary explorations
NASA Astrophysics Data System (ADS)
Yamamoto, Y.; Ishihara, Y.; Otake, H.; Imai, K.; Masuda, K.
2015-12-01
Scientific data obtained by Japanese scientific satellites and lunar and planetary explorations are archived in DARTS (Data ARchives and Transmission System). DARTS provides the data with a simple method such as HTTP directory listing for long-term preservation while DARTS tries to provide rich web applications for ease of access with modern web technologies based on open source software. This presentation showcases availability of open source software through our services. KADIAS is a web-based application to search, analyze, and obtain scientific data measured by SELENE(Kaguya), a Japanese lunar orbiter. KADIAS uses OpenLayers to display maps distributed from Web Map Service (WMS). As a WMS server, open source software MapServer is adopted. KAGUYA 3D GIS (KAGUYA 3D Moon NAVI) provides a virtual globe for the SELENE's data. The main purpose of this application is public outreach. NASA World Wind Java SDK is used to develop. C3 (Cross-Cutting Comparisons) is a tool to compare data from various observations and simulations. It uses Highcharts to draw graphs on web browsers. Flow is a tool to simulate a Field-Of-View of an instrument onboard a spacecraft. This tool itself is open source software developed by JAXA/ISAS, and the license is BSD 3-Caluse License. SPICE Toolkit is essential to compile FLOW. SPICE Toolkit is also open source software developed by NASA/JPL, and the website distributes many spacecrafts' data. Nowadays, open source software is an indispensable tool to integrate DARTS services.
Oros Klein, Kathleen; Grinek, Stepan; Bernatsky, Sasha; Bouchard, Luigi; Ciampi, Antonio; Colmegna, Ines; Fortin, Jean-Philippe; Gao, Long; Hivert, Marie-France; Hudson, Marie; Kobor, Michael S; Labbe, Aurelie; MacIsaac, Julia L; Meaney, Michael J; Morin, Alexander M; O'Donnell, Kieran J; Pastinen, Tomi; Van Ijzendoorn, Marinus H; Voisin, Gregory; Greenwood, Celia M T
2016-02-15
DNA methylation patterns are well known to vary substantially across cell types or tissues. Hence, existing normalization methods may not be optimal if they do not take this into account. We therefore present a new R package for normalization of data from the Illumina Infinium Human Methylation450 BeadChip (Illumina 450 K) built on the concepts in the recently published funNorm method, and introducing cell-type or tissue-type flexibility. funtooNorm is relevant for data sets containing samples from two or more cell or tissue types. A visual display of cross-validated errors informs the choice of the optimal number of components in the normalization. Benefits of cell (tissue)-specific normalization are demonstrated in three data sets. Improvement can be substantial; it is strikingly better on chromosome X, where methylation patterns have unique inter-tissue variability. An R package is available at https://github.com/GreenwoodLab/funtooNorm, and has been submitted to Bioconductor at http://bioconductor.org. © The Author 2015. Published by Oxford University Press.
SpidermiR: An R/Bioconductor Package for Integrative Analysis with miRNA Data.
Cava, Claudia; Colaprico, Antonio; Bertoli, Gloria; Graudenzi, Alex; Silva, Tiago C; Olsen, Catharina; Noushmehr, Houtan; Bontempi, Gianluca; Mauri, Giancarlo; Castiglioni, Isabella
2017-01-27
Gene Regulatory Networks (GRNs) control many biological systems, but how such network coordination is shaped is still unknown. GRNs can be subdivided into basic connections that describe how the network members interact e.g., co-expression, physical interaction, co-localization, genetic influence, pathways, and shared protein domains. The important regulatory mechanisms of these networks involve miRNAs. We developed an R/Bioconductor package, namely SpidermiR, which offers an easy access to both GRNs and miRNAs to the end user, and integrates this information with differentially expressed genes obtained from The Cancer Genome Atlas. Specifically, SpidermiR allows the users to: (i) query and download GRNs and miRNAs from validated and predicted repositories; (ii) integrate miRNAs with GRNs in order to obtain miRNA-gene-gene and miRNA-protein-protein interactions, and to analyze miRNA GRNs in order to identify miRNA-gene communities; and (iii) graphically visualize the results of the analyses. These analyses can be performed through a single interface and without the need for any downloads. The full data sets are then rapidly integrated and processed locally.
TSSi--an R package for transcription start site identification from 5' mRNA tag data.
Kreutz, C; Gehring, J S; Lang, D; Reski, R; Timmer, J; Rensing, S A
2012-06-15
High-throughput sequencing has become an essential experimental approach for the investigation of transcriptional mechanisms. For some applications like ChIP-seq, several approaches for the prediction of peak locations exist. However, these methods are not designed for the identification of transcription start sites (TSSs) because such datasets contain qualitatively different noise. In this application note, the R package TSSi is presented which provides a heuristic framework for the identification of TSSs based on 5' mRNA tag data. Probabilistic assumptions for the distribution of the data, i.e. for the observed positions of the mapped reads, as well as for systematic errors, i.e. for reads which map closely but not exactly to a real TSS, are made and can be adapted by the user. The framework also comprises a regularization procedure which can be applied as a preprocessing step to decrease the noise and thereby reduce the number of false predictions. The R package TSSi is available from the Bioconductor web site: www.bioconductor.org/packages/release/bioc/html/TSSi.html.
NASA's Earth Imagery Service as Open Source Software
NASA Astrophysics Data System (ADS)
De Cesare, C.; Alarcon, C.; Huang, T.; Roberts, J. T.; Rodriguez, J.; Cechini, M. F.; Boller, R. A.; Baynes, K.
2016-12-01
The NASA Global Imagery Browse Service (GIBS) is a software system that provides access to an archive of historical and near-real-time Earth imagery from NASA-supported satellite instruments. The imagery itself is open data, and is accessible via standards such as the Open Geospatial Consortium (OGC)'s Web Map Tile Service (WMTS) protocol. GIBS includes three core software projects: The Imagery Exchange (TIE), OnEarth, and the Meta Raster Format (MRF) project. These projects are developed using a variety of open source software, including: Apache HTTPD, GDAL, Mapserver, Grails, Zookeeper, Eclipse, Maven, git, and Apache Commons. TIE has recently been released for open source, and is now available on GitHub. OnEarth, MRF, and their sub-projects have been on GitHub since 2014, and the MRF project in particular receives many external contributions from the community. Our software has been successful beyond the scope of GIBS: the PO.DAAC State of the Ocean and COVERAGE visualization projects reuse components from OnEarth. The MRF source code has recently been incorporated into GDAL, which is a core library in many widely-used GIS software such as QGIS and GeoServer. This presentation will describe the challenges faced in incorporating open software and open data into GIBS, and also showcase GIBS as a platform on which scientists and the general public can build their own applications.
Developing an Open Source Option for NASA Software
NASA Technical Reports Server (NTRS)
Moran, Patrick J.; Parks, John W. (Technical Monitor)
2003-01-01
We present arguments in favor of developing an Open Source option for NASA software; in particular we discuss how Open Source is compatible with NASA's mission. We compare and contrast several of the leading Open Source licenses, and propose one - the Mozilla license - for use by NASA. We also address some of the related issues for NASA with respect to Open Source. In particular, we discuss some of the elements in the External Release of NASA Software document (NPG 2210.1A) that will likely have to be changed in order to make Open Source a reality withm the agency.
Analysis of Cisco Open Network Environment (ONE) OpenFlow Controller Implementation
2014-08-01
Software - Defined Networking ( SDN ), when fully realized, offer many improvements over the current rigid and...functionalities like handshake, connection setup, switch management, and security. 15. SUBJECT TERMS OpenFlow, software - defined networking , Cisco ONE, SDN ...innovating packet-forwarding technologies. Network device roles are strictly defined with little or no flexibility. In Software - Defined Networks ( SDNs ),
ERIC Educational Resources Information Center
Ge, Xun; Huang, Kun; Dong, Yifei
2010-01-01
A semester-long ethnography study was carried out to investigate project-based learning in a graduate software engineering course through the implementation of an Open-Source Software Development (OSSD) learning environment, which featured authentic projects, learning community, cognitive apprenticeship, and technology affordances. The study…
puma: a Bioconductor package for propagating uncertainty in microarray analysis.
Pearson, Richard D; Liu, Xuejun; Sanguinetti, Guido; Milo, Marta; Lawrence, Neil D; Rattray, Magnus
2009-07-09
Most analyses of microarray data are based on point estimates of expression levels and ignore the uncertainty of such estimates. By determining uncertainties from Affymetrix GeneChip data and propagating these uncertainties to downstream analyses it has been shown that we can improve results of differential expression detection, principal component analysis and clustering. Previously, implementations of these uncertainty propagation methods have only been available as separate packages, written in different languages. Previous implementations have also suffered from being very costly to compute, and in the case of differential expression detection, have been limited in the experimental designs to which they can be applied. puma is a Bioconductor package incorporating a suite of analysis methods for use on Affymetrix GeneChip data. puma extends the differential expression detection methods of previous work from the 2-class case to the multi-factorial case. puma can be used to automatically create design and contrast matrices for typical experimental designs, which can be used both within the package itself but also in other Bioconductor packages. The implementation of differential expression detection methods has been parallelised leading to significant decreases in processing time on a range of computer architectures. puma incorporates the first R implementation of an uncertainty propagation version of principal component analysis, and an implementation of a clustering method based on uncertainty propagation. All of these techniques are brought together in a single, easy-to-use package with clear, task-based documentation. For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. These methods can be used to improve results from more traditional analyses of microarray data. puma also offers improvements in terms of scope and speed of execution over previously available methods. puma is recommended for anyone working with the Affymetrix GeneChip platform for gene expression analysis and can also be applied more generally.
Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data.
Paulson, Joseph N; Chen, Cho-Yi; Lopes-Ramos, Camila M; Kuijjer, Marieke L; Platig, John; Sonawane, Abhijeet R; Fagny, Maud; Glass, Kimberly; Quackenbush, John
2017-10-03
Although ultrahigh-throughput RNA-Sequencing has become the dominant technology for genome-wide transcriptional profiling, the vast majority of RNA-Seq studies typically profile only tens of samples, and most analytical pipelines are optimized for these smaller studies. However, projects are generating ever-larger data sets comprising RNA-Seq data from hundreds or thousands of samples, often collected at multiple centers and from diverse tissues. These complex data sets present significant analytical challenges due to batch and tissue effects, but provide the opportunity to revisit the assumptions and methods that we use to preprocess, normalize, and filter RNA-Seq data - critical first steps for any subsequent analysis. We find that analysis of large RNA-Seq data sets requires both careful quality control and the need to account for sparsity due to the heterogeneity intrinsic in multi-group studies. We developed Yet Another RNA Normalization software pipeline (YARN), that includes quality control and preprocessing, gene filtering, and normalization steps designed to facilitate downstream analysis of large, heterogeneous RNA-Seq data sets and we demonstrate its use with data from the Genotype-Tissue Expression (GTEx) project. An R package instantiating YARN is available at http://bioconductor.org/packages/yarn .
OntoCAT -- simple ontology search and integration in Java, R and REST/JavaScript
2011-01-01
Background Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. Results OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. Conclusions OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. Availability http://www.ontocat.org PMID:21619703
OntoCAT--simple ontology search and integration in Java, R and REST/JavaScript.
Adamusiak, Tomasz; Burdett, Tony; Kurbatova, Natalja; Joeri van der Velde, K; Abeygunawardena, Niran; Antonakaki, Despoina; Kapushesky, Misha; Parkinson, Helen; Swertz, Morris A
2011-05-29
Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. http://www.ontocat.org.
Goldstein, Darlene R
2006-10-01
Studies of gene expression using high-density short oligonucleotide arrays have become a standard in a variety of biological contexts. Of the expression measures that have been proposed to quantify expression in these arrays, multi-chip-based measures have been shown to perform well. As gene expression studies increase in size, however, utilizing multi-chip expression measures is more challenging in terms of computing memory requirements and time. A strategic alternative to exact multi-chip quantification on a full large chip set is to approximate expression values based on subsets of chips. This paper introduces an extrapolation method, Extrapolation Averaging (EA), and a resampling method, Partition Resampling (PR), to approximate expression in large studies. An examination of properties indicates that subset-based methods can perform well compared with exact expression quantification. The focus is on short oligonucleotide chips, but the same ideas apply equally well to any array type for which expression is quantified using an entire set of arrays, rather than for only a single array at a time. Software implementing Partition Resampling and Extrapolation Averaging is under development as an R package for the BioConductor project.
Espino, Jeremy U; Wagner, M; Szczepaniak, C; Tsui, F C; Su, H; Olszewski, R; Liu, Z; Chapman, W; Zeng, X; Ma, L; Lu, Z; Dara, J
2004-09-24
Computer-based outbreak and disease surveillance requires high-quality software that is well-supported and affordable. Developing software in an open-source framework, which entails free distribution and use of software and continuous, community-based software development, can produce software with such characteristics, and can do so rapidly. The objective of the Real-Time Outbreak and Disease Surveillance (RODS) Open Source Project is to accelerate the deployment of computer-based outbreak and disease surveillance systems by writing software and catalyzing the formation of a community of users, developers, consultants, and scientists who support its use. The University of Pittsburgh seeded the Open Source Project by releasing the RODS software under the GNU General Public License. An infrastructure was created, consisting of a website, mailing lists for developers and users, designated software developers, and shared code-development tools. These resources are intended to encourage growth of the Open Source Project community. Progress is measured by assessing website usage, number of software downloads, number of inquiries, number of system deployments, and number of new features or modules added to the code base. During September--November 2003, users generated 5,370 page views of the project website, 59 software downloads, 20 inquiries, one new deployment, and addition of four features. Thus far, health departments and companies have been more interested in using the software as is than in customizing or developing new features. The RODS laboratory anticipates that after initial installation has been completed, health departments and companies will begin to customize the software and contribute their enhancements to the public code base.
Open Source, Openness, and Higher Education
ERIC Educational Resources Information Center
Wiley, David
2006-01-01
In this article David Wiley provides an overview of how the general expansion of open source software has affected the world of education in particular. In doing so, Wiley not only addresses the development of open source software applications for teachers and administrators, he also discusses how the fundamental philosophy of the open source…
Open source molecular modeling.
Pirhadi, Somayeh; Sunseri, Jocelyn; Koes, David Ryan
2016-09-01
The success of molecular modeling and computational chemistry efforts are, by definition, dependent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Open Source Paradigm: A Synopsis of The Cathedral and the Bazaar for Health and Social Care.
Benson, Tim
2016-07-04
Open source software (OSS) is becoming more fashionable in health and social care, although the ideas are not new. However progress has been slower than many had expected. The purpose is to summarise the Free/Libre Open Source Software (FLOSS) paradigm in terms of what it is, how it impacts users and software engineers and how it can work as a business model in health and social care sectors. Much of this paper is a synopsis of Eric Raymond's seminal book The Cathedral and the Bazaar, which was the first comprehensive description of the open source ecosystem, set out in three long essays. Direct quotes from the book are used liberally, without reference to specific passages. The first part contrasts open and closed source approaches to software development and support. The second part describes the culture and practices of the open source movement. The third part considers business models. A key benefit of open source is that users can access and collaborate on improving the software if they wish. Closed source code may be regarded as a strategic business risk that that may be unacceptable if there is an open source alternative. The sharing culture of the open source movement fits well with that of health and social care.
Looking toward the Future: A Case Study of Open Source Software in the Humanities
ERIC Educational Resources Information Center
Quamen, Harvey
2006-01-01
In this article Harvey Quamen examines how the philosophy of open source software might be of particular benefit to humanities scholars in the near future--particularly for academic journals with limited financial resources. To this end he provides a case study in which he describes his use of open source technology (MySQL database software and…
Open Source Software in Medium Size Organizations: Key Factors for Adoption
ERIC Educational Resources Information Center
Solomon, Jerry T.
2010-01-01
For-profit organizations are constantly evaluating new technologies to gain competitive advantage. One such technology, application software, has changed significantly over the past 25 years with the introduction of Open Source Software (OSS). In contrast to commercial software that is developed by private companies and sold to organizations, OSS…
Busby, Ben; Lesko, Matthew; Federer, Lisa
2016-01-01
In genomics, bioinformatics and other areas of data science, gaps exist between extant public datasets and the open-source software tools built by the community to analyze similar data types. The purpose of biological data science hackathons is to assemble groups of genomics or bioinformatics professionals and software developers to rapidly prototype software to address these gaps. The only two rules for the NCBI-assisted hackathons run so far are that 1) data either must be housed in public data repositories or be deposited to such repositories shortly after the hackathon's conclusion, and 2) all software comprising the final pipeline must be open-source or open-use. Proposed topics, as well as suggested tools and approaches, are distributed to participants at the beginning of each hackathon and refined during the event. Software, scripts, and pipelines are developed and published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development. The code resulting from each hackathon is published at https://github.com/NCBI-Hackathons/ with separate directories or repositories for each team.
The 2016 Bioinformatics Open Source Conference (BOSC).
Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather
2016-01-01
Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.
ERIC Educational Resources Information Center
Thankachan, Briju; Moore, David Richard
2017-01-01
The use of Free and Open Source Software (FOSS), a subset of Information and Communication Technology (ICT), can reduce the cost of purchasing software. Despite the benefit in the initial purchase price of software, deploying software requires total cost that goes beyond the initial purchase price. Total cost is a silent issue of FOSS and can only…
ERIC Educational Resources Information Center
Kapor, Mitchell
2005-01-01
Open source software projects involve the production of goods, but in software projects, the "goods" consist of information. The open source model is an alternative to the conventional centralized, command-and-control way in which things are usually made. In contrast, open source projects are genuinely decentralized and transparent. Transparent…
Open core control software for surgical robots
Kozuka, Hiroaki; Kim, Hyung Wook; Takesue, Naoyuki; Vladimirov, B.; Sakaguchi, Masamichi; Tokuda, Junichi; Hata, Nobuhiko; Chinzei, Kiyoyuki; Fujimoto, Hideo
2010-01-01
Object In these days, patients and doctors in operation room are surrounded by many medical devices as resulting from recent advancement of medical technology. However, these cutting-edge medical devices are working independently and not collaborating with each other, even though the collaborations between these devices such as navigation systems and medical imaging devices are becoming very important for accomplishing complex surgical tasks (such as a tumor removal procedure while checking the tumor location in neurosurgery). On the other hand, several surgical robots have been commercialized, and are becoming common. However, these surgical robots are not open for collaborations with external medical devices in these days. A cutting-edge “intelligent surgical robot” will be possible in collaborating with surgical robots, various kinds of sensors, navigation system and so on. On the other hand, most of the academic software developments for surgical robots are “home-made” in their research institutions and not open to the public. Therefore, open source control software for surgical robots can be beneficial in this field. From these perspectives, we developed Open Core Control software for surgical robots to overcome these challenges. Materials and methods In general, control softwares have hardware dependencies based on actuators, sensors and various kinds of internal devices. Therefore, these control softwares cannot be used on different types of robots without modifications. However, the structure of the Open Core Control software can be reused for various types of robots by abstracting hardware dependent parts. In addition, network connectivity is crucial for collaboration between advanced medical devices. The OpenIGTLink is adopted in Interface class which plays a role to communicate with external medical devices. At the same time, it is essential to maintain the stable operation within the asynchronous data transactions through network. In the Open Core Control software, several techniques for this purpose were introduced. Virtual fixture is well known technique as a “force guide” for supporting operators to perform precise manipulation by using a master–slave robot. The virtual fixture for precise and safety surgery was implemented on the system to demonstrate an idea of high-level collaboration between a surgical robot and a navigation system. The extension of virtual fixture is not a part of the Open Core Control system, however, the function such as virtual fixture cannot be realized without a tight collaboration between cutting-edge medical devices. By using the virtual fixture, operators can pre-define an accessible area on the navigation system, and the area information can be transferred to the robot. In this manner, the surgical console generates the reflection force when the operator tries to get out from the pre-defined accessible area during surgery. Results The Open Core Control software was implemented on a surgical master–slave robot and stable operation was observed in a motion test. The tip of the surgical robot was displayed on a navigation system by connecting the surgical robot with a 3D position sensor through the OpenIGTLink. The accessible area was pre-defined before the operation, and the virtual fixture was displayed as a “force guide” on the surgical console. In addition, the system showed stable performance in a duration test with network disturbance. Conclusion In this paper, a design of the Open Core Control software for surgical robots and the implementation of virtual fixture were described. The Open Core Control software was implemented on a surgical robot system and showed stable performance in high-level collaboration works. The Open Core Control software is developed to be a widely used platform of surgical robots. Safety issues are essential for control software of these complex medical devices. It is important to follow the global specifications such as a FDA requirement “General Principles of Software Validation” or IEC62304. For following these regulations, it is important to develop a self-test environment. Therefore, a test environment is now under development to test various interference in operation room such as a noise of electric knife by considering safety and test environment regulations such as ISO13849 and IEC60508. The Open Core Control software is currently being developed software in open-source manner and available on the Internet. A communization of software interface is becoming a major trend in this field. Based on this perspective, the Open Core Control software can be expected to bring contributions in this field. PMID:20033506
Open Source and Design Thinking at NASA: A Vision for Future Software
NASA Technical Reports Server (NTRS)
Trimble, Jay
2017-01-01
NASA Mission Control Software for the Visualization of data has historically been closed, accessible only to small groups of flight controllers, often bound to a specific mission discipline such as flight dynamics, health and status or mission planning. Open Mission Control Technologies (MCT) provides new capability for NASA mission controllers and, by being fully open source, opens up NASA software for the visualization of mission data to broader communities inside and outside of NASA. Open MCT is the product of a design thinking process within NASA, using participatory design and design sprints to build a product that serves users.
Locking Down the Software Development Environment
2014-12-01
OpenSSL code [13]. The OpenSSL software is, as the name implies, open source, a result of many developers coding beginning in 1998 using the C...programming language to build crypto services. OpenSSL is used widely both on the Internet and in firmware [13], further delaying the ability of many
Open Source Software Development
2011-01-01
Software, 2002, 149(1), 3-17. 3. DiBona , C., Cooper, D., and Stone, M. (Eds.), Open Sources 2.0, 2005, O’Reilly Media, Sebastopol, CA. Also see, C... DiBona , S. Ockman, and M. Stone (Eds.). Open Sources: Vocides from the Open Source Revolution, 1999. O’Reilly Media, Sebastopol, CA. 4. Ducheneaut, N
Improving Software Sustainability: Lessons Learned from Profiles in Science.
Gallagher, Marie E
2013-01-01
The Profiles in Science® digital library features digitized surrogates of historical items selected from the archival collections of the U.S. National Library of Medicine as well as collaborating institutions. In addition, it contains a database of descriptive, technical and administrative metadata. It also contains various software components that allow creation of the metadata, management of the digital items, and access to the items and metadata through the Profiles in Science Web site [1]. The choices made building the digital library were designed to maximize the sustainability and long-term survival of all of the components of the digital library [2]. For example, selecting standard and open digital file formats rather than proprietary formats increases the sustainability of the digital files [3]. Correspondingly, using non-proprietary software may improve the sustainability of the software--either through in-house expertise or through the open source community. Limiting our digital library software exclusively to open source software or to software developed in-house has not been feasible. For example, we have used proprietary operating systems, scanning software, a search engine, and office productivity software. We did this when either lack of essential capabilities or the cost-benefit trade-off favored using proprietary software. We also did so knowing that in the future we would need to replace or upgrade some of our proprietary software, analogous to migrating from an obsolete digital file format to a new format as the technological landscape changes. Since our digital library's start in 1998, all of its software has been upgraded or replaced, but the digitized items have not yet required migration to other formats. Technological changes that compelled us to replace proprietary software included the cost of product licensing, product support, incompatibility with other software, prohibited use due to evolving security policies, and product abandonment. Sometimes these changes happen on short notice, so we continually monitor our library's software for signs of endangerment. We have attempted to replace proprietary software with suitable in-house or open source software. When the replacement involves a standalone piece of software with a nearly equivalent version, such as replacing a commercial HTTP server with an open source HTTP server, the replacement is straightforward. Recently we replaced software that functioned not only as our search engine but also as the backbone of the architecture of our Web site. In this paper, we describe the lessons learned and the pros and cons of replacing this software with open source software.
ERIC Educational Resources Information Center
Lin, Yu-Wei; Zini, Enrico
2008-01-01
This empirical paper shows how free/libre open source software (FLOSS) contributes to mutual and collaborative learning in an educational environment. Unlike proprietary software, FLOSS allows extensive customisation of software to support the needs of local users better. This also allows users to participate more proactively in the development…
Shaping Software Engineering Curricula Using Open Source Communities: A Case Study
ERIC Educational Resources Information Center
Bowring, James; Burke, Quinn
2016-01-01
This paper documents four years of a novel approach to teaching a two-course sequence in software engineering as part of the ABET-accredited computer science curriculum at the College of Charleston. This approach is team-based and centers on learning software engineering in the context of open source software projects. In the first course, teams…
The Value of Open Source Software Tools in Qualitative Research
ERIC Educational Resources Information Center
Greenberg, Gary
2011-01-01
In an era of global networks, researchers using qualitative methods must consider the impact of any software they use on the sharing of data and findings. In this essay, I identify researchers' main areas of concern regarding the use of qualitative software packages for research. I then examine how open source software tools, wherein the publisher…
Mathelier, Anthony; Fornes, Oriol; Arenillas, David J; Chen, Chih-Yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W
2016-01-04
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
3D reconstruction software comparison for short sequences
NASA Astrophysics Data System (ADS)
Strupczewski, Adam; Czupryński, BłaŻej
2014-11-01
Large scale multiview reconstruction is recently a very popular area of research. There are many open source tools that can be downloaded and run on a personal computer. However, there are few, if any, comparisons between all the available software in terms of accuracy on small datasets that a single user can create. The typical datasets for testing of the software are archeological sites or cities, comprising thousands of images. This paper presents a comparison of currently available open source multiview reconstruction software for small datasets. It also compares the open source solutions with a simple structure from motion pipeline developed by the authors from scratch with the use of OpenCV and Eigen libraries.
The 2016 Bioinformatics Open Source Conference (BOSC)
Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather
2016-01-01
Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083
pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts.
Rani, Jyoti; Shah, A B Rauf; Ramachandran, Srinivasan
2015-10-01
The PubMed literature database is a valuable source of information for scientific research. It is rich in biomedical literature with more than 24 million citations. Data-mining of voluminous literature is a challenging task. Although several text-mining algorithms have been developed in recent years with focus on data visualization, they have limitations such as speed, are rigid and are not available in the open source. We have developed an R package, pubmed.mineR, wherein we have combined the advantages of existing algorithms, overcome their limitations, and offer user flexibility and link with other packages in Bioconductor and the Comprehensive R Network (CRAN) in order to expand the user capabilities for executing multifaceted approaches. Three case studies are presented, namely, 'Evolving role of diabetes educators', 'Cancer risk assessment' and 'Dynamic concepts on disease and comorbidity' to illustrate the use of pubmed.mineR. The package generally runs fast with small elapsed times in regular workstations even on large corpus sizes and with compute intensive functions. The pubmed.mineR is available at http://cran.rproject. org/web/packages/pubmed.mineR.
2015-08-21
using the Open Computer Vision ( OpenCV ) libraries [6] for computer vision and the Qt library [7] for the user interface. The software has the...depth. The software application calibrates the cameras using the plane based calibration model from the OpenCV calib3D module and allows the...6] OpenCV . 2015. OpenCV Open Source Computer Vision. [Online]. Available at: opencv.org [Accessed]: 09/01/2015. [7] Qt. 2015. Qt Project home
Managing Digital Archives Using Open Source Software Tools
NASA Astrophysics Data System (ADS)
Barve, S.; Dongare, S.
2007-10-01
This paper describes the use of open source software tools such as MySQL and PHP for creating database-backed websites. Such websites offer many advantages over ones built from static HTML pages. This paper will discuss how OSS tools are used and their benefits, and after the successful implementation of these tools how the library took the initiative in implementing an institutional repository using DSpace open source software.
2016-02-22
SPONSORED REPORT SERIES Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web and Mobile Devices 22...ACQUISITION RESEARCH PROGRAM SPONSORED REPORT SERIES Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web ...Policy Naval Postgraduate School Executive Summary Many people within large enterprises rely on up to four Web -based or mobile devices for their
CrossTalk: The Journal of Defense Software Engineering. Volume 24, Number 6. November/December 2011
2011-11-01
Software Development.” Software Quality Professional Journal, American Society for Quality (ASQ), (March 2010) 4-14. 3. Nair, Gopalakrishnan T.R...Inspection Performance Metric”. Software Quality Professional Journal, American Society for Quality (ASQ), Volume 13, Issue 2, (March 2011) 14-26...the discovery process and are marketed by compa- nies such as Black Duck Software, OpenLogic, Palamida, and Protecode, among others.7 A number of open
Busby, Ben; Lesko, Matthew; Federer, Lisa
2016-01-01
In genomics, bioinformatics and other areas of data science, gaps exist between extant public datasets and the open-source software tools built by the community to analyze similar data types. The purpose of biological data science hackathons is to assemble groups of genomics or bioinformatics professionals and software developers to rapidly prototype software to address these gaps. The only two rules for the NCBI-assisted hackathons run so far are that 1) data either must be housed in public data repositories or be deposited to such repositories shortly after the hackathon’s conclusion, and 2) all software comprising the final pipeline must be open-source or open-use. Proposed topics, as well as suggested tools and approaches, are distributed to participants at the beginning of each hackathon and refined during the event. Software, scripts, and pipelines are developed and published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development. The code resulting from each hackathon is published at https://github.com/NCBI-Hackathons/ with separate directories or repositories for each team. PMID:27134733
Katzman, G L; Morris, D; Lauman, J; Cochella, C; Goede, P; Harnsberger, H R
2001-06-01
To foster a community supported evaluation processes for open-source digital teaching file (DTF) development and maintenance. The mechanisms used to support this process will include standard web browsers, web servers, forum software, and custom additions to the forum software to potentially enable a mediated voting protocol. The web server will also serve as a focal point for beta and release software distribution, which is the desired end-goal of this process. We foresee that www.mdtf.org will provide for widespread distribution of open source DTF software that will include function and interface design decisions from community participation on the website forums.
The Role of Standards in Cloud-Computing Interoperability
2012-10-01
services are not shared outside the organization. CloudStack, Eucalyptus, HP, Microsoft, OpenStack , Ubuntu, and VMWare provide tools for building...center requirements • Developing usage models for cloud ven- dors • Independent IT consortium OpenStack http://www.openstack.org • Open-source...software for running private clouds • Currently consists of three core software projects: OpenStack Compute (Nova), OpenStack Object Storage (Swift
Open Source Software and the Intellectual Commons.
ERIC Educational Resources Information Center
Dorman, David
2002-01-01
Discusses the Open Source Software method of software development and its relationship to control over information content. Topics include digital library resources; reference services; preservation; the legal and economic status of information; technical standards; access to digital data; control of information use; and copyright and patent laws.…
Open source hardware and software platform for robotics and artificial intelligence applications
NASA Astrophysics Data System (ADS)
Liang, S. Ng; Tan, K. O.; Lai Clement, T. H.; Ng, S. K.; Mohammed, A. H. Ali; Mailah, Musa; Azhar Yussof, Wan; Hamedon, Zamzuri; Yussof, Zulkifli
2016-02-01
Recent developments in open source hardware and software platforms (Android, Arduino, Linux, OpenCV etc.) have enabled rapid development of previously expensive and sophisticated system within a lower budget and flatter learning curves for developers. Using these platform, we designed and developed a Java-based 3D robotic simulation system, with graph database, which is integrated in online and offline modes with an Android-Arduino based rubbish picking remote control car. The combination of the open source hardware and software system created a flexible and expandable platform for further developments in the future, both in the software and hardware areas, in particular in combination with graph database for artificial intelligence, as well as more sophisticated hardware, such as legged or humanoid robots.
Creating an open environment software infrastructure
NASA Technical Reports Server (NTRS)
Jipping, Michael J.
1992-01-01
As the development of complex computer hardware accelerates at increasing rates, the ability of software to keep pace is essential. The development of software design tools, however, is falling behind the development of hardware for several reasons, the most prominent of which is the lack of a software infrastructure to provide an integrated environment for all parts of a software system. The research was undertaken to provide a basis for answering this problem by investigating the requirements of open environments.
Power enhancement via multivariate outlier testing with gene expression arrays.
Asare, Adam L; Gao, Zhong; Carey, Vincent J; Wang, Richard; Seyfert-Margolis, Vicki
2009-01-01
As the use of microarrays in human studies continues to increase, stringent quality assurance is necessary to ensure accurate experimental interpretation. We present a formal approach for microarray quality assessment that is based on dimension reduction of established measures of signal and noise components of expression followed by parametric multivariate outlier testing. We applied our approach to several data resources. First, as a negative control, we found that the Affymetrix and Illumina contributions to MAQC data were free from outliers at a nominal outlier flagging rate of alpha=0.01. Second, we created a tunable framework for artificially corrupting intensity data from the Affymetrix Latin Square spike-in experiment to allow investigation of sensitivity and specificity of quality assurance (QA) criteria. Third, we applied the procedure to 507 Affymetrix microarray GeneChips processed with RNA from human peripheral blood samples. We show that exclusion of arrays by this approach substantially increases inferential power, or the ability to detect differential expression, in large clinical studies. http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.html and http://bioconductor.org/packages/2.3/bioc/html/affyContam.html affyContam (credentials: readonly/readonly)
RELIC: a novel dye-bias correction method for Illumina Methylation BeadChip.
Xu, Zongli; Langie, Sabine A S; De Boever, Patrick; Taylor, Jack A; Niu, Liang
2017-01-03
The Illumina Infinium HumanMethylation450 BeadChip and its successor, Infinium MethylationEPIC BeadChip, have been extensively utilized in epigenome-wide association studies. Both arrays use two fluorescent dyes (Cy3-green/Cy5-red) to measure methylation level at CpG sites. However, performance difference between dyes can result in biased estimates of methylation levels. Here we describe a novel method, called REgression on Logarithm of Internal Control probes (RELIC) to correct for dye bias on whole array by utilizing the intensity values of paired internal control probes that monitor the two color channels. We evaluate the method in several datasets against other widely used dye-bias correction methods. Results on data quality improvement showed that RELIC correction statistically significantly outperforms alternative dye-bias correction methods. We incorporated the method into the R package ENmix, which is freely available from the Bioconductor website ( https://www.bioconductor.org/packages/release/bioc/html/ENmix.html ). RELIC is an efficient and robust method to correct for dye-bias in Illumina Methylation BeadChip data. It outperforms other alternative methods and conveniently implemented in R package ENmix to facilitate DNA methylation studies.
ctsGE-clustering subgroups of expression data.
Sharabi-Schwager, Michal; Or, Etti; Ophir, Ron
2017-07-01
A pre-requisite to clustering noisy data, such as gene-expression data, is the filtering step. As an alternative to this step, the ctsGE R-package applies a sorting step in which all of the data are divided into small groups. The groups are divided according to how the time points are related to the time-series median. Then clustering is performed separately on each group. Thus, the clustering is done in two steps. First, an expression index (i.e. a sequence of 1, -1 and 0) is defined and genes with the same index are grouped together, and then each group of genes is clustered by k-means to create subgroups. The ctsGE package also provides an interactive tool to visualize and explore the gene-expression patterns and their subclusters. ctsGE proposes a way of organizing and exploring expression data without eliminating valuable information. Freely available as part of the Bioconductor project at https://bioconductor.org/packages/ctsGE/ . ron@agri.gov.il. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Develop Direct Geo-referencing System Based on Open Source Software and Hardware Platform
NASA Astrophysics Data System (ADS)
Liu, H. S.; Liao, H. M.
2015-08-01
Direct geo-referencing system uses the technology of remote sensing to quickly grasp images, GPS tracks, and camera position. These data allows the construction of large volumes of images with geographic coordinates. So that users can be measured directly on the images. In order to properly calculate positioning, all the sensor signals must be synchronized. Traditional aerial photography use Position and Orientation System (POS) to integrate image, coordinates and camera position. However, it is very expensive. And users could not use the result immediately because the position information does not embed into image. To considerations of economy and efficiency, this study aims to develop a direct geo-referencing system based on open source software and hardware platform. After using Arduino microcontroller board to integrate the signals, we then can calculate positioning with open source software OpenCV. In the end, we use open source panorama browser, panini, and integrate all these to open source GIS software, Quantum GIS. A wholesome collection of data - a data processing system could be constructed.
Evaluation of software maintain ability with open EHR - a comparison of architectures.
Atalag, Koray; Yang, Hong Yul; Tempero, Ewan; Warren, James R
2014-11-01
To assess whether it is easier to maintain a clinical information system developed using open EHR model driven development versus mainstream methods. A new open source application (GastrOS) has been developed following open EHR's multi-level modelling approach using .Net/C# based on the same requirements of an existing clinically used application developed using Microsoft Visual Basic and Access database. Almost all the domain knowledge was embedded into the software code and data model in the latter. The same domain knowledge has been expressed as a set of open EHR Archetypes in GastrOS. We then introduced eight real-world change requests that had accumulated during live clinical usage, and implemented these in both systems while measuring time for various development tasks and change in software size for each change request. Overall it took half the time to implement changes in GastrOS. However it was the more difficult application to modify for one change request, suggesting the nature of change is also important. It was not possible to implement changes by modelling only. Comparison of relative measures of time and software size change within each application highlights how architectural differences affected maintain ability across change requests. The use of open EHR model driven development can result in better software maintain ability. The degree to which open EHR affects software maintain ability depends on the extent and nature of domain knowledge involved in changes. Although we used relative measures for time and software size, confounding factors could not be totally excluded as a controlled study design was not feasible. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
The Commercial Open Source Business Model
NASA Astrophysics Data System (ADS)
Riehle, Dirk
Commercial open source software projects are open source software projects that are owned by a single firm that derives a direct and significant revenue stream from the software. Commercial open source at first glance represents an economic paradox: How can a firm earn money if it is making its product available for free as open source? This paper presents the core properties of com mercial open source business models and discusses how they work. Using a commercial open source approach, firms can get to market faster with a superior product at lower cost than possible for traditional competitors. The paper shows how these benefits accrue from an engaged and self-supporting user community. Lacking any prior comprehensive reference, this paper is based on an analysis of public statements by practitioners of commercial open source. It forges the various anecdotes into a coherent description of revenue generation strategies and relevant business functions.
BioContainers: an open-source and community-driven framework for software standardization.
da Veiga Leprevost, Felipe; Grüning, Björn A; Alves Aflitos, Saulo; Röst, Hannes L; Uszkoreit, Julian; Barsnes, Harald; Vaudel, Marc; Moreno, Pablo; Gatto, Laurent; Weber, Jonas; Bai, Mingze; Jimenez, Rafael C; Sachsenberg, Timo; Pfeuffer, Julianus; Vera Alvarez, Roberto; Griss, Johannes; Nesvizhskii, Alexey I; Perez-Riverol, Yasset
2017-08-15
BioContainers (biocontainers.pro) is an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software. BioContainers allows labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. BioContainers is based on popular open-source projects Docker and rkt frameworks, that allow software to be installed and executed under an isolated and controlled environment. Also, it provides infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics technologies. These containers can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, cloud environments or HPC clusters). The software is freely available at github.com/BioContainers/. yperez@ebi.ac.uk. © The Author(s) 2017. Published by Oxford University Press.
BioContainers: an open-source and community-driven framework for software standardization
da Veiga Leprevost, Felipe; Grüning, Björn A.; Alves Aflitos, Saulo; Röst, Hannes L.; Uszkoreit, Julian; Barsnes, Harald; Vaudel, Marc; Moreno, Pablo; Gatto, Laurent; Weber, Jonas; Bai, Mingze; Jimenez, Rafael C.; Sachsenberg, Timo; Pfeuffer, Julianus; Vera Alvarez, Roberto; Griss, Johannes; Nesvizhskii, Alexey I.; Perez-Riverol, Yasset
2017-01-01
Abstract Motivation BioContainers (biocontainers.pro) is an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software. BioContainers allows labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. BioContainers is based on popular open-source projects Docker and rkt frameworks, that allow software to be installed and executed under an isolated and controlled environment. Also, it provides infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics technologies. These containers can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, cloud environments or HPC clusters). Availability and Implementation The software is freely available at github.com/BioContainers/. Contact yperez@ebi.ac.uk PMID:28379341
NASA Astrophysics Data System (ADS)
Xing, Fangyuan; Wang, Honghuan; Yin, Hongxi; Li, Ming; Luo, Shenzi; Wu, Chenguang
2016-02-01
With the extensive application of cloud computing and data centres, as well as the constantly emerging services, the big data with the burst characteristic has brought huge challenges to optical networks. Consequently, the software defined optical network (SDON) that combines optical networks with software defined network (SDN), has attracted much attention. In this paper, an OpenFlow-enabled optical node employed in optical cross-connect (OXC) and reconfigurable optical add/drop multiplexer (ROADM), is proposed. An open source OpenFlow controller is extended on routing strategies. In addition, the experiment platform based on OpenFlow protocol for software defined optical network, is designed. The feasibility and availability of the OpenFlow-enabled optical nodes and the extended OpenFlow controller are validated by the connectivity test, protection switching and load balancing experiments in this test platform.
Weather forecasting with open source software
NASA Astrophysics Data System (ADS)
Rautenhaus, Marc; Dörnbrack, Andreas
2013-04-01
To forecast the weather situation during aircraft-based atmospheric field campaigns, we employ a tool chain of existing and self-developed open source software tools and open standards. Of particular value are the Python programming language with its extension libraries NumPy, SciPy, PyQt4, Matplotlib and the basemap toolkit, the NetCDF standard with the Climate and Forecast (CF) Metadata conventions, and the Open Geospatial Consortium Web Map Service standard. These open source libraries and open standards helped to implement the "Mission Support System", a Web Map Service based tool to support weather forecasting and flight planning during field campaigns. The tool has been implemented in Python and has also been released as open source (Rautenhaus et al., Geosci. Model Dev., 5, 55-71, 2012). In this presentation we discuss the usage of free and open source software for weather forecasting in the context of research flight planning, and highlight how the field campaign work benefits from using open source tools and open standards.
Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field.
Wójcikowski, Maciej; Zielenkiewicz, Piotr; Siedlecki, Pawel
2015-01-01
There has been huge progress in the open cheminformatics field in both methods and software development. Unfortunately, there has been little effort to unite those methods and software into one package. We here describe the Open Drug Discovery Toolkit (ODDT), which aims to fulfill the need for comprehensive and open source drug discovery software. The Open Drug Discovery Toolkit was developed as a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. ODDT reimplements many state-of-the-art methods, such as machine learning scoring functions (RF-Score and NNScore) and wraps other external software to ease the process of developing CADD pipelines. ODDT is an out-of-the-box solution designed to be easily customizable and extensible. Therefore, users are strongly encouraged to extend it and develop new methods. We here present three use cases for ODDT in common tasks in computer-aided drug discovery. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT's source code, additional examples and documentation are available on GitHub (https://github.com/oddt/oddt).
Understanding How the "Open" of Open Source Software (OSS) Will Improve Global Health Security.
Hahn, Erin; Blazes, David; Lewis, Sheri
2016-01-01
Improving global health security will require bold action in all corners of the world, particularly in developing settings, where poverty often contributes to an increase in emerging infectious diseases. In order to mitigate the impact of emerging pandemic threats, enhanced disease surveillance is needed to improve early detection and rapid response to outbreaks. However, the technology to facilitate this surveillance is often unattainable because of high costs, software and hardware maintenance needs, limited technical competence among public health officials, and internet connectivity challenges experienced in the field. One potential solution is to leverage open source software, a concept that is unfortunately often misunderstood. This article describes the principles and characteristics of open source software and how it may be applied to solve global health security challenges.
NASA Astrophysics Data System (ADS)
Hasan, B.; Hasbullah; Purnama, W.; Hery, A.
2016-04-01
Creative industry development areas of software by using Free Open Source Software (FOSS) is expected to be one of the solutions to foster new entrepreneurs of the students who can open job opportunities and contribute to economic development in Indonesia. This study aims to create entrepreneurial coaching model based on the creative industries by utilizing FOSS software field as well as provide understanding and fostering entrepreneurial creative industries based field software for students of Universitas Pendidikan Indonesia. This activity phase begins with identifying entrepreneurs or business software technology that will be developed, training and mentoring, apprenticeship process at industrial partners, creation of business plans and monitoring and evaluation. This activity involves 30 UPI student which has the motivation to self-employment and have competence in the field of information technology. The results and outcomes expected from these activities is the birth of a number of new entrepreneurs from the students engaged in the software industry both software in the world of commerce (e-commerce) and education/learning (e-learning/LMS) and games.
The open-source movement: an introduction for forestry professionals
Patrick Proctor; Paul C. Van Deusen; Linda S. Heath; Jeffrey H. Gove
2005-01-01
In recent years, the open-source movement has yielded a generous and powerful suite of software and utilities that rivals those developed by many commercial software companies. Open-source programs are available for many scientific needs: operating systems, databases, statistical analysis, Geographic Information System applications, and object-oriented programming....
Rey-Martinez, Jorge; Pérez-Fernández, Nicolás
2016-12-01
The proposed validation goal of 0.9 in intra-class correlation coefficient was reached with the results of this study. With the obtained results we consider that the developed software (RombergLab) is a validated balance assessment software. The reliability of this software is dependent of the used force platform technical specifications. Develop and validate a posturography software and share its source code in open source terms. Prospective non-randomized validation study: 20 consecutive adults underwent two balance assessment tests, six condition posturography was performed using a clinical approved software and force platform and the same conditions were measured using the new developed open source software using a low cost force platform. Intra-class correlation index of the sway area obtained from the center of pressure variations in both devices for the six conditions was the main variable used for validation. Excellent concordance between RombergLab and clinical approved force platform was obtained (intra-class correlation coefficient =0.94). A Bland and Altman graphic concordance plot was also obtained. The source code used to develop RombergLab was published in open source terms.
NASA Astrophysics Data System (ADS)
Baudin, Veronique; Gomez-Diaz, Teresa
2013-04-01
The PLUME open platform (https://www.projet-plume.org) has as first goal to share competences and to value the knowledge of software experts within the French higher education and research communities. The project proposes in its platform the access to more than 380 index cards describing useful and economic software for this community, with open access to everybody. The second goal of PLUME focuses on to improve the visibility of software produced by research laboratories within the higher education and research communities. The "development-ESR" index cards briefly describe the main features of the software, including references to research publications associated to it. The platform counts more than 300 cards describing research software, where 89 cards have an English version. In this talk we describe the theme classification and the taxonomy of the index cards and the evolution with new themes added to the project. We will also focus on the organisation of PLUME as an open project and its interests in the promotion of free/open source software from and for research, contributing to the creation of a community of shared knowledge.
Open-source meteor detection software for low-cost single-board computers
NASA Astrophysics Data System (ADS)
Vida, D.; Zubović, D.; Šegon, D.; Gural, P.; Cupec, R.
2016-01-01
This work aims to overcome the current price threshold of meteor stations which can sometimes deter meteor enthusiasts from owning one. In recent years small card-sized computers became widely available and are used for numerous applications. To utilize such computers for meteor work, software which can run on them is needed. In this paper we present a detailed description of newly-developed open-source software for fireball and meteor detection optimized for running on low-cost single board computers. Furthermore, an update on the development of automated open-source software which will handle video capture, fireball and meteor detection, astrometry and photometry is given.
Free and Open Source Software for Geospatial in the field of planetary science
NASA Astrophysics Data System (ADS)
Frigeri, A.
2012-12-01
Information technology applied to geospatial analyses has spread quickly in the last ten years. The availability of OpenData and data from collaborative mapping projects increased the interest on tools, procedures and methods to handle spatially-related information. Free Open Source Software projects devoted to geospatial data handling are gaining a good success as the use of interoperable formats and protocols allow the user to choose what pipeline of tools and libraries is needed to solve a particular task, adapting the software scene to his specific problem. In particular, the Free Open Source model of development mimics the scientific method very well, and researchers should be naturally encouraged to take part to the development process of these software projects, as this represent a very agile way to interact among several institutions. When it comes to planetary sciences, geospatial Free Open Source Software is gaining a key role in projects that commonly involve different subjects in an international scenario. Very popular software suites for processing scientific mission data (for example, ISIS) and for navigation/planning (SPICE) are being distributed along with the source code and the interaction between user and developer is often very strict, creating a continuum between these two figures. A very widely spread library for handling geospatial data (GDAL) has started to support planetary data from the Planetary Data System, and recent contributions enabled the support to other popular data formats used in planetary science, as the Vicar one. The use of Geographic Information System in planetary science is now diffused, and Free Open Source GIS, open GIS formats and network protocols allow to extend existing tools and methods developed to solve Earth based problems, also to the case of the study of solar system bodies. A day in the working life of a researcher using Free Open Source Software for geospatial will be presented, as well as benefits and solutions to possible detriments coming from the effort required by using, supporting and contributing.
OpenROCS: a software tool to control robotic observatories
NASA Astrophysics Data System (ADS)
Colomé, Josep; Sanz, Josep; Vilardell, Francesc; Ribas, Ignasi; Gil, Pere
2012-09-01
We present the Open Robotic Observatory Control System (OpenROCS), an open source software platform developed for the robotic control of telescopes. It acts as a software infrastructure that executes all the necessary processes to implement responses to the system events that appear in the routine and non-routine operations associated to data-flow and housekeeping control. The OpenROCS software design and implementation provides a high flexibility to be adapted to different observatory configurations and event-action specifications. It is based on an abstract model that is independent of the specific hardware or software and is highly configurable. Interfaces to the system components are defined in a simple manner to achieve this goal. We give a detailed description of the version 2.0 of this software, based on a modular architecture developed in PHP and XML configuration files, and using standard communication protocols to interface with applications for hardware monitoring and control, environment monitoring, scheduling of tasks, image processing and data quality control. We provide two examples of how it is used as the core element of the control system in two robotic observatories: the Joan Oró Telescope at the Montsec Astronomical Observatory (Catalonia, Spain) and the SuperWASP Qatar Telescope at the Roque de los Muchachos Observatory (Canary Islands, Spain).
Open high-level data formats and software for gamma-ray astronomy
NASA Astrophysics Data System (ADS)
Deil, Christoph; Boisson, Catherine; Kosack, Karl; Perkins, Jeremy; King, Johannes; Eger, Peter; Mayer, Michael; Wood, Matthew; Zabalza, Victor; Knödlseder, Jürgen; Hassan, Tarek; Mohrmann, Lars; Ziegler, Alexander; Khelifi, Bruno; Dorner, Daniela; Maier, Gernot; Pedaletti, Giovanna; Rosado, Jaime; Contreras, José Luis; Lefaucheur, Julien; Brügge, Kai; Servillat, Mathieu; Terrier, Régis; Walter, Roland; Lombardi, Saverio
2017-01-01
In gamma-ray astronomy, a variety of data formats and proprietary software have been traditionally used, often developed for one specific mission or experiment. Especially for ground-based imaging atmospheric Cherenkov telescopes (IACTs), data and software are mostly private to the collaborations operating the telescopes. However, there is a general movement in science towards the use of open data and software. In addition, the next-generation IACT instrument, the Cherenkov Telescope Array (CTA), will be operated as an open observatory. We have created a Github organisation at https://github.com/open-gamma-ray-astro where we are developing high-level data format specifications. A public mailing list was set up at https://lists.nasa.gov/mailman/listinfo/open-gamma-ray-astro and a first face-to-face meeting on the IACT high-level data model and formats took place in April 2016 in Meudon (France). This open multi-mission effort will help to accelerate the development of open data formats and open-source software for gamma-ray astronomy, leading to synergies in the development of analysis codes and eventually better scientific results (reproducible, multi-mission). This write-up presents this effort for the first time, explaining the motivation and context, the available resources and process we use, as well as the status and planned next steps for the data format specifications. We hope that it will stimulate feedback and future contributions from the gamma-ray astronomy community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
An OpenStudio Measure is a script that can manipulate an OpenStudio model and associated data to apply energy conservation measures (ECMs), run supplemental simulations, or visualize simulation results. The OpenStudio software development kit (SDK) and accessibility of the Ruby scripting language makes measure authorship accessible to both software developers and energy modelers. This paper discusses the life cycle of an OpenStudio Measure from development, testing, and distribution, to application.
Learning from hackers: open-source clinical trials.
Dunn, Adam G; Day, Richard O; Mandl, Kenneth D; Coiera, Enrico
2012-05-02
Open sharing of clinical trial data has been proposed as a way to address the gap between the production of clinical evidence and the decision-making of physicians. A similar gap was addressed in the software industry by their open-source software movement. Here, we examine how the social and technical principles of the movement can guide the growth of an open-source clinical trial community.
2016-04-30
software (OSS) and proprietary (CSS) software elements or remote services (Scacchi, 2002, 2010), eventually including recent efforts to support Web ...specific platforms, including those operating on secured Web /mobile devices. Common Development Technology provides AC development tools and common...transition to OA systems and OSS software elements, specifically for Web and Mobile devices within the realm of C3CB. OA, Open APIs, OSS, and CSS OA
Moody, George B; Mark, Roger G; Goldberger, Ary L
2011-01-01
PhysioNet provides free web access to over 50 collections of recorded physiologic signals and time series, and related open-source software, in support of basic, clinical, and applied research in medicine, physiology, public health, biomedical engineering and computing, and medical instrument design and evaluation. Its three components (PhysioBank, the archive of signals; PhysioToolkit, the software library; and PhysioNetWorks, the virtual laboratory for collaborative development of future PhysioBank data collections and PhysioToolkit software components) connect researchers and students who need physiologic signals and relevant software with researchers who have data and software to share. PhysioNet's annual open engineering challenges stimulate rapid progress on unsolved or poorly solved questions of basic or clinical interest, by focusing attention on achievable solutions that can be evaluated and compared objectively using freely available reference data.
OpenMS: a flexible open-source software platform for mass spectrometry data analysis.
Röst, Hannes L; Sachsenberg, Timo; Aiche, Stephan; Bielow, Chris; Weisser, Hendrik; Aicheler, Fabian; Andreotti, Sandro; Ehrlich, Hans-Christian; Gutenbrunner, Petra; Kenar, Erhan; Liang, Xiao; Nahnsen, Sven; Nilse, Lars; Pfeuffer, Julianus; Rosenberger, George; Rurik, Marc; Schmitt, Uwe; Veit, Johannes; Walzer, Mathias; Wojnar, David; Wolski, Witold E; Schilling, Oliver; Choudhary, Jyoti S; Malmström, Lars; Aebersold, Ruedi; Reinert, Knut; Kohlbacher, Oliver
2016-08-30
High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.
Interim Open Source Software (OSS) Policy
This interim Policy establishes a framework to implement the requirements of the Office of Management and Budget's (OMB) Federal Source Code Policy to achieve efficiency, transparency and innovation through reusable and open source software.
NASA Technical Reports Server (NTRS)
Fountain T.; Tilak, S.; Shin, P.; Hubbard, P.; Freudinger, L.
2009-01-01
The Open Source DataTurbine Initiative is an international community of scientists and engineers sharing a common interest in real-time streaming data middleware and applications. The technology base of the OSDT Initiative is the DataTurbine open source middleware. Key applications of DataTurbine include coral reef monitoring, lake monitoring and limnology, biodiversity and animal tracking, structural health monitoring and earthquake engineering, airborne environmental monitoring, and environmental sustainability. DataTurbine software emerged as a commercial product in the 1990 s from collaborations between NASA and private industry. In October 2007, a grant from the USA National Science Foundation (NSF) Office of Cyberinfrastructure allowed us to transition DataTurbine from a proprietary software product into an open source software initiative. This paper describes the DataTurbine software and highlights key applications in environmental monitoring.
Is There Such a Thing as Free Software? The Pros and Cons of Open-Source Software
ERIC Educational Resources Information Center
Trappler, Thomas J.
2009-01-01
Today's higher education environment is marked by heightened accountability and decreased budgets. In such an environment, no higher education institution can afford to ignore alternative approaches that could result in more effective and less costly solutions. Open-source software (OSS) can serve as a viable alternative to traditional proprietary…
Getting Open Source Software into Schools: Strategies and Challenges
ERIC Educational Resources Information Center
Hepburn, Gary; Buley, Jan
2006-01-01
In this article Gary Hepburn and Jan Buley outline different approaches to implementing open source software (OSS) in schools; they also address the challenges that open source advocates should anticipate as they try to convince educational leaders to adopt OSS. With regard to OSS implementation, they note that schools have a flexible range of…
Open Source as Appropriate Technology for Global Education
ERIC Educational Resources Information Center
Carmichael, Patrick; Honour, Leslie
2002-01-01
Economic arguments for the adoption of "open source" software in business have been widely discussed. In this paper we draw on personal experience in the UK, South Africa and Southeast Asia to forward compelling reasons why open source software should be considered as an appropriate and affordable alternative to the currently prevailing…
Perceptions of Open Source versus Commercial Software: Is Higher Education Still on the Fence?
ERIC Educational Resources Information Center
van Rooij, Shahron Williams
2007-01-01
This exploratory study investigated the perceptions of technology and academic decision-makers about open source benefits and risks versus commercial software applications. The study also explored reactions to a concept for outsourcing campus-wide deployment and maintenance of open source. Data collected from telephone interviews were analyzed,…
Teaching Robotics Software with the Open Hardware Mobile Manipulator
ERIC Educational Resources Information Center
Vona, M.; Shekar, N. H.
2013-01-01
The "open hardware mobile manipulator" (OHMM) is a new open platform with a unique combination of features for teaching robotics software and algorithms. On-board low- and high-level processors support real-time embedded programming and motor control, as well as higher-level coding with contemporary libraries. Full hardware designs and…
User Driven Development of Software Tools for Open Data Discovery and Exploration
NASA Astrophysics Data System (ADS)
Schlobinski, Sascha; Keppel, Frank; Dihe, Pascal; Boot, Gerben; Falkenroth, Esa
2016-04-01
The use of open data in research faces challenges not restricted to inherent properties such as data quality, resolution of open data sets. Often Open data is catalogued insufficiently or fragmented. Software tools that support the effective discovery including the assessment of the data's appropriateness for research have shortcomings such as the lack of essential functionalities like support for data provenance. We believe that one of the reasons is the neglect of real end users requirements in the development process of aforementioned software tools. In the context of the FP7 Switch-On project we have pro-actively engaged the relevant user user community to collaboratively develop a means to publish, find and bind open data relevant for hydrologic research. Implementing key concepts of data discovery and exploration we have used state of the art web technologies to provide an interactive software tool that is easy to use yet powerful enough to satisfy the data discovery and access requirements of the hydrological research community.
Maintaining Quality and Confidence in Open-Source, Evolving Software: Lessons Learned with PFLOTRAN
NASA Astrophysics Data System (ADS)
Frederick, J. M.; Hammond, G. E.
2017-12-01
Software evolution in an open-source framework poses a major challenge to a geoscientific simulator, but when properly managed, the pay-off can be enormous for both the developers and the community at large. Developers must juggle implementing new scientific process models, adopting increasingly efficient numerical methods and programming paradigms, changing funding sources (or total lack of funding), while also ensuring that legacy code remains functional and reported bugs are fixed in a timely manner. With robust software engineering and a plan for long-term maintenance, a simulator can evolve over time incorporating and leveraging many advances in the computational and domain sciences. In this positive light, what practices in software engineering and code maintenance can be employed within open-source development to maximize the positive aspects of software evolution and community contributions while minimizing its negative side effects? This presentation will discusses steps taken in the development of PFLOTRAN (www.pflotran.org), an open source, massively parallel subsurface simulator for multiphase, multicomponent, and multiscale reactive flow and transport processes in porous media. As PFLOTRAN's user base and development team continues to grow, it has become increasingly important to implement strategies which ensure sustainable software development while maintaining software quality and community confidence. In this presentation, we will share our experiences and "lessons learned" within the context of our open-source development framework and community engagement efforts. Topics discussed will include how we've leveraged both standard software engineering principles, such as coding standards, version control, and automated testing, as well unique advantages of object-oriented design in process model coupling, to ensure software quality and confidence. We will also be prepared to discuss the major challenges faced by most open-source software teams, such as on-boarding new developers or one-time contributions, dealing with competitors or lookie-loos, and other downsides of complete transparency, as well as our approach to community engagement, including a user group email list, hosting short courses and workshops for new users, and maintaining a website. SAND2017-8174A
NASA Astrophysics Data System (ADS)
Zelt, C. A.
2017-12-01
Earth science attempts to understand how the earth works. This research often depends on software for modeling, processing, inverting or imaging. Freely sharing open-source software is essential to prevent reinventing the wheel and allows software to be improved and applied in ways the original author may never have envisioned. For young scientists, releasing software can increase their name ID when applying for jobs and funding, and create opportunities for collaborations when scientists who collect data want the software's creator to be involved in their project. However, we frequently hear scientists say software is a tool, it's not science. Creating software that implements a new or better way of earth modeling or geophysical processing, inverting or imaging should be viewed as earth science. Creating software for things like data visualization, format conversion, storage, or transmission, or programming to enhance computational performance, may be viewed as computer science. The former, ideally with an application to real data, can be published in earth science journals, the latter possibly in computer science journals. Citations in either case should accurately reflect the impact of the software on the community. Funding agencies need to support more software development and open-source releasing, and the community should give more high-profile awards for developing impactful open-source software. Funding support and community recognition for software development can have far reaching benefits when the software is used in foreseen and unforeseen ways, potentially for years after the original investment in the software development. For funding, an open-source release that is well documented should be required, with example input and output files. Appropriate funding will provide the incentive and time to release user-friendly software, and minimize the need for others to duplicate the effort. All funded software should be available through a single web site, ideally maintained by someone in a funded position. Perhaps the biggest challenge is the reality that researches who use software, as opposed to develop software, are more attractive university hires because they are more likely to be "big picture" scientists that publish in the highest profile journals, although sometimes the two go together.
2010-03-01
associated with certain software systems [Breaux and Anton 2008]. With this basis to build on, it is now possible to analyze the alignment of...Kazman, R., (2003). Software Architecture in Practice, 2nd Edition, Addison-Wesley Pro- fessional, New York.. Breaux, T.D. and Anton , A.I. (2008... calculus for license rights and obligations in license and context models. Using them, we calculate rights and obligations for specific sys- tems, identify
Open cyberGIS software for geospatial research and education in the big data era
NASA Astrophysics Data System (ADS)
Wang, Shaowen; Liu, Yan; Padmanabhan, Anand
CyberGIS represents an interdisciplinary field combining advanced cyberinfrastructure, geographic information science and systems (GIS), spatial analysis and modeling, and a number of geospatial domains to improve research productivity and enable scientific breakthroughs. It has emerged as new-generation GIS that enable unprecedented advances in data-driven knowledge discovery, visualization and visual analytics, and collaborative problem solving and decision-making. This paper describes three open software strategies-open access, source, and integration-to serve various research and education purposes of diverse geospatial communities. These strategies have been implemented in a leading-edge cyberGIS software environment through three corresponding software modalities: CyberGIS Gateway, Toolkit, and Middleware, and achieved broad and significant impacts.
NASA Astrophysics Data System (ADS)
Melton, R.; Thomas, J.
With the rapid growth in the number of space actors, there has been a marked increase in the complexity and diversity of software systems utilized to support SSA target tracking, indication, warning, and collision avoidance. Historically, most SSA software has been constructed with "closed" proprietary code, which limits interoperability, inhibits the code transparency that some SSA customers need to develop domain expertise, and prevents the rapid injection of innovative concepts into these systems. Open-source aerospace software, a rapidly emerging, alternative trend in code development, is based on open collaboration, which has the potential to bring greater transparency, interoperability, flexibility, and reduced development costs. Open-source software is easily adaptable, geared to rapidly changing mission needs, and can generally be delivered at lower costs to meet mission requirements. This paper outlines Ball's COSMOS C2 system, a fully open-source, web-enabled, command-and-control software architecture which provides several unique capabilities to move the current legacy SSA software paradigm to an open source model that effectively enables pre- and post-launch asset command and control. Among the unique characteristics of COSMOS is the ease with which it can integrate with diverse hardware. This characteristic enables COSMOS to serve as the command-and-control platform for the full life-cycle development of SSA assets, from board test, to box test, to system integration and test, to on-orbit operations. The use of a modern scripting language, Ruby, also permits automated procedures to provide highly complex decision making for the tasking of SSA assets based on both telemetry data and data received from outside sources. Detailed logging enables quick anomaly detection and resolution. Integrated real-time and offline data graphing renders the visualization of the both ground and on-orbit assets simple and straightforward.
Software Writing Skills for Your Research - Lessons Learned from Workshops in the Geosciences
NASA Astrophysics Data System (ADS)
Hammitzsch, Martin
2016-04-01
Findings presented in scientific papers are based on data and software. Once in a while they come along with data - but not commonly with software. However, the software used to gain findings plays a crucial role in the scientific work. Nevertheless, software is rarely seen publishable. Thus researchers may not reproduce the findings without the software which is in conflict with the principle of reproducibility in sciences. For both, the writing of publishable software and the reproducibility issue, the quality of software is of utmost importance. For many programming scientists the treatment of source code, e.g. with code design, version control, documentation, and testing is associated with additional work that is not covered in the primary research task. This includes the adoption of processes following the software development life cycle. However, the adoption of software engineering rules and best practices has to be recognized and accepted as part of the scientific performance. Most scientists have little incentive to improve code and do not publish code because software engineering habits are rarely practised by researchers or students. Software engineering skills are not passed on to followers as for paper writing skill. Thus it is often felt that the software or code produced is not publishable. The quality of software and its source code has a decisive influence on the quality of research results obtained and their traceability. So establishing best practices from software engineering to serve scientific needs is crucial for the success of scientific software. Even though scientists use existing software and code, i.e., from open source software repositories, only few contribute their code back into the repositories. So writing and opening code for Open Science means that subsequent users are able to run the code, e.g. by the provision of sufficient documentation, sample data sets, tests and comments which in turn can be proven by adequate and qualified reviews. This assumes that scientist learn to write and release code and software as they learn to write and publish papers. Having this in mind, software could be valued and assessed as a contribution to science. But this requires the relevant skills that can be passed to colleagues and followers. Therefore, the GFZ German Research Centre for Geosciences performed three workshops in 2015 to address the passing of software writing skills to young scientists, the next generation of researchers in the Earth, planetary and space sciences. Experiences in running these workshops and the lessons learned will be summarized in this presentation. The workshops have received support and funding by Software Carpentry, a volunteer organization whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills, and by FOSTER (Facilitate Open Science Training for European Research), a two-year, EU-Funded (FP7) project, whose goal to produce a European-wide training programme that will help to incorporate Open Access approaches into existing research methodologies and to integrate Open Science principles and practice in the current research workflow by targeting the young researchers and other stakeholders.
Note: Tormenta: An open source Python-powered control software for camera based optical microscopy.
Barabas, Federico M; Masullo, Luciano A; Stefani, Fernando D
2016-12-01
Until recently, PC control and synchronization of scientific instruments was only possible through closed-source expensive frameworks like National Instruments' LabVIEW. Nowadays, efficient cost-free alternatives are available in the context of a continuously growing community of open-source software developers. Here, we report on Tormenta, a modular open-source software for the control of camera-based optical microscopes. Tormenta is built on Python, works on multiple operating systems, and includes some key features for fluorescence nanoscopy based on single molecule localization.
Note: Tormenta: An open source Python-powered control software for camera based optical microscopy
NASA Astrophysics Data System (ADS)
Barabas, Federico M.; Masullo, Luciano A.; Stefani, Fernando D.
2016-12-01
Until recently, PC control and synchronization of scientific instruments was only possible through closed-source expensive frameworks like National Instruments' LabVIEW. Nowadays, efficient cost-free alternatives are available in the context of a continuously growing community of open-source software developers. Here, we report on Tormenta, a modular open-source software for the control of camera-based optical microscopes. Tormenta is built on Python, works on multiple operating systems, and includes some key features for fluorescence nanoscopy based on single molecule localization.
Web accessibility and open source software.
Obrenović, Zeljko
2009-07-01
A Web browser provides a uniform user interface to different types of information. Making this interface universally accessible and more interactive is a long-term goal still far from being achieved. Universally accessible browsers require novel interaction modalities and additional functionalities, for which existing browsers tend to provide only partial solutions. Although functionality for Web accessibility can be found as open source and free software components, their reuse and integration is complex because they were developed in diverse implementation environments, following standards and conventions incompatible with the Web. To address these problems, we have started several activities that aim at exploiting the potential of open-source software for Web accessibility. The first of these activities is the development of Adaptable Multi-Interface COmmunicator (AMICO):WEB, an infrastructure that facilitates efficient reuse and integration of open source software components into the Web environment. The main contribution of AMICO:WEB is in enabling the syntactic and semantic interoperability between Web extension mechanisms and a variety of integration mechanisms used by open source and free software components. Its design is based on our experiences in solving practical problems where we have used open source components to improve accessibility of rich media Web applications. The second of our activities involves improving education, where we have used our platform to teach students how to build advanced accessibility solutions from diverse open-source software. We are also partially involved in the recently started Eclipse projects called Accessibility Tools Framework (ACTF), the aim of which is development of extensible infrastructure, upon which developers can build a variety of utilities that help to evaluate and enhance the accessibility of applications and content for people with disabilities. In this article we briefly report on these activities.
Exploring the Role of Value Networks for Software Innovation
NASA Astrophysics Data System (ADS)
Morgan, Lorraine; Conboy, Kieran
This paper describes a research-in-progress that aims to explore the applicability and implications of open innovation practices in two firms - one that employs agile development methods and another that utilizes open source software. The open innovation paradigm has a lot in common with open source and agile development methodologies. A particular strength of agile approaches is that they move away from 'introverted' development, involving only the development personnel, and intimately involves the customer in all areas of software creation, supposedly leading to the development of a more innovative and hence more valuable information system. Open source software (OSS) development also shares two key elements of the open innovation model, namely the collaborative development of the technology and shared rights to the use of the technology. However, one shortfall with agile development in particular is the narrow focus on a single customer representative. In response to this, we argue that current thinking regarding innovation needs to be extended to include multiple stakeholders both across and outside the organization. Additionally, for firms utilizing open source, it has been found that their position in a network of potential complementors determines the amount of superior value they create for their customers. Thus, this paper aims to get a better understanding of the applicability and implications of open innovation practices in firms that employ open source and agile development methodologies. In particular, a conceptual framework is derived for further testing.
Digital beacon receiver for ionospheric TEC measurement developed with GNU Radio
NASA Astrophysics Data System (ADS)
Yamamoto, M.
2008-11-01
A simple digital receiver named GNU Radio Beacon Receiver (GRBR) was developed for the satellite-ground beacon experiment to measure the ionospheric total electron content (TEC). The open-source software toolkit for the software defined radio, GNU Radio, is utilized to realize the basic function of the receiver and perform fast signal processing. The software is written in Python for a LINUX PC. The open-source hardware called Universal Software Radio Peripheral (USRP), which best matches the GNU Radio, is used as a front-end to acquire the satellite beacon signals of 150 and 400 MHz. The first experiment was successful as results from GRBR showed very good agreement to those from the co-located analog beacon receiver. Detailed design information and software codes are open at the URL http://www.rish.kyoto-u.ac.jp/digitalbeacon/.
Open Source Software in Teaching Physics: A Case Study on Vector Algebra and Visual Representations
ERIC Educational Resources Information Center
Cataloglu, Erdat
2006-01-01
This study aims to report the effort on teaching vector algebra using free open source software (FOSS). Recent studies showed that students have difficulties in learning basic physics concepts. Constructivist learning theories suggest the use of visual and hands-on activities in learning. We will report on the software used for this purpose. The…
ERIC Educational Resources Information Center
Williams van Rooij, Shahron
2010-01-01
This paper contrasts the arguments offered in the literature advocating the adoption of open source software (OSS)--software delivered with its source code--for teaching and learning applications, with the reality of limited enterprise-wide deployment of those applications in U.S. higher education. Drawing on the fields of organizational…
ERIC Educational Resources Information Center
Samuels, Ruth Gallegos; Griffy, Henry
2012-01-01
This article discusses best practices for evaluating open source software for use in library projects, based on the authors' experience evaluating electronic publishing solutions. First, it presents a brief review of the literature, emphasizing the need to evaluate open source solutions carefully in order to minimize Total Cost of Ownership. Next,…
ERIC Educational Resources Information Center
Vlas, Radu Eduard
2012-01-01
Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…
Developing Open Source Software To Advance High End Computing. Report to the President.
ERIC Educational Resources Information Center
National Coordination Office for Information Technology Research and Development, Arlington, VA.
This is part of a series of reports to the President and Congress developed by the President's Information Technology Advisory Committee (PITAC) on key contemporary issues in information technology. This report defines open source software, explains PITAC's interest in this model, describes the process used to investigate issues in open source…
ERIC Educational Resources Information Center
Wen, Wen
2012-01-01
While open source software (OSS) emphasizes open access to the source code and avoids the use of formal appropriability mechanisms, there has been little understanding of how the existence and exercise of formal intellectual property rights (IPR) such as patents influence the direction of OSS innovation. This dissertation seeks to bridge this gap…
The GenABEL Project for statistical genomics.
Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S
2016-01-01
Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.
GIS-Based Noise Simulation Open Source Software: N-GNOIS
NASA Astrophysics Data System (ADS)
Vijay, Ritesh; Sharma, A.; Kumar, M.; Shende, V.; Chakrabarti, T.; Gupta, Rajesh
2015-12-01
Geographical information system (GIS)-based noise simulation software (N-GNOIS) has been developed to simulate the noise scenario due to point and mobile sources considering the impact of geographical features and meteorological parameters. These have been addressed in the software through attenuation modules of atmosphere, vegetation and barrier. N-GNOIS is a user friendly, platform-independent and open geospatial consortia (OGC) compliant software. It has been developed using open source technology (QGIS) and open source language (Python). N-GNOIS has unique features like cumulative impact of point and mobile sources, building structure and honking due to traffic. Honking is the most common phenomenon in developing countries and is frequently observed on any type of roads. N-GNOIS also helps in designing physical barrier and vegetation cover to check the propagation of noise and acts as a decision making tool for planning and management of noise component in environmental impact assessment (EIA) studies.
OpenSatKit Enables Quick Startup for CubeSat Missions
NASA Technical Reports Server (NTRS)
McComas, David; Melton, Ryan
2017-01-01
The software required to develop, integrate, and operate a spacecraft is substantial regardless of whether its a large or small satellite. Even getting started can be a monumental task. To solve this problem, NASAs Core Flight System (cFS), NASA's 42 spacecraft dynamics simulator, and Ball Aerospaces COSMOS ground system have been integrated together into a kit called OpenSatKit that provides a complete and open source software solution for starting a new satellite mission. Users can have a working system with flight software, dynamics simulation, and a ground command and control system up and running within hours.Every satellite mission requires three primary categories of software to function. The first is Flight Software (FSW) which provides the onboard control of the satellites and its payload(s). NASA's cFS provides a great platform for developing this software. Second, while developing a satellite on earth, it is necessary to simulate the satellites orbit, attitude, and actuators, to ensure that the systems that control these aspects will work correctly in the real environment. NASAs 42 simulator provides these functionalities. Finally, the ground has to be able to communicate with the satellite, monitor its performance and health, and display its data. Additionally, test scripts have to be written to verify the system on the ground. Ball Aerospace's COSMOS command and control system provides this functionality. Once the OpenSatKit is up and running, the next step is to customize the platform and get it running on the end target. Starting from a fully working system makes porting the cFS from Linux to a users platform much easier. An example Raspberry Pi target is included in the kit so users can gain experience working with a low cost hardware target. All users can benefit from OpenSatKit but the greatest impact and benefits will be to SmallSat missions with constrained budgets and small software teams. This paper describes OpenSatKits system design, the steps necessary to run the system to target the Raspberry Pi, and future plans. OpenSatKit is a free fully functional spacecraft software system that we hope will greatly benefit the SmallSat community.
Supervised normalization of microarrays
Mecham, Brigham H.; Nelson, Peter S.; Storey, John D.
2010-01-01
Motivation: A major challenge in utilizing microarray technologies to measure nucleic acid abundances is ‘normalization’, the goal of which is to separate biologically meaningful signal from other confounding sources of signal, often due to unavoidable technical factors. It is intuitively clear that true biological signal and confounding factors need to be simultaneously considered when performing normalization. However, the most popular normalization approaches do not utilize what is known about the study, both in terms of the biological variables of interest and the known technical factors in the study, such as batch or array processing date. Results: We show here that failing to include all study-specific biological and technical variables when performing normalization leads to biased downstream analyses. We propose a general normalization framework that fits a study-specific model employing every known variable that is relevant to the expression study. The proposed method is generally applicable to the full range of existing probe designs, as well as to both single-channel and dual-channel arrays. We show through real and simulated examples that the method has favorable operating characteristics in comparison to some of the most highly used normalization methods. Availability: An R package called snm implementing the methodology will be made available from Bioconductor (http://bioconductor.org). Contact: jstorey@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20363728
Systemic evaluation of cellular reprogramming processes exploiting a novel R-tool: eegc.
Zhou, Xiaoyuan; Meng, Guofeng; Nardini, Christine; Mei, Hongkang
2017-08-15
Cells derived by cellular engineering, i.e. differentiation of induced pluripotent stem cells and direct lineage reprogramming, carry a tremendous potential for medical applications and in particular for regenerative therapies. These approaches consist in the definition of lineage-specific experimental protocols that, by manipulation of a limited number of biological cues-niche mimicking factors, (in)activation of transcription factors, to name a few-enforce the final expression of cell-specific (marker) molecules. To date, given the intricate complexity of biological pathways, these approaches still present imperfect reprogramming fidelity, with uncertain consequences on the functional properties of the resulting cells. We propose a novel tool eegc to evaluate cellular engineering processes, in a systemic rather than marker-based fashion, by integrating transcriptome profiling and functional analysis. Our method clusters genes into categories representing different states of (trans)differentiation and further performs functional and gene regulatory network analyses for each of the categories of the engineered cells, thus offering practical indications on the potential lack of the reprogramming protocol. eegc R package is released under the GNU General Public License within the Bioconductor project, freely available at https://bioconductor.org/packages/eegc/. christine.nardini.rsrc@gmail.com or hongkang.k.mei@gsk.com. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Systemic evaluation of cellular reprogramming processes exploiting a novel R-tool: eegc
Zhou, Xiaoyuan; Meng, Guofeng; Nardini, Christine; Mei, Hongkang
2017-01-01
Abstract Motivation Cells derived by cellular engineering, i.e. differentiation of induced pluripotent stem cells and direct lineage reprogramming, carry a tremendous potential for medical applications and in particular for regenerative therapies. These approaches consist in the definition of lineage-specific experimental protocols that, by manipulation of a limited number of biological cues—niche mimicking factors, (in)activation of transcription factors, to name a few—enforce the final expression of cell-specific (marker) molecules. To date, given the intricate complexity of biological pathways, these approaches still present imperfect reprogramming fidelity, with uncertain consequences on the functional properties of the resulting cells. Results We propose a novel tool eegc to evaluate cellular engineering processes, in a systemic rather than marker-based fashion, by integrating transcriptome profiling and functional analysis. Our method clusters genes into categories representing different states of (trans)differentiation and further performs functional and gene regulatory network analyses for each of the categories of the engineered cells, thus offering practical indications on the potential lack of the reprogramming protocol. Availability and Implementation eegc R package is released under the GNU General Public License within the Bioconductor project, freely available at https://bioconductor.org/packages/eegc/. Contact christine.nardini.rsrc@gmail.com or hongkang.k.mei@gsk.com Supplementary information Supplementary data are available at Bioinformatics online. PMID:28398503
Open source EMR software: profiling, insights and hands-on analysis.
Kiah, M L M; Haiqi, Ahmed; Zaidan, B B; Zaidan, A A
2014-11-01
The use of open source software in health informatics is increasingly advocated by authors in the literature. Although there is no clear evidence of the superiority of the current open source applications in the healthcare field, the number of available open source applications online is growing and they are gaining greater prominence. This repertoire of open source options is of a great value for any future-planner interested in adopting an electronic medical/health record system, whether selecting an existent application or building a new one. The following questions arise. How do the available open source options compare to each other with respect to functionality, usability and security? Can an implementer of an open source application find sufficient support both as a user and as a developer, and to what extent? Does the available literature provide adequate answers to such questions? This review attempts to shed some light on these aspects. The objective of this study is to provide more comprehensive guidance from an implementer perspective toward the available alternatives of open source healthcare software, particularly in the field of electronic medical/health records. The design of this study is twofold. In the first part, we profile the published literature on a sample of existent and active open source software in the healthcare area. The purpose of this part is to provide a summary of the available guides and studies relative to the sampled systems, and to identify any gaps in the published literature with respect to our research questions. In the second part, we investigate those alternative systems relative to a set of metrics, by actually installing the software and reporting a hands-on experience of the installation process, usability, as well as other factors. The literature covers many aspects of open source software implementation and utilization in healthcare practice. Roughly, those aspects could be distilled into a basic taxonomy, making the literature landscape more perceivable. Nevertheless, the surveyed articles fall short of fulfilling the targeted objective of providing clear reference to potential implementers. The hands-on study contributed a more detailed comparative guide relative to our set of assessment measures. Overall, no system seems to satisfy an industry-standard measure, particularly in security and interoperability. The systems, as software applications, feel similar from a usability perspective and share a common set of functionality, though they vary considerably in community support and activity. More detailed analysis of popular open source software can benefit the potential implementers of electronic health/medical records systems. The number of examined systems and the measures by which to compare them vary across studies, but still rewarding insights start to emerge. Our work is one step toward that goal. Our overall conclusion is that open source options in the medical field are still far behind the highly acknowledged open source products in other domains, e.g. operating systems market share. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
2013-01-01
Background High-throughput RNA sequencing (RNA-seq) offers unprecedented power to capture the real dynamics of gene expression. Experimental designs with extensive biological replication present a unique opportunity to exploit this feature and distinguish expression profiles with higher resolution. RNA-seq data analysis methods so far have been mostly applied to data sets with few replicates and their default settings try to provide the best performance under this constraint. These methods are based on two well-known count data distributions: the Poisson and the negative binomial. The way to properly calibrate them with large RNA-seq data sets is not trivial for the non-expert bioinformatics user. Results Here we show that expression profiles produced by extensively-replicated RNA-seq experiments lead to a rich diversity of count data distributions beyond the Poisson and the negative binomial, such as Poisson-Inverse Gaussian or Pólya-Aeppli, which can be captured by a more general family of count data distributions called the Poisson-Tweedie. The flexibility of the Poisson-Tweedie family enables a direct fitting of emerging features of large expression profiles, such as heavy-tails or zero-inflation, without the need to alter a single configuration parameter. We provide a software package for R called tweeDEseq implementing a new test for differential expression based on the Poisson-Tweedie family. Using simulations on synthetic and real RNA-seq data we show that tweeDEseq yields P-values that are equally or more accurate than competing methods under different configuration parameters. By surveying the tiny fraction of sex-specific gene expression changes in human lymphoblastoid cell lines, we also show that tweeDEseq accurately detects differentially expressed genes in a real large RNA-seq data set with improved performance and reproducibility over the previously compared methodologies. Finally, we compared the results with those obtained from microarrays in order to check for reproducibility. Conclusions RNA-seq data with many replicates leads to a handful of count data distributions which can be accurately estimated with the statistical model illustrated in this paper. This method provides a better fit to the underlying biological variability; this may be critical when comparing groups of RNA-seq samples with markedly different count data distributions. The tweeDEseq package forms part of the Bioconductor project and it is available for download at http://www.bioconductor.org. PMID:23965047
ERIC Educational Resources Information Center
Mukala, Patrick; Cerone, Antonio; Turini, Franco
2017-01-01
Free\\Libre Open Source Software (FLOSS) environments are increasingly dubbed as learning environments where practical software engineering skills can be acquired. Numerous studies have extensively investigated how knowledge is acquired in these environments through a collaborative learning model that define a learning process. Such a learning…
ERIC Educational Resources Information Center
Long, Ju
2009-01-01
Open Source Software (OSS) is a major force in today's Information Technology (IT) landscape. Companies are increasingly using OSS in mission-critical applications. The transparency of the OSS technology itself with openly available source codes makes it ideal for students to participate in the OSS project development. OSS can provide unique…
A Dozen Years after Open Source's 1998 Birth, It's Time for "OpenTechComm"
ERIC Educational Resources Information Center
Still, Brian
2010-01-01
2008 marked the 10-year Anniversary of the Open Source movement, which has had a substantial impact on not only software production and adoption, but also on the sharing and distribution of information. Technical communication as a discipline has taken some advantage of the movement or its derivative software, but this article argues not as much…
ERIC Educational Resources Information Center
van Rooij, Shahron Williams
2009-01-01
Higher Education institutions in the United States are considering Open Source software applications such as the Moodle and Sakai course management systems and the Kuali financial system to build integrated learning environments that serve both academic and administrative needs. Open Source is presumed to be more flexible and less costly than…
Shenoy, Shailesh M
2016-07-01
A challenge in any imaging laboratory, especially one that uses modern techniques, is to achieve a sustainable and productive balance between using open source and commercial software to perform quantitative image acquisition, analysis and visualization. In addition to considering the expense of software licensing, one must consider factors such as the quality and usefulness of the software's support, training and documentation. Also, one must consider the reproducibility with which multiple people generate results using the same software to perform the same analysis, how one may distribute their methods to the community using the software and the potential for achieving automation to improve productivity.
OpenComet: An automated tool for comet assay image analysis
Gyori, Benjamin M.; Venkatachalam, Gireedhar; Thiagarajan, P.S.; Hsu, David; Clement, Marie-Veronique
2014-01-01
Reactive species such as free radicals are constantly generated in vivo and DNA is the most important target of oxidative stress. Oxidative DNA damage is used as a predictive biomarker to monitor the risk of development of many diseases. The comet assay is widely used for measuring oxidative DNA damage at a single cell level. The analysis of comet assay output images, however, poses considerable challenges. Commercial software is costly and restrictive, while free software generally requires laborious manual tagging of cells. This paper presents OpenComet, an open-source software tool providing automated analysis of comet assay images. It uses a novel and robust method for finding comets based on geometric shape attributes and segmenting the comet heads through image intensity profile analysis. Due to automation, OpenComet is more accurate, less prone to human bias, and faster than manual analysis. A live analysis functionality also allows users to analyze images captured directly from a microscope. We have validated OpenComet on both alkaline and neutral comet assay images as well as sample images from existing software packages. Our results show that OpenComet achieves high accuracy with significantly reduced analysis time. PMID:24624335
OpenComet: an automated tool for comet assay image analysis.
Gyori, Benjamin M; Venkatachalam, Gireedhar; Thiagarajan, P S; Hsu, David; Clement, Marie-Veronique
2014-01-01
Reactive species such as free radicals are constantly generated in vivo and DNA is the most important target of oxidative stress. Oxidative DNA damage is used as a predictive biomarker to monitor the risk of development of many diseases. The comet assay is widely used for measuring oxidative DNA damage at a single cell level. The analysis of comet assay output images, however, poses considerable challenges. Commercial software is costly and restrictive, while free software generally requires laborious manual tagging of cells. This paper presents OpenComet, an open-source software tool providing automated analysis of comet assay images. It uses a novel and robust method for finding comets based on geometric shape attributes and segmenting the comet heads through image intensity profile analysis. Due to automation, OpenComet is more accurate, less prone to human bias, and faster than manual analysis. A live analysis functionality also allows users to analyze images captured directly from a microscope. We have validated OpenComet on both alkaline and neutral comet assay images as well as sample images from existing software packages. Our results show that OpenComet achieves high accuracy with significantly reduced analysis time.
NASA Astrophysics Data System (ADS)
Ames, D.; Kadlec, J.; Horsburgh, J. S.; Maidment, D. R.
2009-12-01
The Consortium of Universities for the Advancement of Hydrologic Sciences (CUAHSI) Hydrologic Information System (HIS) project includes extensive development of data storage and delivery tools and standards including WaterML (a language for sharing hydrologic data sets via web services); and HIS Server (a software tool set for delivering WaterML from a server); These and other CUASHI HIS tools have been under development and deployment for several years and together, present a relatively complete software “stack” to support the consistent storage and delivery of hydrologic and other environmental observation data. This presentation describes the development of a new HIS software tool called “HydroDesktop” and the development of an online open source software development community to update and maintain the software. HydroDesktop is a local (i.e. not server-based) client side software tool that ultimately will run on multiple operating systems and will provide a highly usable level of access to HIS services. The software provides many key capabilities including data query, map-based visualization, data download, local data maintenance, editing, graphing, data export to selected model-specific data formats, linkage with integrated modeling systems such as OpenMI, and ultimately upload to HIS servers from the local desktop software. As the software is presently in the early stages of development, this presentation will focus on design approach and paradigm and is viewed as an opportunity to encourage participation in the open development community. Indeed, recognizing the value of community based code development as a means of ensuring end-user adoption, this project has adopted an “iterative” or “spiral” software development approach which will be described in this presentation.
Experiences using OpenMP based on Computer Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland
2003-01-01
In this work we report on our experiences running OpenMP programs on a commodity cluster of PCs running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automaticaly parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
What an open source clinical trial community can learn from hackers
Dunn, Adam G.; Day, Richard O.; Mandl, Kenneth D.; Coiera, Enrico
2014-01-01
Summary Open sharing of clinical trial data has been proposed as a way to address the gap between the production of clinical evidence and the decision-making of physicians. Since a similar gap has already been addressed in the software industry by the open source software movement, we examine how the social and technical principles of the movement can be used to guide the growth of an open source clinical trial community. PMID:22553248
Kajihata, Shuichi; Furusawa, Chikara; Matsuda, Fumio; Shimizu, Hiroshi
2014-01-01
The in vivo measurement of metabolic flux by (13)C-based metabolic flux analysis ((13)C-MFA) provides valuable information regarding cell physiology. Bioinformatics tools have been developed to estimate metabolic flux distributions from the results of tracer isotopic labeling experiments using a (13)C-labeled carbon source. Metabolic flux is determined by nonlinear fitting of a metabolic model to the isotopic labeling enrichment of intracellular metabolites measured by mass spectrometry. Whereas (13)C-MFA is conventionally performed under isotopically constant conditions, isotopically nonstationary (13)C metabolic flux analysis (INST-(13)C-MFA) has recently been developed for flux analysis of cells with photosynthetic activity and cells at a quasi-steady metabolic state (e.g., primary cells or microorganisms under stationary phase). Here, the development of a novel open source software for INST-(13)C-MFA on the Windows platform is reported. OpenMebius (Open source software for Metabolic flux analysis) provides the function of autogenerating metabolic models for simulating isotopic labeling enrichment from a user-defined configuration worksheet. Analysis using simulated data demonstrated the applicability of OpenMebius for INST-(13)C-MFA. Confidence intervals determined by INST-(13)C-MFA were less than those determined by conventional methods, indicating the potential of INST-(13)C-MFA for precise metabolic flux analysis. OpenMebius is the open source software for the general application of INST-(13)C-MFA.
2006-11-01
software components used in the ad hoc nodes for the C4ISR OTM experiment were OLSRD, an open-source proactive MANET routing software, and OpenVPN , an...developed by Mike Baker (openwrt.org). 6OpenVPN is a trademark of OpenVPN Solutions LLC. 6 Secure communications in the MANET are achieved with...encryption provided by Wired Equivalent Privacy (WEP) and OpenVPN . The WEP protocol, which is part of the IEEE 802.11 wireless networking standard
[GNU Pattern: open source pattern hunter for biological sequences based on SPLASH algorithm].
Xu, Ying; Li, Yi-xue; Kong, Xiang-yin
2005-06-01
To construct a high performance open source software engine based on IBM SPLASH algorithm for later research on pattern discovery. Gpat, which is based on SPLASH algorithm, was developed by using open source software. GNU Pattern (Gpat) software was developped, which efficiently implemented the core part of SPLASH algorithm. Full source code of Gpat was also available for other researchers to modify the program under the GNU license. Gpat is a successful implementation of SPLASH algorithm and can be used as a basic framework for later research on pattern recognition in biological sequences.
Open Technology Approaches to Geospatial Interface Design
NASA Astrophysics Data System (ADS)
Crevensten, B.; Simmons, D.; Alaska Satellite Facility
2011-12-01
What problems do you not want your software developers to be solving? Choosing open technologies across the entire stack of software development-from low-level shared libraries to high-level user interaction implementations-is a way to help ensure that customized software yields innovative and valuable tools for Earth Scientists. This demonstration will review developments in web application technologies and the recurring patterns of interaction design regarding exploration and discovery of geospatial data through the Vertex: ASF's Dataportal interface, a project utilizing current open web application standards and technologies including HTML5, jQueryUI, Backbone.js and the Jasmine unit testing framework.
The GenABEL Project for statistical genomics
Karssen, Lennart C.; van Duijn, Cornelia M.; Aulchenko, Yurii S.
2016-01-01
Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination. PMID:27347381
The role of open-source software in innovation and standardization in radiology.
Erickson, Bradley J; Langer, Steve; Nagy, Paul
2005-11-01
The use of open-source software (OSS), in which developers release the source code to applications they have developed, is popular in the software industry. This is done to allow others to modify and improve software (which may or may not be shared back to the community) and to allow others to learn from the software. Radiology was an early participant in this model, supporting OSS that implemented the ACR-National Electrical Manufacturers Association (now Digital Imaging and Communications in Medicine) standard for medical image communications. In radiology and in other fields, OSS has promoted innovation and the adoption of standards. Popular OSS is of high quality because access to source code allows many people to identify and resolve errors. Open-source software is analogous to the peer-review scientific process: one must be able to see and reproduce results to understand and promote what is shared. The authors emphasize that support for OSS need not threaten vendors; most vendors embrace and benefit from standards. Open-source development does not replace vendors but more clearly defines their roles, typically focusing on areas in which proprietary differentiators benefit customers and on professional services such as implementation planning and service. Continued support for OSS is essential for the success of our field.
Lee, Young Han
2012-01-01
The objectives are (1) to introduce an easy open-source macro program as connection software and (2) to illustrate the practical usages in radiologic reading environment by simulating the radiologic reading process. The simulation is a set of radiologic reading process to do a practical task in the radiologic reading room. The principal processes are: (1) to view radiologic images on the Picture Archiving and Communicating System (PACS), (2) to connect the HIS/EMR (Hospital Information System/Electronic Medical Record) system, (3) to make an automatic radiologic reporting system, and (4) to record and recall information of interesting cases. This simulation environment was designed by using open-source macro program as connection software. The simulation performed well on the Window-based PACS workstation. Radiologists practiced the steps of the simulation comfortably by utilizing the macro-powered radiologic environment. This macro program could automate several manual cumbersome steps in the radiologic reading process. This program successfully acts as connection software for the PACS software, EMR/HIS, spreadsheet, and other various input devices in the radiologic reading environment. A user-friendly efficient radiologic reading environment could be established by utilizing open-source macro program as connection software. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Wang, Anliang; Yan, Xiaolong; Wei, Zhijun
2018-04-27
This note presents the design of a scalable software package named ImagePy for analysing biological images. Our contribution is concentrated on facilitating extensibility and interoperability of the software through decoupling the data model from the user interface. Especially with assistance from the Python ecosystem, this software framework makes modern computer algorithms easier to be applied in bioimage analysis. ImagePy is free and open source software, with documentation and code available at https://github.com/Image-Py/imagepy under the BSD license. It has been tested on the Windows, Mac and Linux operating systems. wzjdlut@dlut.edu.cn or yxdragon@imagepy.org.
USE OF COMPUTER-AIDED PROCESS ENGINEERING TOOL IN POLLUTION PREVENTION
Computer-Aided Process Engineering has become established in industry as a design tool. With the establishment of the CAPE-OPEN software specifications for process simulation environments. CAPE-OPEN provides a set of "middleware" standards that enable software developers to acces...
Large Eddy Simulations using oodlesDST
2016-01-01
Research Agency DST-Group-TR-3205 ABSTRACT The oodlesDST code is based on OpenFOAM software and performs Large Eddy Simulations of......maritime platforms using a variety of simulation techniques. He is currently using OpenFOAM software to perform both Reynolds Averaged Navier-Stokes
ERIC Educational Resources Information Center
Olsen, Florence
2003-01-01
Colleges and universities are beginning to consider collaborating on open-source-code projects as a way to meet critical software and computing needs. Points out the attractive features of noncommercial open-source software and describes some examples in use now, especially for the creation of Web infrastructure. (SLD)
Open source software to control Bioflo bioreactors.
Burdge, David A; Libourel, Igor G L
2014-01-01
Bioreactors are designed to support highly controlled environments for growth of tissues, cell cultures or microbial cultures. A variety of bioreactors are commercially available, often including sophisticated software to enhance the functionality of the bioreactor. However, experiments that the bioreactor hardware can support, but that were not envisioned during the software design cannot be performed without developing custom software. In addition, support for third party or custom designed auxiliary hardware is often sparse or absent. This work presents flexible open source freeware for the control of bioreactors of the Bioflo product family. The functionality of the software includes setpoint control, data logging, and protocol execution. Auxiliary hardware can be easily integrated and controlled through an integrated plugin interface without altering existing software. Simple experimental protocols can be entered as a CSV scripting file, and a Python-based protocol execution model is included for more demanding conditional experimental control. The software was designed to be a more flexible and free open source alternative to the commercially available solution. The source code and various auxiliary hardware plugins are publicly available for download from https://github.com/LibourelLab/BiofloSoftware. In addition to the source code, the software was compiled and packaged as a self-installing file for 32 and 64 bit windows operating systems. The compiled software will be able to control a Bioflo system, and will not require the installation of LabVIEW.
Open Source Software to Control Bioflo Bioreactors
Burdge, David A.; Libourel, Igor G. L.
2014-01-01
Bioreactors are designed to support highly controlled environments for growth of tissues, cell cultures or microbial cultures. A variety of bioreactors are commercially available, often including sophisticated software to enhance the functionality of the bioreactor. However, experiments that the bioreactor hardware can support, but that were not envisioned during the software design cannot be performed without developing custom software. In addition, support for third party or custom designed auxiliary hardware is often sparse or absent. This work presents flexible open source freeware for the control of bioreactors of the Bioflo product family. The functionality of the software includes setpoint control, data logging, and protocol execution. Auxiliary hardware can be easily integrated and controlled through an integrated plugin interface without altering existing software. Simple experimental protocols can be entered as a CSV scripting file, and a Python-based protocol execution model is included for more demanding conditional experimental control. The software was designed to be a more flexible and free open source alternative to the commercially available solution. The source code and various auxiliary hardware plugins are publicly available for download from https://github.com/LibourelLab/BiofloSoftware. In addition to the source code, the software was compiled and packaged as a self-installing file for 32 and 64 bit windows operating systems. The compiled software will be able to control a Bioflo system, and will not require the installation of LabVIEW. PMID:24667828
Using Open Source Software in Visual Simulation Development
2005-09-01
increased the use of the technology in training activities. Using open source/free software tools in the process can expand these possibilities...resulting in even greater cost reduction and allowing the flexibility needed in a training environment. This thesis presents a configuration and architecture...to be used when developing training visual simulations using both personal computers and open source tools. Aspects of the requirements needed in a
Anatomy of BioJS, an open source community for the life sciences.
Yachdav, Guy; Goldberg, Tatyana; Wilzbach, Sebastian; Dao, David; Shih, Iris; Choudhary, Saket; Crouch, Steve; Franz, Max; García, Alexander; García, Leyla J; Grüning, Björn A; Inupakutika, Devasena; Sillitoe, Ian; Thanki, Anil S; Vieira, Bruno; Villaveces, José M; Schneider, Maria V; Lewis, Suzanna; Pettifer, Steve; Rost, Burkhard; Corpas, Manuel
2015-07-08
BioJS is an open source software project that develops visualization tools for different types of biological data. Here we report on the factors that influenced the growth of the BioJS user and developer community, and outline our strategy for building on this growth. The lessons we have learned on BioJS may also be relevant to other open source software projects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parker, Andrew; Haves, Philip; Jegi, Subhash
This paper describes a software system for automatically generating a reference (baseline) building energy model from the proposed (as-designed) building energy model. This system is built using the OpenStudio Software Development Kit (SDK) and is designed to operate on building energy models in the OpenStudio file format.
Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland; Biegel, Bryan (Technical Monitor)
2002-01-01
In this work we report on our experiences running OpenMP (message passing) programs on a commodity cluster of PCs (personal computers) running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS (NASA Advanced Supercomputing) Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
Embracing Open Software Development in Solar Physics
NASA Astrophysics Data System (ADS)
Hughitt, V. K.; Ireland, J.; Christe, S.; Mueller, D.
2012-12-01
We discuss two ongoing software projects in solar physics that have adopted best practices of the open source software community. The first, the Helioviewer Project, is a powerful data visualization tool which includes online and Java interfaces inspired by Google Maps (tm). This effort allows users to find solar features and events of interest, and download the corresponding data. Having found data of interest, the user now has to analyze it. The dominant solar data analysis platform is an open-source library called SolarSoft (SSW). Although SSW itself is open-source, the programming language used is IDL, a proprietary language with licensing costs that are prohibative for many institutions and individuals. SSW is composed of a collection of related scripts written by missions and individuals for solar data processing and analysis, without any consistent data structures or common interfaces. Further, at the time when SSW was initially developed, many of the best software development processes of today (mirrored and distributed version control, unit testing, continuous integration, etc.) were not standard, and have not since been adopted. The challenges inherent in developing SolarSoft led to a second software project known as SunPy. SunPy is an open-source Python-based library which seeks to create a unified solar data analysis environment including a number of core datatypes such as Maps, Lightcurves, and Spectra which have consistent interfaces and behaviors. By taking advantage of the large and sophisticated body of scientific software already available in Python (e.g. SciPy, NumPy, Matplotlib), and by adopting many of the best practices refined in open-source software development, SunPy has been able to develop at a very rapid pace while still ensuring a high level of reliability. The Helioviewer Project and SunPy represent two pioneering technologies in solar physics - simple yet flexible data visualization and a powerful, new data analysis environment. We discuss the development of both these efforts and how they are beginning to influence the solar physics community.
Tracking Clouds with low cost GNSS chips aided by the Arduino platform
NASA Astrophysics Data System (ADS)
Hameed, Saji; Realini, Eugenio; Ishida, Shinya
2016-04-01
The Global Navigation Satellite System (GNSS) is a constellation of satellites that is used to provide geo-positioning services. Besides this application, the GNSS system is important for a wide range of scientific and civilian applications. For example, GNSS systems are routinely used in civilian applications such as surveying and scientific applications such as the study of crustal deformation. Another important scientific application of GNSS system is in meteorological research. Here it is mainly used to determine the total water vapour content of the troposphere, hereafter Precipitable Water Vapor (PWV). However, both GNSS receivers and software have prohibitively high price due to a variety of reasons. To overcome this somewhat artificial barrier we are exploring the use of low-cost GNSS receivers along with open source GNSS software for scientific research, in particular for GNSS meteorology research. To achieve this aim, we have developed a custom Arduino compatible data logging board that is able to operate together with a specific low-cost single frequency GNSS receiver chip from NVS Technologies AG. We have also developed an open-source software bundle that includes a new Arduino core for the Atmel324p chip, which is the main processor used in our custom logger. We have also developed software code that enables data collection, logging and parsing of the GNSS data stream. Additionally we have comprehensively evaluated the low power characteristics of the GNSS receiver and logger boards. Currently we are exploring the use of several openly source or free to use for research software to map GNSS delays to PWV. These include the open source goGPS (http://www.gogps-project.org/) and gLAB (http://gage.upc.edu/gLAB) and the openly available GAMIT software from Massachusetts Institute of Technology (MIT). We note that all the firmware and software developed as part of this project is available on an open source license.
DOT National Transportation Integrated Search
2016-11-17
The ETFOMM (Enhanced Transportation Flow Open Source Microscopic Model) Cloud Service (ECS) is a software product sponsored by the U.S. Department of Transportation in conjunction with the Microscopic Traffic Simulation Models and SoftwareAn Op...
Scientific Software - the role of best practices and recommendations
NASA Astrophysics Data System (ADS)
Fritzsch, Bernadette; Bernstein, Erik; Castell, Wolfgang zu; Diesmann, Markus; Haas, Holger; Hammitzsch, Martin; Konrad, Uwe; Lähnemann, David; McHardy, Alice; Pampel, Heinz; Scheliga, Kaja; Schreiber, Andreas; Steglich, Dirk
2017-04-01
In Geosciences - like in most other communities - scientific work strongly depends on software. For big data analysis, existing (closed or open source) program packages are often mixed with newly developed codes. Different versions of software components and varying configurations can influence the result of data analysis. This often makes reproducibility of results and reuse of codes very difficult. Policies for publication and documentation of used and newly developed software, along with best practices, can help tackle this problem. Within the Helmholtz Association a Task Group "Access to and Re-use of scientific software" was implemented by the Open Science Working Group in 2016. The aim of the Task Group is to foster the discussion about scientific software in the Open Science context and to formulate recommendations for the production and publication of scientific software, ensuring open access to it. As a first step, a workshop gathered interested scientists from institutions across Germany. The workshop brought together various existing initiatives from different scientific communities to analyse current problems, share established best practices and come up with possible solutions. The subjects in the working groups covered a broad range of themes, including technical infrastructures, standards and quality assurance, citation of software and reproducibility. Initial recommendations are presented and discussed in the talk. They are the foundation for further discussions in the Helmholtz Association and the Priority Initiative "Digital Information" of the Alliance of Science Organisations in Germany. The talk aims to inform about the activities and to link with other initiatives on the national or international level.
Zhou, Ji; Applegate, Christopher; Alonso, Albor Dobon; Reynolds, Daniel; Orford, Simon; Mackiewicz, Michal; Griffiths, Simon; Penfield, Steven; Pullen, Nick
2017-01-01
Plants demonstrate dynamic growth phenotypes that are determined by genetic and environmental factors. Phenotypic analysis of growth features over time is a key approach to understand how plants interact with environmental change as well as respond to different treatments. Although the importance of measuring dynamic growth traits is widely recognised, available open software tools are limited in terms of batch image processing, multiple traits analyses, software usability and cross-referencing results between experiments, making automated phenotypic analysis problematic. Here, we present Leaf-GP (Growth Phenotypes), an easy-to-use and open software application that can be executed on different computing platforms. To facilitate diverse scientific communities, we provide three software versions, including a graphic user interface (GUI) for personal computer (PC) users, a command-line interface for high-performance computer (HPC) users, and a well-commented interactive Jupyter Notebook (also known as the iPython Notebook) for computational biologists and computer scientists. The software is capable of extracting multiple growth traits automatically from large image datasets. We have utilised it in Arabidopsis thaliana and wheat ( Triticum aestivum ) growth studies at the Norwich Research Park (NRP, UK). By quantifying a number of growth phenotypes over time, we have identified diverse plant growth patterns between different genotypes under several experimental conditions. As Leaf-GP has been evaluated with noisy image series acquired by different imaging devices (e.g. smartphones and digital cameras) and still produced reliable biological outputs, we therefore believe that our automated analysis workflow and customised computer vision based feature extraction software implementation can facilitate a broader plant research community for their growth and development studies. Furthermore, because we implemented Leaf-GP based on open Python-based computer vision, image analysis and machine learning libraries, we believe that our software not only can contribute to biological research, but also demonstrates how to utilise existing open numeric and scientific libraries (e.g. Scikit-image, OpenCV, SciPy and Scikit-learn) to build sound plant phenomics analytic solutions, in a efficient and effective way. Leaf-GP is a sophisticated software application that provides three approaches to quantify growth phenotypes from large image series. We demonstrate its usefulness and high accuracy based on two biological applications: (1) the quantification of growth traits for Arabidopsis genotypes under two temperature conditions; and (2) measuring wheat growth in the glasshouse over time. The software is easy-to-use and cross-platform, which can be executed on Mac OS, Windows and HPC, with open Python-based scientific libraries preinstalled. Our work presents the advancement of how to integrate computer vision, image analysis, machine learning and software engineering in plant phenomics software implementation. To serve the plant research community, our modulated source code, detailed comments, executables (.exe for Windows; .app for Mac), and experimental results are freely available at https://github.com/Crop-Phenomics-Group/Leaf-GP/releases.
Addressing Challenges in the Acquisition of Secure Software Systems With Open Architectures
2012-04-30
as a “broker” to market specific research topics identified by our sponsors to NPS graduate students. This three-pronged approach provides for a...breaks, and the day-ending socials. Many of our researchers use these occasions to establish new teaming arrangements for future research work. In the...software (CSS) and open source software (OSS). Federal government acquisition policy, as well as many leading enterprise IT centers, now encourage the use
NASA Technical Reports Server (NTRS)
Clancey, William J.; Lowry, Michael R.; Nado, Robert Allen; Sierhuis, Maarten
2011-01-01
We analyzed a series of ten systematically developed surface exploration systems that integrated a variety of hardware and software components. Design, development, and testing data suggest that incremental buildup of an exploration system for long-duration capabilities is facilitated by an open architecture with appropriate-level APIs, specifically designed to facilitate integration of new components. This improves software productivity by reducing changes required for reconfiguring an existing system.
An Open Avionics and Software Architecture to Support Future NASA Exploration Missions
NASA Technical Reports Server (NTRS)
Schlesinger, Adam
2017-01-01
The presentation describes an avionics and software architecture that has been developed through NASAs Advanced Exploration Systems (AES) division. The architecture is open-source, highly reliable with fault tolerance, and utilizes standard capabilities and interfaces, which are scalable and customizable to support future exploration missions. Specific focus areas of discussion will include command and data handling, software, human interfaces, communication and wireless systems, and systems engineering and integration.
Lessons learned in transitioning to an open systems environment
NASA Technical Reports Server (NTRS)
Boland, Dillard E.; Green, David S.; Steger, Warren L.
1994-01-01
Software development organizations, both commercial and governmental, are undergoing rapid change spurred by developments in the computing industry. To stay competitive, these organizations must adopt new technologies, skills, and practices quickly. Yet even for an organization with a well-developed set of software engineering models and processes, transitioning to a new technology can be expensive and risky. Current industry trends are leading away from traditional mainframe environments and toward the workstation-based, open systems world. This paper presents the experiences of software engineers on three recent projects that pioneered open systems development for NASA's Flight Dynamics Division of the Goddard Space Flight Center (GSFC).
Embracing Open Source for NASA's Earth Science Data Systems
NASA Technical Reports Server (NTRS)
Baynes, Katie; Pilone, Dan; Boller, Ryan; Meyer, David; Murphy, Kevin
2017-01-01
The overarching purpose of NASAs Earth Science program is to develop a scientific understanding of Earth as a system. Scientific knowledge is most robust and actionable when resulting from transparent, traceable, and reproducible methods. Reproducibility includes open access to the data as well as the software used to arrive at results. Additionally, software that is custom-developed for NASA should be open to the greatest degree possible, to enable re-use across Federal agencies, reduce overall costs to the government, remove barriers to innovation, and promote consistency through the use of uniform standards. Finally, Open Source Software (OSS) practices facilitate collaboration between agencies and the private sector. To best meet these ends, NASAs Earth Science Division promotes the full and open sharing of not only all data, metadata, products, information, documentation, models, images, and research results but also the source code used to generate, manipulate and analyze them. This talk focuses on the challenges to open sourcing NASA developed software within ESD and the growing pains associated with establishing policies running the gamut of tracking issues, properly documenting build processes, engaging the open source community, maintaining internal compliance, and accepting contributions from external sources. This talk also covers the adoption of existing open source technologies and standards to enhance our custom solutions and our contributions back to the community. Finally, we will be introducing the most recent OSS contributions from NASA Earth Science program and promoting these projects for wider community review and adoption.
Software Comparison for Renewable Energy Deployment in a Distribution Network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, David Wenzhong; Muljadi, Eduard; Tian, Tian
The main objective of this report is to evaluate different software options for performing robust distributed generation (DG) power system modeling. The features and capabilities of four simulation tools, OpenDSS, GridLAB-D, CYMDIST, and PowerWorld Simulator, are compared to analyze their effectiveness in analyzing distribution networks with DG. OpenDSS and GridLAB-D, two open source software, have the capability to simulate networks with fluctuating data values. These packages allow the running of a simulation each time instant by iterating only the main script file. CYMDIST, a commercial software, allows for time-series simulation to study variations on network controls. PowerWorld Simulator, another commercialmore » tool, has a batch mode simulation function through the 'Time Step Simulation' tool, which obtains solutions for a list of specified time points. PowerWorld Simulator is intended for analysis of transmission-level systems, while the other three are designed for distribution systems. CYMDIST and PowerWorld Simulator feature easy-to-use graphical user interfaces (GUIs). OpenDSS and GridLAB-D, on the other hand, are based on command-line programs, which increase the time necessary to become familiar with the software packages.« less
Noninvasive Fetal ECG: the PhysioNet/Computing in Cardiology Challenge 2013.
Silva, Ikaro; Behar, Joachim; Sameni, Reza; Zhu, Tingting; Oster, Julien; Clifford, Gari D; Moody, George B
2013-03-01
The PhysioNet/CinC 2013 Challenge aimed to stimulate rapid development and improvement of software for estimating fetal heart rate (FHR), fetal interbeat intervals (FRR), and fetal QT intervals (FQT), from multichannel recordings made using electrodes placed on the mother's abdomen. For the challenge, five data collections from a variety of sources were used to compile a large standardized database, which was divided into training, open test, and hidden test subsets. Gold-standard fetal QRS and QT interval annotations were developed using a novel crowd-sourcing framework. The challenge organizers used the hidden test subset to evaluate 91 open-source software entries submitted by 53 international teams of participants in three challenge events, estimating FHR, FRR, and FQT using the hidden test subset, which was not available for study by participants. Two additional events required only user-submitted QRS annotations to evaluate FHR and FRR estimation accuracy using the open test subset available to participants. The challenge yielded a total of 91 open-source software entries. The best of these achieved average estimation errors of 187bpm 2 for FHR, 20.9 ms for FRR, and 152.7 ms for FQT. The open data sets, scoring software, and open-source entries are available at PhysioNet for researchers interested on working on these problems.
World Reaction to Virtual Space
NASA Technical Reports Server (NTRS)
1999-01-01
DRaW Computing developed virtual reality software for the International Space Station. Open Worlds, as the software has been named, can be made to support Java scripting and virtual reality hardware devices. Open Worlds permits the use of VRML script nodes to add virtual reality capabilities to the user's applications.
Communal Resources in Open Source Software Development
ERIC Educational Resources Information Center
Spaeth, Sebastian; Haefliger, Stefan; von Krogh, Georg; Renzl, Birgit
2008-01-01
Introduction: Virtual communities play an important role in innovation. The paper focuses on the particular form of collective action in virtual communities underlying as Open Source software development projects. Method: Building on resource mobilization theory and private-collective innovation, we propose a theory of collective action in…
75 FR 10439 - Cognitive Radio Technologies and Software Defined Radios
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-08
... Technologies and Software Defined Radios AGENCY: Federal Communications Commission. ACTION: Final rule. SUMMARY... concerning the use of open source software to implement security features in software defined radios (SDRs... ongoing technical developments in cognitive and software defined radio (SDR) technologies. 2. On April 20...
Hadlich, Marcelo Souza; Oliveira, Gláucia Maria Moraes; Feijóo, Raúl A; Azevedo, Clerio F; Tura, Bernardo Rangel; Ziemer, Paulo Gustavo Portela; Blanco, Pablo Javier; Pina, Gustavo; Meira, Márcio; Souza e Silva, Nelson Albuquerque de
2012-10-01
The standardization of images used in Medicine in 1993 was performed using the DICOM (Digital Imaging and Communications in Medicine) standard. Several tests use this standard and it is increasingly necessary to design software applications capable of handling this type of image; however, these software applications are not usually free and open-source, and this fact hinders their adjustment to most diverse interests. To develop and validate a free and open-source software application capable of handling DICOM coronary computed tomography angiography images. We developed and tested the ImageLab software in the evaluation of 100 tests randomly selected from a database. We carried out 600 tests divided between two observers using ImageLab and another software sold with Philips Brilliance computed tomography appliances in the evaluation of coronary lesions and plaques around the left main coronary artery (LMCA) and the anterior descending artery (ADA). To evaluate intraobserver, interobserver and intersoftware agreements, we used simple and kappa statistics agreements. The agreements observed between software applications were generally classified as substantial or almost perfect in most comparisons. The ImageLab software agreed with the Philips software in the evaluation of coronary computed tomography angiography tests, especially in patients without lesions, with lesions < 50% in the LMCA and < 70% in the ADA. The agreement for lesions > 70% in the ADA was lower, but this is also observed when the anatomical reference standard is used.
Bonnal, Raoul J P; Aerts, Jan; Githinji, George; Goto, Naohisa; MacLean, Dan; Miller, Chase A; Mishima, Hiroyuki; Pagani, Massimiliano; Ramirez-Gonzalez, Ricardo; Smant, Geert; Strozzi, Francesco; Syme, Rob; Vos, Rutger; Wennblom, Trevor J; Woodcroft, Ben J; Katayama, Toshiaki; Prins, Pjotr
2012-04-01
Biogem provides a software development environment for the Ruby programming language, which encourages community-based software development for bioinformatics while lowering the barrier to entry and encouraging best practices. Biogem, with its targeted modular and decentralized approach, software generator, tools and tight web integration, is an improved general model for scaling up collaborative open source software development in bioinformatics. Biogem and modules are free and are OSS. Biogem runs on all systems that support recent versions of Ruby, including Linux, Mac OS X and Windows. Further information at http://www.biogems.info. A tutorial is available at http://www.biogems.info/howto.html bonnal@ingm.org.
Open Architecture SDR for Space
NASA Technical Reports Server (NTRS)
Smith, Carl; Long, Chris; Liebetreu, John; Reinhart, Richard C.
2005-01-01
This paper describes an open-architecture SDR (software defined radio) infrastructure that is suitable for space-based operations (Space-SDR). SDR technologies will endow space and planetary exploration systems with dramatically increased capability, reduced power consumption, and significantly less mass than conventional systems, at costs reduced by vigorous competition, hardware commonality, dense integration, reduced obsolescence, interoperability, and software re-use. Significant progress has been recorded on developments like the Joint Tactical Radio System (JSTRS) Software Communication Architecture (SCA), which is oriented toward reconfigurable radios for defense forces operating in multiple theaters of engagement. The JTRS-SCA presents a consistent software interface for waveform development, and facilitates interoperability, waveform portability, software re-use, and technology evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lehe, Remi
Many simulation software produce data in the form of a set of field values or of a set of particle positions. (one such example is that of particle-in-cell codes, which produce data on the electromagnetic fields that they simulate.) However, each particular software uses its own particular format and layout, for the output data. This makes it difficult to compare the results of different simulation software, or to have a common visualization tool for these results. However, a standardized layout for fields and particles has recently been developed: the openPMD format ( HYPERLINK "http://www.openpmd.org/"www.openpmd.org) This format is open- source, andmore » specifies a standard way in which field data and particle data should be written. The openPMD format is already implemented in the particle-in-cell code Warp (developed at LBL) and in PIConGPU (developed at HZDR, Germany). In this context, the proposed software (openPMD-viewer) is a Python package, which allows to access and visualize any data which has been formatted according to the openPMD standard. This package contains two main components: - a Python API, which allows to read and extract the data from a openPMD file, so as to be able to work with it within the Python environment. (e.g. plot the data and reprocess it with particular Python functions) - a graphical interface, which works with the ipython notebook, and allows to quickly visualize the data and browse through a set of openPMD files. The proposed software will be typically used when analyzing the results of numerical simulations. It will be useful to quickly extract scientific meaning from a set of numerical data.« less
Open Source software and social networks: disruptive alternatives for medical imaging.
Ratib, Osman; Rosset, Antoine; Heuberger, Joris
2011-05-01
In recent decades several major changes in computer and communication technology have pushed the limits of imaging informatics and PACS beyond the traditional system architecture providing new perspectives and innovative approach to a traditionally conservative medical community. Disruptive technologies such as the world-wide-web, wireless networking, Open Source software and recent emergence of cyber communities and social networks have imposed an accelerated pace and major quantum leaps in the progress of computer and technology infrastructure applicable to medical imaging applications. This paper reviews the impact and potential benefits of two major trends in consumer market software development and how they will influence the future of medical imaging informatics. Open Source software is emerging as an attractive and cost effective alternative to traditional commercial software developments and collaborative social networks provide a new model of communication that is better suited to the needs of the medical community. Evidence shows that successful Open Source software tools have penetrated the medical market and have proven to be more robust and cost effective than their commercial counterparts. Developed by developers that are themselves part of the user community, these tools are usually better adapted to the user's need and are more robust than traditional software programs being developed and tested by a large number of contributing users. This context allows a much faster and more appropriate development and evolution of the software platforms. Similarly, communication technology has opened up to the general public in a way that has changed the social behavior and habits adding a new dimension to the way people communicate and interact with each other. The new paradigms have also slowly penetrated the professional market and ultimately the medical community. Secure social networks allowing groups of people to easily communicate and exchange information is a new model that is particularly suitable for some specific groups of healthcare professional and for physicians. It has also changed the expectations of how patients wish to communicate with their physicians. Emerging disruptive technologies and innovative paradigm such as Open Source software are leading the way to a new generation of information systems that slowly will change the way physicians and healthcare providers as well as patients will interact and communicate in the future. The impact of these new technologies is particularly effective in image communication, PACS and teleradiology. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Two-step web-mining approach to study geology/geophysics-related open-source software projects
NASA Astrophysics Data System (ADS)
Behrends, Knut; Conze, Ronald
2013-04-01
Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github.com and stackoverflow.com. These popular, developer-centric websites have powerful application-programmer interfaces, and follow an open-data policy. In this regard, these sites offer a web-accessible reservoir of information that can be tapped to study questions such as: which open source software projects are eminent in the various geoscience fields? What are the most popular programming languages? How are they trending? Are there any interesting temporal patterns in committer activities? How large are programming teams and how do they change over time? What free software packages exist in the vast realms of related fields? Does the software from these fields have capabilities that might still be useful to me as a researcher, or can help me perform my work better? Are there any open-source projects that might be commercially interesting? This evaluation strategy reveals programming projects that tend to be new. As many important legacy codes are not hosted on open-source code-repositories, the presented search method might overlook some older projects.
NASA Astrophysics Data System (ADS)
Lemmens, R.; Maathuis, B.; Mannaerts, C.; Foerster, T.; Schaeffer, B.; Wytzisk, A.
2009-12-01
This paper involves easy accessible integrated web-based analysis of satellite images with a plug-in based open source software. The paper is targeted to both users and developers of geospatial software. Guided by a use case scenario, we describe the ILWIS software and its toolbox to access satellite images through the GEONETCast broadcasting system. The last two decades have shown a major shift from stand-alone software systems to networked ones, often client/server applications using distributed geo-(web-)services. This allows organisations to combine without much effort their own data with remotely available data and processing functionality. Key to this integrated spatial data analysis is a low-cost access to data from within a user-friendly and flexible software. Web-based open source software solutions are more often a powerful option for developing countries. The Integrated Land and Water Information System (ILWIS) is a PC-based GIS & Remote Sensing software, comprising a complete package of image processing, spatial analysis and digital mapping and was developed as commercial software from the early nineties onwards. Recent project efforts have migrated ILWIS into a modular, plug-in-based open source software, and provide web-service support for OGC-based web mapping and processing. The core objective of the ILWIS Open source project is to provide a maintainable framework for researchers and software developers to implement training components, scientific toolboxes and (web-) services. The latest plug-ins have been developed for multi-criteria decision making, water resources analysis and spatial statistics analysis. The development of this framework is done since 2007 in the context of 52°North, which is an open initiative that advances the development of cutting edge open source geospatial software, using the GPL license. GEONETCast, as part of the emerging Global Earth Observation System of Systems (GEOSS), puts essential environmental data at the fingertips of users around the globe. This user-friendly and low-cost information dissemination provides global information as a basis for decision-making in a number of critical areas, including public health, energy, agriculture, weather, water, climate, natural disasters and ecosystems. GEONETCast makes available satellite images via Digital Video Broadcast (DVB) technology. An OGC WMS interface and plug-ins which convert GEONETCast data streams allow an ILWIS user to integrate various distributed data sources with data locally stored on his machine. Our paper describes a use case in which ILWIS is used with GEONETCast satellite imagery for decision making processes in Ghana. We also explain how the ILWIS software can be extended with additional functionality by means of building plug-ins and unfold our plans to implement other OGC standards, such as WCS and WPS in the same context. Especially, the latter one can be seen as a major step forward in terms of moving well-proven desktop based processing functionality to the web. This enables the embedding of ILWIS functionality in Spatial Data Infrastructures or even the execution in scalable and on-demand cloud computing environments.
PyPedal, an open source software package for pedigree analysis
USDA-ARS?s Scientific Manuscript database
The open source software package PyPedal (http://pypedal.sourceforge.net/) was first released in 2002, and provided users with a set of simple tools for manipulating pedigrees. Its flexibility has been demonstrated by its used in a number of settings for large and small populations. After substantia...
ERIC Educational Resources Information Center
Baser, Mustafa
2006-01-01
This paper reports upon an active learning approach that promotes conceptual change when studying direct current electricity circuits, using free open source software, "Qucs". The study involved a total of 102 prospective mathematics teacher students. Prior to instruction, students' understanding of direct current electricity was…
NASA Astrophysics Data System (ADS)
Tavakkol, Sasan; Lynett, Patrick
2017-08-01
In this paper, we introduce an interactive coastal wave simulation and visualization software, called Celeris. Celeris is an open source software which needs minimum preparation to run on a Windows machine. The software solves the extended Boussinesq equations using a hybrid finite volume-finite difference method and supports moving shoreline boundaries. The simulation and visualization are performed on the GPU using Direct3D libraries, which enables the software to run faster than real-time. Celeris provides a first-of-its-kind interactive modeling platform for coastal wave applications and it supports simultaneous visualization with both photorealistic and colormapped rendering capabilities. We validate our software through comparison with three standard benchmarks for non-breaking and breaking waves.
Whole earth modeling: developing and disseminating scientific software for computational geophysics.
NASA Astrophysics Data System (ADS)
Kellogg, L. H.
2016-12-01
Historically, a great deal of specialized scientific software for modeling and data analysis has been developed by individual researchers or small groups of scientists working on their own specific research problems. As the magnitude of available data and computer power has increased, so has the complexity of scientific problems addressed by computational methods, creating both a need to sustain existing scientific software, and expand its development to take advantage of new algorithms, new software approaches, and new computational hardware. To that end, communities like the Computational Infrastructure for Geodynamics (CIG) have been established to support the use of best practices in scientific computing for solid earth geophysics research and teaching. Working as a scientific community enables computational geophysicists to take advantage of technological developments, improve the accuracy and performance of software, build on prior software development, and collaborate more readily. The CIG community, and others, have adopted an open-source development model, in which code is developed and disseminated by the community in an open fashion, using version control and software repositories like Git. One emerging issue is how to adequately identify and credit the intellectual contributions involved in creating open source scientific software. The traditional method of disseminating scientific ideas, peer reviewed publication, was not designed for review or crediting scientific software, although emerging publication strategies such software journals are attempting to address the need. We are piloting an integrated approach in which authors are identified and credited as scientific software is developed and run. Successful software citation requires integration with the scholarly publication and indexing mechanisms as well, to assign credit, ensure discoverability, and provide provenance for software.
2012-10-01
use of R packages implemented in Bioconductor. Each dataset was normalized from raw data using the Frozen RMA (fRMA) algorithm . We applied the same...because development of the specific algorithms and fine tuning of the analytic strategy to accomplish this task was not immediately straightforward. We...express firefly luciferase using a retrovirus that encodes a fusion of luciferase and neomycin phosphotransferase (LucNeo), will be implanted and followed
The 2015 Bioinformatics Open Source Conference (BOSC 2015).
Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica
2016-02-01
The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.
Matlab-Excel Interface for OpenDSS
DOE Office of Scientific and Technical Information (OSTI.GOV)
The software allows users of the OpenDSS grid modeling software to access their load flow models using a GUI interface developed in MATLAB. The circuit definitions are entered into a Microsoft Excel spreadsheet which makes circuit creation and editing a much simpler process than the basic text-based editors used in the native OpenDSS interface. Plot tools have been developed which can be accessed through a MATLAB GUI once the desired parameters have been simulated.
Christopher W. Helm
2006-01-01
GLIMS is a NASA funded project that utilizes Open-Source Software to achieve its goal of creating a globally complete inventory of glaciers. The participation of many international institutions and the development of on-line mapping applications to provide access to glacial data have both been enhanced by Open-Source GIS capabilities and play a crucial role in the...
Probabilistic models of genetic variation in structured populations applied to global human studies.
Hao, Wei; Song, Minsun; Storey, John D
2016-03-01
Modern population genetics studies typically involve genome-wide genotyping of individuals from a diverse network of ancestries. An important problem is how to formulate and estimate probabilistic models of observed genotypes that account for complex population structure. The most prominent work on this problem has focused on estimating a model of admixture proportions of ancestral populations for each individual. Here, we instead focus on modeling variation of the genotypes without requiring a higher-level admixture interpretation. We formulate two general probabilistic models, and we propose computationally efficient algorithms to estimate them. First, we show how principal component analysis can be utilized to estimate a general model that includes the well-known Pritchard-Stephens-Donnelly admixture model as a special case. Noting some drawbacks of this approach, we introduce a new 'logistic factor analysis' framework that seeks to directly model the logit transformation of probabilities underlying observed genotypes in terms of latent variables that capture population structure. We demonstrate these advances on data from the Human Genome Diversity Panel and 1000 Genomes Project, where we are able to identify SNPs that are highly differentiated with respect to structure while making minimal modeling assumptions. A Bioconductor R package called lfa is available at http://www.bioconductor.org/packages/release/bioc/html/lfa.html jstorey@princeton.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
van den Broek, Evert; van Lieshout, Stef; Rausch, Christian; Ylstra, Bauke; van de Wiel, Mark A; Meijer, Gerrit A; Fijneman, Remond J A; Abeln, Sanne
2016-01-01
Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org ) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html ).
RCP: a novel probe design bias correction method for Illumina Methylation BeadChip.
Niu, Liang; Xu, Zongli; Taylor, Jack A
2016-09-01
The Illumina HumanMethylation450 BeadChip has been extensively utilized in epigenome-wide association studies. This array and its successor, the MethylationEPIC array, use two types of probes-Infinium I (type I) and Infinium II (type II)-in order to increase genome coverage but differences in probe chemistries result in different type I and II distributions of methylation values. Ignoring the difference in distributions between the two probe types may bias downstream analysis. Here, we developed a novel method, called Regression on Correlated Probes (RCP), which uses the existing correlation between pairs of nearby type I and II probes to adjust the beta values of all type II probes. We evaluate the effect of this adjustment on reducing probe design type bias, reducing technical variation in duplicate samples, improving accuracy of measurements against known standards, and retention of biological signal. We find that RCP is statistically significantly better than unadjusted data or adjustment with alternative methods including SWAN and BMIQ. We incorporated the method into the R package ENmix, which is freely available from the Bioconductor website (https://www.bioconductor.org/packages/release/bioc/html/ENmix.html). niulg@ucmail.uc.edu Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
Model-based variance-stabilizing transformation for Illumina microarray data.
Lin, Simon M; Du, Pan; Huber, Wolfgang; Kibbe, Warren A
2008-02-01
Variance stabilization is a step in the preprocessing of microarray data that can greatly benefit the performance of subsequent statistical modeling and inference. Due to the often limited number of technical replicates for Affymetrix and cDNA arrays, achieving variance stabilization can be difficult. Although the Illumina microarray platform provides a larger number of technical replicates on each array (usually over 30 randomly distributed beads per probe), these replicates have not been leveraged in the current log2 data transformation process. We devised a variance-stabilizing transformation (VST) method that takes advantage of the technical replicates available on an Illumina microarray. We have compared VST with log2 and Variance-stabilizing normalization (VSN) by using the Kruglyak bead-level data (2006) and Barnes titration data (2005). The results of the Kruglyak data suggest that VST stabilizes variances of bead-replicates within an array. The results of the Barnes data show that VST can improve the detection of differentially expressed genes and reduce false-positive identifications. We conclude that although both VST and VSN are built upon the same model of measurement noise, VST stabilizes the variance better and more efficiently for the Illumina platform by leveraging the availability of a larger number of within-array replicates. The algorithms and Supplementary Data are included in the lumi package of Bioconductor, available at: www.bioconductor.org.
TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis
Ji, Zhicheng; Ji, Hongkai
2016-01-01
When analyzing single-cell RNA-seq data, constructing a pseudo-temporal path to order cells based on the gradual transition of their transcriptomes is a useful way to study gene expression dynamics in a heterogeneous cell population. Currently, a limited number of computational tools are available for this task, and quantitative methods for comparing different tools are lacking. Tools for Single Cell Analysis (TSCAN) is a software tool developed to better support in silico pseudo-Time reconstruction in Single-Cell RNA-seq ANalysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods. TSCAN is available at https://github.com/zji90/TSCAN and as a Bioconductor package. PMID:27179027
TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis.
Ji, Zhicheng; Ji, Hongkai
2016-07-27
When analyzing single-cell RNA-seq data, constructing a pseudo-temporal path to order cells based on the gradual transition of their transcriptomes is a useful way to study gene expression dynamics in a heterogeneous cell population. Currently, a limited number of computational tools are available for this task, and quantitative methods for comparing different tools are lacking. Tools for Single Cell Analysis (TSCAN) is a software tool developed to better support in silico pseudo-Time reconstruction in Single-Cell RNA-seq ANalysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods. TSCAN is available at https://github.com/zji90/TSCAN and as a Bioconductor package. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Building a Snow Data Management System using Open Source Software (and IDL)
NASA Astrophysics Data System (ADS)
Goodale, C. E.; Mattmann, C. A.; Ramirez, P.; Hart, A. F.; Painter, T.; Zimdars, P. A.; Bryant, A.; Brodzik, M.; Skiles, M.; Seidel, F. C.; Rittger, K. E.
2012-12-01
At NASA's Jet Propulsion Laboratory free and open source software is used everyday to support a wide range of projects, from planetary to climate to research and development. In this abstract I will discuss the key role that open source software has played in building a robust science data processing pipeline for snow hydrology research, and how the system is also able to leverage programs written in IDL, making JPL's Snow Data System a hybrid of open source and proprietary software. Main Points: - The Design of the Snow Data System (illustrate how the collection of sub-systems are combined to create a complete data processing pipeline) - Discuss the Challenges of moving from a single algorithm on a laptop, to running 100's of parallel algorithms on a cluster of servers (lesson's learned) - Code changes - Software license related challenges - Storage Requirements - System Evolution (from data archiving, to data processing, to data on a map, to near-real-time products and maps) - Road map for the next 6 months (including how easily we re-used the snowDS code base to support the Airborne Snow Observatory Mission) Software in Use and their Software Licenses: IDL - Used for pre and post processing of data. Licensed under a proprietary software license held by Excelis. Apache OODT - Used for data management and workflow processing. Licensed under the Apache License Version 2. GDAL - Geospatial Data processing library used for data re-projection currently. Licensed under the X/MIT license. GeoServer - WMS Server. Licensed under the General Public License Version 2.0 Leaflet.js - Javascript web mapping library. Licensed under the Berkeley Software Distribution License. Python - Glue code and miscellaneous data processing support. Licensed under the Python Software Foundation License. Perl - Script wrapper for running the SCAG algorithm. Licensed under the General Public License Version 3. PHP - Front-end web application programming. Licensed under the PHP License Version 3.01
OpenCFU, a new free and open-source software to count cell colonies and other circular objects.
Geissmann, Quentin
2013-01-01
Counting circular objects such as cell colonies is an important source of information for biologists. Although this task is often time-consuming and subjective, it is still predominantly performed manually. The aim of the present work is to provide a new tool to enumerate circular objects from digital pictures and video streams. Here, I demonstrate that the created program, OpenCFU, is very robust, accurate and fast. In addition, it provides control over the processing parameters and is implemented in an intuitive and modern interface. OpenCFU is a cross-platform and open-source software freely available at http://opencfu.sourceforge.net.
NASA Technical Reports Server (NTRS)
Stensrud, Kjell C.; Hamm, Dustin
2007-01-01
NASA's Johnson Space Center (JSC) / Flight Design and Dynamics Division (DM) has prototyped the use of Open Source middleware technology for building its next generation spacecraft mission support system. This is part of a larger initiative to use open standards and open source software as building blocks for future mission and safety critical systems. JSC is hoping to leverage standardized enterprise architectures, such as Java EE, so that its internal software development efforts can be focused on the core aspects of their problem domain. This presentation will outline the design and implementation of the Trajectory system and the lessons learned during the exercise.
Open Source Next Generation Visualization Software for Interplanetary Missions
NASA Technical Reports Server (NTRS)
Trimble, Jay; Rinker, George
2016-01-01
Mission control is evolving quickly, driven by the requirements of new missions, and enabled by modern computing capabilities. Distributed operations, access to data anywhere, data visualization for spacecraft analysis that spans multiple data sources, flexible reconfiguration to support multiple missions, and operator use cases, are driving the need for new capabilities. NASA's Advanced Multi-Mission Operations System (AMMOS), Ames Research Center (ARC) and the Jet Propulsion Laboratory (JPL) are collaborating to build a new generation of mission operations software for visualization, to enable mission control anywhere, on the desktop, tablet and phone. The software is built on an open source platform that is open for contributions (http://nasa.github.io/openmct).
Fiji: an open-source platform for biological-image analysis.
Schindelin, Johannes; Arganda-Carreras, Ignacio; Frise, Erwin; Kaynig, Verena; Longair, Mark; Pietzsch, Tobias; Preibisch, Stephan; Rueden, Curtis; Saalfeld, Stephan; Schmid, Benjamin; Tinevez, Jean-Yves; White, Daniel James; Hartenstein, Volker; Eliceiri, Kevin; Tomancak, Pavel; Cardona, Albert
2012-06-28
Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.
Herbst, Christian T; Oh, Jinook; Vydrová, Jitka; Švec, Jan G
2015-07-01
In this short report we introduce DigitalVHI, a free open-source software application for obtaining Voice Handicap Index (VHI) and other questionnaire data, which can be put on a computer in clinics and used in clinical practice. The software can simplify performing clinical studies since it makes the VHI scores directly available for analysis in a digital form. It can be downloaded from http://www.christian-herbst.org/DigitalVHI/.
Open Radio Communications Architecture Core Framework V1.1.0 Volume 1 Software Users Manual
2005-02-01
on a PC utilizing the KDE desktop that comes with Red Hat Linux . The default desktop for most Red Hat Linux installations is the GNOME desktop. The...SCA) v2.2. The software was designed for a desktop computer running the Linux operating system (OS). It was developed in C++, uses ACE/TAO for CORBA...middleware, Xerces for the XML parser, and Red Hat Linux for the Operating System. The software is referred to as, Open Radio Communication
An object oriented implementation of the Yeadon human inertia model
Dembia, Christopher; Moore, Jason K.; Hubbard, Mont
2015-01-01
We present an open source software implementation of a popular mathematical method developed by M.R. Yeadon for calculating the body and segment inertia parameters of a human body. The software is written in a high level open source language and provides three interfaces for manipulating the data and the model: a Python API, a command-line user interface, and a graphical user interface. Thus the software can fit into various data processing pipelines and requires only simple geometrical measures as input. PMID:25717365
An object oriented implementation of the Yeadon human inertia model.
Dembia, Christopher; Moore, Jason K; Hubbard, Mont
2014-01-01
We present an open source software implementation of a popular mathematical method developed by M.R. Yeadon for calculating the body and segment inertia parameters of a human body. The software is written in a high level open source language and provides three interfaces for manipulating the data and the model: a Python API, a command-line user interface, and a graphical user interface. Thus the software can fit into various data processing pipelines and requires only simple geometrical measures as input.
The ImageJ ecosystem: an open platform for biomedical image analysis
Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368
The ImageJ ecosystem: An open platform for biomedical image analysis.
Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.
Gpufit: An open-source toolkit for GPU-accelerated curve fitting.
Przybylski, Adrian; Thiel, Björn; Keller-Findeisen, Jan; Stock, Bernd; Bates, Mark
2017-11-16
We present a general purpose, open-source software library for estimation of non-linear parameters by the Levenberg-Marquardt algorithm. The software, Gpufit, runs on a Graphics Processing Unit (GPU) and executes computations in parallel, resulting in a significant gain in performance. We measured a speed increase of up to 42 times when comparing Gpufit with an identical CPU-based algorithm, with no loss of precision or accuracy. Gpufit is designed such that it is easily incorporated into existing applications or adapted for new ones. Multiple software interfaces, including to C, Python, and Matlab, ensure that Gpufit is accessible from most programming environments. The full source code is published as an open source software repository, making its function transparent to the user and facilitating future improvements and extensions. As a demonstration, we used Gpufit to accelerate an existing scientific image analysis package, yielding significantly improved processing times for super-resolution fluorescence microscopy datasets.
Digital Preservation in Open-Source Digital Library Software
ERIC Educational Resources Information Center
Madalli, Devika P.; Barve, Sunita; Amin, Saiful
2012-01-01
Digital archives and digital library projects are being initiated all over the world for materials of different formats and domains. To organize, store, and retrieve digital content, many libraries as well as archiving centers are using either proprietary or open-source software. While it is accepted that print media can survive for centuries with…
Evaluating Open Source Portals
ERIC Educational Resources Information Center
Goh, Dion; Luyt, Brendan; Chua, Alton; Yee, See-Yong; Poh, Kia-Ngoh; Ng, How-Yeu
2008-01-01
Portals have become indispensable for organizations of all types trying to establish themselves on the Web. Unfortunately, there have only been a few evaluative studies of portal software and even fewer of open source portal software. This study aims to add to the available literature in this important area by proposing and testing a checklist for…
Open Source Projects in Software Engineering Education: A Mapping Study
ERIC Educational Resources Information Center
Nascimento, Debora M. C.; Almeida Bittencourt, Roberto; Chavez, Christina
2015-01-01
Context: It is common practice in academia to have students work with "toy" projects in software engineering (SE) courses. One way to make such courses more realistic and reduce the gap between academic courses and industry needs is getting students involved in open source projects (OSP) with faculty supervision. Objective: This study…
Prior, Fred W; Erickson, Bradley J; Tarbox, Lawrence
2007-11-01
The Cancer Bioinformatics Grid (caBIG) program was created by the National Cancer Institute to facilitate sharing of IT infrastructure, data, and applications among the National Cancer Institute-sponsored cancer research centers. The program was launched in February 2004 and now links more than 50 cancer centers. In April 2005, the In Vivo Imaging Workspace was added to promote the use of imaging in cancer clinical trials. At the inaugural meeting, four special interest groups (SIGs) were established. The Software SIG was charged with identifying projects that focus on open-source software for image visualization and analysis. To date, two projects have been defined by the Software SIG. The eXtensible Imaging Platform project has produced a rapid application development environment that researchers may use to create targeted workflows customized for specific research projects. The Algorithm Validation Tools project will provide a set of tools and data structures that will be used to capture measurement information and associated needed to allow a gold standard to be defined for the given database against which change analysis algorithms can be tested. Through these and future efforts, the caBIG In Vivo Imaging Workspace Software SIG endeavors to advance imaging informatics and provide new open-source software tools to advance cancer research.
Dimensional Error in Rapid Prototyping with Open Source Software and Low-cost 3D-printer
Andrade-Delgado, Laura; Telich-Tarriba, Jose E.; Fuente-del-Campo, Antonio; Altamirano-Arcos, Carlos A.
2018-01-01
Summary: Rapid prototyping models (RPMs) had been extensively used in craniofacial and maxillofacial surgery, especially in areas such as orthognathic surgery, posttraumatic or oncological reconstructions, and implantology. Economic limitations are higher in developing countries such as Mexico, where resources dedicated to health care are limited, therefore limiting the use of RPM to few selected centers. This article aims to determine the dimensional error of a low-cost fused deposition modeling 3D printer (Tronxy P802MA, Shenzhen, Tronxy Technology Co), with Open source software. An ordinary dry human mandible was scanned with a computed tomography device. The data were processed with open software to build a rapid prototype with a fused deposition machine. Linear measurements were performed to find the mean absolute and relative difference. The mean absolute and relative difference was 0.65 mm and 1.96%, respectively (P = 0.96). Low-cost FDM machines and Open Source Software are excellent options to manufacture RPM, with the benefit of low cost and a similar relative error than other more expensive technologies. PMID:29464171
Dimensional Error in Rapid Prototyping with Open Source Software and Low-cost 3D-printer.
Rendón-Medina, Marco A; Andrade-Delgado, Laura; Telich-Tarriba, Jose E; Fuente-Del-Campo, Antonio; Altamirano-Arcos, Carlos A
2018-01-01
Rapid prototyping models (RPMs) had been extensively used in craniofacial and maxillofacial surgery, especially in areas such as orthognathic surgery, posttraumatic or oncological reconstructions, and implantology. Economic limitations are higher in developing countries such as Mexico, where resources dedicated to health care are limited, therefore limiting the use of RPM to few selected centers. This article aims to determine the dimensional error of a low-cost fused deposition modeling 3D printer (Tronxy P802MA, Shenzhen, Tronxy Technology Co), with Open source software. An ordinary dry human mandible was scanned with a computed tomography device. The data were processed with open software to build a rapid prototype with a fused deposition machine. Linear measurements were performed to find the mean absolute and relative difference. The mean absolute and relative difference was 0.65 mm and 1.96%, respectively ( P = 0.96). Low-cost FDM machines and Open Source Software are excellent options to manufacture RPM, with the benefit of low cost and a similar relative error than other more expensive technologies.
Open-Source web-based geographical information system for health exposure assessment
2012-01-01
This paper presents the design and development of an open source web-based Geographical Information System allowing users to visualise, customise and interact with spatial data within their web browser. The developed application shows that by using solely Open Source software it was possible to develop a customisable web based GIS application that provides functions necessary to convey health and environmental data to experts and non-experts alike without the requirement of proprietary software. PMID:22233606
Technology collaboration by means of an open source government
NASA Astrophysics Data System (ADS)
Berardi, Steven M.
2009-05-01
The idea of open source software originally began in the early 1980s, but it never gained widespread support until recently, largely due to the explosive growth of the Internet. Only the Internet has made this kind of concept possible, bringing together millions of software developers from around the world to pool their knowledge. The tremendous success of open source software has prompted many corporations to adopt the culture of open source and thus share information they previously held secret. The government, and specifically the Department of Defense (DoD), could also benefit from adopting an open source culture. In acquiring satellite systems, the DoD often builds walls between program offices, but installing doors between programs can promote collaboration and information sharing. This paper addresses the challenges and consequences of adopting an open source culture to facilitate technology collaboration for DoD space acquisitions. DISCLAIMER: The views presented here are the views of the author, and do not represent the views of the United States Government, United States Air Force, or the Missile Defense Agency.
NASA Astrophysics Data System (ADS)
Ames, D. P.
2013-12-01
As has been seen in other informatics fields, well-documented and appropriately licensed open source software tools have the potential to significantly increase both opportunities and motivation for inter-institutional science and technology collaboration. The CUAHSI HIS (and related HydroShare) projects have aimed to foster such activities in hydrology resulting in the development of many useful community software components including the HydroDesktop software application. HydroDesktop is an open source, GIS-based, scriptable software application for discovering data on the CUAHSI Hydrologic Information System and related resources. It includes a well-defined plugin architecture and interface to allow 3rd party developers to create extensions and add new functionality without requiring recompiling of the full source code. HydroDesktop is built in the C# programming language and uses the open source DotSpatial GIS engine for spatial data management. Capabilities include data search, discovery, download, visualization, and export. An extension that integrates the R programming language with HydroDesktop provides scripting and data automation capabilities and an OpenMI plugin provides the ability to link models. Current revision and updates to HydroDesktop include migration of core business logic to cross platform, scriptable Python code modules that can be executed in any operating system or linked into other software front-end applications.
Ganry, L; Hersant, B; Quilichini, J; Leyder, P; Meningaud, J P
2017-06-01
Tridimensional (3D) surgical modelling is a necessary step to create 3D-printed surgical tools, and expensive professional software is generally needed. Open-source software are functional, reliable, updated, may be downloaded for free and used to produce 3D models. Few surgical teams have used free solutions for mastering 3D surgical modelling for reconstructive surgery with osseous free flaps. We described an Open-source software 3D surgical modelling protocol to perform a fast and nearly free mandibular reconstruction with microvascular fibula free flap and its surgical guides, with no need for engineering support. Four successive specialised Open-source software were used to perform our 3D modelling: OsiriX ® , Meshlab ® , Netfabb ® and Blender ® . Digital Imaging and Communications in Medicine (DICOM) data on patient skull and fibula, obtained with a computerised tomography (CT) scan, were needed. The 3D modelling of the reconstructed mandible and its surgical guides were created. This new strategy may improve surgical management in Oral and Craniomaxillofacial surgery. Further clinical studies are needed to demonstrate the feasibility, reproducibility, transfer of know how and benefits of this technique. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
The TJO-OAdM robotic observatory: OpenROCS and dome control
NASA Astrophysics Data System (ADS)
Colomé, Josep; Francisco, Xavier; Ribas, Ignasi; Casteels, Kevin; Martín, Jonatan
2010-07-01
The Telescope Joan Oró at the Montsec Astronomical Observatory (TJO - OAdM) is a small-class observatory working in completely unattended control. There are key problems to solve when a robotic control is envisaged, both on hardware and software issues. We present the OpenROCS (ROCS stands for Robotic Observatory Control System), an open source platform developed for the robotic control of the TJO - OAdM and similar astronomical observatories. It is a complex software architecture, composed of several applications for hardware control, event handling, environment monitoring, target scheduling, image reduction pipeline, etc. The code is developed in Java, C++, Python and Perl. The software infrastructure used is based on the Internet Communications Engine (Ice), an object-oriented middleware that provides object-oriented remote procedure call, grid computing, and publish/subscribe functionality. We also describe the subsystem in charge of the dome control: several hardware and software elements developed to specially protect the system at this identified single point of failure. It integrates a redundant control and a rain detector signal for alarm triggering and it responds autonomously in case communication with any of the control elements is lost (watchdog functionality). The self-developed control software suite (OpenROCS) and dome control system have proven to be highly reliable.
OpenSim: open-source software to create and analyze dynamic simulations of movement.
Delp, Scott L; Anderson, Frank C; Arnold, Allison S; Loan, Peter; Habib, Ayman; John, Chand T; Guendelman, Eran; Thelen, Darryl G
2007-11-01
Dynamic simulations of movement allow one to study neuromuscular coordination, analyze athletic performance, and estimate internal loading of the musculoskeletal system. Simulations can also be used to identify the sources of pathological movement and establish a scientific basis for treatment planning. We have developed a freely available, open-source software system (OpenSim) that lets users develop models of musculoskeletal structures and create dynamic simulations of a wide variety of movements. We are using this system to simulate the dynamics of individuals with pathological gait and to explore the biomechanical effects of treatments. OpenSim provides a platform on which the biomechanics community can build a library of simulations that can be exchanged, tested, analyzed, and improved through a multi-institutional collaboration. Developing software that enables a concerted effort from many investigators poses technical and sociological challenges. Meeting those challenges will accelerate the discovery of principles that govern movement control and improve treatments for individuals with movement pathologies.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-12
... Pachard Company, Business Critical Systems, Mission Critical Business Software Division, Openvms Operating... Business Software Division, Openvms Operating System Development Group, Including an Employee Operating Out... Company, Business Critical Systems, Mission Critical Business Software Division, OpenVMS Operating System...
What makes computational open source software libraries successful?
NASA Astrophysics Data System (ADS)
Bangerth, Wolfgang; Heister, Timo
2013-01-01
Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
Enabling cost-effective multimodal trip planners through open transit data.
DOT National Transportation Integrated Search
2011-05-01
This study examined whether multimodal trip planners can be developed using opensource software and open data sources. : OpenStreetMap (OSM), maintained by the nonprofit OpenStreetMap Foundation, is an open, freely available international : rep...
Enabling cost-effective multimodal trip planners through open transit data.
DOT National Transportation Integrated Search
2011-05-01
This study examined whether multimodal trip planners can be developed using opensource software and open data sources. OpenStreetMap (OSM), maintained by the nonprofit OpenStreetMap Foundation, is an open, freely available international reposit...
OpenFLUID: an open-source software environment for modelling fluxes in landscapes
NASA Astrophysics Data System (ADS)
Fabre, Jean-Christophe; Rabotin, Michaël; Crevoisier, David; Libres, Aline; Dagès, Cécile; Moussa, Roger; Lagacherie, Philippe; Raclot, Damien; Voltz, Marc
2013-04-01
Integrative landscape functioning has become a common concept in environmental management. Landscapes are complex systems where many processes interact in time and space. In agro-ecosystems, these processes are mainly physical processes, including hydrological-processes, biological processes and human activities. Modelling such systems requires an interdisciplinary approach, coupling models coming from different disciplines, developed by different teams. In order to support collaborative works, involving many models coupled in time and space for integrative simulations, an open software modelling platform is a relevant answer. OpenFLUID is an open source software platform for modelling landscape functioning, mainly focused on spatial fluxes. It provides an advanced object-oriented architecture allowing to i) couple models developed de novo or from existing source code, and which are dynamically plugged to the platform, ii) represent landscapes as hierarchical graphs, taking into account multi-scale, spatial heterogeneities and landscape objects connectivity, iii) run and explore simulations in many ways : using the OpenFLUID software interfaces for users (command line interface, graphical user interface), or using external applications such as GNU R through the provided ROpenFLUID package. OpenFLUID is developed in C++ and relies on open source libraries only (Boost, libXML2, GLib/GTK, OGR/GDAL, …). For modelers and developers, OpenFLUID provides a dedicated environment for model development, which is based on an open source toolchain, including the Eclipse editor, the GCC compiler and the CMake build system. OpenFLUID is distributed under the GPLv3 open source license, with a special exception allowing to plug existing models licensed under any license. It is clearly in the spirit of sharing knowledge and favouring collaboration in a community of modelers. OpenFLUID has been involved in many research applications, such as modelling of hydrological network transfer, diagnosis and prediction of water quality taking into account human activities, study of the effect of spatial organization on hydrological fluxes, modelling of surface-subsurface water exchanges, … At LISAH research unit, OpenFLUID is the supporting development platform of the MHYDAS model, which is a distributed model for agrosystems (Moussa et al., 2002, Hydrological Processes, 16, 393-412). OpenFLUID web site : http://www.openfluid-project.org
Your Personal Analysis Toolkit - An Open Source Solution
NASA Astrophysics Data System (ADS)
Mitchell, T.
2009-12-01
Open source software is commonly known for its web browsers, word processors and programming languages. However, there is a vast array of open source software focused on geographic information management and geospatial application building in general. As geo-professionals, having easy access to tools for our jobs is crucial. Open source software provides the opportunity to add a tool to your tool belt and carry it with you for your entire career - with no license fees, a supportive community and the opportunity to test, adopt and upgrade at your own pace. OSGeo is a US registered non-profit representing more than a dozen mature geospatial data management applications and programming resources. Tools cover areas such as desktop GIS, web-based mapping frameworks, metadata cataloging, spatial database analysis, image processing and more. Learn about some of these tools as they apply to AGU members, as well as how you can join OSGeo and its members in getting the job done with powerful open source tools. If you haven't heard of OSSIM, MapServer, OpenLayers, PostGIS, GRASS GIS or the many other projects under our umbrella - then you need to hear this talk. Invest in yourself - use open source!
Falcon: a highly flexible open-source software for closed-loop neuroscience.
Ciliberti, Davide; Kloosterman, Fabian
2017-08-01
Closed-loop experiments provide unique insights into brain dynamics and function. To facilitate a wide range of closed-loop experiments, we created an open-source software platform that enables high-performance real-time processing of streaming experimental data. We wrote Falcon, a C++ multi-threaded software in which the user can load and execute an arbitrary processing graph. Each node of a Falcon graph is mapped to a single thread and nodes communicate with each other through thread-safe buffers. The framework allows for easy implementation of new processing nodes and data types. Falcon was tested both on a 32-core and a 4-core workstation. Streaming data was read from either a commercial acquisition system (Neuralynx) or the open-source Open Ephys hardware, while closed-loop TTL pulses were generated with a USB module for digital output. We characterized the round-trip latency of our Falcon-based closed-loop system, as well as the specific latency contribution of the software architecture, by testing processing graphs with up to 32 parallel pipelines and eight serial stages. We finally deployed Falcon in a task of real-time detection of population bursts recorded live from the hippocampus of a freely moving rat. On Neuralynx hardware, round-trip latency was well below 1 ms and stable for at least 1 h, while on Open Ephys hardware latencies were below 15 ms. The latency contribution of the software was below 0.5 ms. Round-trip and software latencies were similar on both 32- and 4-core workstations. Falcon was used successfully to detect population bursts online with ~40 ms average latency. Falcon is a novel open-source software for closed-loop neuroscience. It has sub-millisecond intrinsic latency and gives the experimenter direct control of CPU resources. We envisage Falcon to be a useful tool to the neuroscientific community for implementing a wide variety of closed-loop experiments, including those requiring use of complex data structures and real-time execution of computationally intensive algorithms, such as population neural decoding/encoding from large cell assemblies.
Falcon: a highly flexible open-source software for closed-loop neuroscience
NASA Astrophysics Data System (ADS)
Ciliberti, Davide; Kloosterman, Fabian
2017-08-01
Objective. Closed-loop experiments provide unique insights into brain dynamics and function. To facilitate a wide range of closed-loop experiments, we created an open-source software platform that enables high-performance real-time processing of streaming experimental data. Approach. We wrote Falcon, a C++ multi-threaded software in which the user can load and execute an arbitrary processing graph. Each node of a Falcon graph is mapped to a single thread and nodes communicate with each other through thread-safe buffers. The framework allows for easy implementation of new processing nodes and data types. Falcon was tested both on a 32-core and a 4-core workstation. Streaming data was read from either a commercial acquisition system (Neuralynx) or the open-source Open Ephys hardware, while closed-loop TTL pulses were generated with a USB module for digital output. We characterized the round-trip latency of our Falcon-based closed-loop system, as well as the specific latency contribution of the software architecture, by testing processing graphs with up to 32 parallel pipelines and eight serial stages. We finally deployed Falcon in a task of real-time detection of population bursts recorded live from the hippocampus of a freely moving rat. Main results. On Neuralynx hardware, round-trip latency was well below 1 ms and stable for at least 1 h, while on Open Ephys hardware latencies were below 15 ms. The latency contribution of the software was below 0.5 ms. Round-trip and software latencies were similar on both 32- and 4-core workstations. Falcon was used successfully to detect population bursts online with ~40 ms average latency. Significance. Falcon is a novel open-source software for closed-loop neuroscience. It has sub-millisecond intrinsic latency and gives the experimenter direct control of CPU resources. We envisage Falcon to be a useful tool to the neuroscientific community for implementing a wide variety of closed-loop experiments, including those requiring use of complex data structures and real-time execution of computationally intensive algorithms, such as population neural decoding/encoding from large cell assemblies.
A Flexible Method for Producing F.E.M. Analysis of Bone Using Open-Source Software
NASA Technical Reports Server (NTRS)
Boppana, Abhishektha; Sefcik, Ryan; Meyers, Jerry G.; Lewandowski, Beth E.
2016-01-01
This project, performed in support of the NASA GRC Space Academy summer program, sought to develop an open-source workflow methodology that segmented medical image data, created a 3D model from the segmented data, and prepared the model for finite-element analysis. In an initial step, a technological survey evaluated the performance of various existing open-source software that claim to perform these tasks. However, the survey concluded that no single software exhibited the wide array of functionality required for the potential NASA application in the area of bone, muscle and bio fluidic studies. As a result, development of a series of Python scripts provided the bridging mechanism to address the shortcomings of the available open source tools. The implementation of the VTK library provided the most quick and effective means of segmenting regions of interest from the medical images; it allowed for the export of a 3D model by using the marching cubes algorithm to build a surface mesh. To facilitate the development of the model domain from this extracted information required a surface mesh to be processed in the open-source software packages Blender and Gmsh. The Preview program of the FEBio suite proved to be sufficient for volume filling the model with an unstructured mesh and preparing boundaries specifications for finite element analysis. To fully allow FEM modeling, an in house developed Python script allowed assignment of material properties on an element by element basis by performing a weighted interpolation of voxel intensity of the parent medical image correlated to published information of image intensity to material properties, such as ash density. A graphical user interface combined the Python scripts and other software into a user friendly interface. The work using Python scripts provides a potential alternative to expensive commercial software and inadequate, limited open-source freeware programs for the creation of 3D computational models. More work will be needed to validate this approach in creating finite-element models.
OpenStereo: Open Source, Cross-Platform Software for Structural Geology Analysis
NASA Astrophysics Data System (ADS)
Grohmann, C. H.; Campanha, G. A.
2010-12-01
Free and open source software (FOSS) are increasingly seen as synonyms of innovation and progress. Freedom to run, copy, distribute, study, change and improve the software (through access to the source code) assure a high level of positive feedback between users and developers, which results in stable, secure and constantly updated systems. Several software packages for structural geology analysis are available to the user, with commercial licenses or that can be downloaded at no cost from the Internet. Some provide basic tools of stereographic projections such as plotting poles, great circles, density contouring, eigenvector analysis, data rotation etc, while others perform more specific tasks, such as paleostress or geotechnical/rock stability analysis. This variety also means a wide range of data formating for input, Graphical User Interface (GUI) design and graphic export format. The majority of packages is built for MS-Windows and even though there are packages for the UNIX-based MacOS, there aren't native packages for *nix (UNIX, Linux, BSD etc) Operating Systems (OS), forcing the users to run these programs with emulators or virtual machines. Those limitations lead us to develop OpenStereo, an open source, cross-platform software for stereographic projections and structural geology. The software is written in Python, a high-level, cross-platform programming language and the GUI is designed with wxPython, which provide a consistent look regardless the OS. Numeric operations (like matrix and linear algebra) are performed with the Numpy module and all graphic capabilities are provided by the Matplolib library, including on-screen plotting and graphic exporting to common desktop formats (emf, eps, ps, pdf, png, svg). Data input is done with simple ASCII text files, with values of dip direction and dip/plunge separated by spaces, tabs or commas. The user can open multiple file at the same time (or the same file more than once), and overlay different elements of each dataset (poles, great circles etc). The GUI shows the opened files in a tree structure, similar to “layers” of many illustration software, where the vertical order of the files in the tree reflects the drawing order of the selected elements. At this stage, the software performs plotting operations of poles to planes, lineations, great circles, density contours and rose diagrams. A set of statistics is calculated for each file and its eigenvalues and eigenvectors are used to suggest if the data is clustered about a mean value or distributed along a girdle. Modified Flinn, Triangular and histograms plots are also available. Next step of development will focus on tools as merging and rotation of datasets, possibility to save 'projects' and paleostress analysis. In its current state, OpenStereo requires Python, wxPython, Numpy and Matplotlib installed in the system. We recommend installing PythonXY or the Enthought Python Distribution on MS-Windows and MacOS machines, since all dependencies are provided. Most Linux distributions provide an easy way to install all dependencies through software repositories. OpenStereo is released under the GNU General Public License. Programmers willing to contribute are encouraged to contact the authors directly. FAPESP Grant #09/17675-5
The 2015 Bioinformatics Open Source Conference (BOSC 2015)
Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar
2016-01-01
The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653
Open Standards, Open Source, and Open Innovation: Harnessing the Benefits of Openness
ERIC Educational Resources Information Center
Committee for Economic Development, 2006
2006-01-01
Digitization of information and the Internet have profoundly expanded the capacity for openness. This report details the benefits of openness in three areas--open standards, open-source software, and open innovation--and examines the major issues in the debate over whether openness should be encouraged or not. The report explains each of these…
NASA Astrophysics Data System (ADS)
Barlow, P. M.; Filali-Meknassi, Y.; Sanford, W. E.; Winston, R. B.; Kuniansky, E.; Dawson, C.
2015-12-01
UNESCO's HOPE Initiative—the Hydro Free and (or) Open-source Platform of Experts—was launched in June 2013 as part of UNESCO's International Hydrological Programme. The Initiative arose in response to a recognized need to make free and (or) open-source water-resources software more widely accessible to Africa's water sector. A kit of software is being developed to provide African water authorities, teachers, university lecturers, and researchers with a set of programs that can be enhanced and (or) applied to the development of efficient and sustainable management strategies for Africa's water resources. The Initiative brings together experts from the many fields of water resources to identify software that might be included in the kit, to oversee an objective process for selecting software for the kit, and to engage in training and other modes of capacity building to enhance dissemination of the software. To date, teams of experts from the fields of wastewater treatment, groundwater hydrology, surface-water hydrology, and data management have been formed to identify relevant software from their respective fields. An initial version of the HOPE Software Kit was released in late August 2014 and consists of the STOAT model for wastewater treatment developed by the Water Research Center (United Kingdom) and the MODFLOW-2005 model for groundwater-flow simulation developed by the U.S. Geological Survey. The Kit is available on the UNESCO HOPE website (http://www.hope-initiative.net/).Training in the theory and use of MODFLOW-2005 is planned in southern Africa in conjunction with UNESCO's study of the Kalahari-Karoo/Stampriet Transboundary Aquifer, which extends over an area that includes parts of Botswana, Namibia, and South Africa, and in support of the European Commission's Horizon 2020 FREEWAT project (FREE and open source software tools for WATer resource management; see the UNESCO HOPE website).
Wenig, Philip; Odermatt, Juergen
2010-07-30
Today, data evaluation has become a bottleneck in chromatographic science. Analytical instruments equipped with automated samplers yield large amounts of measurement data, which needs to be verified and analyzed. Since nearly every GC/MS instrument vendor offers its own data format and software tools, the consequences are problems with data exchange and a lack of comparability between the analytical results. To challenge this situation a number of either commercial or non-profit software applications have been developed. These applications provide functionalities to import and analyze several data formats but have shortcomings in terms of the transparency of the implemented analytical algorithms and/or are restricted to a specific computer platform. This work describes a native approach to handle chromatographic data files. The approach can be extended in its functionality such as facilities to detect baselines, to detect, integrate and identify peaks and to compare mass spectra, as well as the ability to internationalize the application. Additionally, filters can be applied on the chromatographic data to enhance its quality, for example to remove background and noise. Extended operations like do, undo and redo are supported. OpenChrom is a software application to edit and analyze mass spectrometric chromatographic data. It is extensible in many different ways, depending on the demands of the users or the analytical procedures and algorithms. It offers a customizable graphical user interface. The software is independent of the operating system, due to the fact that the Rich Client Platform is written in Java. OpenChrom is released under the Eclipse Public License 1.0 (EPL). There are no license constraints regarding extensions. They can be published using open source as well as proprietary licenses. OpenChrom is available free of charge at http://www.openchrom.net.
The UARS and open data concept and analysis study. [upper atmosphere
NASA Technical Reports Server (NTRS)
Mittal, M.; Nebb, J.; Woodward, H.
1983-01-01
Alternative concepts for a common design for the UARS and OPEN Central Data Handling Facility (CDHF) are offered. Costs for alternative implementations of the UARS designs are presented, showing that the system design does not restrict the implementation to a single manufacturer. Processing demands on the alternative UARS CDHF implementations are then discussed. With this information at hand together with estimates for OPEN processing demands, it is shown that any shortfall in system capability for OPEN support can be remedied by either component upgrades or array processing attachments rather than a system redesign. In addition to a common system design, it is shown that there is significant potential for common software design, especially in the areas of data management software and non-user-unique production software. Archiving the CDHF data are discussed. Following that, cost examples for several modes of communications between the CDHF and Remote User Facilities are presented. Technology application is discussed.
A Quantitative Analysis of Open Source Software's Acceptability as Production-Quality Code
ERIC Educational Resources Information Center
Fischer, Michael
2011-01-01
The difficulty in writing defect-free software has been long acknowledged both by academia and industry. A constant battle occurs as developers seek to craft software that works within aggressive business schedules and deadlines. Many tools and techniques are used in attempt to manage these software projects. Software metrics are a tool that has…
Data-Driven Software Framework for Web-Based ISS Telescience
NASA Technical Reports Server (NTRS)
Tso, Kam S.
2005-01-01
Software that enables authorized users to monitor and control scientific payloads aboard the International Space Station (ISS) from diverse terrestrial locations equipped with Internet connections is undergoing development. This software reflects a data-driven approach to distributed operations. A Web-based software framework leverages prior developments in Java and Extensible Markup Language (XML) to create portable code and portable data, to which one can gain access via Web-browser software on almost any common computer. Open-source software is used extensively to minimize cost; the framework also accommodates enterprise-class server software to satisfy needs for high performance and security. To accommodate the diversity of ISS experiments and users, the framework emphasizes openness and extensibility. Users can take advantage of available viewer software to create their own client programs according to their particular preferences, and can upload these programs for custom processing of data, generation of views, and planning of experiments. The same software system, possibly augmented with a subset of data and additional software tools, could be used for public outreach by enabling public users to replay telescience experiments, conduct their experiments with simulated payloads, and create their own client programs and other custom software.
OpenCFU, a New Free and Open-Source Software to Count Cell Colonies and Other Circular Objects
Geissmann, Quentin
2013-01-01
Counting circular objects such as cell colonies is an important source of information for biologists. Although this task is often time-consuming and subjective, it is still predominantly performed manually. The aim of the present work is to provide a new tool to enumerate circular objects from digital pictures and video streams. Here, I demonstrate that the created program, OpenCFU, is very robust, accurate and fast. In addition, it provides control over the processing parameters and is implemented in an intuitive and modern interface. OpenCFU is a cross-platform and open-source software freely available at http://opencfu.sourceforge.net. PMID:23457446
ERIC Educational Resources Information Center
Vanfretti, L.; Milano, F.
2012-01-01
This paper describes how the use of free and open-source software (FOSS) can facilitate the application of constructive alignment theory in power systems engineering education by enabling the deep learning approach in power system analysis courses. With this aim, this paper describes the authors' approach in using the Power System Analysis Toolbox…
ERIC Educational Resources Information Center
Wales, Tim; Robertson, Penny
2008-01-01
Purpose: The aim of this paper is to share the experiences and challenges faced by the Open University Library (OUL) in using screen capture software to develop online literature search tutorials. Design/methodology/approach: A summary of information literacy support at the OUL is provided as background information to explain the decision to…
Learning and Best Practices for Learning in Open-Source Software Communities
ERIC Educational Resources Information Center
Singh, Vandana; Holt, Lila
2013-01-01
This research is about participants who use open-source software (OSS) discussion forums for learning. Learning in online communities of education as well as non-education-related online communities has been studied under the lens of social learning theory and situated learning for a long time. In this research, we draw parallels among these two…
Correction of Spatial Bias in Oligonucleotide Array Data
Lemieux, Sébastien
2013-01-01
Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target's true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users' current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias. PMID:23573083
RTSPM: real-time Linux control software for scanning probe microscopy.
Chandrasekhar, V; Mehta, M M
2013-01-01
Real time computer control is an essential feature of scanning probe microscopes, which have become important tools for the characterization and investigation of nanometer scale samples. Most commercial (and some open-source) scanning probe data acquisition software uses digital signal processors to handle the real time data processing and control, which adds to the expense and complexity of the control software. We describe here scan control software that uses a single computer and a data acquisition card to acquire scan data. The computer runs an open-source real time Linux kernel, which permits fast acquisition and control while maintaining a responsive graphical user interface. Images from a simulated tuning-fork based microscope as well as a standard topographical sample are also presented, showing some of the capabilities of the software.
The OpenCalphad thermodynamic software interface.
Sundman, Bo; Kattner, Ursula R; Sigli, Christophe; Stratmann, Matthias; Le Tellier, Romain; Palumbo, Mauro; Fries, Suzana G
2016-12-01
Thermodynamic data are needed for all kinds of simulations of materials processes. Thermodynamics determines the set of stable phases and also provides chemical potentials, compositions and driving forces for nucleation of new phases and phase transformations. Software to simulate materials properties needs accurate and consistent thermodynamic data to predict metastable states that occur during phase transformations. Due to long calculation times thermodynamic data are frequently pre-calculated into "lookup tables" to speed up calculations. This creates additional uncertainties as data must be interpolated or extrapolated and conditions may differ from those assumed for creating the lookup table. Speed and accuracy requires that thermodynamic software is fully parallelized and the Open-Calphad (OC) software is the first thermodynamic software supporting this feature. This paper gives a brief introduction to computational thermodynamics and introduces the basic features of the OC software and presents four different application examples to demonstrate its versatility.
The OpenCalphad thermodynamic software interface
Sundman, Bo; Kattner, Ursula R; Sigli, Christophe; Stratmann, Matthias; Le Tellier, Romain; Palumbo, Mauro; Fries, Suzana G
2017-01-01
Thermodynamic data are needed for all kinds of simulations of materials processes. Thermodynamics determines the set of stable phases and also provides chemical potentials, compositions and driving forces for nucleation of new phases and phase transformations. Software to simulate materials properties needs accurate and consistent thermodynamic data to predict metastable states that occur during phase transformations. Due to long calculation times thermodynamic data are frequently pre-calculated into “lookup tables” to speed up calculations. This creates additional uncertainties as data must be interpolated or extrapolated and conditions may differ from those assumed for creating the lookup table. Speed and accuracy requires that thermodynamic software is fully parallelized and the Open-Calphad (OC) software is the first thermodynamic software supporting this feature. This paper gives a brief introduction to computational thermodynamics and introduces the basic features of the OC software and presents four different application examples to demonstrate its versatility. PMID:28260838
Analyzing huge pathology images with open source software.
Deroulers, Christophe; Ameisen, David; Badoual, Mathilde; Gerin, Chloé; Granier, Alexandre; Lartaud, Marc
2013-06-06
Digital pathology images are increasingly used both for diagnosis and research, because slide scanners are nowadays broadly available and because the quantitative study of these images yields new insights in systems biology. However, such virtual slides build up a technical challenge since the images occupy often several gigabytes and cannot be fully opened in a computer's memory. Moreover, there is no standard format. Therefore, most common open source tools such as ImageJ fail at treating them, and the others require expensive hardware while still being prohibitively slow. We have developed several cross-platform open source software tools to overcome these limitations. The NDPITools provide a way to transform microscopy images initially in the loosely supported NDPI format into one or several standard TIFF files, and to create mosaics (division of huge images into small ones, with or without overlap) in various TIFF and JPEG formats. They can be driven through ImageJ plugins. The LargeTIFFTools achieve similar functionality for huge TIFF images which do not fit into RAM. We test the performance of these tools on several digital slides and compare them, when applicable, to standard software. A statistical study of the cells in a tissue sample from an oligodendroglioma was performed on an average laptop computer to demonstrate the efficiency of the tools. Our open source software enables dealing with huge images with standard software on average computers. They are cross-platform, independent of proprietary libraries and very modular, allowing them to be used in other open source projects. They have excellent performance in terms of execution speed and RAM requirements. They open promising perspectives both to the clinician who wants to study a single slide and to the research team or data centre who do image analysis of many slides on a computer cluster. The virtual slide(s) for this article can be found here:http://www.diagnosticpathology.diagnomx.eu/vs/5955513929846272.
Analyzing huge pathology images with open source software
2013-01-01
Background Digital pathology images are increasingly used both for diagnosis and research, because slide scanners are nowadays broadly available and because the quantitative study of these images yields new insights in systems biology. However, such virtual slides build up a technical challenge since the images occupy often several gigabytes and cannot be fully opened in a computer’s memory. Moreover, there is no standard format. Therefore, most common open source tools such as ImageJ fail at treating them, and the others require expensive hardware while still being prohibitively slow. Results We have developed several cross-platform open source software tools to overcome these limitations. The NDPITools provide a way to transform microscopy images initially in the loosely supported NDPI format into one or several standard TIFF files, and to create mosaics (division of huge images into small ones, with or without overlap) in various TIFF and JPEG formats. They can be driven through ImageJ plugins. The LargeTIFFTools achieve similar functionality for huge TIFF images which do not fit into RAM. We test the performance of these tools on several digital slides and compare them, when applicable, to standard software. A statistical study of the cells in a tissue sample from an oligodendroglioma was performed on an average laptop computer to demonstrate the efficiency of the tools. Conclusions Our open source software enables dealing with huge images with standard software on average computers. They are cross-platform, independent of proprietary libraries and very modular, allowing them to be used in other open source projects. They have excellent performance in terms of execution speed and RAM requirements. They open promising perspectives both to the clinician who wants to study a single slide and to the research team or data centre who do image analysis of many slides on a computer cluster. Virtual slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5955513929846272 PMID:23829479
OpenMx: An Open Source Extended Structural Equation Modeling Framework
ERIC Educational Resources Information Center
Boker, Steven; Neale, Michael; Maes, Hermine; Wilde, Michael; Spiegel, Michael; Brick, Timothy; Spies, Jeffrey; Estabrook, Ryne; Kenny, Sarah; Bates, Timothy; Mehta, Paras; Fox, John
2011-01-01
OpenMx is free, full-featured, open source, structural equation modeling (SEM) software. OpenMx runs within the "R" statistical programming environment on Windows, Mac OS-X, and Linux computers. The rationale for developing OpenMx is discussed along with the philosophy behind the user interface. The OpenMx data structures are…
Challenges of the Open Source Component Marketplace in the Industry
NASA Astrophysics Data System (ADS)
Ayala, Claudia; Hauge, Øyvind; Conradi, Reidar; Franch, Xavier; Li, Jingyue; Velle, Ketil Sandanger
The reuse of Open Source Software components available on the Internet is playing a major role in the development of Component Based Software Systems. Nevertheless, the special nature of the OSS marketplace has taken the “classical” concept of software reuse based on centralized repositories to a completely different arena based on massive reuse over Internet. In this paper we provide an overview of the actual state of the OSS marketplace, and report preliminary findings about how companies interact with this marketplace to reuse OSS components. Such data was gathered from interviews in software companies in Spain and Norway. Based on these results we identify some challenges aimed to improve the industrial reuse of OSS components.
Listening to the student voice to improve educational software.
van Wyk, Mari; van Ryneveld, Linda
2017-01-01
Academics often develop software for teaching and learning purposes with the best of intentions, only to be disappointed by the low acceptance rate of the software by their students once it is implemented. In this study, the focus is on software that was designed to enable veterinary students to record their clinical skills. A pilot of the software clearly showed that the program had not been received as well as had been anticipated, and therefore the researchers used a group interview and a questionnaire with closed-ended and open-ended questions to obtain the students' feedback. The open-ended questions were analysed with conceptual content analysis, and themes were identified. Students made valuable suggestions about what they regarded as important considerations when a new software program is introduced. The most important lesson learnt was that students cannot always predict their needs accurately if they are asked for input prior to the development of software. For that reason student input should be obtained on a continuous and regular basis throughout the design and development phases.
Klambauer, Günter; Schwarzbauer, Karin; Mayr, Andreas; Clevert, Djork-Arné; Mitterecker, Andreas; Bodenhofer, Ulrich; Hochreiter, Sepp
2012-01-01
Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor. PMID:22302147
Klambauer, Günter; Schwarzbauer, Karin; Mayr, Andreas; Clevert, Djork-Arné; Mitterecker, Andreas; Bodenhofer, Ulrich; Hochreiter, Sepp
2012-05-01
Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose 'Copy Number estimation by a Mixture Of PoissonS' (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1-FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.
Software Assurance: Five Essential Considerations for Acquisition Officials
2007-05-01
May 2007 www.stsc.hill.af.mil 17 2 • address security concerns in the software development life cycle ( SDLC )? • Are there formal software quality...What threat modeling process, if any, is used when designing the software ? What analysis, design, and construction tools are used by your software design...the-shelf (COTS), government off-the-shelf (GOTS), open- source, embedded, and legacy software . Attackers exploit unintentional vulnerabil- ities or
NASA Astrophysics Data System (ADS)
Das, I.; Oberai, K.; Sarathi Roy, P.
2012-07-01
Landslides exhibit themselves in different mass movement processes and are considered among the most complex natural hazards occurring on the earth surface. Making landslide database available online via WWW (World Wide Web) promotes the spreading and reaching out of the landslide information to all the stakeholders. The aim of this research is to present a comprehensive database for generating landslide hazard scenario with the help of available historic records of landslides and geo-environmental factors and make them available over the Web using geospatial Free & Open Source Software (FOSS). FOSS reduces the cost of the project drastically as proprietary software's are very costly. Landslide data generated for the period 1982 to 2009 were compiled along the national highway road corridor in Indian Himalayas. All the geo-environmental datasets along with the landslide susceptibility map were served through WEBGIS client interface. Open source University of Minnesota (UMN) mapserver was used as GIS server software for developing web enabled landslide geospatial database. PHP/Mapscript server-side application serve as a front-end application and PostgreSQL with PostGIS extension serve as a backend application for the web enabled landslide spatio-temporal databases. This dynamic virtual visualization process through a web platform brings an insight into the understanding of the landslides and the resulting damage closer to the affected people and user community. The landslide susceptibility dataset is also made available as an Open Geospatial Consortium (OGC) Web Feature Service (WFS) which can be accessed through any OGC compliant open source or proprietary GIS Software.
A generic open-source software framework supporting scenario simulations in bioterrorist crises.
Falenski, Alexander; Filter, Matthias; Thöns, Christian; Weiser, Armin A; Wigger, Jan-Frederik; Davis, Matthew; Douglas, Judith V; Edlund, Stefan; Hu, Kun; Kaufman, James H; Appel, Bernd; Käsbohrer, Annemarie
2013-09-01
Since the 2001 anthrax attack in the United States, awareness of threats originating from bioterrorism has grown. This led internationally to increased research efforts to improve knowledge of and approaches to protecting human and animal populations against the threat from such attacks. A collaborative effort in this context is the extension of the open-source Spatiotemporal Epidemiological Modeler (STEM) simulation and modeling software for agro- or bioterrorist crisis scenarios. STEM, originally designed to enable community-driven public health disease models and simulations, was extended with new features that enable integration of proprietary data as well as visualization of agent spread along supply and production chains. STEM now provides a fully developed open-source software infrastructure supporting critical modeling tasks such as ad hoc model generation, parameter estimation, simulation of scenario evolution, estimation of effects of mitigation or management measures, and documentation. This open-source software resource can be used free of charge. Additionally, STEM provides critical features like built-in worldwide data on administrative boundaries, transportation networks, or environmental conditions (eg, rainfall, temperature, elevation, vegetation). Users can easily combine their own confidential data with built-in public data to create customized models of desired resolution. STEM also supports collaborative and joint efforts in crisis situations by extended import and export functionalities. In this article we demonstrate specifically those new software features implemented to accomplish STEM application in agro- or bioterrorist crisis scenarios.
TRACC: an open source software for processing sap flux data from thermal dissipation probes
Eric J. Ward; Jean-Christophe Domec; John King; Ge Sun; Steve McNulty; Asko Noormets
2017-01-01
Key message TRACC is an open-source software for standardizing the cleaning, conversion, and calibration of sap flux density data from thermal dissipation probes, which addresses issues of nighttime transpiration and water storage. Abstract Thermal dissipation probes (TDPs) have become a widely used method of monitoring plant water use in recent years. The use of TDPs...
Software Training Classes Now Open | Poster
By Nancy Parrish, Staff Writer Data Management Services, Inc. (DMS), has announced the opening of its spring session of software training classes, available to all employees at NCI at Frederick. Classes begin on March 31 and run through June 30.
ERIC Educational Resources Information Center
Britton, Todd Alan
2014-01-01
Purpose: The purpose of this study was to examine the key considerations of community, scalability, supportability, security, and functionality for selecting open-source software in California universities as perceived by technology leaders. Methods: After a review of the cogent literature, the key conceptual framework categories were identified…
ERIC Educational Resources Information Center
Schmidt, Matthew; Galyen, Krista; Laffey, James; Babiuch, Ryan; Schmidt, Carla
2014-01-01
Design-based research (DBR) and open source software are both acknowledged as potentially productive ways for advancing learning technologies. These approaches have practical benefits for the design and development process and for building and leveraging community to augment and sustain design and development. This report presents a case study of…
Open source tools and toolkits for bioinformatics: significance, and where are we?
Stajich, Jason E; Lapp, Hilmar
2006-09-01
This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.
Open source tools for ATR development and performance evaluation
NASA Astrophysics Data System (ADS)
Baumann, James M.; Dilsavor, Ronald L.; Stubbles, James; Mossing, John C.
2002-07-01
Early in almost every engineering project, a decision must be made about tools; should I buy off-the-shelf tools or should I develop my own. Either choice can involve significant cost and risk. Off-the-shelf tools may be readily available, but they can be expensive to purchase and to maintain licenses, and may not be flexible enough to satisfy all project requirements. On the other hand, developing new tools permits great flexibility, but it can be time- (and budget-) consuming, and the end product still may not work as intended. Open source software has the advantages of both approaches without many of the pitfalls. This paper examines the concept of open source software, including its history, unique culture, and informal yet closely followed conventions. These characteristics influence the quality and quantity of software available, and ultimately its suitability for serious ATR development work. We give an example where Python, an open source scripting language, and OpenEV, a viewing and analysis tool for geospatial data, have been incorporated into ATR performance evaluation projects. While this case highlights the successful use of open source tools, we also offer important insight into risks associated with this approach.
FluxPyt: a Python-based free and open-source software for 13C-metabolic flux analyses.
Desai, Trunil S; Srivastava, Shireesh
2018-01-01
13 C-Metabolic flux analysis (MFA) is a powerful approach to estimate intracellular reaction rates which could be used in strain analysis and design. Processing and analysis of labeling data for calculation of fluxes and associated statistics is an essential part of MFA. However, various software currently available for data analysis employ proprietary platforms and thus limit accessibility. We developed FluxPyt, a Python-based truly open-source software package for conducting stationary 13 C-MFA data analysis. The software is based on the efficient elementary metabolite unit framework. The standard deviations in the calculated fluxes are estimated using the Monte-Carlo analysis. FluxPyt also automatically creates flux maps based on a template for visualization of the MFA results. The flux distributions calculated by FluxPyt for two separate models: a small tricarboxylic acid cycle model and a larger Corynebacterium glutamicum model, were found to be in good agreement with those calculated by a previously published software. FluxPyt was tested in Microsoft™ Windows 7 and 10, as well as in Linux Mint 18.2. The availability of a free and open 13 C-MFA software that works in various operating systems will enable more researchers to perform 13 C-MFA and to further modify and develop the package.
FluxPyt: a Python-based free and open-source software for 13C-metabolic flux analyses
Desai, Trunil S.
2018-01-01
13C-Metabolic flux analysis (MFA) is a powerful approach to estimate intracellular reaction rates which could be used in strain analysis and design. Processing and analysis of labeling data for calculation of fluxes and associated statistics is an essential part of MFA. However, various software currently available for data analysis employ proprietary platforms and thus limit accessibility. We developed FluxPyt, a Python-based truly open-source software package for conducting stationary 13C-MFA data analysis. The software is based on the efficient elementary metabolite unit framework. The standard deviations in the calculated fluxes are estimated using the Monte-Carlo analysis. FluxPyt also automatically creates flux maps based on a template for visualization of the MFA results. The flux distributions calculated by FluxPyt for two separate models: a small tricarboxylic acid cycle model and a larger Corynebacterium glutamicum model, were found to be in good agreement with those calculated by a previously published software. FluxPyt was tested in Microsoft™ Windows 7 and 10, as well as in Linux Mint 18.2. The availability of a free and open 13C-MFA software that works in various operating systems will enable more researchers to perform 13C-MFA and to further modify and develop the package. PMID:29736347
Villoria, Eduardo M; Lenzi, Antônio R; Soares, Rodrigo V; Souki, Bernardo Q; Sigurdsson, Asgeir; Marques, Alexandre P; Fidel, Sandra R
2017-01-01
To describe the use of open-source software for the post-processing of CBCT imaging for the assessment of periapical lesions development after endodontic treatment. CBCT scans were retrieved from endodontic records of two patients. Three-dimensional virtual models, voxel counting, volumetric measurement (mm 3 ) and mean intensity of the periapical lesion were performed with ITK-SNAP v. 3.0 software. Three-dimensional models of the lesions were aligned and overlapped through the MeshLab software, which performed an automatic recording of the anatomical structures, based on the best fit. Qualitative and quantitative analyses of the changes in lesions size after treatment were performed with the 3DMeshMetric software. The ITK-SNAP v. 3.0 showed the smaller value corresponding to the voxel count and the volume of the lesion segmented in yellow, indicating reduction in volume of the lesion after the treatment. A higher value of the mean intensity of the segmented image in yellow was also observed, which suggested new bone formation. Colour mapping and "point value" tool allowed the visualization of the reduction of periapical lesions in several regions. Researchers and clinicians in the monitoring of endodontic periapical lesions have the opportunity to use open-source software.
Tools for open geospatial science
NASA Astrophysics Data System (ADS)
Petras, V.; Petrasova, A.; Mitasova, H.
2017-12-01
Open science uses open source to deal with reproducibility challenges in data and computational sciences. However, just using open source software or making the code public does not make the research reproducible. Moreover, the scientists face the challenge of learning new unfamiliar tools and workflows. In this contribution, we will look at a graduate-level course syllabus covering several software tools which make validation and reuse by a wider professional community possible. For the novices in the open science arena, we will look at how scripting languages such as Python and Bash help us reproduce research (starting with our own work). Jupyter Notebook will be introduced as a code editor, data exploration tool, and a lab notebook. We will see how Git helps us not to get lost in revisions and how Docker is used to wrap all the parts together using a single text file so that figures for a scientific paper or a technical report can be generated with a single command. We will look at examples of software and publications in the geospatial domain which use these tools and principles. Scientific contributions to GRASS GIS, a powerful open source desktop GIS and geoprocessing backend, will serve as an example of why and how to publish new algorithms and tools as part of a bigger open source project.
Free Software and Free Textbooks
ERIC Educational Resources Information Center
Takhteyev, Yuri
2012-01-01
Some of the world's best and most sophisticated software is distributed today under "free" or "open source" licenses, which allow the recipients of such software to use, modify, and share it without paying royalties or asking for permissions. If this works for software, could it also work for educational resources, such as books? The economics of…
The Role of Free/Libre and Open Source Software in Learning Health Systems.
Paton, C; Karopka, T
2017-08-01
Objective: To give an overview of the role of Free/Libre and Open Source Software (FLOSS) in the context of secondary use of patient data to enable Learning Health Systems (LHSs). Methods: We conducted an environmental scan of the academic and grey literature utilising the MedFLOSS database of open source systems in healthcare to inform a discussion of the role of open source in developing LHSs that reuse patient data for research and quality improvement. Results: A wide range of FLOSS is identified that contributes to the information technology (IT) infrastructure of LHSs including operating systems, databases, frameworks, interoperability software, and mobile and web apps. The recent literature around the development and use of key clinical data management tools is also reviewed. Conclusions: FLOSS already plays a critical role in modern health IT infrastructure for the collection, storage, and analysis of patient data. The nature of FLOSS systems to be collaborative, modular, and modifiable may make open source approaches appropriate for building the digital infrastructure for a LHS. Georg Thieme Verlag KG Stuttgart.
2012-01-01
Background The robust identification of isotope patterns originating from peptides being analyzed through mass spectrometry (MS) is often significantly hampered by noise artifacts and the interference of overlapping patterns arising e.g. from post-translational modifications. As the classification of the recorded data points into either ‘noise’ or ‘signal’ lies at the very root of essentially every proteomic application, the quality of the automated processing of mass spectra can significantly influence the way the data might be interpreted within a given biological context. Results We propose non-negative least squares/non-negative least absolute deviation regression to fit a raw spectrum by templates imitating isotope patterns. In a carefully designed validation scheme, we show that the method exhibits excellent performance in pattern picking. It is demonstrated that the method is able to disentangle complicated overlaps of patterns. Conclusions We find that regularization is not necessary to prevent overfitting and that thresholding is an effective and user-friendly way to perform feature selection. The proposed method avoids problems inherent in regularization-based approaches, comes with a set of well-interpretable parameters whose default configuration is shown to generalize well without the need for fine-tuning, and is applicable to spectra of different platforms. The R package IPPD implements the method and is available from the Bioconductor platform (http://bioconductor.fhcrc.org/help/bioc-views/devel/bioc/html/IPPD.html). PMID:23137144
RCAS: an RNA centric annotation system for transcriptome-wide regions of interest.
Uyar, Bora; Yusuf, Dilmurat; Wurmus, Ricardo; Rajewsky, Nikolaus; Ohler, Uwe; Akalin, Altuna
2017-06-02
In the field of RNA, the technologies for studying the transcriptome have created a tremendous potential for deciphering the puzzles of the RNA biology. Along with the excitement, the unprecedented volume of RNA related omics data is creating great challenges in bioinformatics analyses. Here, we present the RNA Centric Annotation System (RCAS), an R package, which is designed to ease the process of creating gene-centric annotations and analysis for the genomic regions of interest obtained from various RNA-based omics technologies. The design of RCAS is modular, which enables flexible usage and convenient integration with other bioinformatics workflows. RCAS is an R/Bioconductor package but we also created graphical user interfaces including a Galaxy wrapper and a stand-alone web service. The application of RCAS on published datasets shows that RCAS is not only able to reproduce published findings but also helps generate novel knowledge and hypotheses. The meta-gene profiles, gene-centric annotation, motif analysis and gene-set analysis provided by RCAS provide contextual knowledge which is necessary for understanding the functional aspects of different biological events that involve RNAs. In addition, the array of different interfaces and deployment options adds the convenience of use for different levels of users. RCAS is available at http://bioconductor.org/packages/release/bioc/html/RCAS.html and http://rcas.mdc-berlin.de. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hardcastle, Thomas J
2016-01-15
High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a 'large P, small n' setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. tjh48@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
CellProfiler and KNIME: open source tools for high content screening.
Stöter, Martin; Niederlein, Antje; Barsacchi, Rico; Meyenhofer, Felix; Brandl, Holger; Bickle, Marc
2013-01-01
High content screening (HCS) has established itself in the world of the pharmaceutical industry as an essential tool for drug discovery and drug development. HCS is currently starting to enter the academic world and might become a widely used technology. Given the diversity of problems tackled in academic research, HCS could experience some profound changes in the future, mainly with more imaging modalities and smart microscopes being developed. One of the limitations in the establishment of HCS in academia is flexibility and cost. Flexibility is important to be able to adapt the HCS setup to accommodate the multiple different assays typical of academia. Many cost factors cannot be avoided, but the costs of the software packages necessary to analyze large datasets can be reduced by using Open Source software. We present and discuss the Open Source software CellProfiler for image analysis and KNIME for data analysis and data mining that provide software solutions which increase flexibility and keep costs low.
An Open-Source Standard T-Wave Alternans Detector for Benchmarking.
Khaustov, A; Nemati, S; Clifford, Gd
2008-09-14
We describe an open source algorithm suite for T-Wave Alternans (TWA) detection and quantification. The software consists of Matlab implementations of the widely used Spectral Method and Modified Moving Average with libraries to read both WFDB and ASCII data under windows and Linux. The software suite can run in both batch mode and with a provided graphical user interface to aid waveform exploration. Our software suite was calibrated using an open source TWA model, described in a partner paper [1] by Clifford and Sameni. For the PhysioNet/CinC Challenge 2008 we obtained a score of 0.881 for the Spectral Method and 0.400 for the MMA method. However, our objective was not to provide the best TWA detector, but rather a basis for detailed discussion of algorithms.
Ganry, L; Hersant, B; Bosc, R; Leyder, P; Quilichini, J; Meningaud, J P
2018-02-27
Benefits of 3D printing techniques, biomodeling and surgical guides are well known in surgery, especially when the same surgeon who performed the surgery participated in the virtual surgical planning. Our objective was to evaluate the transfer of know how of a neutral 3D surgical modeling free open-source software protocol to surgeons with different surgical specialities. A one-day training session was organised in 3D surgical modeling applied to one mandibular reconstruction case with fibula free flap and creation of its surgical guides. Surgeon satisfaction was analysed before and after the training. Of 22 surgeons, 59% assessed the training as excellent or very good and 68% considered changing their daily surgical routine and would try to apply our open-source software protocol in their department after a single training day. The mean capacity in using the software improved from 4.13 on 10 before to 6.59 on 10 after training for OsiriX ® software, from 1.14 before to 5.05 after training for Meshlab ® , from 0.45 before to 4.91 after training for Netfabb ® and from 1.05 before and 4.41 after training for Blender ® . According to surgeons, using the software Blender ® became harder as the day went on. Despite improvement in the capacity in using software for all participants, more than a single training day is needed for the transfer of know how on 3D modeling with open-source software. Although the know-how transfer, overall satisfaction, actual learning outcomes and relevance of this training were appropriated, a longer training including different topics will be needed to improve training quality. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Open-Source Software in Computational Research: A Case Study
Syamlal, Madhava; O'Brien, Thomas J.; Benyahia, Sofiane; ...
2008-01-01
A case study of open-source (OS) development of the computational research software MFIX, used for multiphase computational fluid dynamics simulations, is presented here. The verification and validation steps required for constructing modern computational software and the advantages of OS development in those steps are discussed. The infrastructure used for enabling the OS development of MFIX is described. The impact of OS development on computational research and education in gas-solids flow, as well as the dissemination of information to other areas such as geophysical and volcanology research, is demonstrated. This study shows that the advantages of OS development were realized inmore » the case of MFIX: verification by many users, which enhances software quality; the use of software as a means for accumulating and exchanging information; the facilitation of peer review of the results of computational research.« less
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Ronald C.
Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBasemore » project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.« less
MyMolDB: a micromolecular database solution with open source and free components.
Xia, Bing; Tai, Zheng-Fu; Gu, Yu-Cheng; Li, Bang-Jing; Ding, Li-Sheng; Zhou, Yan
2011-10-01
To manage chemical structures in small laboratories is one of the important daily tasks. Few solutions are available on the internet, and most of them are closed source applications. The open-source applications typically have limited capability and basic cheminformatics functionalities. In this article, we describe an open-source solution to manage chemicals in research groups based on open source and free components. It has a user-friendly interface with the functions of chemical handling and intensive searching. MyMolDB is a micromolecular database solution that supports exact, substructure, similarity, and combined searching. This solution is mainly implemented using scripting language Python with a web-based interface for compound management and searching. Almost all the searches are in essence done with pure SQL on the database by using the high performance of the database engine. Thus, impressive searching speed has been archived in large data sets for no external Central Processing Unit (CPU) consuming languages were involved in the key procedure of the searching. MyMolDB is an open-source software and can be modified and/or redistributed under GNU General Public License version 3 published by the Free Software Foundation (Free Software Foundation Inc. The GNU General Public License, Version 3, 2007. Available at: http://www.gnu.org/licenses/gpl.html). The software itself can be found at http://code.google.com/p/mymoldb/. Copyright © 2011 Wiley Periodicals, Inc.
Nurturing reliable and robust open-source scientific software
NASA Astrophysics Data System (ADS)
Uieda, L.; Wessel, P.
2017-12-01
Scientific results are increasingly the product of software. The reproducibility and validity of published results cannot be ensured without access to the source code of the software used to produce them. Therefore, the code itself is a fundamental part of the methodology and must be published along with the results. With such a reliance on software, it is troubling that most scientists do not receive formal training in software development. Tools such as version control, continuous integration, and automated testing are routinely used in industry to ensure the correctness and robustness of software. However, many scientist do not even know of their existence (although efforts like Software Carpentry are having an impact on this issue; software-carpentry.org). Publishing the source code is only the first step in creating an open-source project. For a project to grow it must provide documentation, participation guidelines, and a welcoming environment for new contributors. Expanding the project community is often more challenging than the technical aspects of software development. Maintainers must invest time to enforce the rules of the project and to onboard new members, which can be difficult to justify in the context of the "publish or perish" mentality. This problem will continue as long as software contributions are not recognized as valid scholarship by hiring and tenure committees. Furthermore, there are still unsolved problems in providing attribution for software contributions. Many journals and metrics of academic productivity do not recognize citations to sources other than traditional publications. Thus, some authors choose to publish an article about the software and use it as a citation marker. One issue with this approach is that updating the reference to include new contributors involves writing and publishing a new article. A better approach would be to cite a permanent archive of individual versions of the source code in services such as Zenodo (zenodo.org). However, citations to these sources are not always recognized when computing citation metrics. In summary, the widespread development of reliable and robust open-source software relies on the creation of formal training programs in software development best practices and the recognition of software as a valid form of scholarship.
Clinical software development for the Web: lessons learned from the BOADICEA project
2012-01-01
Background In the past 20 years, society has witnessed the following landmark scientific advances: (i) the sequencing of the human genome, (ii) the distribution of software by the open source movement, and (iii) the invention of the World Wide Web. Together, these advances have provided a new impetus for clinical software development: developers now translate the products of human genomic research into clinical software tools; they use open-source programs to build them; and they use the Web to deliver them. Whilst this open-source component-based approach has undoubtedly made clinical software development easier, clinical software projects are still hampered by problems that traditionally accompany the software process. This study describes the development of the BOADICEA Web Application, a computer program used by clinical geneticists to assess risks to patients with a family history of breast and ovarian cancer. The key challenge of the BOADICEA Web Application project was to deliver a program that was safe, secure and easy for healthcare professionals to use. We focus on the software process, problems faced, and lessons learned. Our key objectives are: (i) to highlight key clinical software development issues; (ii) to demonstrate how software engineering tools and techniques can facilitate clinical software development for the benefit of individuals who lack software engineering expertise; and (iii) to provide a clinical software development case report that can be used as a basis for discussion at the start of future projects. Results We developed the BOADICEA Web Application using an evolutionary software process. Our approach to Web implementation was conservative and we used conventional software engineering tools and techniques. The principal software development activities were: requirements, design, implementation, testing, documentation and maintenance. The BOADICEA Web Application has now been widely adopted by clinical geneticists and researchers. BOADICEA Web Application version 1 was released for general use in November 2007. By May 2010, we had > 1200 registered users based in the UK, USA, Canada, South America, Europe, Africa, Middle East, SE Asia, Australia and New Zealand. Conclusions We found that an evolutionary software process was effective when we developed the BOADICEA Web Application. The key clinical software development issues identified during the BOADICEA Web Application project were: software reliability, Web security, clinical data protection and user feedback. PMID:22490389
Clinical software development for the Web: lessons learned from the BOADICEA project.
Cunningham, Alex P; Antoniou, Antonis C; Easton, Douglas F
2012-04-10
In the past 20 years, society has witnessed the following landmark scientific advances: (i) the sequencing of the human genome, (ii) the distribution of software by the open source movement, and (iii) the invention of the World Wide Web. Together, these advances have provided a new impetus for clinical software development: developers now translate the products of human genomic research into clinical software tools; they use open-source programs to build them; and they use the Web to deliver them. Whilst this open-source component-based approach has undoubtedly made clinical software development easier, clinical software projects are still hampered by problems that traditionally accompany the software process. This study describes the development of the BOADICEA Web Application, a computer program used by clinical geneticists to assess risks to patients with a family history of breast and ovarian cancer. The key challenge of the BOADICEA Web Application project was to deliver a program that was safe, secure and easy for healthcare professionals to use. We focus on the software process, problems faced, and lessons learned. Our key objectives are: (i) to highlight key clinical software development issues; (ii) to demonstrate how software engineering tools and techniques can facilitate clinical software development for the benefit of individuals who lack software engineering expertise; and (iii) to provide a clinical software development case report that can be used as a basis for discussion at the start of future projects. We developed the BOADICEA Web Application using an evolutionary software process. Our approach to Web implementation was conservative and we used conventional software engineering tools and techniques. The principal software development activities were: requirements, design, implementation, testing, documentation and maintenance. The BOADICEA Web Application has now been widely adopted by clinical geneticists and researchers. BOADICEA Web Application version 1 was released for general use in November 2007. By May 2010, we had > 1200 registered users based in the UK, USA, Canada, South America, Europe, Africa, Middle East, SE Asia, Australia and New Zealand. We found that an evolutionary software process was effective when we developed the BOADICEA Web Application. The key clinical software development issues identified during the BOADICEA Web Application project were: software reliability, Web security, clinical data protection and user feedback.
OASIS: a data and software distribution service for Open Science Grid
NASA Astrophysics Data System (ADS)
Bockelman, B.; Caballero Bejar, J.; De Stefano, J.; Hover, J.; Quick, R.; Teige, S.
2014-06-01
The Open Science Grid encourages the concept of software portability: a user's scientific application should be able to run at as many sites as possible. It is necessary to provide a mechanism for OSG Virtual Organizations to install software at sites. Since its initial release, the OSG Compute Element has provided an application software installation directory to Virtual Organizations, where they can create their own sub-directory, install software into that sub-directory, and have the directory shared on the worker nodes at that site. The current model has shortcomings with regard to permissions, policies, versioning, and the lack of a unified, collective procedure or toolset for deploying software across all sites. Therefore, a new mechanism for data and software distributing is desirable. The architecture for the OSG Application Software Installation Service (OASIS) is a server-client model: the software and data are installed only once in a single place, and are automatically distributed to all client sites simultaneously. Central file distribution offers other advantages, including server-side authentication and authorization, activity records, quota management, data validation and inspection, and well-defined versioning and deletion policies. The architecture, as well as a complete analysis of the current implementation, will be described in this paper.
ERIC Educational Resources Information Center
Voyles, Bennett
2007-01-01
People know about the Sakai Project (open source course management system); they may even know about Kuali (open source financials). So, what is the next wave in open source software? This article discusses business intelligence (BI) systems. Though open source BI may still be only a rumor in most campus IT departments, some brave early adopters…
EPRI and Schneider Electric Demonstrate Distributed Resource Communications
Electric Power Research Institute (EPRI) is designing, building, and testing a flexible, open-source Schneider Electric ADMS, open software platforms, an open-platform home energy management system
Open source pipeline for ESPaDOnS reduction and analysis
NASA Astrophysics Data System (ADS)
Martioli, Eder; Teeple, Doug; Manset, Nadine; Devost, Daniel; Withington, Kanoa; Venne, Andre; Tannock, Megan
2012-09-01
OPERA is a Canada-France-Hawaii Telescope (CFHT) open source collaborative software project currently under development for an ESPaDOnS echelle spectro-polarimetric image reduction pipeline. OPERA is designed to be fully automated, performing calibrations and reduction, producing one-dimensional intensity and polarimetric spectra. The calibrations are performed on two-dimensional images. Spectra are extracted using an optimal extraction algorithm. While primarily designed for CFHT ESPaDOnS data, the pipeline is being written to be extensible to other echelle spectrographs. A primary design goal is to make use of fast, modern object-oriented technologies. Processing is controlled by a harness, which manages a set of processing modules, that make use of a collection of native OPERA software libraries and standard external software libraries. The harness and modules are completely parametrized by site configuration and instrument parameters. The software is open- ended, permitting users of OPERA to extend the pipeline capabilities. All these features have been designed to provide a portable infrastructure that facilitates collaborative development, code re-usability and extensibility. OPERA is free software with support for both GNU/Linux and MacOSX platforms. The pipeline is hosted on SourceForge under the name "opera-pipeline".
ERIC Educational Resources Information Center
van Reijswoud, Victor; Mulo, Emmanuel
2006-01-01
Over recent years the issue of free and open source software (FOSS) for development in less developed countries (LDCs) has received increasing attention. In the beginning the benefits of FOSS for lower developed countries was only stressed by small groups of idealists like Richard Stallman. Now, however, it is moving into the hands of large…
ERIC Educational Resources Information Center
Tay, Lee Yong; Lim, Cher Ping; Lye, Sze Yee; Ng, Kay Joo; Lim, Siew Khiaw
2011-01-01
This paper analyses how an elementary-level future school in Singapore implements and uses various open-source online platforms, which are easily available online and could be implemented with minimal software cost, for the purpose of teaching and learning. Online platforms have the potential to facilitate students' engagement for independent and…
ERIC Educational Resources Information Center
Benoit-Barne, Chantal
2007-01-01
This essay investigates the rhetorical practices of socio-technical deliberation about free and open source (F/OS) software, providing support for the idea that a public sphere is a socio-technical ensemble that is discursive and fluid, yet tangible and organized because it is enacted by both humans and non-humans. In keeping with the empirical…
Crux: Rapid Open Source Protein Tandem Mass Spectrometry Analysis
2015-01-01
Efficiently and accurately analyzing big protein tandem mass spectrometry data sets requires robust software that incorporates state-of-the-art computational, machine learning, and statistical methods. The Crux mass spectrometry analysis software toolkit (http://cruxtoolkit.sourceforge.net) is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data. PMID:25182276
Development of a web application for water resources based on open source software
NASA Astrophysics Data System (ADS)
Delipetrev, Blagoj; Jonoski, Andreja; Solomatine, Dimitri P.
2014-01-01
This article presents research and development of a prototype web application for water resources using latest advancements in Information and Communication Technologies (ICT), open source software and web GIS. The web application has three web services for: (1) managing, presenting and storing of geospatial data, (2) support of water resources modeling and (3) water resources optimization. The web application is developed using several programming languages (PhP, Ajax, JavaScript, Java), libraries (OpenLayers, JQuery) and open source software components (GeoServer, PostgreSQL, PostGIS). The presented web application has several main advantages: it is available all the time, it is accessible from everywhere, it creates a real time multi-user collaboration platform, the programing languages code and components are interoperable and designed to work in a distributed computer environment, it is flexible for adding additional components and services and, it is scalable depending on the workload. The application was successfully tested on a case study with concurrent multi-users access.
IGT-Open: An open-source, computerized version of the Iowa Gambling Task.
Dancy, Christopher L; Ritter, Frank E
2017-06-01
The Iowa Gambling Task (IGT) is commonly used to understand the processes involved in decision-making. Though the task was originally run without a computer, using a computerized version of the task has become typical. These computerized versions of the IGT are useful, because they can make the task more standardized across studies and allow for the task to be used in environments where a physical version of the task may be difficult or impossible to use (e.g., while collecting brain imaging data). Though these computerized versions of the IGT have been useful for experimentation, having multiple software implementations of the task could present reliability issues. We present an open-source software version of the Iowa Gambling Task (called IGT-Open) that allows for millisecond visual presentation accuracy and is freely available to be used and modified. This software has been used to collect data from human subjects and also has been used to run model-based simulations with computational process models developed to run in the ACT-R architecture.
NASA Technical Reports Server (NTRS)
Yin, J.; Oyaki, A.; Hwang, C.; Hung, C.
2000-01-01
The purpose of this research and study paper is to provide a summary description and results of rapid development accomplishments at NASA/JPL in the area of advanced distributed computing technology using a Commercial-Off--The-Shelf (COTS)-based object oriented component approach to open inter-operable software development and software reuse.
Open source electronic health record and patient data management system for intensive care.
Massaut, Jacques; Reper, Pascal
2008-01-01
In Intensive Care Units, the amount of data to be processed for patients care, the turn over of the patients, the necessity for reliability and for review processes indicate the use of Patient Data Management Systems (PDMS) and electronic health records (EHR). To respond to the needs of an Intensive Care Unit and not to be locked with proprietary software, we developed a PDMS and EHR based on open source software and components. The software was designed as a client-server architecture running on the Linux operating system and powered by the PostgreSQL data base system. The client software was developed in C using GTK interface library. The application offers to the users the following functions: medical notes captures, observations and treatments, nursing charts with administration of medications, scoring systems for classification, and possibilities to encode medical activities for billing processes. Since his deployment in February 2004, the PDMS was used to care more than three thousands patients with the expected software reliability and facilitated data management and review processes. Communications with other medical software were not developed from the start, and are realized by the use of the Mirth HL7 communication engine. Further upgrade of the system will include multi-platform support, use of typed language with static analysis, and configurable interface. The developed system based on open source software components was able to respond to the medical needs of the local ICU environment. The use of OSS for development allowed us to customize the software to the preexisting organization and contributed to the acceptability of the whole system.
NASA Astrophysics Data System (ADS)
Fulker, D. W.; Gallagher, J. H. R.
2015-12-01
OPeNDAP's Hyrax data server is an open-source framework fostering interoperability via easily-deployed Web services. Compatible with solutions listed in the (PA001) session description—federation, rigid standards and brokering/mediation—the framework can support tight or loose coupling, even with dependence on community-contributed software. Hyrax is a Web-services framework with a middleware-like design and a handler-style architecture that together reduce the interoperability challenge (for N datatypes and M user contexts) to an O(N+M) problem, similar to brokering. Combined with an open-source ethos, this reduction makes Hyrax a community tool for gaining interoperability. E.g., in its response to the Big Earth Data Initiative (BEDI), NASA references OPeNDAP-based interoperability. Assuming its suitability, the question becomes: how sustainable is OPeNDAP, a small not-for-profit that produces open-source software, i.e., has no software-sales? In other words, if geoscience interoperability depends on OPeNDAP and similar organizations, are those entities in turn sustainable? Jim Collins (in Good to Great) highlights three questions that successful companies can answer (paraphrased here): What is your passion? Where is your world-class excellence? What drives your economic engine? We attempt to shed light on OPeNDAP sustainability by examining these. Passion: OPeNDAP has a focused passion for improving the effectiveness of scientific data sharing and use, as deeply-cooperative community endeavors. Excellence: OPeNDAP has few peers in remote, scientific data access. Skills include computer science with experience in data science, (operational, secure) Web services, and software design (for servers and clients, where the latter vary from Web pages to standalone apps and end-user programs). Economic Engine: OPeNDAP is an engineering services organization more than a product company, despite software being key to OPeNDAP's reputation. In essence, provision of engineering expertise, via contracts and grants, is the economic engine. Hence sustainability, as needed to address global grand challenges in geoscience, depends on agencies' and others' abilities and willingness to offer grants and let contracts for continually upgrading open-source software from OPeNDAP and others.
Crawling The Web for Libre: Selecting, Integrating, Extending and Releasing Open Source Software
NASA Astrophysics Data System (ADS)
Truslove, I.; Duerr, R. E.; Wilcox, H.; Savoie, M.; Lopez, L.; Brandt, M.
2012-12-01
Libre is a project developed by the National Snow and Ice Data Center (NSIDC). Libre is devoted to liberating science data from its traditional constraints of publication, location, and findability. Libre embraces and builds on the notion of making knowledge freely available, and both Creative Commons licensed content and Open Source Software are crucial building blocks for, as well as required deliverable outcomes of the project. One important aspect of the Libre project is to discover cryospheric data published on the internet without prior knowledge of the location or even existence of that data. Inspired by well-known search engines and their underlying web crawling technologies, Libre has explored tools and technologies required to build a search engine tailored to allow users to easily discover geospatial data related to the polar regions. After careful consideration, the Libre team decided to base its web crawling work on the Apache Nutch project (http://nutch.apache.org). Nutch is "an open source web-search software project" written in Java, with good documentation, a significant user base, and an active development community. Nutch was installed and configured to search for the types of data of interest, and the team created plugins to customize the default Nutch behavior to better find and categorize these data feeds. This presentation recounts the Libre team's experiences selecting, using, and extending Nutch, and working with the Nutch user and developer community. We will outline the technical and organizational challenges faced in order to release the project's software as Open Source, and detail the steps actually taken. We distill these experiences into a set of heuristics and recommendations for using, contributing to, and releasing Open Source Software.
ERIC Educational Resources Information Center
Paskevicius, Michael; Veletsianos, George; Kimmons, Royce
2018-01-01
Inspired by open educational resources, open pedagogy, and open source software, the openness movement in education has different meanings for different people. In this study, we use Twitter data to examine the discourses surrounding openness as well as the people who participate in discourse around openness. By targeting hashtags related to open…
Advantages and Disadvantages in Image Processing with Free Software in Radiology.
Mujika, Katrin Muradas; Méndez, Juan Antonio Juanes; de Miguel, Andrés Framiñan
2018-01-15
Currently, there are sophisticated applications that make it possible to visualize medical images and even to manipulate them. These software applications are of great interest, both from a teaching and a radiological perspective. In addition, some of these applications are known as Free Open Source Software because they are free and the source code is freely available, and therefore it can be easily obtained even on personal computers. Two examples of free open source software are Osirix Lite® and 3D Slicer®. However, this last group of free applications have limitations in its use. For the radiological field, manipulating and post-processing images is increasingly important. Consequently, sophisticated computing tools that combine software and hardware to process medical images are needed. In radiology, graphic workstations allow their users to process, review, analyse, communicate and exchange multidimensional digital images acquired with different image-capturing radiological devices. These radiological devices are basically CT (Computerised Tomography), MRI (Magnetic Resonance Imaging), PET (Positron Emission Tomography), etc. Nevertheless, the programs included in these workstations have a high cost which always depends on the software provider and is always subject to its norms and requirements. With this study, we aim to present the advantages and disadvantages of these radiological image visualization systems in the advanced management of radiological studies. We will compare the features of the VITREA2® and AW VolumeShare 5® radiology workstation with free open source software applications like OsiriX® and 3D Slicer®, with examples from specific studies.
ERIC Educational Resources Information Center
Krishnamurthy, M.
2008-01-01
Purpose: The purpose of this paper is to describe the open access and open source movement in the digital library world. Design/methodology/approach: A review of key developments in the open access and open source movement is provided. Findings: Open source software and open access to research findings are of great use to scholars in developing…
Implementation, reliability, and feasibility test of an Open-Source PACS.
Valeri, Gianluca; Zuccaccia, Matteo; Badaloni, Andrea; Ciriaci, Damiano; La Riccia, Luigi; Mazzoni, Giovanni; Maggi, Stefania; Giovagnoni, Andrea
2015-12-01
To implement a hardware and software system able to perform the major functions of an Open-Source PACS, and to analyze it in a simulated real-world environment. A small home network was implemented, and the Open-Source operating system Ubuntu 11.10 was installed in a laptop containing the Dcm4chee suite with the software devices needed. The Open-Source PACS implemented is compatible with Linux OS, Microsoft OS, and Mac OS X; furthermore, it was used with operating systems that guarantee the operation in portable devices (smartphone, tablet) Android and iOS. An OSS PACS is useful for making tutorials and workshops on post-processing techniques for educational and training purposes.
Reflecting Indigenous Culture in Educational Software Design.
ERIC Educational Resources Information Center
Fleer, Marilyn
1989-01-01
Discusses research on Australian Aboriginal cognition which relates to the development of appropriate educational software. Describes "Tinja," a software program using familiar content and experiences, Aboriginal characters and cultural values, extensive graphics and animation, peer and group work, and open-ended design to help young…
Software Cost-Estimation Model
NASA Technical Reports Server (NTRS)
Tausworthe, R. C.
1985-01-01
Software Cost Estimation Model SOFTCOST provides automated resource and schedule model for software development. Combines several cost models found in open literature into one comprehensive set of algorithms. Compensates for nearly fifty implementation factors relative to size of task, inherited baseline, organizational and system environment and difficulty of task.
The SCEC/UseIT Intern Program: Creating Open-Source Visualization Software Using Diverse Resources
NASA Astrophysics Data System (ADS)
Francoeur, H.; Callaghan, S.; Perry, S.; Jordan, T.
2004-12-01
The Southern California Earthquake Center undergraduate IT intern program (SCEC UseIT) conducts IT research to benefit collaborative earth science research. Through this program, interns have developed real-time, interactive, 3D visualization software using open-source tools. Dubbed LA3D, a distribution of this software is now in use by the seismic community. LA3D enables the user to interactively view Southern California datasets and models of importance to earthquake scientists, such as faults, earthquakes, fault blocks, digital elevation models, and seismic hazard maps. LA3D is now being extended to support visualizations anywhere on the planet. The new software, called SCEC-VIDEO (Virtual Interactive Display of Earth Objects), makes use of a modular, plugin-based software architecture which supports easy development and integration of new data sets. Currently SCEC-VIDEO is in beta testing, with a full open-source release slated for the future. Both LA3D and SCEC-VIDEO were developed using a wide variety of software technologies. These, which included relational databases, web services, software management technologies, and 3-D graphics in Java, were necessary to integrate the heterogeneous array of data sources which comprise our software. Currently the interns are working to integrate new technologies and larger data sets to increase software functionality and value. In addition, both LA3D and SCEC-VIDEO allow the user to script and create movies. Thus program interns with computer science backgrounds have been writing software while interns with other interests, such as cinema, geology, and education, have been making movies that have proved of great use in scientific talks, media interviews, and education. Thus, SCEC UseIT incorporates a wide variety of scientific and human resources to create products of value to the scientific and outreach communities. The program plans to continue with its interdisciplinary approach, increasing the relevance of the software and expanding its use in the scientific community.
BioSig: The Free and Open Source Software Library for Biomedical Signal Processing
Vidaurre, Carmen; Sander, Tilmann H.; Schlögl, Alois
2011-01-01
BioSig is an open source software library for biomedical signal processing. The aim of the BioSig project is to foster research in biomedical signal processing by providing free and open source software tools for many different application areas. Some of the areas where BioSig can be employed are neuroinformatics, brain-computer interfaces, neurophysiology, psychology, cardiovascular systems, and sleep research. Moreover, the analysis of biosignals such as the electroencephalogram (EEG), electrocorticogram (ECoG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), or respiration signals is a very relevant element of the BioSig project. Specifically, BioSig provides solutions for data acquisition, artifact processing, quality control, feature extraction, classification, modeling, and data visualization, to name a few. In this paper, we highlight several methods to help students and researchers to work more efficiently with biomedical signals. PMID:21437227
BioSig: the free and open source software library for biomedical signal processing.
Vidaurre, Carmen; Sander, Tilmann H; Schlögl, Alois
2011-01-01
BioSig is an open source software library for biomedical signal processing. The aim of the BioSig project is to foster research in biomedical signal processing by providing free and open source software tools for many different application areas. Some of the areas where BioSig can be employed are neuroinformatics, brain-computer interfaces, neurophysiology, psychology, cardiovascular systems, and sleep research. Moreover, the analysis of biosignals such as the electroencephalogram (EEG), electrocorticogram (ECoG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), or respiration signals is a very relevant element of the BioSig project. Specifically, BioSig provides solutions for data acquisition, artifact processing, quality control, feature extraction, classification, modeling, and data visualization, to name a few. In this paper, we highlight several methods to help students and researchers to work more efficiently with biomedical signals.
Psynteract: A flexible, cross-platform, open framework for interactive experiments.
Henninger, Felix; Kieslich, Pascal J; Hilbig, Benjamin E
2017-10-01
We introduce a novel platform for interactive studies, that is, any form of study in which participants' experiences depend not only on their own responses, but also on those of other participants who complete the same study in parallel, for example a prisoner's dilemma or an ultimatum game. The software thus especially serves the rapidly growing field of strategic interaction research within psychology and behavioral economics. In contrast to all available software packages, our platform does not handle stimulus display and response collection itself. Instead, we provide a mechanism to extend existing experimental software to incorporate interactive functionality. This approach allows us to draw upon the capabilities already available, such as accuracy of temporal measurement, integration with auxiliary hardware such as eye-trackers or (neuro-)physiological apparatus, and recent advances in experimental software, for example capturing response dynamics through mouse-tracking. Through integration with OpenSesame, an open-source graphical experiment builder, studies can be assembled via a drag-and-drop interface requiring little or no further programming skills. In addition, by using the same communication mechanism across software packages, we also enable interoperability between systems. Our source code, which provides support for all major operating systems and several popular experimental packages, can be freely used and distributed under an open source license. The communication protocols underlying its functionality are also well documented and easily adapted to further platforms. Code and documentation are available at https://github.com/psynteract/ .
RNA-Seq workflow: gene-level exploratory analysis and differential expression
Love, Michael I.; Anders, Simon; Kim, Vladislav; Huber, Wolfgang
2015-01-01
Here we walk through an end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were aligned to the reference genome, and prepare a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample. We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results. PMID:26674615
Listening to the student voice to improve educational software
van Wyk, Mari; van Ryneveld, Linda
2017-01-01
ABSTRACT Academics often develop software for teaching and learning purposes with the best of intentions, only to be disappointed by the low acceptance rate of the software by their students once it is implemented. In this study, the focus is on software that was designed to enable veterinary students to record their clinical skills. A pilot of the software clearly showed that the program had not been received as well as had been anticipated, and therefore the researchers used a group interview and a questionnaire with closed-ended and open-ended questions to obtain the students’ feedback. The open-ended questions were analysed with conceptual content analysis, and themes were identified. Students made valuable suggestions about what they regarded as important considerations when a new software program is introduced. The most important lesson learnt was that students cannot always predict their needs accurately if they are asked for input prior to the development of software. For that reason student input should be obtained on a continuous and regular basis throughout the design and development phases. PMID:28678678
Ensemble: an Architecture for Mission-Operations Software
NASA Technical Reports Server (NTRS)
Norris, Jeffrey; Powell, Mark; Fox, Jason; Rabe, Kenneth; Shu, IHsiang; McCurdy, Michael; Vera, Alonso
2008-01-01
Ensemble is the name of an open architecture for, and a methodology for the development of, spacecraft mission operations software. Ensemble is also potentially applicable to the development of non-spacecraft mission-operations- type software. Ensemble capitalizes on the strengths of the open-source Eclipse software and its architecture to address several issues that have arisen repeatedly in the development of mission-operations software: Heretofore, mission-operations application programs have been developed in disparate programming environments and integrated during the final stages of development of missions. The programs have been poorly integrated, and it has been costly to develop, test, and deploy them. Users of each program have been forced to interact with several different graphical user interfaces (GUIs). Also, the strategy typically used in integrating the programs has yielded serial chains of operational software tools of such a nature that during use of a given tool, it has not been possible to gain access to the capabilities afforded by other tools. In contrast, the Ensemble approach offers a low-risk path towards tighter integration of mission-operations software tools.
Sun, Ryan; Bouchard, Matthew B.; Hillman, Elizabeth M. C.
2010-01-01
Camera-based in-vivo optical imaging can provide detailed images of living tissue that reveal structure, function, and disease. High-speed, high resolution imaging can reveal dynamic events such as changes in blood flow and responses to stimulation. Despite these benefits, commercially available scientific cameras rarely include software that is suitable for in-vivo imaging applications, making this highly versatile form of optical imaging challenging and time-consuming to implement. To address this issue, we have developed a novel, open-source software package to control high-speed, multispectral optical imaging systems. The software integrates a number of modular functions through a custom graphical user interface (GUI) and provides extensive control over a wide range of inexpensive IEEE 1394 Firewire cameras. Multispectral illumination can be incorporated through the use of off-the-shelf light emitting diodes which the software synchronizes to image acquisition via a programmed microcontroller, allowing arbitrary high-speed illumination sequences. The complete software suite is available for free download. Here we describe the software’s framework and provide details to guide users with development of this and similar software. PMID:21258475
Nolden, Marco; Zelzer, Sascha; Seitel, Alexander; Wald, Diana; Müller, Michael; Franz, Alfred M; Maleike, Daniel; Fangerau, Markus; Baumhauer, Matthias; Maier-Hein, Lena; Maier-Hein, Klaus H; Meinzer, Hans-Peter; Wolf, Ivo
2013-07-01
The Medical Imaging Interaction Toolkit (MITK) has been available as open-source software for almost 10 years now. In this period the requirements of software systems in the medical image processing domain have become increasingly complex. The aim of this paper is to show how MITK evolved into a software system that is able to cover all steps of a clinical workflow including data retrieval, image analysis, diagnosis, treatment planning, intervention support, and treatment control. MITK provides modularization and extensibility on different levels. In addition to the original toolkit, a module system, micro services for small, system-wide features, a service-oriented architecture based on the Open Services Gateway initiative (OSGi) standard, and an extensible and configurable application framework allow MITK to be used, extended and deployed as needed. A refined software process was implemented to deliver high-quality software, ease the fulfillment of regulatory requirements, and enable teamwork in mixed-competence teams. MITK has been applied by a worldwide community and integrated into a variety of solutions, either at the toolkit level or as an application framework with custom extensions. The MITK Workbench has been released as a highly extensible and customizable end-user application. Optional support for tool tracking, image-guided therapy, diffusion imaging as well as various external packages (e.g. CTK, DCMTK, OpenCV, SOFA, Python) is available. MITK has also been used in several FDA/CE-certified applications, which demonstrates the high-quality software and rigorous development process. MITK provides a versatile platform with a high degree of modularization and interoperability and is well suited to meet the challenging tasks of today's and tomorrow's clinically motivated research.
NASA Astrophysics Data System (ADS)
Udell, C.; Selker, J. S.
2017-12-01
The increasing availability and functionality of Open-Source software and hardware along with 3D printing, low-cost electronics, and proliferation of open-access resources for learning rapid prototyping are contributing to fundamental transformations and new technologies in environmental sensing. These tools invite reevaluation of time-tested methodologies and devices toward more efficient, reusable, and inexpensive alternatives. Building upon Open-Source design facilitates community engagement and invites a Do-It-Together (DIT) collaborative framework for research where solutions to complex problems may be crowd-sourced. However, barriers persist that prevent researchers from taking advantage of the capabilities afforded by open-source software, hardware, and rapid prototyping. Some of these include: requisite technical skillsets, knowledge of equipment capabilities, identifying inexpensive sources for materials, money, space, and time. A university MAKER space staffed by engineering students to assist researchers is one proposed solution to overcome many of these obstacles. This presentation investigates the unique capabilities the USDA-funded Openly Published Environmental Sensing (OPEnS) Lab affords researchers, within Oregon State and internationally, and the unique functions these types of initiatives support at the intersection of MAKER spaces, Open-Source academic research, and open-access dissemination.
OpenSQUID: A Flexible Open-Source Software Framework for the Control of SQUID Electronics
Jaeckel, Felix T.; Lafler, Randy J.; Boyd, S. T. P.
2013-02-06
We report commercially available computer-controlled SQUID electronics are usually delivered with software providing a basic user interface for adjustment of SQUID tuning parameters, such as bias current, flux offset, and feedback loop settings. However, in a research context it would often be useful to be able to modify this code and/or to have full control over all these parameters from researcher-written software. In the case of the STAR Cryoelectronics PCI/PFL family of SQUID control electronics, the supplied software contains modules for automatic tuning and noise characterization, but does not provide an interface for user code. On the other hand, themore » Magnicon SQUIDViewer software package includes a public application programming interface (API), but lacks auto-tuning and noise characterization features. To overcome these and other limitations, we are developing an "open-source" framework for controlling SQUID electronics which should provide maximal interoperability with user software, a unified user interface for electronics from different manufacturers, and a flexible platform for the rapid development of customized SQUID auto-tuning and other advanced features. Finally, we have completed a first implementation for the STAR Cryoelectronics hardware and have made the source code for this ongoing project available to the research community on SourceForge (http://opensquid.sourceforge.net) under the GNU public license.« less
ERIC Educational Resources Information Center
Morgan, Becka S.
2012-01-01
Open Source Software (OSS) communities are homogenous and their lack of diversity is of concern to many within this field. This problem is becoming more pronounced as it is the practice of many technology companies to use OSS participation as a factor in the hiring process, disadvantaging those who are not a part of this community. We should…
ERIC Educational Resources Information Center
Howison, James
2009-01-01
This dissertation presents evidence that the production of Free and Open Source Software (FLOSS) is far more alone than together; it is far more often individual work done "in company" than it is teamwork. When tasks appear too large for an individual they are more likely to be deferred until they are easier rather than be undertaken through…
Visualization and Analytics Software Tools for Peregrine System |
R is a language and environment for statistical computing and graphics. Go to the R web site for System Visualization and Analytics Software Tools for Peregrine System Learn about the available visualization for OpenGL-based applications. For more information, please go to the FastX page. ParaView An open
Computer Forensics Education - the Open Source Approach
NASA Astrophysics Data System (ADS)
Huebner, Ewa; Bem, Derek; Cheung, Hon
In this chapter we discuss the application of the open source software tools in computer forensics education at tertiary level. We argue that open source tools are more suitable than commercial tools, as they provide the opportunity for students to gain in-depth understanding and appreciation of the computer forensic process as opposed to familiarity with one software product, however complex and multi-functional. With the access to all source programs the students become more than just the consumers of the tools as future forensic investigators. They can also examine the code, understand the relationship between the binary images and relevant data structures, and in the process gain necessary background to become the future creators of new and improved forensic software tools. As a case study we present an advanced subject, Computer Forensics Workshop, which we designed for the Bachelor's degree in computer science at the University of Western Sydney. We based all laboratory work and the main take-home project in this subject on open source software tools. We found that without exception more than one suitable tool can be found to cover each topic in the curriculum adequately. We argue that this approach prepares students better for forensic field work, as they gain confidence to use a variety of tools, not just a single product they are familiar with.
The Case for Open Source Software: The Interactional Discourse Lab
ERIC Educational Resources Information Center
Choi, Seongsook
2016-01-01
Computational techniques and software applications for the quantitative content analysis of texts are now well established, and many qualitative data software applications enable the manipulation of input variables and the visualization of complex relations between them via interactive and informative graphical interfaces. Although advances in…
Sanyal, Parikshit; Ganguli, Prosenjit; Barui, Sanghita; Deb, Prabal
2018-01-01
The Pap stained cervical smear is a screening tool for cervical cancer. Commercial systems are used for automated screening of liquid based cervical smears. However, there is no image analysis software used for conventional cervical smears. The aim of this study was to develop and test the diagnostic accuracy of a software for analysis of conventional smears. The software was developed using Python programming language and open source libraries. It was standardized with images from Bethesda Interobserver Reproducibility Project. One hundred and thirty images from smears which were reported Negative for Intraepithelial Lesion or Malignancy (NILM), and 45 images where some abnormality has been reported, were collected from the archives of the hospital. The software was then tested on the images. The software was able to segregate images based on overall nuclear: cytoplasmic ratio, coefficient of variation (CV) in nuclear size, nuclear membrane irregularity, and clustering. 68.88% of abnormal images were flagged by the software, as well as 19.23% of NILM images. The major difficulties faced were segmentation of overlapping cell clusters and separation of neutrophils. The software shows potential as a screening tool for conventional cervical smears; however, further refinement in technique is required.
Jelicic Kadic, Antonia; Vucic, Katarina; Dosenovic, Svjetlana; Sapunar, Damir; Puljak, Livia
2016-06-01
To compare speed and accuracy of graphical data extraction using manual estimation and open source software. Data points from eligible graphs/figures published in randomized controlled trials (RCTs) from 2009 to 2014 were extracted by two authors independently, both by manual estimation and with the Plot Digitizer, open source software. Corresponding authors of each RCT were contacted up to four times via e-mail to obtain exact numbers that were used to create graphs. Accuracy of each method was compared against the source data from which the original graphs were produced. Software data extraction was significantly faster, reducing time for extraction for 47%. Percent agreement between the two raters was 51% for manual and 53.5% for software data extraction. Percent agreement between the raters and original data was 66% vs. 75% for the first rater and 69% vs. 73% for the second rater, for manual and software extraction, respectively. Data extraction from figures should be conducted using software, whereas manual estimation should be avoided. Using software for data extraction of data presented only in figures is faster and enables higher interrater reliability. Copyright © 2016 Elsevier Inc. All rights reserved.
PLUME-FEATHER, Referencing and Finding Software for Research and Education
NASA Astrophysics Data System (ADS)
Bénassy, O.; Caron, C.; Ferret-Canape, C.; Cheylus, A.; Courcelle, E.; Dantec, C.; Dayre, P.; Dostes, T.; Durand, A.; Facq, A.; Gambini, G.; Geahchan, E.; Helft, C.; Hoffmann, D.; Ingarao, M.; Joly, P.; Kieffer, J.; Larré, J.-M.; Libes, M.; Morris, F.; Parmentier, H.; Pérochon, L.; Porte, O.; Romier, G.; Rousse, D.; Tournoy, R.; Valeins, H.
2014-06-01
PLUME-FEATHER is a non-profit project created to Promote economicaL, Useful and Maintained softwarEFor theHigher Education And THE Research communities. The site references software, mainly Free/Libre Open Source Software (FLOSS) from French universities and national research organisations, (CNRS, INRA...), laboratories or departments as well as other FLOSS software used and evaluated by users within these institutions. Each software is represented by a reference card, which describes origin, aim, installation, cost (if applicable) and user experience from the point of view of an academic user for academic users. Presently over 1000 programs are referenced on PLUME by more than 900 contributors. Although the server is maintained by a French institution, it is open to international contributions in the academic domain. All contained and validated contents are visible to anonymous public, whereas (presently more than 2000) registered users can contribute, starting with comments on single software reference cards up to help with the organisation and presentation of the referenced software products. The project has been presented to the HEP community in 2012 for the first time [1]. This is an update of the status and a call for (further) contributions.
Realizing the Living Paper using the ProvONE Model for Reproducible Research
NASA Astrophysics Data System (ADS)
Jones, M. B.; Jones, C. S.; Ludäscher, B.; Missier, P.; Walker, L.; Slaughter, P.; Schildhauer, M.; Cuevas-Vicenttín, V.
2015-12-01
Science has advanced through traditional publications that codify research results as a permenant part of the scientific record. But because publications are static and atomic, researchers can only cite and reference a whole work when building on prior work of colleagues. The open source software model has demonstrated a new approach in which strong version control in an open environment can nurture an open ecosystem of software. Developers now commonly fork and extend software giving proper credit, with less repetition, and with confidence in the relationship to original software. Through initiatives like 'Beyond the PDF', an analogous model has been imagined for open science, in which software, data, analyses, and derived products become first class objects within a publishing ecosystem that has evolved to be finer-grained and is realized through a web of linked open data. We have prototyped a Living Paper concept by developing the ProvONE provenance model for scientific workflows, with prototype deployments in DataONE. ProvONE promotes transparency and openness by describing the authenticity, origin, structure, and processing history of research artifacts and by detailing the steps in computational workflows that produce derived products. To realize the Living Paper, we decompose scientific papers into their constituent products and publish these as compound objects in the DataONE federation of archival repositories. Each individual finding and sub-product of a reseach project (such as a derived data table, a workflow or script, a figure, an image, or a finding) can be independently stored, versioned, and cited. ProvONE provenance traces link these fine-grained products within and across versions of a paper, and across related papers that extend an original analysis. This allows for open scientific publishing in which researchers extend and modify findings, creating a dynamic, evolving web of results that collectively represent the scientific enterprise. The Living Paper provides detailed metadata for properly interpreting and verifying individual research findings, for tracing the origin of ideas, for launching new lines of inquiry, and for implementing transitive credit for research and engineering.
An open source platform for multi-scale spatially distributed simulations of microbial ecosystems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Segre, Daniel
2014-08-14
The goal of this project was to develop a tool for facilitating simulation, validation and discovery of multiscale dynamical processes in microbial ecosystems. This led to the development of an open-source software platform for Computation Of Microbial Ecosystems in Time and Space (COMETS). COMETS performs spatially distributed time-dependent flux balance based simulations of microbial metabolism. Our plan involved building the software platform itself, calibrating and testing it through comparison with experimental data, and integrating simulations and experiments to address important open questions on the evolution and dynamics of cross-feeding interactions between microbial species.
NASA Astrophysics Data System (ADS)
Joyce, M.; Ramirez, P.; Boustani, M.; Mattmann, C. A.; Khudikyan, S.; McGibbney, L. J.; Whitehall, K. D.
2014-12-01
Apache Open Climate Workbench (OCW; https://climate.apache.org/) is a Top-Level Project at the Apache Software Foundation that aims to provide a suite of tools for performing climate science evaluations using model outputs from a multitude of different sources (ESGF, CORDEX, U.S. NCA, NARCCAP) with remote sensing data from NASA, NOAA, and other agencies. Apache OCW is the second NASA project to become a Top-Level Project at the Apache Software Foundation. It grew out of the Jet Propulsion Laboratory's (JPL) Regional Climate Model Evaluation System (RCMES) project, a collaboration between JPL and the University of California, Los Angeles' Joint Institute for Regional Earth System Science and Engineering (JIFRESSE). Apache OCW provides scientists and developers with tools for data manipulation, metrics for dataset comparisons, and a visualization suite. In addition to a powerful low-level API, Apache OCW also supports a web application for quick, browser-controlled evaluations, a command line application for local evaluations, and a virtual machine for isolated experimentation with minimal setup. This talk will look at the difficulties and successes of moving a closed community research project out into the wild world of open source. We'll explore the growing pains Apache OCW went through to become a Top-Level Project at the Apache Software Foundation as well as the benefits gained by opening up development to the broader climate and computer science communities.
A Generic Software Architecture For Prognostics
NASA Technical Reports Server (NTRS)
Teubert, Christopher; Daigle, Matthew J.; Sankararaman, Shankar; Goebel, Kai; Watkins, Jason
2017-01-01
Prognostics is a systems engineering discipline focused on predicting end-of-life of components and systems. As a relatively new and emerging technology, there are few fielded implementations of prognostics, due in part to practitioners perceiving a large hurdle in developing the models, algorithms, architecture, and integration pieces. As a result, no open software frameworks for applying prognostics currently exist. This paper introduces the Generic Software Architecture for Prognostics (GSAP), an open-source, cross-platform, object-oriented software framework and support library for creating prognostics applications. GSAP was designed to make prognostics more accessible and enable faster adoption and implementation by industry, by reducing the effort and investment required to develop, test, and deploy prognostics. This paper describes the requirements, design, and testing of GSAP. Additionally, a detailed case study involving battery prognostics demonstrates its use.
Campagnola, Luke; Kratz, Megan B; Manis, Paul B
2014-01-01
The complexity of modern neurophysiology experiments requires specialized software to coordinate multiple acquisition devices and analyze the collected data. We have developed ACQ4, an open-source software platform for performing data acquisition and analysis in experimental neurophysiology. This software integrates the tasks of acquiring, managing, and analyzing experimental data. ACQ4 has been used primarily for standard patch-clamp electrophysiology, laser scanning photostimulation, multiphoton microscopy, intrinsic imaging, and calcium imaging. The system is highly modular, which facilitates the addition of new devices and functionality. The modules included with ACQ4 provide for rapid construction of acquisition protocols, live video display, and customizable analysis tools. Position-aware data collection allows automated construction of image mosaics and registration of images with 3-dimensional anatomical atlases. ACQ4 uses free and open-source tools including Python, NumPy/SciPy for numerical computation, PyQt for the user interface, and PyQtGraph for scientific graphics. Supported hardware includes cameras, patch clamp amplifiers, scanning mirrors, lasers, shutters, Pockels cells, motorized stages, and more. ACQ4 is available for download at http://www.acq4.org.
NeuroPG: open source software for optical pattern generation and data acquisition
Avants, Benjamin W.; Murphy, Daniel B.; Dapello, Joel A.; Robinson, Jacob T.
2015-01-01
Patterned illumination using a digital micromirror device (DMD) is a powerful tool for optogenetics. Compared to a scanning laser, DMDs are inexpensive and can easily create complex illumination patterns. Combining these complex spatiotemporal illumination patterns with optogenetics allows DMD-equipped microscopes to probe neural circuits by selectively manipulating the activity of many individual cells or many subcellular regions at the same time. To use DMDs to study neural activity, scientists must develop specialized software to coordinate optical stimulation patterns with the acquisition of electrophysiological and fluorescence data. To meet this growing need we have developed an open source optical pattern generation software for neuroscience—NeuroPG—that combines, DMD control, sample visualization, and data acquisition in one application. Built on a MATLAB platform, NeuroPG can also process, analyze, and visualize data. The software is designed specifically for the Mightex Polygon400; however, as an open source package, NeuroPG can be modified to incorporate any data acquisition, imaging, or illumination equipment that is compatible with MATLAB’s Data Acquisition and Image Acquisition toolboxes. PMID:25784873
GIMS—Software for asset market experiments
Palan, Stefan
2015-01-01
In this article we lay out requirements for an experimental market software for financial and economic research. We then discuss existing solutions. Finally, we introduce GIMS, an open source market software which is characterized by extensibility and ease of use, while offering nearly all of the required functionality. PMID:26525085
Achieving Better Buying Power through Acquisition of Open Architecture Software Systems: Volume 1
2016-01-06
supporting “Bring Your Own Devices” (BYOD)? 22 New business models for OA software components ● Franchising ● Enterprise licensing ● Metered usage...paths IP and cybersecurity requirements will need continuous attention! 35 New business models for OA software components ● Franchising ● Enterprise
Networks at Their Limits: Software, Similarity, and Continuity in Vietnam
ERIC Educational Resources Information Center
Nguyen, Lilly Uyen
2013-01-01
This dissertation explores the social worlds of pirated software discs and free/open source software in Vietnam to describe the practices of copying, evangelizing, and translation. This dissertation also reveals the cultural logics of similarity and continuity that sustain these social worlds. Taken together, this dissertation argues that the…
pyLIMA : an open source microlensing software
NASA Astrophysics Data System (ADS)
Bachelet, Etienne
2017-01-01
Planetary microlensing is a unique tool to detect cold planets around low-mass stars which is approaching a watershed in discoveries as near-future missions incorporate dedicated surveys. NASA and ESA have decided to complement WFIRST-AFTA and Euclid with microlensing programs to enrich our statistics about this planetary population. Of the nany challenges in- herent in these missions, the data analysis is of primary importance, yet is often perceived as time consuming, complex and daunting barrier to participation in the field. We present the first open source modeling software to conduct a microlensing analysis. This software is written in Python and use as much as possible existing packages.
15 CFR 734.7 - Published information and software.
Code of Federal Regulations, 2011 CFR
2011-01-01
...) Ready availability at libraries open to the public or at university libraries (See supplement No. 1 to this part, Question A(6)); (3) Patents and open (published) patent applications available at any patent office; and (4) Release at an open conference, meeting, seminar, trade show, or other open gathering. (i...
15 CFR 734.7 - Published information and software.
Code of Federal Regulations, 2010 CFR
2010-01-01
...) Ready availability at libraries open to the public or at university libraries (See Supplement No. 1 to this part, Question A(6)); (3) Patents and open (published) patent applications available at any patent office; and (4) Release at an open conference, meeting, seminar, trade show, or other open gathering. (i...
Computational Fluids Domain Reduction to a Simplified Fluid Network
2012-04-19
readily available read/ write software library. Code components from the open source projects OpenFoam and Paraview were explored for their adaptability...to the project. Both Paraview and OpenFoam read polyhedral mesh. OpenFoam does not read results data. Paraview actually allows for user “filters
Open source Modeling and optimization tools for Planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peles, S.
Open source modeling and optimization tools for planning The existing tools and software used for planning and analysis in California are either expensive, difficult to use, or not generally accessible to a large number of participants. These limitations restrict the availability of participants for larger scale energy and grid studies in the state. The proposed initiative would build upon federal and state investments in open source software, and create and improve open source tools for use in the state planning and analysis activities. Computational analysis and simulation frameworks in development at national labs and universities can be brought forward tomore » complement existing tools. An open source platform would provide a path for novel techniques and strategies to be brought into the larger community and reviewed by a broad set of stakeholders.« less
Nema, Shubham; Hasan, Whidul; Bhargava, Anamika; Bhargava, Yogesh
2016-09-15
Behavioural neuroscience relies on software driven methods for behavioural assessment, but the field lacks cost-effective, robust, open source software for behavioural analysis. Here we propose a novel method which we called as ZebraTrack. It includes cost-effective imaging setup for distraction-free behavioural acquisition, automated tracking using open-source ImageJ software and workflow for extraction of behavioural endpoints. Our ImageJ algorithm is capable of providing control to users at key steps while maintaining automation in tracking without the need for the installation of external plugins. We have validated this method by testing novelty induced anxiety behaviour in adult zebrafish. Our results, in agreement with established findings, showed that during state-anxiety, zebrafish showed reduced distance travelled, increased thigmotaxis and freezing events. Furthermore, we proposed a method to represent both spatial and temporal distribution of choice-based behaviour which is currently not possible to represent using simple videograms. ZebraTrack method is simple and economical, yet robust enough to give results comparable with those obtained from costly proprietary software like Ethovision XT. We have developed and validated a novel cost-effective method for behavioural analysis of adult zebrafish using open-source ImageJ software. Copyright © 2016 Elsevier B.V. All rights reserved.
2011-08-01
dominates the global mobile application market and mobile computing software ecosystems. But overall, OA systems are not necessarily excluded from...License 3.0 (OSL) Corel Transactional License ( CTL ) The licenses were chosen to represent a variety of kinds of licenses, and include one...proprietary ( CTL ), three academic (Apache, BSD, MIT), and six reciprocal licenses (CPL, EPL, GPL, LGPL, MPL, OSL) that take varying approaches in
Data Mining Meets HCI: Making Sense of Large Graphs
2012-07-01
graph algo- rithms, won the Open Source Software World Challenge, Silver Award. We have released Pegasus as free , open-source software, downloaded by...METIS [77], spectral clustering [108], and the parameter- free “Cross-associations” (CA) [26]. Belief Propagation can also be used for clus- tering, as...number of tools have been developed to support “ landscape ” views of information. These include WebBook and Web- Forager [23], which use a book metaphor
The Azimuth Project: an Open-Access Educational Resource
NASA Astrophysics Data System (ADS)
Baez, J. C.
2012-12-01
The Azimuth Project is an online collaboration of scientists, engineers and programmers who are volunteering their time to do something about a wide range of environmental problems. The project has several aspects: 1) a wiki designed to make reliable, sourced information easy to find and accessible to a technically literate nonexperts, 2) a blog featuring expository articles and news items, 3) a project to write programs that explain basic concepts of climate physics and illustrate principles of good open-source software design, and 4) a project to develop mathematical tools for studying complex networked systems. We discuss the progress so far and some preliminary lessons. For example, enlisting the help of experts outside academia highlights the problems with pay-walled journals and the benefits of open access, as well as differences between how software development is done commercially, in the free software community, and in academe.
CymeR: cytometry analysis using KNIME, docker and R
Muchmore, B.; Alarcón-Riquelme, M.E.
2017-01-01
Abstract Summary: Here we present open-source software for the analysis of high-dimensional cytometry data using state of the art algorithms. Importantly, use of the software requires no programming ability, and output files can either be interrogated directly in CymeR or they can be used downstream with any other cytometric data analysis platform. Also, because we use Docker to integrate the multitude of components that form the basis of CymeR, we have additionally developed a proof-of-concept of how future open-source bioinformatic programs with graphical user interfaces could be developed. Availability and Implementation: CymeR is open-source software that ties several components into a single program that is perhaps best thought of as a self-contained data analysis operating system. Please see https://github.com/bmuchmore/CymeR/wiki for detailed installation instructions. Contact: brian.muchmore@genyo.es or marta.alarcon@genyo.es PMID:27998935
CymeR: cytometry analysis using KNIME, docker and R.
Muchmore, B; Alarcón-Riquelme, M E
2017-03-01
Here we present open-source software for the analysis of high-dimensional cytometry data using state of the art algorithms. Importantly, use of the software requires no programming ability, and output files can either be interrogated directly in CymeR or they can be used downstream with any other cytometric data analysis platform. Also, because we use Docker to integrate the multitude of components that form the basis of CymeR, we have additionally developed a proof-of-concept of how future open-source bioinformatic programs with graphical user interfaces could be developed. CymeR is open-source software that ties several components into a single program that is perhaps best thought of as a self-contained data analysis operating system. Please see https://github.com/bmuchmore/CymeR/wiki for detailed installation instructions. brian.muchmore@genyo.es or marta.alarcon@genyo.es. © The Author 2016. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Zacharek, M.; Delis, P.; Kedzierski, M.; Fryskowska, A.
2017-05-01
These studies have been conductedusing non-metric digital camera and dense image matching algorithms, as non-contact methods of creating monuments documentation.In order toprocess the imagery, few open-source software and algorithms of generating adense point cloud from images have been executed. In the research, the OSM Bundler, VisualSFM software, and web application ARC3D were used. Images obtained for each of the investigated objects were processed using those applications, and then dense point clouds and textured 3D models were created. As a result of post-processing, obtained models were filtered and scaled.The research showedthat even using the open-source software it is possible toobtain accurate 3D models of structures (with an accuracy of a few centimeters), but for the purpose of documentation and conservation of cultural and historical heritage, such accuracy can be insufficient.
PD5: a general purpose library for primer design software.
Riley, Michael C; Aubrey, Wayne; Young, Michael; Clare, Amanda
2013-01-01
Complex PCR applications for large genome-scale projects require fast, reliable and often highly sophisticated primer design software applications. Presently, such applications use pipelining methods to utilise many third party applications and this involves file parsing, interfacing and data conversion, which is slow and prone to error. A fully integrated suite of software tools for primer design would considerably improve the development time, the processing speed, and the reliability of bespoke primer design software applications. The PD5 software library is an open-source collection of classes and utilities, providing a complete collection of software building blocks for primer design and analysis. It is written in object-oriented C(++) with an emphasis on classes suitable for efficient and rapid development of bespoke primer design programs. The modular design of the software library simplifies the development of specific applications and also integration with existing third party software where necessary. We demonstrate several applications created using this software library that have already proved to be effective, but we view the project as a dynamic environment for building primer design software and it is open for future development by the bioinformatics community. Therefore, the PD5 software library is published under the terms of the GNU General Public License, which guarantee access to source-code and allow redistribution and modification. The PD5 software library is downloadable from Google Code and the accompanying Wiki includes instructions and examples: http://code.google.com/p/primer-design.
Proceedings of the Second Software Architecture Technology User Network (SATURN) Workshop
2006-08-01
Proceedings of the Second Software Architecture Technology User Network (SATURN) Workshop Robert L. Nord August 2006 TECHNICAL REPORT CMU...SEI-2006-TR-010 ESC-TR-2006-010 Software Architecture Technology Initiative Unlimited distribution subject to the copyright. This report was...Participants 3 3 Presentations 5 3.1 SATURN Opening Presentation: Future Directions of the Software Architecture Technology Initiative 5 3.2 Keynote
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
Nguyen, Hoang T; Merriman, Tony R; Black, Michael A
2014-01-01
Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.
Pounds, Stan; Cheng, Cheng; Cao, Xueyuan; Crews, Kristine R; Plunkett, William; Gandhi, Varsha; Rubnitz, Jeffrey; Ribeiro, Raul C; Downing, James R; Lamba, Jatinder
2009-08-15
In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables. Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis. Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org.
NASA Astrophysics Data System (ADS)
Hwang, L.; Kellogg, L. H.
2017-12-01
Curation of software promotes discoverability and accessibility and works hand in hand with scholarly citation to ascribe value to, and provide recognition for software development. To meet this challenge, the Computational Infrastructure for Geodynamics (CIG) maintains a community repository built on custom and open tools to promote discovery, access, identification, credit, and provenance of research software for the geodynamics community. CIG (geodynamics.org) originated from recognition of the tremendous effort required to develop sound software and the need to reduce duplication of effort and to sustain community codes. CIG curates software across 6 domains and has developed and follows software best practices that include establishing test cases, documentation, and a citable publication for each software package. CIG software landing web pages provide access to current and past releases; many are also accessible through the CIG community repository on github. CIG has now developed abc - attribution builder for citation to enable software users to give credit to software developers. abc uses zenodo as an archive and as the mechanism to obtain a unique identifier (DOI) for scientific software. To assemble the metadata, we searched the software's documentation and research publications and then requested the primary developers to verify. In this process, we have learned that each development community approaches software attribution differently. The metadata gathered is based on guidelines established by groups such as FORCE11 and OntoSoft. The rollout of abc is gradual as developers are forward-looking, rarely willing to go back and archive prior releases in zenodo. Going forward all actively developed packages will utilize the zenodo and github integration to automate the archival process when a new release is issued. How to handle legacy software, multi-authored libraries, and assigning roles to software remain open issues.
NASA Astrophysics Data System (ADS)
Engel, P.; Schweimler, B.
2016-04-01
The deformation monitoring of structures and buildings is an important task field of modern engineering surveying, ensuring the standing and reliability of supervised objects over a long period. Several commercial hardware and software solutions for the realization of such monitoring measurements are available on the market. In addition to them, a research team at the Neubrandenburg University of Applied Sciences (NUAS) is actively developing a software package for monitoring purposes in geodesy and geotechnics, which is distributed under an open source licence and free of charge. The task of managing an open source project is well-known in computer science, but it is fairly new in a geodetic context. This paper contributes to that issue by detailing applications, frameworks, and interfaces for the design and implementation of open hardware and software solutions for sensor control, sensor networks, and data management in automatic deformation monitoring. It will be discussed how the development effort of networked applications can be reduced by using free programming tools, cloud computing technologies, and rapid prototyping methods.
Open systems storage platforms
NASA Technical Reports Server (NTRS)
Collins, Kirby
1992-01-01
The building blocks for an open storage system includes a system platform, a selection of storage devices and interfaces, system software, and storage applications CONVEX storage systems are based on the DS Series Data Server systems. These systems are a variant of the C3200 supercomputer with expanded I/O capabilities. These systems support a variety of medium and high speed interfaces to networks and peripherals. System software is provided in the form of ConvexOS, a POSIX compliant derivative of 4.3BSD UNIX. Storage applications include products such as UNITREE and EMASS. With the DS Series of storage systems, Convex has developed a set of products which provide open system solutions for storage management applications. The systems are highly modular, assembled from off the shelf components with industry standard interfaces. The C Series system architecture provides a stable base, with the performance and reliability of a general purpose platform. This combination of a proven system architecture with a variety of choices in peripherals and application software allows wide flexibility in configurations, and delivers the benefits of open systems to the mass storage world.
Liu, Lei; Peng, Wei-Ren; Casellas, Ramon; Tsuritani, Takehiro; Morita, Itsuro; Martínez, Ricardo; Muñoz, Raül; Yoo, S J B
2014-01-13
Optical Orthogonal Frequency Division Multiplexing (O-OFDM), which transmits high speed optical signals using multiple spectrally overlapped lower-speed subcarriers, is a promising candidate for supporting future elastic optical networks. In contrast to previous works which focus on Coherent Optical OFDM (CO-OFDM), in this paper, we consider the direct-detection optical OFDM (DDO-OFDM) as the transport technique, which leads to simpler hardware and software realizations, potentially offering a low-cost solution for elastic optical networks, especially in metro networks, and short or medium distance core networks. Based on this network scenario, we design and deploy a software-defined networking (SDN) control plane enabled by extending OpenFlow, detailing the network architecture, the routing and spectrum assignment algorithm, OpenFlow protocol extensions and the experimental validation. To the best of our knowledge, it is the first time that an OpenFlow-based control plane is reported and its performance is quantitatively measured in an elastic optical network with DDO-OFDM transmission.
Mathur, Gagan; Haugen, Thomas H; Davis, Scott L; Krasowski, Matthew D
2014-01-01
Interfacing of clinical laboratory instruments with the laboratory information system (LIS) via "middleware" software is increasingly common. Our clinical laboratory implemented capillary electrophoresis using a Sebia(®) Capillarys-2™ (Norcross, GA, USA) instrument for serum and urine protein electrophoresis. Using Data Innovations Instrument Manager, an interface was established with the LIS (Cerner) that allowed for bi-directional transmission of numeric data. However, the text of the interpretive pathology report was not properly transferred. To reduce manual effort and possibility for error in text data transfer, we developed scripts in AutoHotkey, a free, open-source macro-creation and automation software utility. Scripts were written to create macros that automated mouse and key strokes. The scripts retrieve the specimen accession number, capture user input text, and insert the text interpretation in the correct patient record in the desired format. The scripts accurately and precisely transfer narrative interpretation into the LIS. Combined with bar-code reading by the electrophoresis instrument, the scripts transfer data efficiently to the correct patient record. In addition, the AutoHotKey script automated repetitive key strokes required for manual entry into the LIS, making protein electrophoresis sign-out easier to learn and faster to use by the pathology residents. Scripts allow for either preliminary verification by residents or final sign-out by the attending pathologist. Using the open-source AutoHotKey software, we successfully improved the transfer of text data between capillary electrophoresis software and the LIS. The use of open-source software tools should not be overlooked as tools to improve interfacing of laboratory instruments.
Van Berkel, Gary J.; Kertesz, Vilmos
2016-11-15
An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Berkel, Gary J.; Kertesz, Vilmos
An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less
OsiriX: an open-source software for navigating in multidimensional DICOM images.
Rosset, Antoine; Spadola, Luca; Ratib, Osman
2004-09-01
A multidimensional image navigation and display software was designed for display and interpretation of large sets of multidimensional and multimodality images such as combined PET-CT studies. The software is developed in Objective-C on a Macintosh platform under the MacOS X operating system using the GNUstep development environment. It also benefits from the extremely fast and optimized 3D graphic capabilities of the OpenGL graphic standard widely used for computer games optimized for taking advantage of any hardware graphic accelerator boards available. In the design of the software special attention was given to adapt the user interface to the specific and complex tasks of navigating through large sets of image data. An interactive jog-wheel device widely used in the video and movie industry was implemented to allow users to navigate in the different dimensions of an image set much faster than with a traditional mouse or on-screen cursors and sliders. The program can easily be adapted for very specific tasks that require a limited number of functions, by adding and removing tools from the program's toolbar and avoiding an overwhelming number of unnecessary tools and functions. The processing and image rendering tools of the software are based on the open-source libraries ITK and VTK. This ensures that all new developments in image processing that could emerge from other academic institutions using these libraries can be directly ported to the OsiriX program. OsiriX is provided free of charge under the GNU open-source licensing agreement at http://homepage.mac.com/rossetantoine/osirix.
GC-Content Normalization for RNA-Seq Data
2011-01-01
Background Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq. Conclusions Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes. PMID:22177264