OMPC: an Open-Source MATLAB®-to-Python Compiler
Jurica, Peter; van Leeuwen, Cees
2008-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB®, the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB®-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB® functions into Python programs. The imported MATLAB® modules will run independently of MATLAB®, relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB®. OMPC is available at http://ompc.juricap.com. PMID:19225577
OMPC: an Open-Source MATLAB-to-Python Compiler.
Jurica, Peter; van Leeuwen, Cees
2009-01-01
Free access to scientific information facilitates scientific progress. Open-access scientific journals are a first step in this direction; a further step is to make auxiliary and supplementary materials that accompany scientific publications, such as methodological procedures and data-analysis tools, open and accessible to the scientific community. To this purpose it is instrumental to establish a software base, which will grow toward a comprehensive free and open-source language of technical and scientific computing. Endeavors in this direction are met with an important obstacle. MATLAB((R)), the predominant computation tool in many fields of research, is a closed-source commercial product. To facilitate the transition to an open computation platform, we propose Open-source MATLAB((R))-to-Python Compiler (OMPC), a platform that uses syntax adaptation and emulation to allow transparent import of existing MATLAB((R)) functions into Python programs. The imported MATLAB((R)) modules will run independently of MATLAB((R)), relying on Python's numerical and scientific libraries. Python offers a stable and mature open source platform that, in many respects, surpasses commonly used, expensive commercial closed source packages. The proposed software will therefore facilitate the transparent transition towards a free and general open-source lingua franca for scientific computation, while enabling access to the existing methods and algorithms of technical computing already available in MATLAB((R)). OMPC is available at http://ompc.juricap.com.
Open-Source 3-D Platform for Low-Cost Scientific Instrument Ecosystem.
Zhang, C; Wijnen, B; Pearce, J M
2016-08-01
The combination of open-source software and hardware provides technically feasible methods to create low-cost, highly customized scientific research equipment. Open-source 3-D printers have proven useful for fabricating scientific tools. Here the capabilities of an open-source 3-D printer are expanded to become a highly flexible scientific platform. An automated low-cost 3-D motion control platform is presented that has the capacity to perform scientific applications, including (1) 3-D printing of scientific hardware; (2) laboratory auto-stirring, measuring, and probing; (3) automated fluid handling; and (4) shaking and mixing. The open-source 3-D platform not only facilities routine research while radically reducing the cost, but also inspires the creation of a diverse array of custom instruments that can be shared and replicated digitally throughout the world to drive down the cost of research and education further. © 2016 Society for Laboratory Automation and Screening.
Preparing a scientific manuscript in Linux: Today's possibilities and limitations.
Tchantchaleishvili, Vakhtang; Schmitto, Jan D
2011-10-22
Increasing number of scientists are enthusiastic about using free, open source software for their research purposes. Authors' specific goal was to examine whether a Linux-based operating system with open source software packages would allow to prepare a submission-ready scientific manuscript without the need to use the proprietary software. Preparation and editing of scientific manuscripts is possible using Linux and open source software. This letter to the editor describes key steps for preparation of a publication-ready scientific manuscript in a Linux-based operating system, as well as discusses the necessary software components. This manuscript was created using Linux and open source programs for Linux.
Preparing a scientific manuscript in Linux: Today's possibilities and limitations
2011-01-01
Background Increasing number of scientists are enthusiastic about using free, open source software for their research purposes. Authors' specific goal was to examine whether a Linux-based operating system with open source software packages would allow to prepare a submission-ready scientific manuscript without the need to use the proprietary software. Findings Preparation and editing of scientific manuscripts is possible using Linux and open source software. This letter to the editor describes key steps for preparation of a publication-ready scientific manuscript in a Linux-based operating system, as well as discusses the necessary software components. This manuscript was created using Linux and open source programs for Linux. PMID:22018246
Adopting Open Source Software to Address Software Risks during the Scientific Data Life Cycle
NASA Astrophysics Data System (ADS)
Vinay, S.; Downs, R. R.
2012-12-01
Software enables the creation, management, storage, distribution, discovery, and use of scientific data throughout the data lifecycle. However, the capabilities offered by software also present risks for the stewardship of scientific data, since future access to digital data is dependent on the use of software. From operating systems to applications for analyzing data, the dependence of data on software presents challenges for the stewardship of scientific data. Adopting open source software provides opportunities to address some of the proprietary risks of data dependence on software. For example, in some cases, open source software can be deployed to avoid licensing restrictions for using, modifying, and transferring proprietary software. The availability of the source code of open source software also enables the inclusion of modifications, which may be contributed by various community members who are addressing similar issues. Likewise, an active community that is maintaining open source software can be a valuable source of help, providing an opportunity to collaborate to address common issues facing adopters. As part of the effort to meet the challenges of software dependence for scientific data stewardship, risks from software dependence have been identified that exist during various times of the data lifecycle. The identification of these risks should enable the development of plans for mitigating software dependencies, where applicable, using open source software, and to improve understanding of software dependency risks for scientific data and how they can be reduced during the data life cycle.
Open source software integrated into data services of Japanese planetary explorations
NASA Astrophysics Data System (ADS)
Yamamoto, Y.; Ishihara, Y.; Otake, H.; Imai, K.; Masuda, K.
2015-12-01
Scientific data obtained by Japanese scientific satellites and lunar and planetary explorations are archived in DARTS (Data ARchives and Transmission System). DARTS provides the data with a simple method such as HTTP directory listing for long-term preservation while DARTS tries to provide rich web applications for ease of access with modern web technologies based on open source software. This presentation showcases availability of open source software through our services. KADIAS is a web-based application to search, analyze, and obtain scientific data measured by SELENE(Kaguya), a Japanese lunar orbiter. KADIAS uses OpenLayers to display maps distributed from Web Map Service (WMS). As a WMS server, open source software MapServer is adopted. KAGUYA 3D GIS (KAGUYA 3D Moon NAVI) provides a virtual globe for the SELENE's data. The main purpose of this application is public outreach. NASA World Wind Java SDK is used to develop. C3 (Cross-Cutting Comparisons) is a tool to compare data from various observations and simulations. It uses Highcharts to draw graphs on web browsers. Flow is a tool to simulate a Field-Of-View of an instrument onboard a spacecraft. This tool itself is open source software developed by JAXA/ISAS, and the license is BSD 3-Caluse License. SPICE Toolkit is essential to compile FLOW. SPICE Toolkit is also open source software developed by NASA/JPL, and the website distributes many spacecrafts' data. Nowadays, open source software is an indispensable tool to integrate DARTS services.
The open-source movement: an introduction for forestry professionals
Patrick Proctor; Paul C. Van Deusen; Linda S. Heath; Jeffrey H. Gove
2005-01-01
In recent years, the open-source movement has yielded a generous and powerful suite of software and utilities that rivals those developed by many commercial software companies. Open-source programs are available for many scientific needs: operating systems, databases, statistical analysis, Geographic Information System applications, and object-oriented programming....
Open Source Software Development and Lotka's Law: Bibliometric Patterns in Programming.
ERIC Educational Resources Information Center
Newby, Gregory B.; Greenberg, Jane; Jones, Paul
2003-01-01
Applies Lotka's Law to metadata on open source software development. Authoring patterns found in software development productivity are found to be comparable to prior studies of Lotka's Law for scientific and scholarly publishing, and offer promise in predicting aggregate behavior of open source developers. (Author/LRW)
NASA Astrophysics Data System (ADS)
Yetman, G.; Downs, R. R.
2011-12-01
Software deployment is needed to process and distribute scientific data throughout the data lifecycle. Developing software in-house can take software development teams away from other software development projects and can require efforts to maintain the software over time. Adopting and reusing software and system modules that have been previously developed by others can reduce in-house software development and maintenance costs and can contribute to the quality of the system being developed. A variety of models are available for reusing and deploying software and systems that have been developed by others. These deployment models include open source software, vendor-supported open source software, commercial software, and combinations of these approaches. Deployment in Earth science data processing and distribution has demonstrated the advantages and drawbacks of each model. Deploying open source software offers advantages for developing and maintaining scientific data processing systems and applications. By joining an open source community that is developing a particular system module or application, a scientific data processing team can contribute to aspects of the software development without having to commit to developing the software alone. Communities of interested developers can share the work while focusing on activities that utilize in-house expertise and addresses internal requirements. Maintenance is also shared by members of the community. Deploying vendor-supported open source software offers similar advantages to open source software. However, by procuring the services of a vendor, the in-house team can rely on the vendor to provide, install, and maintain the software over time. Vendor-supported open source software may be ideal for teams that recognize the value of an open source software component or application and would like to contribute to the effort, but do not have the time or expertise to contribute extensively. Vendor-supported software may also have the additional benefits of guaranteed up-time, bug fixes, and vendor-added enhancements. Deploying commercial software can be advantageous for obtaining system or software components offered by a vendor that meet in-house requirements. The vendor can be contracted to provide installation, support and maintenance services as needed. Combining these options offers a menu of choices, enabling selection of system components or software modules that meet the evolving requirements encountered throughout the scientific data lifecycle.
2010-01-01
Background The ability to write clearly and effectively is of central importance to the scientific enterprise. Encouraged by the success of simulation environments in other biomedical sciences, we developed WriteSim TCExam, an open-source, Web-based, textual simulation environment for teaching effective writing techniques to novice researchers. We shortlisted and modified an existing open source application - TCExam to serve as a textual simulation environment. After testing usability internally in our team, we conducted formal field usability studies with novice researchers. These were followed by formal surveys with researchers fitting the role of administrators and users (novice researchers) Results The development process was guided by feedback from usability tests within our research team. Online surveys and formal studies, involving members of the Research on Research group and selected novice researchers, show that the application is user-friendly. Additionally it has been used to train 25 novice researchers in scientific writing to date and has generated encouraging results. Conclusion WriteSim TCExam is the first Web-based, open-source textual simulation environment designed to complement traditional scientific writing instruction. While initial reviews by students and educators have been positive, a formal study is needed to measure its benefits in comparison to standard instructional methods. PMID:20509946
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinlan, D.; Yi, Q.; Buduc, R.
2005-02-17
ROSE is an object-oriented software infrastructure for source-to-source translation that provides an interface for programmers to write their own specialized translators for optimizing scientific applications. ROSE is a part of current research on telescoping languages, which provides optimizations of the use of libraries in scientific applications. ROSE defines approaches to extend the optimization techniques, common in well defined languages, to the optimization of scientific applications using well defined libraries. ROSE includes a rich set of tools for generating customized transformations to support optimization of applications codes. We currently support full C and C++ (including template instantiation etc.), with Fortran 90more » support under development as part of a collaboration and contract with Rice to use their version of the open source Open64 F90 front-end. ROSE represents an attempt to define an open compiler infrastructure to handle the full complexity of full scale DOE applications codes using the languages common to scientific computing within DOE. We expect that such an infrastructure will also be useful for the development of numerous tools that may then realistically expect to work on DOE full scale applications.« less
SCIFIO: an extensible framework to support scientific image formats.
Hiner, Mark C; Rueden, Curtis T; Eliceiri, Kevin W
2016-12-07
No gold standard exists in the world of scientific image acquisition; a proliferation of instruments each with its own proprietary data format has made out-of-the-box sharing of that data nearly impossible. In the field of light microscopy, the Bio-Formats library was designed to translate such proprietary data formats to a common, open-source schema, enabling sharing and reproduction of scientific results. While Bio-Formats has proved successful for microscopy images, the greater scientific community was lacking a domain-independent framework for format translation. SCIFIO (SCientific Image Format Input and Output) is presented as a freely available, open-source library unifying the mechanisms of reading and writing image data. The core of SCIFIO is its modular definition of formats, the design of which clearly outlines the components of image I/O to encourage extensibility, facilitated by the dynamic discovery of the SciJava plugin framework. SCIFIO is structured to support coexistence of multiple domain-specific open exchange formats, such as Bio-Formats' OME-TIFF, within a unified environment. SCIFIO is a freely available software library developed to standardize the process of reading and writing scientific image formats.
Embracing Open Source for NASA's Earth Science Data Systems
NASA Technical Reports Server (NTRS)
Baynes, Katie; Pilone, Dan; Boller, Ryan; Meyer, David; Murphy, Kevin
2017-01-01
The overarching purpose of NASAs Earth Science program is to develop a scientific understanding of Earth as a system. Scientific knowledge is most robust and actionable when resulting from transparent, traceable, and reproducible methods. Reproducibility includes open access to the data as well as the software used to arrive at results. Additionally, software that is custom-developed for NASA should be open to the greatest degree possible, to enable re-use across Federal agencies, reduce overall costs to the government, remove barriers to innovation, and promote consistency through the use of uniform standards. Finally, Open Source Software (OSS) practices facilitate collaboration between agencies and the private sector. To best meet these ends, NASAs Earth Science Division promotes the full and open sharing of not only all data, metadata, products, information, documentation, models, images, and research results but also the source code used to generate, manipulate and analyze them. This talk focuses on the challenges to open sourcing NASA developed software within ESD and the growing pains associated with establishing policies running the gamut of tracking issues, properly documenting build processes, engaging the open source community, maintaining internal compliance, and accepting contributions from external sources. This talk also covers the adoption of existing open source technologies and standards to enhance our custom solutions and our contributions back to the community. Finally, we will be introducing the most recent OSS contributions from NASA Earth Science program and promoting these projects for wider community review and adoption.
Openness, Web 2.0 Technology, and Open Science
ERIC Educational Resources Information Center
Peters, Michael A.
2010-01-01
Open science is a term that is being used in the literature to designate a form of science based on open source models or that utilizes principles of open access, open archiving and open publishing to promote scientific communication. Open science increasingly also refers to open governance and more democratized engagement and control of science…
Tracking Clouds with low cost GNSS chips aided by the Arduino platform
NASA Astrophysics Data System (ADS)
Hameed, Saji; Realini, Eugenio; Ishida, Shinya
2016-04-01
The Global Navigation Satellite System (GNSS) is a constellation of satellites that is used to provide geo-positioning services. Besides this application, the GNSS system is important for a wide range of scientific and civilian applications. For example, GNSS systems are routinely used in civilian applications such as surveying and scientific applications such as the study of crustal deformation. Another important scientific application of GNSS system is in meteorological research. Here it is mainly used to determine the total water vapour content of the troposphere, hereafter Precipitable Water Vapor (PWV). However, both GNSS receivers and software have prohibitively high price due to a variety of reasons. To overcome this somewhat artificial barrier we are exploring the use of low-cost GNSS receivers along with open source GNSS software for scientific research, in particular for GNSS meteorology research. To achieve this aim, we have developed a custom Arduino compatible data logging board that is able to operate together with a specific low-cost single frequency GNSS receiver chip from NVS Technologies AG. We have also developed an open-source software bundle that includes a new Arduino core for the Atmel324p chip, which is the main processor used in our custom logger. We have also developed software code that enables data collection, logging and parsing of the GNSS data stream. Additionally we have comprehensively evaluated the low power characteristics of the GNSS receiver and logger boards. Currently we are exploring the use of several openly source or free to use for research software to map GNSS delays to PWV. These include the open source goGPS (http://www.gogps-project.org/) and gLAB (http://gage.upc.edu/gLAB) and the openly available GAMIT software from Massachusetts Institute of Technology (MIT). We note that all the firmware and software developed as part of this project is available on an open source license.
Developing a low-cost open-source CTD for research and outreach
NASA Astrophysics Data System (ADS)
Thaler, A. D.; Sturdivant, K.
2013-12-01
Developing a low-cost open-source CTD for research and outreach Andrew David Thaler and Kersey Sturdivant Conductivity, temperature, and depth (CTD). With these three measurements, marine scientists can unlock ocean patterns hidden beneath the waves. The ocean is not uniform, it its filled with swirling eddies, temperature boundaries, layers of high and low salinity, changing densities, and many other physical characteristics. To reveal these patterns, oceanographers use a tool called the CTD. A CTD is found on almost every major research vessel. Rare is the scientific expedition-whether it be coastal work in shallow estuaries or journeys to the deepest ocean trenches-that doesn't begin with the humble CTD cast. The CTD is not cheap. Commercial CTD's start at more the 5,000 and can climb as high as 25,000 or more. We believe that the prohibitive cost of a CTD is an unacceptable barrier to open science. The price tag excludes individuals and groups who lack research grants or significant private funds from conducting oceanographic research. We want to make this tool-the workhorse of oceanographic research-available to anyone with an interest in the oceans. The OpenCTD is a low-cost, open-source CTD suitable for both educators and scientists. The platform is built using readily available parts and is powered by an Arduino-based microcontroller. Our goal is to create a device that is accurate enough to be used for scientific research and can be constructed for less than $200. Source codes, circuit diagrams, and building plans will be freely available. The final instrument will be effective to 200 meters depth. Why 200 meters? For many coastal regions, 200 meters of water depth covers the majority of the ocean that is accessible by small boat. The OpenCTD is targeted to people working in this niche, where entire research projects can be conducted for less than the cost of a commercial CTD. However, the Open CTD is scalable, and anyone with the inclination can adapt our plans to operate in deeper waters. Through a crowdfunding initiative and collaboration with numerous interested scientists, researchers, educators, and developers, we developed the framework for a low-cost, open-source, CTD that is appropriate for both scientific research and public outreach. We envision a network or researchers and educators using the OpenCTD to contribute to local and region scientific programs through open-source databases.
Search Analytics: Automated Learning, Analysis, and Search with Open Source
NASA Astrophysics Data System (ADS)
Hundman, K.; Mattmann, C. A.; Hyon, J.; Ramirez, P.
2016-12-01
The sheer volume of unstructured scientific data makes comprehensive human analysis impossible, resulting in missed opportunities to identify relationships, trends, gaps, and outliers. As the open source community continues to grow, tools like Apache Tika, Apache Solr, Stanford's DeepDive, and Data-Driven Documents (D3) can help address this challenge. With a focus on journal publications and conference abstracts often in the form of PDF and Microsoft Office documents, we've initiated an exploratory NASA Advanced Concepts project aiming to use the aforementioned open source text analytics tools to build a data-driven justification for the HyspIRI Decadal Survey mission. We call this capability Search Analytics, and it fuses and augments these open source tools to enable the automatic discovery and extraction of salient information. In the case of HyspIRI, a hyperspectral infrared imager mission, key findings resulted from the extractions and visualizations of relationships from thousands of unstructured scientific documents. The relationships include links between satellites (e.g. Landsat 8), domain-specific measurements (e.g. spectral coverage) and subjects (e.g. invasive species). Using the above open source tools, Search Analytics mined and characterized a corpus of information that would be infeasible for a human to process. More broadly, Search Analytics offers insights into various scientific and commercial applications enabled through missions and instrumentation with specific technical capabilities. For example, the following phrases were extracted in close proximity within a publication: "In this study, hyperspectral images…with high spatial resolution (1 m) were analyzed to detect cutleaf teasel in two areas. …Classification of cutleaf teasel reached a users accuracy of 82 to 84%." Without reading a single paper we can use Search Analytics to automatically identify that a 1 m spatial resolution provides a cutleaf teasel detection users accuracy of 82-84%, which could have tangible, direct downstream implications for crop protection. Automatically assimilating this information expedites and supplements human analysis, and, ultimately, Search Analytics and its foundation of open source tools will result in more efficient scientific investment and research.
Tools for open geospatial science
NASA Astrophysics Data System (ADS)
Petras, V.; Petrasova, A.; Mitasova, H.
2017-12-01
Open science uses open source to deal with reproducibility challenges in data and computational sciences. However, just using open source software or making the code public does not make the research reproducible. Moreover, the scientists face the challenge of learning new unfamiliar tools and workflows. In this contribution, we will look at a graduate-level course syllabus covering several software tools which make validation and reuse by a wider professional community possible. For the novices in the open science arena, we will look at how scripting languages such as Python and Bash help us reproduce research (starting with our own work). Jupyter Notebook will be introduced as a code editor, data exploration tool, and a lab notebook. We will see how Git helps us not to get lost in revisions and how Docker is used to wrap all the parts together using a single text file so that figures for a scientific paper or a technical report can be generated with a single command. We will look at examples of software and publications in the geospatial domain which use these tools and principles. Scientific contributions to GRASS GIS, a powerful open source desktop GIS and geoprocessing backend, will serve as an example of why and how to publish new algorithms and tools as part of a bigger open source project.
Note: Tormenta: An open source Python-powered control software for camera based optical microscopy.
Barabas, Federico M; Masullo, Luciano A; Stefani, Fernando D
2016-12-01
Until recently, PC control and synchronization of scientific instruments was only possible through closed-source expensive frameworks like National Instruments' LabVIEW. Nowadays, efficient cost-free alternatives are available in the context of a continuously growing community of open-source software developers. Here, we report on Tormenta, a modular open-source software for the control of camera-based optical microscopes. Tormenta is built on Python, works on multiple operating systems, and includes some key features for fluorescence nanoscopy based on single molecule localization.
Note: Tormenta: An open source Python-powered control software for camera based optical microscopy
NASA Astrophysics Data System (ADS)
Barabas, Federico M.; Masullo, Luciano A.; Stefani, Fernando D.
2016-12-01
Until recently, PC control and synchronization of scientific instruments was only possible through closed-source expensive frameworks like National Instruments' LabVIEW. Nowadays, efficient cost-free alternatives are available in the context of a continuously growing community of open-source software developers. Here, we report on Tormenta, a modular open-source software for the control of camera-based optical microscopes. Tormenta is built on Python, works on multiple operating systems, and includes some key features for fluorescence nanoscopy based on single molecule localization.
ObsPy: Establishing and maintaining an open-source community package
NASA Astrophysics Data System (ADS)
Krischer, L.; Megies, T.; Barsch, R.
2017-12-01
Python's ecosystem evolved into one of the most powerful and productive research environment across disciplines. ObsPy (https://obspy.org) is a fully community driven, open-source project dedicated to provide a bridge for seismology into that ecosystem. It does so by offering Read and write support for essentially every commonly used data format in seismology, Integrated access to the largest data centers, web services, and real-time data streams, A powerful signal processing toolbox tuned to the specific needs of seismologists, and Utility functionality like travel time calculations, geodetic functions, and data visualizations. ObsPy has been in constant unfunded development for more than eight years and is developed and used by scientists around the world with successful applications in all branches of seismology. By now around 70 people directly contributed code to ObsPy and we aim to make it a self-sustaining community project.This contributions focusses on several meta aspects of open-source software in science, in particular how we experienced them. During the panel we would like to discuss obvious questions like long-term sustainability with very limited to no funding, insufficient computer science training in many sciences, and gaining hard scientific credits for software development, but also the following questions: How to best deal with the fact that a lot of scientific software is very specialized thus usually solves a complex problem but at the same time can only ever reach a limited pool of developers and users by virtue of it being so specialized? Therefore the "many eyes on the code" approach to develop and improve open-source software only applies in a limited fashion. An initial publication for a significant new scientific software package is fairly straightforward. How to on-board and motivate potential new contributors when they can no longer be lured by a potential co-authorship? When is spending significant time and effort on reusable scientific open-source development a reasonable choice for young researchers? The effort to go from purpose tailored code for a single application resulting in a scientific publication is significantly less compared to generalising and engineering it well enough so it can be used by others.
High performance geospatial and climate data visualization using GeoJS
NASA Astrophysics Data System (ADS)
Chaudhary, A.; Beezley, J. D.
2015-12-01
GeoJS (https://github.com/OpenGeoscience/geojs) is an open-source library developed to support interactive scientific and geospatial visualization of climate and earth science datasets in a web environment. GeoJS has a convenient application programming interface (API) that enables users to harness the fast performance of WebGL and Canvas 2D APIs with sophisticated Scalable Vector Graphics (SVG) features in a consistent and convenient manner. We started the project in response to the need for an open-source JavaScript library that can combine traditional geographic information systems (GIS) and scientific visualization on the web. Many libraries, some of which are open source, support mapping or other GIS capabilities, but lack the features required to visualize scientific and other geospatial datasets. For instance, such libraries are not be capable of rendering climate plots from NetCDF files, and some libraries are limited in regards to geoinformatics (infovis in a geospatial environment). While libraries such as d3.js are extremely powerful for these kinds of plots, in order to integrate them into other GIS libraries, the construction of geoinformatics visualizations must be completed manually and separately, or the code must somehow be mixed in an unintuitive way.We developed GeoJS with the following motivations:• To create an open-source geovisualization and GIS library that combines scientific visualization with GIS and informatics• To develop an extensible library that can combine data from multiple sources and render them using multiple backends• To build a library that works well with existing scientific visualizations tools such as VTKWe have successfully deployed GeoJS-based applications for multiple domains across various projects. The ClimatePipes project funded by the Department of Energy, for example, used GeoJS to visualize NetCDF datasets from climate data archives. Other projects built visualizations using GeoJS for interactively exploring data and analysis regarding 1) the human trafficking domain, 2) New York City taxi drop-offs and pick-ups, and 3) the Ebola outbreak. GeoJS supports advanced visualization features such as picking and selecting, as well as clustering. It also supports 2D contour plots, vector plots, heat maps, and geospatial graphs.
Open Data and Open Science for better Research in the Geo and Space Domain
NASA Astrophysics Data System (ADS)
Ritschel, B.; Seelus, C.; Neher, G.; Iyemori, T.; Koyama, Y.; Yatagai, A. I.; Murayama, Y.; King, T. A.; Hughes, S.; Fung, S. F.; Galkin, I. A.; Hapgood, M. A.; Belehaki, A.
2015-12-01
Main open data principles had been worked out in the run-up and finally adopted in the Open Data Charta at the G8 summit in Lough Erne, Northern Ireland in June 2013. Important principles are also valid for science data, such as Open Data by Default, Quality and Quantity, Useable by All, Releasing Data for Improved Governance, Releasing Data for Innovation. There is also an explicit relationship to such areas of high values as earth observation, education and geospatial data. The European union implementation plan of the Open Data Charta identifies among other things objectives such as making data available in an open format, enabling semantic interoperability, ensuring quality, documentation and where appropriate reconciliation across different data sources, implementing software solutionsallowing easy management, publication or visualization of datasets and simplifying clearance of intellectual property rights.Open Science is not just a list of already for a longer time known principles but stands for a lot of initiatives and projects around a better handling of scientific data and openly shared scientific knowledge. It is also about transparency in methodology and collection of data, availability and reuse of scientific data, public accessibility to scientific communication and using of social media to facility scientific collaboration. Some projects are concentrating on open sharing of free and open source software and even further hardware in kind of processing capabilities. In addition question about the mashup of data and publication and an open peer review process are addressed.Following the principles of open data and open science the newest results of the collaboration efforts in mashing up the data servers related to the Japanese IUGONET, the European Union ESPAS and the GFZ ISDC semantic Web projects will be presented here. The semantic Web based approach for the mashup is focusing on the design and implementation of a common but still distributed data catalog based on semantical interoperability including the transparent access to data in relational data bases. References: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/207772/Open_Data_Charter.pdfhttp://www.openscience.org/blog/wp-content/uploads/2013/06/OpenSciencePoster.pdf
What makes computational open source software libraries successful?
NASA Astrophysics Data System (ADS)
Bangerth, Wolfgang; Heister, Timo
2013-01-01
Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
Lakes, landscapes, and R: A framework for open science on freshwater cyanobacteria
In the last several years scientific reproducibility has been a hot topic with several fields openly struggling with reproducing and replicating research results. One of the key tools to address reproducibility is the use of open source software. Increasingly in ecology, the R ...
Erdemir, Ahmet
2016-01-01
Virtual representations of the knee joint can provide clinicians, scientists, and engineers the tools to explore mechanical function of the knee and its tissue structures in health and disease. Modeling and simulation approaches such as finite element analysis also provide the possibility to understand the influence of surgical procedures and implants on joint stresses and tissue deformations. A large number of knee joint models are described in the biomechanics literature. However, freely accessible, customizable, and easy-to-use models are scarce. Availability of such models can accelerate clinical translation of simulations, where labor intensive reproduction of model development steps can be avoided. The interested parties can immediately utilize readily available models for scientific discovery and for clinical care. Motivated by this gap, this study aims to describe an open source and freely available finite element representation of the tibiofemoral joint, namely Open Knee, which includes detailed anatomical representation of the joint's major tissue structures, their nonlinear mechanical properties and interactions. Three use cases illustrate customization potential of the model, its predictive capacity, and its scientific and clinical utility: prediction of joint movements during passive flexion, examining the role of meniscectomy on contact mechanics and joint movements, and understanding anterior cruciate ligament mechanics. A summary of scientific and clinically directed studies conducted by other investigators are also provided. The utilization of this open source model by groups other than its developers emphasizes the premise of model sharing as an accelerator of simulation-based medicine. Finally, the imminent need to develop next generation knee models are noted. These are anticipated to incorporate individualized anatomy and tissue properties supported by specimen-specific joint mechanics data for evaluation, all acquired in vitro from varying age groups and pathological states. PMID:26444849
Can open-source drug R&D repower pharmaceutical innovation?
Munos, B
2010-05-01
Open-source R&D initiatives are multiplying across biomedical research. Some of them-such as public-private partnerships-have achieved notable success in bringing new drugs to market economically, whereas others reflect the pharmaceutical industry's efforts to retool its R&D model. Is open innovation the answer to the innovation crisis? This Commentary argues that although it may likely be part of the solution, significant cultural, scientific, and regulatory barriers can prevent it from delivering on its promise.
Shipping Science Worldwide with Open Source Containers
NASA Astrophysics Data System (ADS)
Molineaux, J. P.; McLaughlin, B. D.; Pilone, D.; Plofchan, P. G.; Murphy, K. J.
2014-12-01
Scientific applications often present difficult web-hosting needs. Their compute- and data-intensive nature, as well as an increasing need for high-availability and distribution, combine to create a challenging set of hosting requirements. In the past year, advancements in container-based virtualization and related tooling have offered new lightweight and flexible ways to accommodate diverse applications with all the isolation and portability benefits of traditional virtualization. This session will introduce and demonstrate an open-source, single-interface, Platform-as-a-Serivce (PaaS) that empowers application developers to seamlessly leverage geographically distributed, public and private compute resources to achieve highly-available, performant hosting for scientific applications.
Rapid development of medical imaging tools with open-source libraries.
Caban, Jesus J; Joshi, Alark; Nagy, Paul
2007-11-01
Rapid prototyping is an important element in researching new imaging analysis techniques and developing custom medical applications. In the last ten years, the open source community and the number of open source libraries and freely available frameworks for biomedical research have grown significantly. What they offer are now considered standards in medical image analysis, computer-aided diagnosis, and medical visualization. A cursory review of the peer-reviewed literature in imaging informatics (indeed, in almost any information technology-dependent scientific discipline) indicates the current reliance on open source libraries to accelerate development and validation of processes and techniques. In this survey paper, we review and compare a few of the most successful open source libraries and frameworks for medical application development. Our dual intentions are to provide evidence that these approaches already constitute a vital and essential part of medical image analysis, diagnosis, and visualization and to motivate the reader to use open source libraries and software for rapid prototyping of medical applications and tools.
Upon the Shoulders of Giants: Open-Source Hardware and Software in Analytical Chemistry.
Dryden, Michael D M; Fobel, Ryan; Fobel, Christian; Wheeler, Aaron R
2017-04-18
Isaac Newton famously observed that "if I have seen further it is by standing on the shoulders of giants." We propose that this sentiment is a powerful motivation for the "open-source" movement in scientific research, in which creators provide everything needed to replicate a given project online, as well as providing explicit permission for users to use, improve, and share it with others. Here, we write to introduce analytical chemists who are new to the open-source movement to best practices and concepts in this area and to survey the state of open-source research in analytical chemistry. We conclude by considering two examples of open-source projects from our own research group, with the hope that a description of the process, motivations, and results will provide a convincing argument about the benefits that this movement brings to both creators and users.
Free and Open Source Software for Geospatial in the field of planetary science
NASA Astrophysics Data System (ADS)
Frigeri, A.
2012-12-01
Information technology applied to geospatial analyses has spread quickly in the last ten years. The availability of OpenData and data from collaborative mapping projects increased the interest on tools, procedures and methods to handle spatially-related information. Free Open Source Software projects devoted to geospatial data handling are gaining a good success as the use of interoperable formats and protocols allow the user to choose what pipeline of tools and libraries is needed to solve a particular task, adapting the software scene to his specific problem. In particular, the Free Open Source model of development mimics the scientific method very well, and researchers should be naturally encouraged to take part to the development process of these software projects, as this represent a very agile way to interact among several institutions. When it comes to planetary sciences, geospatial Free Open Source Software is gaining a key role in projects that commonly involve different subjects in an international scenario. Very popular software suites for processing scientific mission data (for example, ISIS) and for navigation/planning (SPICE) are being distributed along with the source code and the interaction between user and developer is often very strict, creating a continuum between these two figures. A very widely spread library for handling geospatial data (GDAL) has started to support planetary data from the Planetary Data System, and recent contributions enabled the support to other popular data formats used in planetary science, as the Vicar one. The use of Geographic Information System in planetary science is now diffused, and Free Open Source GIS, open GIS formats and network protocols allow to extend existing tools and methods developed to solve Earth based problems, also to the case of the study of solar system bodies. A day in the working life of a researcher using Free Open Source Software for geospatial will be presented, as well as benefits and solutions to possible detriments coming from the effort required by using, supporting and contributing.
Whole earth modeling: developing and disseminating scientific software for computational geophysics.
NASA Astrophysics Data System (ADS)
Kellogg, L. H.
2016-12-01
Historically, a great deal of specialized scientific software for modeling and data analysis has been developed by individual researchers or small groups of scientists working on their own specific research problems. As the magnitude of available data and computer power has increased, so has the complexity of scientific problems addressed by computational methods, creating both a need to sustain existing scientific software, and expand its development to take advantage of new algorithms, new software approaches, and new computational hardware. To that end, communities like the Computational Infrastructure for Geodynamics (CIG) have been established to support the use of best practices in scientific computing for solid earth geophysics research and teaching. Working as a scientific community enables computational geophysicists to take advantage of technological developments, improve the accuracy and performance of software, build on prior software development, and collaborate more readily. The CIG community, and others, have adopted an open-source development model, in which code is developed and disseminated by the community in an open fashion, using version control and software repositories like Git. One emerging issue is how to adequately identify and credit the intellectual contributions involved in creating open source scientific software. The traditional method of disseminating scientific ideas, peer reviewed publication, was not designed for review or crediting scientific software, although emerging publication strategies such software journals are attempting to address the need. We are piloting an integrated approach in which authors are identified and credited as scientific software is developed and run. Successful software citation requires integration with the scholarly publication and indexing mechanisms as well, to assign credit, ensure discoverability, and provide provenance for software.
High School Open On-Line Courses (HOOC): A Case Study from Italy
ERIC Educational Resources Information Center
Canessa, Enrique; Pisani, Armando
2013-01-01
The first implementation of complete high school, open on-line courses (HOOC) aiming to support the training and basic scientific knowledge of young students from the Liceo Ginnasio Dante Alighieri in Gorizia, Italy, is discussed. Using the open source and automated recording system openEyA, HOOC give a student the opportunity to watch on-line, at…
The ESA scientific exploitation element results and outlook
NASA Astrophysics Data System (ADS)
Desnos, Yves-louis; Regner, Peter; Delwart, Steven; Benveniste, Jerome; Engdahl, Marcus; Donlon, Craig; Mathieu, Pierre-Philippe; Fernandez, Diego; Gascon, Ferran; Zehner, Claus; Davidson, Malcolm; Goryl, Philippe; Koetz, Benjamin; Pinnock, Simon
2017-04-01
The Scientific Exploitation of Operational Missions (SEOM) element of ESA's fourth Earth Observation Envelope Programme (EOEP4) prime objective is to federate, support and expand the international research community built up over the last 25 years exploiting ESA's EO missions. SEOM enables the science community to address new scientific research areas that are opened by the free and open access to data from operational EO missions. Based on community-wide recommendations, gathered through a series of international thematic workshops and scientific user consultation meetings, key research studies have been launched over the last years to further exploit data from the Sentinels (http://seom.esa.int/). During 2016 several Science users consultation workshops have been organized, new results from scientific studies have been published and open-source multi-mission scientific toolboxes have been distributed (SNAP 80000 users from 190 countries). In addition the first ESA Massive Open Online Courses on Climate from space have been deployed (20000 participants) and the second EO Open Science conference was organized at ESA in September 2016 bringing together young EO scientists and data scientists. The new EOEP5 Exploitation element approved in 2016 and starting in 2017 is taking stock of all precursor activities in EO Open Science and Innovation and in particular a workplan for ESA scientific exploitation activities has been presented to Member States taking full benefit of the latest information and communication technology. The results and highlights from current scientific exploitation activities will be presented and an outlook on the upcoming activities under the new EOEP5 exploitation element will be given.
Scheftic, Charlie M.; Brinson, L. Catherine
2018-01-01
The cost of specialized scientific equipment can be high and with limited funding resources, researchers and students are often unable to access or purchase the ideal equipment for their projects. In the fields of materials science and mechanical engineering, fundamental equipment such as tensile testing devices can cost tens to hundreds of thousands of dollars. While a research lab often has access to a large-scale testing machine suitable for conventional samples, loading devices for meso- and micro-scale samples for in-situ testing with the myriad of microscopy tools are often hard to source and cost prohibitive. Open-source software has allowed for great strides in the reduction of costs associated with software development and open-source hardware and additive manufacturing have the potential to similarly reduce the costs of scientific equipment and increase the accessibility of scientific research. To investigate the feasibility of open-source hardware, a micro-tensile tester was designed with a freely accessible computer-aided design package and manufactured with a desktop 3D-printer and off-the-shelf components. To our knowledge this is one of the first demonstrations of a tensile tester with additively manufactured components for scientific research. The capabilities of the tensile tester were demonstrated by investigating the mechanical properties of Graphene Oxide (GO) paper and thin films. A 3D printed tensile tester was successfully used in conjunction with an atomic force microscope to provide one of the first quantitative measurements of GO thin film buckling under compression. The tensile tester was also used in conjunction with an atomic force microscope to observe the change in surface topology of a GO paper in response to increasing tensile strain. No significant change in surface topology was observed in contrast to prior hypotheses from the literature. Based on this result obtained with the new open source tensile stage we propose an alternative hypothesis we term ‘superlamellae consolidation’ to explain the initial deformation of GO paper. The additively manufactured tensile tester tested represents cost savings of >99% compared to commercial solutions in its class and offers simple customization. However, continued development is needed for the tensile tester presented here to approach the technical specifications achievable with commercial solutions. PMID:29813103
Nandy, Krishanu; Collinson, David W; Scheftic, Charlie M; Brinson, L Catherine
2018-01-01
The cost of specialized scientific equipment can be high and with limited funding resources, researchers and students are often unable to access or purchase the ideal equipment for their projects. In the fields of materials science and mechanical engineering, fundamental equipment such as tensile testing devices can cost tens to hundreds of thousands of dollars. While a research lab often has access to a large-scale testing machine suitable for conventional samples, loading devices for meso- and micro-scale samples for in-situ testing with the myriad of microscopy tools are often hard to source and cost prohibitive. Open-source software has allowed for great strides in the reduction of costs associated with software development and open-source hardware and additive manufacturing have the potential to similarly reduce the costs of scientific equipment and increase the accessibility of scientific research. To investigate the feasibility of open-source hardware, a micro-tensile tester was designed with a freely accessible computer-aided design package and manufactured with a desktop 3D-printer and off-the-shelf components. To our knowledge this is one of the first demonstrations of a tensile tester with additively manufactured components for scientific research. The capabilities of the tensile tester were demonstrated by investigating the mechanical properties of Graphene Oxide (GO) paper and thin films. A 3D printed tensile tester was successfully used in conjunction with an atomic force microscope to provide one of the first quantitative measurements of GO thin film buckling under compression. The tensile tester was also used in conjunction with an atomic force microscope to observe the change in surface topology of a GO paper in response to increasing tensile strain. No significant change in surface topology was observed in contrast to prior hypotheses from the literature. Based on this result obtained with the new open source tensile stage we propose an alternative hypothesis we term 'superlamellae consolidation' to explain the initial deformation of GO paper. The additively manufactured tensile tester tested represents cost savings of >99% compared to commercial solutions in its class and offers simple customization. However, continued development is needed for the tensile tester presented here to approach the technical specifications achievable with commercial solutions.
Development of an Open Source, Air-Deployable Weather Station
NASA Astrophysics Data System (ADS)
Krejci, A.; Lopez Alcala, J. M.; Nelke, M.; Wagner, J.; Udell, C.; Higgins, C. W.; Selker, J. S.
2017-12-01
We created a packaged weather station intended to be deployed in the air on tethered systems. The device incorporates lightweight sensors and parts and runs for up to 24 hours off of lithium polymer batteries, allowing the entire package to be supported by a thin fiber. As the fiber does not provide a stable platform, additional data (pitch and roll) from typical weather parameters (e.g. temperature, pressure, humidity, wind speed, and wind direction) are determined using an embedded inertial motion unit. All designs are open sourced including electronics, CAD drawings, and descriptions of assembly and can be found on the OPEnS lab website at http://www.open-sensing.org/lowcost-weather-station/. The Openly Published Environmental Sensing Lab (OPEnS: Open-Sensing.org) expands the possibilities of scientific observation of our Earth, transforming the technology, methods, and culture by combining open-source development and cutting-edge technology. New OPEnS labs are now being established in India, France, Switzerland, the Netherlands, and Ghana.
SensorWeb Hub infrastructure for open access to scientific research data
NASA Astrophysics Data System (ADS)
de Filippis, Tiziana; Rocchi, Leandro; Rapisardi, Elena
2015-04-01
The sharing of research data is a new challenge for the scientific community that may benefit from a large amount of information to solve environmental issues and sustainability in agriculture and urban contexts. Prerequisites for this challenge is the development of an infrastructure that ensure access, management and preservation of data, technical support for a coordinated and harmonious management of data that, in the framework of Open Data Policies, should encourages the reuse and the collaboration. The neogeography and the citizen as sensors approach, highlight that new data sources need a new set of tools and practices so to collect, validate, categorize, and use / access these "crowdsourced" data, that integrate the data sets produced in the scientific field, thus "feeding" the overall available data for analysis and research. When the scientific community embraces the dimension of collaboration and sharing, access and re-use, in order to accept the open innovation approach, it should redesign and reshape the processes of data management: the challenges of technological and cultural innovation, enabled by web 2.0 technologies, bring to the scenario where the sharing of structured and interoperable data will constitute the unavoidable building block to set up a new paradigm of scientific research. In this perspective the Institute of Biometeorology, CNR, whose aim is contributing to sharing and development of research data, has developed the "SensorWebHub" (SWH) infrastructure to support the scientific activities carried out in several research projects at national and international level. It is designed to manage both mobile and fixed open source meteorological and environmental sensors, in order to integrate the existing agro-meteorological and urban monitoring networks. The proposed architecture uses open source tools to ensure sustainability in the development and deployment of web applications with geographic features and custom analysis, as requested by the different research projects. The SWH components are organized in typical client-server architecture and interact from the sensing process to the representation of the results to the end-users. The Web Application enables to view and analyse the data stored in the GeoDB. The interface is designed following Internet browsers specifications allowing the visualization of collected data in different formats (tabular, chart and geographic map). The services for the dissemination of geo-referenced information, adopt the OGC specifications. SWH is a bottom-up collaborative initiative to share real time research data and pave the way for a open innovation approach in the scientific research. Until now this framework has been used for several WebGIS applications and WebApp for environmental monitoring at different temporal and spatial scales.
4onse: four times open & non-conventional technology for sensing the environment
NASA Astrophysics Data System (ADS)
Cannata, Massimiliano; Ratnayake, Rangageewa; Antonovic, Milan; Strigaro, Daniele; Cardoso, Mirko; Hoffmann, Marcus
2017-04-01
The availability of complete, quality and dense monitoring hydro-meteorological data is essential to address a number of practical issues including, but not limited to, flood-water and urban drainage management, climate change impact assessment, early warning and risk management, now-casting and weather predictions. Thanks to the recent technological advances such as Internet Of Things, Big Data and Ubiquitous Internet, non-conventional monitoring systems based on open technologies and low cost sensors may represent a great opportunity either as a complement of authoritative monitoring network or as a vital source of information wherever existing monitoring networks are in decline or completely missing. Nevertheless, scientific literature on such a kind of open and non-conventional monitoring systems is still limited and often relates to prototype engineering and testing in rather limited case studies. For this reason the 4onse project aims at integrating existing open technologies in the field of Free & Open Source Software, Open Hardware, Open Data, and Open Standards and evaluate this kind of system in a real case (about 30 stations) for a medium period of 2 years to better scientifically understand strengths, criticalities and applicabilities in terms of data quality; system durability; management costs; performances; sustainability. The ultimate objective is to contribute in non-conventional monitoring systems adoption based on four open technologies.
Moutsatsos, Ioannis K; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J; Jenkins, Jeremy L; Holway, Nicholas; Tallarico, John; Parker, Christian N
2017-03-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an "off-the-shelf," open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community.
Moutsatsos, Ioannis K.; Hossain, Imtiaz; Agarinis, Claudia; Harbinski, Fred; Abraham, Yann; Dobler, Luc; Zhang, Xian; Wilson, Christopher J.; Jenkins, Jeremy L.; Holway, Nicholas; Tallarico, John; Parker, Christian N.
2016-01-01
High-throughput screening generates large volumes of heterogeneous data that require a diverse set of computational tools for management, processing, and analysis. Building integrated, scalable, and robust computational workflows for such applications is challenging but highly valuable. Scientific data integration and pipelining facilitate standardized data processing, collaboration, and reuse of best practices. We describe how Jenkins-CI, an “off-the-shelf,” open-source, continuous integration system, is used to build pipelines for processing images and associated data from high-content screening (HCS). Jenkins-CI provides numerous plugins for standard compute tasks, and its design allows the quick integration of external scientific applications. Using Jenkins-CI, we integrated CellProfiler, an open-source image-processing platform, with various HCS utilities and a high-performance Linux cluster. The platform is web-accessible, facilitates access and sharing of high-performance compute resources, and automates previously cumbersome data and image-processing tasks. Imaging pipelines developed using the desktop CellProfiler client can be managed and shared through a centralized Jenkins-CI repository. Pipelines and managed data are annotated to facilitate collaboration and reuse. Limitations with Jenkins-CI (primarily around the user interface) were addressed through the selection of helper plugins from the Jenkins-CI community. PMID:27899692
Open-source hardware is a low-cost alternative for scientific instrumentation and research
USDA-ARS?s Scientific Manuscript database
Scientific research requires the collection of data in order to study, monitor, analyze, describe, or understand a particular process or event. Data collection efforts are often a compromise: manual measurements can be time-consuming and labor-intensive, resulting in data being collected at a low f...
Sustaining Open Source Communities through Hackathons - An Example from the ASPECT Community
NASA Astrophysics Data System (ADS)
Heister, T.; Hwang, L.; Bangerth, W.; Kellogg, L. H.
2016-12-01
The ecosystem surrounding a successful scientific open source software package combines both social and technical aspects. Much thought has been given to the technology side of writing sustainable software for large infrastructure projects and software libraries, but less about building the human capacity to perpetuate scientific software used in computational modeling. One effective format for building capacity is regular multi-day hackathons. Scientific hackathons bring together a group of science domain users and scientific software contributors to make progress on a specific software package. Innovation comes through the chance to work with established and new collaborations. Especially in the domain sciences with small communities, hackathons give geographically distributed scientists an opportunity to connect face-to-face. They foster lively discussions amongst scientists with different expertise, promote new collaborations, and increase transparency in both the technical and scientific aspects of code development. ASPECT is an open source, parallel, extensible finite element code to simulate thermal convection, that began development in 2011 under the Computational Infrastructure for Geodynamics. ASPECT hackathons for the past 3 years have grown the number of authors to >50, training new code maintainers in the process. Hackathons begin with leaders establishing project-specific conventions for development, demonstrating the workflow for code contributions, and reviewing relevant technical skills. Each hackathon expands the developer community. Over 20 scientists add >6,000 lines of code during the >1 week event. Participants grow comfortable contributing to the repository and over half continue to contribute afterwards. A high return rate of participants ensures continuity and stability of the group as well as mentoring for novice members. We hope to build other software communities on this model, but anticipate each to bring their own unique challenges.
Open-Source 3D-Printable Optics Equipment
Zhang, Chenlong; Anzalone, Nicholas C.; Faria, Rodrigo P.; Pearce, Joshua M.
2013-01-01
Just as the power of the open-source design paradigm has driven down the cost of software to the point that it is accessible to most people, the rise of open-source hardware is poised to drive down the cost of doing experimental science to expand access to everyone. To assist in this aim, this paper introduces a library of open-source 3-D-printable optics components. This library operates as a flexible, low-cost public-domain tool set for developing both research and teaching optics hardware. First, the use of parametric open-source designs using an open-source computer aided design package is described to customize the optics hardware for any application. Second, details are provided on the use of open-source 3-D printers (additive layer manufacturing) to fabricate the primary mechanical components, which are then combined to construct complex optics-related devices. Third, the use of the open-source electronics prototyping platform are illustrated as control for optical experimental apparatuses. This study demonstrates an open-source optical library, which significantly reduces the costs associated with much optical equipment, while also enabling relatively easily adapted customizable designs. The cost reductions in general are over 97%, with some components representing only 1% of the current commercial investment for optical products of similar function. The results of this study make its clear that this method of scientific hardware development enables a much broader audience to participate in optical experimentation both as research and teaching platforms than previous proprietary methods. PMID:23544104
Open-source 3D-printable optics equipment.
Zhang, Chenlong; Anzalone, Nicholas C; Faria, Rodrigo P; Pearce, Joshua M
2013-01-01
Just as the power of the open-source design paradigm has driven down the cost of software to the point that it is accessible to most people, the rise of open-source hardware is poised to drive down the cost of doing experimental science to expand access to everyone. To assist in this aim, this paper introduces a library of open-source 3-D-printable optics components. This library operates as a flexible, low-cost public-domain tool set for developing both research and teaching optics hardware. First, the use of parametric open-source designs using an open-source computer aided design package is described to customize the optics hardware for any application. Second, details are provided on the use of open-source 3-D printers (additive layer manufacturing) to fabricate the primary mechanical components, which are then combined to construct complex optics-related devices. Third, the use of the open-source electronics prototyping platform are illustrated as control for optical experimental apparatuses. This study demonstrates an open-source optical library, which significantly reduces the costs associated with much optical equipment, while also enabling relatively easily adapted customizable designs. The cost reductions in general are over 97%, with some components representing only 1% of the current commercial investment for optical products of similar function. The results of this study make its clear that this method of scientific hardware development enables a much broader audience to participate in optical experimentation both as research and teaching platforms than previous proprietary methods.
Open Knee: Open Source Modeling and Simulation in Knee Biomechanics.
Erdemir, Ahmet
2016-02-01
Virtual representations of the knee joint can provide clinicians, scientists, and engineers the tools to explore mechanical functions of the knee and its tissue structures in health and disease. Modeling and simulation approaches such as finite element analysis also provide the possibility to understand the influence of surgical procedures and implants on joint stresses and tissue deformations. A large number of knee joint models are described in the biomechanics literature. However, freely accessible, customizable, and easy-to-use models are scarce. Availability of such models can accelerate clinical translation of simulations, where labor-intensive reproduction of model development steps can be avoided. Interested parties can immediately utilize readily available models for scientific discovery and clinical care. Motivated by this gap, this study aims to describe an open source and freely available finite element representation of the tibiofemoral joint, namely Open Knee, which includes the detailed anatomical representation of the joint's major tissue structures and their nonlinear mechanical properties and interactions. Three use cases illustrate customization potential of the model, its predictive capacity, and its scientific and clinical utility: prediction of joint movements during passive flexion, examining the role of meniscectomy on contact mechanics and joint movements, and understanding anterior cruciate ligament mechanics. A summary of scientific and clinically directed studies conducted by other investigators are also provided. The utilization of this open source model by groups other than its developers emphasizes the premise of model sharing as an accelerator of simulation-based medicine. Finally, the imminent need to develop next-generation knee models is noted. These are anticipated to incorporate individualized anatomy and tissue properties supported by specimen-specific joint mechanics data for evaluation, all acquired in vitro from varying age groups and pathological states. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
PyEEG: an open source Python module for EEG/MEG feature extraction.
Bao, Forrest Sheng; Liu, Xin; Zhang, Christina
2011-01-01
Computer-aided diagnosis of neural diseases from EEG signals (or other physiological signals that can be treated as time series, e.g., MEG) is an emerging field that has gained much attention in past years. Extracting features is a key component in the analysis of EEG signals. In our previous works, we have implemented many EEG feature extraction functions in the Python programming language. As Python is gaining more ground in scientific computing, an open source Python module for extracting EEG features has the potential to save much time for computational neuroscientists. In this paper, we introduce PyEEG, an open source Python module for EEG feature extraction.
PyEEG: An Open Source Python Module for EEG/MEG Feature Extraction
Bao, Forrest Sheng; Liu, Xin; Zhang, Christina
2011-01-01
Computer-aided diagnosis of neural diseases from EEG signals (or other physiological signals that can be treated as time series, e.g., MEG) is an emerging field that has gained much attention in past years. Extracting features is a key component in the analysis of EEG signals. In our previous works, we have implemented many EEG feature extraction functions in the Python programming language. As Python is gaining more ground in scientific computing, an open source Python module for extracting EEG features has the potential to save much time for computational neuroscientists. In this paper, we introduce PyEEG, an open source Python module for EEG feature extraction. PMID:21512582
Development the Internet - Resources in Solar-Terrestrial Physics for the Science and Education
NASA Astrophysics Data System (ADS)
Zaistev, A.; Ishkov, V.; Kozlov, A.; Obridko, V.; Odintsov, V.
Future development of research in the solar-terrestrial physics (STP) will motivated by needs into fundamental knowledge and the practical demands in the format of space weather. Public community realized that outer space disturbances affects on the operation of high technologies systems integrated into everyday life, so they need into Internet resources of solar-terrestrial physics as the open scientific and public domain. Recent achievements of STP lead to burst of data sources and we have now many different types of information available free in Internet: solar images from SOHO and GOES-12 satellites, WIND and ACE interplanetary data, satellite and ground-based magnetic field variations, aurora images in real time, ionospheric data and many more. In this paper we present some experience to establish in Russian language the open scientific and public domain in Internet which can served for better understanding of STP in wide scientific community and into the general public including different media sources. Now we have more than one hundred sites which present the STP data: Space Research Institute (www.iki.rssi.ru), IZMIRAN (www.izmiran.rssi.ru), Institute of Solar-Terrestrial Physics (www.iszf.irk.ru), Institute of Nuclear Physics in Moscow University (http://alpha.npi.msu.su) Institute of Nuclear Physics in Moscow University ) and many more. Based on our own experience and our colleagues we decide to create information resources in solar-terrestrial physics as the open scientific and public domain. On this way the main directions of our activity as follows: to produce the catalogues of resources in Internet with detailed description of its content in Russian, to publish the list of Russian institutes working in STP, to present the biographical dictionary of Russian scientists in STP, to create the interactive forum for discussion of latest scientific results, to form the team of authors who willing to publish summarized analytical papers on the STP problems, to establish the regular newsletter with open circulation between professionals and people interested in STP, and to provide the scientific coordination between Russian institutes according rules of the road adopted by Solar-Terrestrial Scientific Council. We strongly advocate in favor to construct such Internet resources on native languages as it will served for national level due to its basic funding source. On the other hand our experience might be useful for other nations, as they are have the same aims. Our project have one of the goal to establish a better public understanding of STP through more open and wide public access to the latest scientific results. The realization of this project is supported by Russian Fund of Basic Research (grant N 02-07-90232) for period 2002-2004 and include results also supported by RFBR before.
Nuñez, Isaac; Matute, Tamara; Herrera, Roberto; Keymer, Juan; Marzullo, Timothy; Rudge, Timothy; Federici, Fernán
2017-01-01
The advent of easy-to-use open source microcontrollers, off-the-shelf electronics and customizable manufacturing technologies has facilitated the development of inexpensive scientific devices and laboratory equipment. In this study, we describe an imaging system that integrates low-cost and open-source hardware, software and genetic resources. The multi-fluorescence imaging system consists of readily available 470 nm LEDs, a Raspberry Pi camera and a set of filters made with low cost acrylics. This device allows imaging in scales ranging from single colonies to entire plates. We developed a set of genetic components (e.g. promoters, coding sequences, terminators) and vectors following the standard framework of Golden Gate, which allowed the fabrication of genetic constructs in a combinatorial, low cost and robust manner. In order to provide simultaneous imaging of multiple wavelength signals, we screened a series of long stokes shift fluorescent proteins that could be combined with cyan/green fluorescent proteins. We found CyOFP1, mBeRFP and sfGFP to be the most compatible set for 3-channel fluorescent imaging. We developed open source Python code to operate the hardware to run time-lapse experiments with automated control of illumination and camera and a Python module to analyze data and extract meaningful biological information. To demonstrate the potential application of this integral system, we tested its performance on a diverse range of imaging assays often used in disciplines such as microbial ecology, microbiology and synthetic biology. We also assessed its potential use in a high school environment to teach biology, hardware design, optics, and programming. Together, these results demonstrate the successful integration of open source hardware, software, genetic resources and customizable manufacturing to obtain a powerful, low cost and robust system for education, scientific research and bioengineering. All the resources developed here are available under open source licenses.
Herrera, Roberto; Keymer, Juan; Marzullo, Timothy; Rudge, Timothy
2017-01-01
The advent of easy-to-use open source microcontrollers, off-the-shelf electronics and customizable manufacturing technologies has facilitated the development of inexpensive scientific devices and laboratory equipment. In this study, we describe an imaging system that integrates low-cost and open-source hardware, software and genetic resources. The multi-fluorescence imaging system consists of readily available 470 nm LEDs, a Raspberry Pi camera and a set of filters made with low cost acrylics. This device allows imaging in scales ranging from single colonies to entire plates. We developed a set of genetic components (e.g. promoters, coding sequences, terminators) and vectors following the standard framework of Golden Gate, which allowed the fabrication of genetic constructs in a combinatorial, low cost and robust manner. In order to provide simultaneous imaging of multiple wavelength signals, we screened a series of long stokes shift fluorescent proteins that could be combined with cyan/green fluorescent proteins. We found CyOFP1, mBeRFP and sfGFP to be the most compatible set for 3-channel fluorescent imaging. We developed open source Python code to operate the hardware to run time-lapse experiments with automated control of illumination and camera and a Python module to analyze data and extract meaningful biological information. To demonstrate the potential application of this integral system, we tested its performance on a diverse range of imaging assays often used in disciplines such as microbial ecology, microbiology and synthetic biology. We also assessed its potential use in a high school environment to teach biology, hardware design, optics, and programming. Together, these results demonstrate the successful integration of open source hardware, software, genetic resources and customizable manufacturing to obtain a powerful, low cost and robust system for education, scientific research and bioengineering. All the resources developed here are available under open source licenses. PMID:29140977
The ImageJ ecosystem: an open platform for biomedical image analysis
Schindelin, Johannes; Rueden, Curtis T.; Hiner, Mark C.; Eliceiri, Kevin W.
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available – from commercial to academic, special-purpose to Swiss army knife, small to large–but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts life science, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. PMID:26153368
The ImageJ ecosystem: An open platform for biomedical image analysis.
Schindelin, Johannes; Rueden, Curtis T; Hiner, Mark C; Eliceiri, Kevin W
2015-01-01
Technology in microscopy advances rapidly, enabling increasingly affordable, faster, and more precise quantitative biomedical imaging, which necessitates correspondingly more-advanced image processing and analysis techniques. A wide range of software is available-from commercial to academic, special-purpose to Swiss army knife, small to large-but a key characteristic of software that is suitable for scientific inquiry is its accessibility. Open-source software is ideal for scientific endeavors because it can be freely inspected, modified, and redistributed; in particular, the open-software platform ImageJ has had a huge impact on the life sciences, and continues to do so. From its inception, ImageJ has grown significantly due largely to being freely available and its vibrant and helpful user community. Scientists as diverse as interested hobbyists, technical assistants, students, scientific staff, and advanced biology researchers use ImageJ on a daily basis, and exchange knowledge via its dedicated mailing list. Uses of ImageJ range from data visualization and teaching to advanced image processing and statistical analysis. The software's extensibility continues to attract biologists at all career stages as well as computer scientists who wish to effectively implement specific image-processing algorithms. In this review, we use the ImageJ project as a case study of how open-source software fosters its suites of software tools, making multitudes of image-analysis technology easily accessible to the scientific community. We specifically explore what makes ImageJ so popular, how it impacts the life sciences, how it inspires other projects, and how it is self-influenced by coevolving projects within the ImageJ ecosystem. © 2015 Wiley Periodicals, Inc.
Open-Source Syringe Pump Library
Wijnen, Bas; Hunt, Emily J.; Anzalone, Gerald C.; Pearce, Joshua M.
2014-01-01
This article explores a new open-source method for developing and manufacturing high-quality scientific equipment suitable for use in virtually any laboratory. A syringe pump was designed using freely available open-source computer aided design (CAD) software and manufactured using an open-source RepRap 3-D printer and readily available parts. The design, bill of materials and assembly instructions are globally available to anyone wishing to use them. Details are provided covering the use of the CAD software and the RepRap 3-D printer. The use of an open-source Rasberry Pi computer as a wireless control device is also illustrated. Performance of the syringe pump was assessed and the methods used for assessment are detailed. The cost of the entire system, including the controller and web-based control interface, is on the order of 5% or less than one would expect to pay for a commercial syringe pump having similar performance. The design should suit the needs of a given research activity requiring a syringe pump including carefully controlled dosing of reagents, pharmaceuticals, and delivery of viscous 3-D printer media among other applications. PMID:25229451
3D Scientific Visualization with Blender
NASA Astrophysics Data System (ADS)
Kent, Brian R.
2015-03-01
This is the first book written on using Blender (an open source visualization suite widely used in the entertainment and gaming industries) for scientific visualization. It is a practical and interesting introduction to Blender for understanding key parts of 3D rendering and animation that pertain to the sciences via step-by-step guided tutorials. 3D Scientific Visualization with Blender takes you through an understanding of 3D graphics and modelling for different visualization scenarios in the physical sciences.
Nyström, Pär; Falck-Ytter, Terje; Gredebäck, Gustaf
2016-06-01
This article describes a new open source scientific workflow system, the TimeStudio Project, dedicated to the behavioral and brain sciences. The program is written in MATLAB and features a graphical user interface for the dynamic pipelining of computer algorithms developed as TimeStudio plugins. TimeStudio includes both a set of general plugins (for reading data files, modifying data structures, visualizing data structures, etc.) and a set of plugins specifically developed for the analysis of event-related eyetracking data as a proof of concept. It is possible to create custom plugins to integrate new or existing MATLAB code anywhere in a workflow, making TimeStudio a flexible workbench for organizing and performing a wide range of analyses. The system also features an integrated sharing and archiving tool for TimeStudio workflows, which can be used to share workflows both during the data analysis phase and after scientific publication. TimeStudio thus facilitates the reproduction and replication of scientific studies, increases the transparency of analyses, and reduces individual researchers' analysis workload. The project website ( http://timestudioproject.com ) contains the latest releases of TimeStudio, together with documentation and user forums.
Design and Deployment of a General Purpose, Open Source LoRa to Wi-Fi Hub and Data Logger
NASA Astrophysics Data System (ADS)
DeBell, T. C.; Udell, C.; Kwon, M.; Selker, J. S.; Lopez Alcala, J. M.
2017-12-01
Methods and technologies facilitating internet connectivity and near-real-time status updates for in site environmental sensor data are of increasing interest in Earth Science. However, Open Source, Do-It-Yourself technologies that enable plug and play functionality for web-connected sensors and devices remain largely inaccessible for typical researchers in our community. The Openly Published Environmental Sensing Lab at Oregon State University (OPEnS Lab) constructed an Open Source 900 MHz Long Range Radio (LoRa) receiver hub with SD card data logger, Ethernet and Wi-Fi shield, and 3D printed enclosure that dynamically uploads transmissions from multiple wirelessly-connected environmental sensing devices. Data transmissions may be received from devices up to 20km away. The hub time-stamps, saves to SD card, and uploads all transmissions to a Google Drive spreadsheet to be accessed in near-real-time by researchers and GeoVisualization applications (such as Arc GIS) for access, visualization, and analysis. This research expands the possibilities of scientific observation of our Earth, transforming the technology, methods, and culture by combining open-source development and cutting edge technology. This poster details our methods and evaluates the application of using 3D printing, Arduino Integrated Development Environment (IDE), Adafruit's Open-Hardware Feather development boards, and the WIZNET5500 Ethernet shield for designing this open-source, general purpose LoRa to Wi-Fi data logger.
The SCEC/UseIT Intern Program: Creating Open-Source Visualization Software Using Diverse Resources
NASA Astrophysics Data System (ADS)
Francoeur, H.; Callaghan, S.; Perry, S.; Jordan, T.
2004-12-01
The Southern California Earthquake Center undergraduate IT intern program (SCEC UseIT) conducts IT research to benefit collaborative earth science research. Through this program, interns have developed real-time, interactive, 3D visualization software using open-source tools. Dubbed LA3D, a distribution of this software is now in use by the seismic community. LA3D enables the user to interactively view Southern California datasets and models of importance to earthquake scientists, such as faults, earthquakes, fault blocks, digital elevation models, and seismic hazard maps. LA3D is now being extended to support visualizations anywhere on the planet. The new software, called SCEC-VIDEO (Virtual Interactive Display of Earth Objects), makes use of a modular, plugin-based software architecture which supports easy development and integration of new data sets. Currently SCEC-VIDEO is in beta testing, with a full open-source release slated for the future. Both LA3D and SCEC-VIDEO were developed using a wide variety of software technologies. These, which included relational databases, web services, software management technologies, and 3-D graphics in Java, were necessary to integrate the heterogeneous array of data sources which comprise our software. Currently the interns are working to integrate new technologies and larger data sets to increase software functionality and value. In addition, both LA3D and SCEC-VIDEO allow the user to script and create movies. Thus program interns with computer science backgrounds have been writing software while interns with other interests, such as cinema, geology, and education, have been making movies that have proved of great use in scientific talks, media interviews, and education. Thus, SCEC UseIT incorporates a wide variety of scientific and human resources to create products of value to the scientific and outreach communities. The program plans to continue with its interdisciplinary approach, increasing the relevance of the software and expanding its use in the scientific community.
Memorandum "Open Metadata". Open Access to Documentation Forms and Item Catalogs in Healthcare.
Dugas, M; Jöckel, K-H; Friede, T; Gefeller, O; Kieser, M; Marschollek, M; Ammenwerth, E; Röhrig, R; Knaup-Gregori, P; Prokosch, H-U
2015-01-01
At present, most documentation forms and item catalogs in healthcare are not accessible to the public. This applies to assessment forms of routine patient care as well as case report forms (CRFs) of clinical and epidemiological studies. On behalf of the German chairs for Medical Informatics, Biometry and Epidemiology six recommendations to developers and users of documentation forms in healthcare were developed. Open access to medical documentation forms could substantially improve information systems in healthcare and medical research networks. Therefore these forms should be made available to the scientific community, their use should not be unduly restricted, they should be published in a sustainable way using international standards and sources of documentation forms should be referenced in scientific publications.
An Open and Holistic Approach for Geo and Space Sciences
NASA Astrophysics Data System (ADS)
Ritschel, Bernd; Seelus, Christoph; Neher, Günther; Toshihiko, Iyemori; Yatagai, Akiyo; Koyama, Yukinobu; Murayama, Yasuhiro; King, Todd; Hughes, Steve; Fung, Shing; Galkin, Ivan; Hapgood, Mike; Belehaki, Anna
2016-04-01
Geo and space sciences thus far have been very successful, even often an open, cross-domain and holistic approach did not play an essential role. But this situation is changing rapidly. The research focus is shifting into more complex, non-linear and multi-domain specified phenomena, such as e.g. climate change or space environment. This kind of phenomena only can be understood step by step using the holistic idea. So, what is necessary for a successful cross-domain and holistic approach in geo and space sciences? Research and science in general become more and more dependent from a rich fundus of multi-domain data sources, related context information and the use of highly advanced technologies in data processing. Such buzzword phrases as Big Data and Deep Learning are reflecting this development. Big Data also addresses the real exponential growing of data and information produced by measurements or simulations. Deep Learning technology may help to detect new patterns and relationships in data describing high sophisticated natural phenomena. And further on, we should not forget science and humanities are only two sides of the same medal in the continuing human process of knowledge discovery. The concept of Open Data or in particular the open access to scientific data is addressing the free and open availability of -at least publicly founded and generated- data. The open availability of data covers the free use, reuse and redistribution of data which have been established with the formation of World Data Centers already more than 50 years ago. So, we should not forget, the foundation for open data is the responsibility of the individual scientist up until the big science institutions and organizations for a sustainable management of data. Other challenges are discovering and collecting the appropriate data, and preferably all of them or at least the majority of the right data. Therefore a network of individual or even better institutional catalog-based and at least domain-specific data servers is necessary. In times of the WWW or nowadays Semantic Web, context enriched and mashed-up open data catalogs pointing to the appropriate data sources, step-by-step will help to overcome the burden of the users to find the right data. Further on, the Semantic Web provides an interoperable and universal format for data and metadata. The Resource Description Formation (RDF) inherently enables a domain and cross-domain mashup of data, e.g. realized in the Linked Open Data project. Scientific work and appropriate papers in the geo and space domain often are based on data, physical models and previous publications, which again have been dependent on data, models and publications. So, in order to guarantee a high quality of scientific work, the complete verification process of the results is necessary. This is nothing new, but in times of Big Data a real challenge. So, what do we need for a complete verification of presented results? Yes, especially we need all the original data which has been used. But it is also necessary to get complete information about the context of the research objectives and the resulting constraints in the preparation of the raw data. Further on we need knowledge about the methods and the appropriate processing software, which has been used to generate the results. The Open Data approach enriched by the Open Archive idea is providing the concept for sustainable and verifiable scientific work. Open Archive of course stands for the free availability of scientific papers. But furthermore it focuses on mechanisms and methods within the realm of scientific publications for referencing and providing the underlying data, methods and software. Such reference mechanism are the use of Digital Object Identifier (DOI) or Uniform Resource Identifier (URI) within the Semantic Web -in our case for geo and space science data- but also methods and software code. Nowadays, more and more open and private publishers are demanding such kind of references in preparation of the publishing process. In addition, references to well documented earth and space science data are available via an increasing amount of data publications. This approach serves both, the institutional geo and space data centers which increase their awareness and importance, but also the scientists, which will find the right and already DOI-referenced data in the appropriate data journals. The Open Data and Open Archive approach finally merges in the concept of Open Science. Open Science emphasizes an open sharing of knowledge of all kind, based on a transparent multi-disciplinary and cross-domain scientific work. But Open Science is not just an idea, it also stands for a variety of projects which following the rules of Open Science, such as open methodology, open source, open data, open access, open peer review and open educational resources. Open Science also demands a new culture of scientific collaboration based on social media, and the use of shared cloud technology for data storage and computing. But, we should not forget, the WWW is not a one way road. As more data, methods and software for science research become freely available at the Internet, as more chances for a commercial or even destructive use of scientific data are opened. Already now, the giant search engine provider, such as Google or Microsoft and others are collecting, storing and analyzing all data which is available at the net. The usage of Deep Learning for the detection of semantical coherence of data for e.g. the creation of personalized on time and on location predictions using neuronal networks and artificial intelligence methods should not be reserved for them but also used within Open Science for the creation of new scientific knowledge. Open Science does not mean just to dump our scientific data, information and knowledge into the Web. Far from it, we are still responsible for a sustainable handling of our data for the benefit of humankind. The usage of the principles of Open Science is demonstrated on the scientific and software engineering activities for the mashup of the Japanese IUGONET, European Union ESPAS and GFZ ISDC related data server covering different geo and space science domains.
The Privacy and Security Implications of Open Data in Healthcare.
Kobayashi, Shinji; Kane, Thomas B; Paton, Chris
2018-04-22
The International Medical Informatics Association (IMIA) Open Source Working Group (OSWG) initiated a group discussion to discuss current privacy and security issues in the open data movement in the healthcare domain from the perspective of the OSWG membership. Working group members independently reviewed the recent academic and grey literature and sampled a number of current large-scale open data projects to inform the working group discussion. This paper presents an overview of open data repositories and a series of short case reports to highlight relevant issues present in the recent literature concerning the adoption of open approaches to sharing healthcare datasets. Important themes that emerged included data standardisation, the inter-connected nature of the open source and open data movements, and how publishing open data can impact on the ethics, security, and privacy of informatics projects. The open data and open source movements in healthcare share many common philosophies and approaches including developing international collaborations across multiple organisations and domains of expertise. Both movements aim to reduce the costs of advancing scientific research and improving healthcare provision for people around the world by adopting open intellectual property licence agreements and codes of practice. Implications of the increased adoption of open data in healthcare include the need to balance the security and privacy challenges of opening data sources with the potential benefits of open data for improving research and healthcare delivery. Georg Thieme Verlag KG Stuttgart.
iGlobe Interactive Visualization and Analysis of Spatial Data
NASA Technical Reports Server (NTRS)
Hogan, Patrick
2012-01-01
iGlobe is open-source software built on NASA World Wind virtual globe technology. iGlobe provides a growing set of tools for weather science, climate research, and agricultural analysis. Up until now, these types of sophisticated tools have been developed in isolation by national agencies, academic institutions, and research organizations. By providing an open-source solution to analyze and visualize weather, climate, and agricultural data, the scientific and research communities can more readily advance solutions needed to understand better the dynamics of our home planet, Earth
OpenSim: open-source software to create and analyze dynamic simulations of movement.
Delp, Scott L; Anderson, Frank C; Arnold, Allison S; Loan, Peter; Habib, Ayman; John, Chand T; Guendelman, Eran; Thelen, Darryl G
2007-11-01
Dynamic simulations of movement allow one to study neuromuscular coordination, analyze athletic performance, and estimate internal loading of the musculoskeletal system. Simulations can also be used to identify the sources of pathological movement and establish a scientific basis for treatment planning. We have developed a freely available, open-source software system (OpenSim) that lets users develop models of musculoskeletal structures and create dynamic simulations of a wide variety of movements. We are using this system to simulate the dynamics of individuals with pathological gait and to explore the biomechanical effects of treatments. OpenSim provides a platform on which the biomechanics community can build a library of simulations that can be exchanged, tested, analyzed, and improved through a multi-institutional collaboration. Developing software that enables a concerted effort from many investigators poses technical and sociological challenges. Meeting those challenges will accelerate the discovery of principles that govern movement control and improve treatments for individuals with movement pathologies.
Bhardwaj, Anshu; Scaria, Vinod; Raghava, Gajendra Pal Singh; Lynn, Andrew Michael; Chandra, Nagasuma; Banerjee, Sulagna; Raghunandanan, Muthukurussi V; Pandey, Vikas; Taneja, Bhupesh; Yadav, Jyoti; Dash, Debasis; Bhattacharya, Jaijit; Misra, Amit; Kumar, Anil; Ramachandran, Srinivasan; Thomas, Zakir; Brahmachari, Samir K
2011-09-01
It is being realized that the traditional closed-door and market driven approaches for drug discovery may not be the best suited model for the diseases of the developing world such as tuberculosis and malaria, because most patients suffering from these diseases have poor paying capacity. To ensure that new drugs are created for patients suffering from these diseases, it is necessary to formulate an alternate paradigm of drug discovery process. The current model constrained by limitations for collaboration and for sharing of resources with confidentiality hampers the opportunities for bringing expertise from diverse fields. These limitations hinder the possibilities of lowering the cost of drug discovery. The Open Source Drug Discovery project initiated by Council of Scientific and Industrial Research, India has adopted an open source model to power wide participation across geographical borders. Open Source Drug Discovery emphasizes integrative science through collaboration, open-sharing, taking up multi-faceted approaches and accruing benefits from advances on different fronts of new drug discovery. Because the open source model is based on community participation, it has the potential to self-sustain continuous development by generating a storehouse of alternatives towards continued pursuit for new drug discovery. Since the inventions are community generated, the new chemical entities developed by Open Source Drug Discovery will be taken up for clinical trial in a non-exclusive manner by participation of multiple companies with majority funding from Open Source Drug Discovery. This will ensure availability of drugs through a lower cost community driven drug discovery process for diseases afflicting people with poor paying capacity. Hopefully what LINUX the World Wide Web have done for the information technology, Open Source Drug Discovery will do for drug discovery. Copyright © 2011 Elsevier Ltd. All rights reserved.
Gpufit: An open-source toolkit for GPU-accelerated curve fitting.
Przybylski, Adrian; Thiel, Björn; Keller-Findeisen, Jan; Stock, Bernd; Bates, Mark
2017-11-16
We present a general purpose, open-source software library for estimation of non-linear parameters by the Levenberg-Marquardt algorithm. The software, Gpufit, runs on a Graphics Processing Unit (GPU) and executes computations in parallel, resulting in a significant gain in performance. We measured a speed increase of up to 42 times when comparing Gpufit with an identical CPU-based algorithm, with no loss of precision or accuracy. Gpufit is designed such that it is easily incorporated into existing applications or adapted for new ones. Multiple software interfaces, including to C, Python, and Matlab, ensure that Gpufit is accessible from most programming environments. The full source code is published as an open source software repository, making its function transparent to the user and facilitating future improvements and extensions. As a demonstration, we used Gpufit to accelerate an existing scientific image analysis package, yielding significantly improved processing times for super-resolution fluorescence microscopy datasets.
National Synchrotron Light Source II
Hill, John; Dooryhee, Eric; Wilkins, Stuart; Miller, Lisa; Chu, Yong
2018-01-16
NSLS-II is a synchrotron light source helping researchers explore solutions to the grand energy challenges faced by the nation, and open up new regimes of scientific discovery that will pave the way to discoveries in physics, chemistry, and biology â advances that will ultimately enhance national security and help drive the development of abundant, safe, and clean energy technologies.
A Shared Infrastructure for Federated Search Across Distributed Scientific Metadata Catalogs
NASA Astrophysics Data System (ADS)
Reed, S. A.; Truslove, I.; Billingsley, B. W.; Grauch, A.; Harper, D.; Kovarik, J.; Lopez, L.; Liu, M.; Brandt, M.
2013-12-01
The vast amount of science metadata can be overwhelming and highly complex. Comprehensive analysis and sharing of metadata is difficult since institutions often publish to their own repositories. There are many disjoint standards used for publishing scientific data, making it difficult to discover and share information from different sources. Services that publish metadata catalogs often have different protocols, formats, and semantics. The research community is limited by the exclusivity of separate metadata catalogs and thus it is desirable to have federated search interfaces capable of unified search queries across multiple sources. Aggregation of metadata catalogs also enables users to critique metadata more rigorously. With these motivations in mind, the National Snow and Ice Data Center (NSIDC) and Advanced Cooperative Arctic Data and Information Service (ACADIS) implemented two search interfaces for the community. Both the NSIDC Search and ACADIS Arctic Data Explorer (ADE) use a common infrastructure which keeps maintenance costs low. The search clients are designed to make OpenSearch requests against Solr, an Open Source search platform. Solr applies indexes to specific fields of the metadata which in this instance optimizes queries containing keywords, spatial bounds and temporal ranges. NSIDC metadata is reused by both search interfaces but the ADE also brokers additional sources. Users can quickly find relevant metadata with minimal effort and ultimately lowers costs for research. This presentation will highlight the reuse of data and code between NSIDC and ACADIS, discuss challenges and milestones for each project, and will identify creation and use of Open Source libraries.
NASA Astrophysics Data System (ADS)
Benveniste, J.; Regner, P.; Desnos, Y. L.
2015-12-01
The Scientific Exploitation of Operational Mission (SEOM) programme element (http://seom.esa.int/) is part of the ESA's Fourth Earth Observation Envelope Programme (2013-2017). The prime objective is to federate, support and expand the international research community that the ERS, ENVISAT and the Envelope programmes have built up over the last 25 years. It aims to further strengthen the leadership of the European Earth Observation research community by enabling them to extensively exploit future European operational EO missions. SEOM is enabling the science community to address new scientific research that are opened by free and open access to data from operational EO missions. The Programme is based on community-wide recommendations for actions on key research issues, gathered through a series of international thematic workshops and scientific user consultation meetings such as the Sentinel-3 for Science Workshop held last June in Venice, Italy (see http://seom.esa.int/S3forScience2015). The 2015 SEOM work plan includes the launch of new R&D studies for scientific exploitation of the Sentinels, the development of open-source multi-mission scientific toolboxes, the organization of advanced international training courses, summer schools and educational materials, as well as activities for promoting the scientific use of EO data, also via the organization of Workshops. This paper will report the recommendations from the International Scientific Community concerning the Sentinel-3 Scientific Exploitation, as expressed in Venice, keeping in mind that Sentinel-3 is an operational mission to provide operational services (see http://www.copernicus.eu).
NASA Astrophysics Data System (ADS)
Konnik, Mikhail V.; Welsh, James
2012-09-01
Numerical simulators for adaptive optics systems have become an essential tool for the research and development of the future advanced astronomical instruments. However, growing software code of the numerical simulator makes it difficult to continue to support the code itself. The problem of adequate documentation of the astronomical software for adaptive optics simulators may complicate the development since the documentation must contain up-to-date schemes and mathematical descriptions implemented in the software code. Although most modern programming environments like MATLAB or Octave have in-built documentation abilities, they are often insufficient for the description of a typical adaptive optics simulator code. This paper describes a general cross-platform framework for the documentation of scientific software using open-source tools such as LATEX, mercurial, Doxygen, and Perl. Using the Perl script that translates M-files MATLAB comments into C-like, one can use Doxygen to generate and update the documentation for the scientific source code. The documentation generated by this framework contains the current code description with mathematical formulas, images, and bibliographical references. A detailed description of the framework components is presented as well as the guidelines for the framework deployment. Examples of the code documentation for the scripts and functions of a MATLAB-based adaptive optics simulator are provided.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Hyunwoo; Timm, Steven
We present a summary of how X.509 authentication and authorization are used with OpenNebula in FermiCloud. We also describe a history of why the X.509 authentication was needed in FermiCloud, and review X.509 authorization options, both internal and external to OpenNebula. We show how these options can be and have been used to successfully run scientific workflows on federated clouds, which include OpenNebula on FermiCloud and Amazon Web Services as well as other community clouds. We also outline federation options being used by other commercial and open-source clouds and cloud research projects.
NASA Astrophysics Data System (ADS)
Angius, S.; Bisegni, C.; Ciuffetti, P.; Di Pirro, G.; Foggetta, L. G.; Galletti, F.; Gargana, R.; Gioscio, E.; Maselli, D.; Mazzitelli, G.; Michelotti, A.; Orrù, R.; Pistoni, M.; Spagnoli, F.; Spigone, D.; Stecchi, A.; Tonto, T.; Tota, M. A.; Catani, L.; Di Giulio, C.; Salina, G.; Buzzi, P.; Checcucci, B.; Lubrano, P.; Piccini, M.; Fattibene, E.; Michelotto, M.; Cavallaro, S. R.; Diana, B. F.; Enrico, F.; Pulvirenti, S.
2016-01-01
The paper is aimed to present the !CHAOS open source project aimed to develop a prototype of a national private Cloud Computing infrastructure, devoted to accelerator control systems and large experiments of High Energy Physics (HEP). The !CHAOS project has been financed by MIUR (Italian Ministry of Research and Education) and aims to develop a new concept of control system and data acquisition framework by providing, with a high level of aaabstraction, all the services needed for controlling and managing a large scientific, or non-scientific, infrastructure. A beta version of the !CHAOS infrastructure will be released at the end of December 2015 and will run on private Cloud infrastructures based on OpenStack.
NASA Astrophysics Data System (ADS)
Gwamuri, J.; Pearce, Joshua M.
2017-08-01
The recent introduction of RepRap (self-replicating rapid prototyper) 3-D printers and the resultant open source technological improvements have resulted in affordable 3-D printing, enabling low-cost distributed manufacturing for individuals. This development and others such as the rise of open source-appropriate technology (OSAT) and solar powered 3-D printing are moving 3-D printing from an industry based technology to one that could be used in the developing world for sustainable development. In this paper, we explore some specific technological improvements and how distributed manufacturing with open-source 3-D printing can be used to provide open-source 3-D printable optics components for developing world communities through the ability to print less expensive and customized products. This paper presents an open-source low cost optical equipment library which enables relatively easily adapted customizable designs with the potential of changing the way optics is taught in resource constraint communities. The study shows that this method of scientific hardware development has a potential to enables a much broader audience to participate in optical experimentation both as research and teaching platforms. Conclusions on the technical viability of 3-D printing to assist in development and recommendations on how developing communities can fully exploit this technology to improve the learning of optics through hands-on methods have been outlined.
"XANSONS for COD": a new small BOINC project in crystallography
NASA Astrophysics Data System (ADS)
Neverov, Vladislav S.; Khrapov, Nikolay P.
2018-04-01
"XANSONS for COD" (http://xansons4cod.com) is a new BOINC project aimed at creating the open-access database of simulated x-ray and neutron powder diffraction patterns for nanocrystalline phase of materials from the collection of the Crystallography Open Database (COD). The project uses original open-source software XaNSoNS to simulate diffraction patterns on CPU and GPU. This paper describes the scientific problem this project solves, the project's internal structure, its operation principles and organization of the final database.
Hartin, Corinne A.; Patel, Pralit L.; Schwarber, Adria; ...
2015-04-01
Simple climate models play an integral role in the policy and scientific communities. They are used for climate mitigation scenarios within integrated assessment models, complex climate model emulation, and uncertainty analyses. Here we describe Hector v1.0, an open source, object-oriented, simple global climate carbon-cycle model. This model runs essentially instantaneously while still representing the most critical global-scale earth system processes. Hector has a three-part main carbon cycle: a one-pool atmosphere, land, and ocean. The model's terrestrial carbon cycle includes primary production and respiration fluxes, accommodating arbitrary geographic divisions into, e.g., ecological biomes or political units. Hector actively solves the inorganicmore » carbon system in the surface ocean, directly calculating air–sea fluxes of carbon and ocean pH. Hector reproduces the global historical trends of atmospheric [CO 2], radiative forcing, and surface temperatures. The model simulates all four Representative Concentration Pathways (RCPs) with equivalent rates of change of key variables over time compared to current observations, MAGICC (a well-known simple climate model), and models from the 5th Coupled Model Intercomparison Project. Hector's flexibility, open-source nature, and modular design will facilitate a broad range of research in various areas.« less
NASA Astrophysics Data System (ADS)
Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Jonestrask, L.; Tauxe, L.; Constable, C.
2016-12-01
The Magnetics Information Consortium (https://earthref.org/MagIC/) develops and maintains a database and web application for supporting the paleo-, geo-, and rock magnetic scientific community. Historically, this objective has been met with an Oracle database and a Perl web application at the San Diego Supercomputer Center (SDSC). The Oracle Enterprise Cluster at SDSC, however, was decommissioned in July of 2016 and the cost for MagIC to continue using Oracle became prohibitive. This provided MagIC with a unique opportunity to reexamine the entire technology stack and data model. MagIC has developed an open-source web application using the Meteor (http://meteor.com) framework and a MongoDB database. The simplicity of the open-source full-stack framework that Meteor provides has improved MagIC's development pace and the increased flexibility of the data schema in MongoDB encouraged the reorganization of the MagIC Data Model. As a result of incorporating actively developed open-source projects into the technology stack, MagIC has benefited from their vibrant software development communities. This has translated into a more modern web application that has significantly improved the user experience for the paleo-, geo-, and rock magnetic scientific community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick
The framework created through the Open-Source Integrated Design-Analysis Environment (IDAE) for Nuclear Energy Advanced Modeling & Simulation grant has simplify and democratize advanced modeling and simulation in the nuclear energy industry that works on a range of nuclear engineering applications. It leverages millions of investment dollars from the Department of Energy's Office of Nuclear Energy for modeling and simulation of light water reactors and the Office of Nuclear Energy's research and development. The IDEA framework enhanced Kitware’s Computational Model Builder (CMB) while leveraging existing open-source toolkits and creating a graphical end-to-end umbrella guiding end-users and developers through the nuclear energymore » advanced modeling and simulation lifecycle. In addition, the work deliver strategic advancements in meshing and visualization for ensembles.« less
Scientific Software - the role of best practices and recommendations
NASA Astrophysics Data System (ADS)
Fritzsch, Bernadette; Bernstein, Erik; Castell, Wolfgang zu; Diesmann, Markus; Haas, Holger; Hammitzsch, Martin; Konrad, Uwe; Lähnemann, David; McHardy, Alice; Pampel, Heinz; Scheliga, Kaja; Schreiber, Andreas; Steglich, Dirk
2017-04-01
In Geosciences - like in most other communities - scientific work strongly depends on software. For big data analysis, existing (closed or open source) program packages are often mixed with newly developed codes. Different versions of software components and varying configurations can influence the result of data analysis. This often makes reproducibility of results and reuse of codes very difficult. Policies for publication and documentation of used and newly developed software, along with best practices, can help tackle this problem. Within the Helmholtz Association a Task Group "Access to and Re-use of scientific software" was implemented by the Open Science Working Group in 2016. The aim of the Task Group is to foster the discussion about scientific software in the Open Science context and to formulate recommendations for the production and publication of scientific software, ensuring open access to it. As a first step, a workshop gathered interested scientists from institutions across Germany. The workshop brought together various existing initiatives from different scientific communities to analyse current problems, share established best practices and come up with possible solutions. The subjects in the working groups covered a broad range of themes, including technical infrastructures, standards and quality assurance, citation of software and reproducibility. Initial recommendations are presented and discussed in the talk. They are the foundation for further discussions in the Helmholtz Association and the Priority Initiative "Digital Information" of the Alliance of Science Organisations in Germany. The talk aims to inform about the activities and to link with other initiatives on the national or international level.
psiTurk: An open-source framework for conducting replicable behavioral experiments online.
Gureckis, Todd M; Martin, Jay; McDonnell, John; Rich, Alexander S; Markant, Doug; Coenen, Anna; Halpern, David; Hamrick, Jessica B; Chan, Patricia
2016-09-01
Online data collection has begun to revolutionize the behavioral sciences. However, conducting carefully controlled behavioral experiments online introduces a number of new of technical and scientific challenges. The project described in this paper, psiTurk, is an open-source platform which helps researchers develop experiment designs which can be conducted over the Internet. The tool primarily interfaces with Amazon's Mechanical Turk, a popular crowd-sourcing labor market. This paper describes the basic architecture of the system and introduces new users to the overall goals. psiTurk aims to reduce the technical hurdles for researchers developing online experiments while improving the transparency and collaborative nature of the behavioral sciences.
The GenABEL Project for statistical genomics.
Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S
2016-01-01
Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.
NASA Astrophysics Data System (ADS)
Fischer, T.; Naumov, D.; Sattler, S.; Kolditz, O.; Walther, M.
2015-11-01
We offer a versatile workflow to convert geological models built with the ParadigmTM GOCAD© (Geological Object Computer Aided Design) software into the open-source VTU (Visualization Toolkit unstructured grid) format for usage in numerical simulation models. Tackling relevant scientific questions or engineering tasks often involves multidisciplinary approaches. Conversion workflows are needed as a way of communication between the diverse tools of the various disciplines. Our approach offers an open-source, platform-independent, robust, and comprehensible method that is potentially useful for a multitude of environmental studies. With two application examples in the Thuringian Syncline, we show how a heterogeneous geological GOCAD model including multiple layers and faults can be used for numerical groundwater flow modeling, in our case employing the OpenGeoSys open-source numerical toolbox for groundwater flow simulations. The presented workflow offers the chance to incorporate increasingly detailed data, utilizing the growing availability of computational power to simulate numerical models.
Open-Source Wax RepRap 3-D Printer for Rapid Prototyping Paper-Based Microfluidics.
Pearce, J M; Anzalone, N C; Heldt, C L
2016-08-01
The open-source release of self-replicating rapid prototypers (RepRaps) has created a rich opportunity for low-cost distributed digital fabrication of complex 3-D objects such as scientific equipment. For example, 3-D printable reactionware devices offer the opportunity to combine open hardware microfluidic handling with lab-on-a-chip reactionware to radically reduce costs and increase the number and complexity of microfluidic applications. To further drive down the cost while improving the performance of lab-on-a-chip paper-based microfluidic prototyping, this study reports on the development of a RepRap upgrade capable of converting a Prusa Mendel RepRap into a wax 3-D printer for paper-based microfluidic applications. An open-source hardware approach is used to demonstrate a 3-D printable upgrade for the 3-D printer, which combines a heated syringe pump with the RepRap/Arduino 3-D control. The bill of materials, designs, basic assembly, and use instructions are provided, along with a completely free and open-source software tool chain. The open-source hardware device described here accelerates the potential of the nascent field of electrochemical detection combined with paper-based microfluidics by dropping the marginal cost of prototyping to nearly zero while accelerating the turnover between paper-based microfluidic designs. © 2016 Society for Laboratory Automation and Screening.
NASA Astrophysics Data System (ADS)
Darch, Peter T.; Sands, Ashley E.
2016-06-01
Sky surveys, such as the Sloan Digital Sky Survey (SDSS) and the Large Synoptic Survey Telescope (LSST), generate data on an unprecedented scale. While many scientific projects span a few years from conception to completion, sky surveys are typically on the scale of decades. This paper focuses on critical challenges arising from long timescales, and how sky surveys address these challenges.We present findings from a study of LSST, comprising interviews (n=58) and observation. Conceived in the 1990s, the LSST Corporation was formed in 2003, and construction began in 2014. LSST will commence data collection operations in 2022 for ten years.One challenge arising from this long timescale is uncertainty about future needs of the astronomers who will use these data many years hence. Sources of uncertainty include scientific questions to be posed, astronomical phenomena to be studied, and tools and practices these astronomers will have at their disposal. These uncertainties are magnified by the rapid technological and scientific developments anticipated between now and the start of LSST operations.LSST is implementing a range of strategies to address these challenges. Some strategies involve delaying resolution of uncertainty, placing this resolution in the hands of future data users. Other strategies aim to reduce uncertainty by shaping astronomers’ data analysis practices so that these practices will integrate well with LSST once operations begin.One approach that exemplifies both types of strategy is the decision to make LSST data management software open source, even now as it is being developed. This policy will enable future data users to adapt this software to evolving needs. In addition, LSST intends for astronomers to start using this software well in advance of 2022, thereby embedding LSST software and data analysis approaches in the practices of astronomers.These findings strengthen arguments for making the software supporting sky surveys available as open source. Such arguments usually focus on reuse potential of software, and enhancing replicability of analyses. In this case, however, open source software also promises to mitigate the critical challenge of anticipating the needs of future data users.
Reynolds, Barbara; Quinn Crouse, Sandra
2008-10-01
During a crisis, an open and empathetic style of communication that engenders the public's trust is the most effective when officials are attempting to galvanize the population to take a positive action or refrain from a harmful act. Although trust is imperative in a crisis, public suspicions of scientific experts and government are increasing for a variety of reasons, including access to more sources of conflicting information, a reduction in the use of scientific reasoning in decision making, and political infighting. Trust and credibility--which are demonstrated through empathy and caring, competence and expertise, honesty and openness, and dedication and commitment--are essential elements of persuasive communication.
Open access to biomedical engineering publications.
Flexman, Jennifer A
2008-01-01
Scientific research is disseminated within the community and to the public in part through journals. Most scientific journals, in turn, protect the manuscript through copyright and recover their costs by charging subscription fees to individuals and institutions. This revenue stream is used to support the management of the journal and, in some cases, professional activities of the sponsoring society such as the Institute of Electrical and Electronics Engineers (IEEE). For example, the IEEE Engineering in Medicine and Biology Society (EMBS) manages seven academic publications representing the various areas of biomedical engineering. New business models have been proposed to distribute journal articles free of charge, either immediately or after a delay, to enable a greater dissemination of knowledge to both the public and the scientific community. However, publication costs must be recovered and likely at a higher cost to the manuscript authors. While there is little doubt that the foundations of scientific publication will change, the specifics and implications of an open source framework must be discussed.
Enhancements to VTK enabling Scientific Visualization in Immersive Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick; Jhaveri, Sankhesh; Chaudhary, Aashish
Modern scientific, engineering and medical computational sim- ulations, as well as experimental and observational data sens- ing/measuring devices, produce enormous amounts of data. While statistical analysis provides insight into this data, scientific vi- sualization is tactically important for scientific discovery, prod- uct design and data analysis. These benefits are impeded, how- ever, when scientific visualization algorithms are implemented from scratch—a time-consuming and redundant process in im- mersive application development. This process can greatly ben- efit from leveraging the state-of-the-art open-source Visualization Toolkit (VTK) and its community. Over the past two (almost three) decades, integrating VTK with a virtual reality (VR)more » environment has only been attempted to varying degrees of success. In this pa- per, we demonstrate two new approaches to simplify this amalga- mation of an immersive interface with visualization rendering from VTK. In addition, we cover several enhancements to VTK that pro- vide near real-time updates and efficient interaction. Finally, we demonstrate the combination of VTK with both Vrui and OpenVR immersive environments in example applications.« less
NASA Astrophysics Data System (ADS)
Huba, J. D.; Joyce, G.
2001-05-01
In the past decade, the Open Source Model for software development has gained popularity and has had numerous major achievements: emacs, Linux, the Gimp, and Python, to name a few. The basic idea is to provide the source code of the model or application, a tutorial on its use, and a feedback mechanism with the community so that the model can be tested, improved, and archived. Given the success of the Open Source Model, we believe it may prove valuable in the development of scientific research codes. With this in mind, we are `Open Sourcing' the low to mid-latitude ionospheric model that has recently been developed at the Naval Research Laboratory: SAMI2 (Sami2 is Another Model of the Ionosphere). The model is comprehensive and uses modern numerical techniques. The structure and design of SAMI2 make it relatively easy to understand and modify: the numerical algorithms are simple and direct, and the code is reasonably well-written. Furthermore, SAMI2 is designed to run on personal computers; prohibitive computational resources are not necessary, thereby making the model accessible and usable by virtually all researchers. For these reasons, SAMI2 is an excellent candidate to explore and test the open source modeling paradigm in space physics research. We will discuss various topics associated with this project. Research supported by the Office of Naval Research.
A Scalable, Open Source Platform for Data Processing, Archiving and Dissemination
2016-01-01
Object Oriented Data Technology (OODT) big data toolkit developed by NASA and the Work-flow INstance Generation and Selection (WINGS) scientific work...to several challenge big data problems and demonstrated the utility of OODT-WINGS in addressing them. Specific demonstrated analyses address i...source software, Apache, Object Oriented Data Technology, OODT, semantic work-flows, WINGS, big data , work- flow management 16. SECURITY CLASSIFICATION OF
NASA Astrophysics Data System (ADS)
Chaudhary, A.
2017-12-01
Current simulation models and sensors are producing high-resolution, high-velocity data in geosciences domain. Knowledge discovery from these complex and large size datasets require tools that are capable of handling very large data and providing interactive data analytics features to researchers. To this end, Kitware and its collaborators are producing open-source tools GeoNotebook, GeoJS, Gaia, and Minerva for geosciences that are using hardware accelerated graphics and advancements in parallel and distributed processing (Celery and Apache Spark) and can be loosely coupled to solve real-world use-cases. GeoNotebook (https://github.com/OpenGeoscience/geonotebook) is co-developed by Kitware and NASA-Ames and is an extension to the Jupyter Notebook. It provides interactive visualization and python-based analysis of geospatial data and depending the backend (KTile or GeoPySpark) can handle data sizes of Hundreds of Gigabytes to Terabytes. GeoNotebook uses GeoJS (https://github.com/OpenGeoscience/geojs) to render very large geospatial data on the map using WebGL and Canvas2D API. GeoJS is more than just a GIS library as users can create scientific plots such as vector and contour and can embed InfoVis plots using D3.js. GeoJS aims for high-performance visualization and interactive data exploration of scientific and geospatial location aware datasets and supports features such as Point, Line, Polygon, and advanced features such as Pixelmap, Contour, Heatmap, and Choropleth. Our another open-source tool Minerva ((https://github.com/kitware/minerva) is a geospatial application that is built on top of open-source web-based data management system Girder (https://github.com/girder/girder) which provides an ability to access data from HDFS or Amazon S3 buckets and provides capabilities to perform visualization and analyses on geosciences data in a web environment using GDAL and GeoPandas wrapped in a unified API provided by Gaia (https://github.com/OpenDataAnalytics/gaia). In this presentation, we will discuss core features of each of these tools and will present lessons learned on handling large data in the context of data management, analyses and visualization.
Open source software and low cost sensors for teaching UAV science
NASA Astrophysics Data System (ADS)
Kefauver, S. C.; Sanchez-Bragado, R.; El-Haddad, G.; Araus, J. L.
2016-12-01
Drones, also known as UASs (unmanned aerial systems), UAVs (Unmanned Aerial Vehicles) or RPAS (Remotely piloted aircraft systems), are both useful advanced scientific platforms and recreational toys that are appealing to younger generations. As such, they can make for excellent education tools as well as low-cost scientific research project alternatives. However, the process of taking pretty pictures to remote sensing science can be daunting if one is presented with only expensive software and sensor options. There are a number of open-source tools and low cost platform and sensor options available that can provide excellent scientific research results, and, by often requiring more user-involvement than commercial software and sensors, provide even greater educational benefits. Scale-invariant feature transform (SIFT) algorithm implementations, such as the Microsoft Image Composite Editor (ICE), which can create quality 2D image mosaics with some motion and terrain adjustments and VisualSFM (Structure from Motion), which can provide full image mosaicking with movement and orthorectification capacities. RGB image quantification using alternate color space transforms, such as the BreedPix indices, can be calculated via plugins in the open-source software Fiji (http://fiji.sc/Fiji; http://github.com/george-haddad/CIMMYT). Recent analyses of aerial images from UAVs over different vegetation types and environments have shown RGB metrics can outperform more costly commercial sensors. Specifically, Hue-based pixel counts, the Triangle Greenness Index (TGI), and the Normalized Green Red Difference Index (NGRDI) consistently outperformed NDVI in estimating abiotic and biotic stress impacts on crop health. Also, simple kits are available for NDVI camera conversions. Furthermore, suggestions for multivariate analyses of the different RGB indices in the "R program for statistical computing", such as classification and regression trees can allow for a more approachable interpretation of results in the classroom.
The role of open-source software in innovation and standardization in radiology.
Erickson, Bradley J; Langer, Steve; Nagy, Paul
2005-11-01
The use of open-source software (OSS), in which developers release the source code to applications they have developed, is popular in the software industry. This is done to allow others to modify and improve software (which may or may not be shared back to the community) and to allow others to learn from the software. Radiology was an early participant in this model, supporting OSS that implemented the ACR-National Electrical Manufacturers Association (now Digital Imaging and Communications in Medicine) standard for medical image communications. In radiology and in other fields, OSS has promoted innovation and the adoption of standards. Popular OSS is of high quality because access to source code allows many people to identify and resolve errors. Open-source software is analogous to the peer-review scientific process: one must be able to see and reproduce results to understand and promote what is shared. The authors emphasize that support for OSS need not threaten vendors; most vendors embrace and benefit from standards. Open-source development does not replace vendors but more clearly defines their roles, typically focusing on areas in which proprietary differentiators benefit customers and on professional services such as implementation planning and service. Continued support for OSS is essential for the success of our field.
Science Gateways, Scientific Workflows and Open Community Software
NASA Astrophysics Data System (ADS)
Pierce, M. E.; Marru, S.
2014-12-01
Science gateways and scientific workflows occupy different ends of the spectrum of user-focused cyberinfrastructure. Gateways, sometimes called science portals, provide a way for enabling large numbers of users to take advantage of advanced computing resources (supercomputers, advanced storage systems, science clouds) by providing Web and desktop interfaces and supporting services. Scientific workflows, at the other end of the spectrum, support advanced usage of cyberinfrastructure that enable "power users" to undertake computational experiments that are not easily done through the usual mechanisms (managing simulations across multiple sites, for example). Despite these different target communities, gateways and workflows share many similarities and can potentially be accommodated by the same software system. For example, pipelines to process InSAR imagery sets or to datamine GPS time series data are workflows. The results and the ability to make downstream products may be made available through a gateway, and power users may want to provide their own custom pipelines. In this abstract, we discuss our efforts to build an open source software system, Apache Airavata, that can accommodate both gateway and workflow use cases. Our approach is general, and we have applied the software to problems in a number of scientific domains. In this talk, we discuss our applications to usage scenarios specific to earth science, focusing on earthquake physics examples drawn from the QuakSim.org and GeoGateway.org efforts. We also examine the role of the Apache Software Foundation's open community model as a way to build up common commmunity codes that do not depend upon a single "owner" to sustain. Pushing beyond open source software, we also see the need to provide gateways and workflow systems as cloud services. These services centralize operations, provide well-defined programming interfaces, scale elastically, and have global-scale fault tolerance. We discuss our work providing Apache Airavata as a hosted service to provide these features.
NASA Astrophysics Data System (ADS)
Strassmann, Kuno M.; Joos, Fortunat
2018-05-01
The Bern Simple Climate Model (BernSCM) is a free open-source re-implementation of a reduced-form carbon cycle-climate model which has been used widely in previous scientific work and IPCC assessments. BernSCM represents the carbon cycle and climate system with a small set of equations for the heat and carbon budget, the parametrization of major nonlinearities, and the substitution of complex component systems with impulse response functions (IRFs). The IRF approach allows cost-efficient yet accurate substitution of detailed parent models of climate system components with near-linear behavior. Illustrative simulations of scenarios from previous multimodel studies show that BernSCM is broadly representative of the range of the climate-carbon cycle response simulated by more complex and detailed models. Model code (in Fortran) was written from scratch with transparency and extensibility in mind, and is provided open source. BernSCM makes scientifically sound carbon cycle-climate modeling available for many applications. Supporting up to decadal time steps with high accuracy, it is suitable for studies with high computational load and for coupling with integrated assessment models (IAMs), for example. Further applications include climate risk assessment in a business, public, or educational context and the estimation of CO2 and climate benefits of emission mitigation options.
NASA Technical Reports Server (NTRS)
Centrella, Joan M.
2011-01-01
The Laser Interferometer Space Antenna (LISA) is a space-borne observatory that will open the low frequency (approx.0.1-100 mHz) gravitational wave window on the universe. LISA will observe a rich variety of gravitational wave sources, including mergers of massive black holes, captures of stellar black holes by massive black holes in the centers of galaxies, and compact Galactic binaries. These sources are generally long-lived, providing unprecedented opportunities for multi-messenger astronomy in the transient sky. This talk will present an overview of these scientific arenas, highlighting how LISA will enable stunning discoveries in origins, understanding the cosmic order, and the frontiers of knowledge.
Sochat, Vanessa
2018-05-01
Here, we present the Scientific Filesystem (SCIF), an organizational format that supports exposure of executables and metadata for discoverability of scientific applications. The format includes a known filesystem structure, a definition for a set of environment variables describing it, and functions for generation of the variables and interaction with the libraries, metadata, and executables located within. SCIF makes it easy to expose metadata, multiple environments, installation steps, files, and entry points to render scientific applications consistent, modular, and discoverable. A SCIF can be installed on a traditional host or in a container technology such as Docker or Singularity. We start by reviewing the background and rationale for the SCIF, followed by an overview of the specification and the different levels of internal modules ("apps") that the organizational format affords. Finally, we demonstrate that SCIF is useful by implementing and discussing several use cases that improve user interaction and understanding of scientific applications. SCIF is released along with a client and integration in the Singularity 2.4 software to quickly install and interact with SCIF. When used inside of a reproducible container, a SCIF is a recipe for reproducibility and introspection of the functions and users that it serves. We use SCIF to evaluate container software, provide metrics, serve scientific workflows, and execute a primary function under different contexts. To encourage collaboration and sharing of applications, we developed tools along with an open source, version-controlled, tested, and programmatically accessible web infrastructure. SCIF and associated resources are available at https://sci-f.github.io. The ease of using SCIF, especially in the context of containers, offers promise for scientists' work to be self-documenting and programatically parseable for maximum reproducibility. SCIF opens up an abstraction from underlying programming languages and packaging logic to work with scientific applications, opening up new opportunities for scientific software development.
The GenABEL Project for statistical genomics
Karssen, Lennart C.; van Duijn, Cornelia M.; Aulchenko, Yurii S.
2016-01-01
Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination. PMID:27347381
Open-Source Selective Laser Sintering (OpenSLS) of Nylon and Biocompatible Polycaprolactone
Paulsen, Samantha J.; Hwang, Daniel H.; Ta, Anderson H.; Yalacki, David R.; Schmidt, Tim; Miller, Jordan S.
2016-01-01
Selective Laser Sintering (SLS) is an additive manufacturing process that uses a laser to fuse powdered starting materials into solid 3D structures. Despite the potential for fabrication of complex, high-resolution structures with SLS using diverse starting materials (including biomaterials), prohibitive costs of commercial SLS systems have hindered the wide adoption of this technology in the scientific community. Here, we developed a low-cost, open-source SLS system (OpenSLS) and demonstrated its capacity to fabricate structures in nylon with sub-millimeter features and overhanging regions. Subsequently, we demonstrated fabrication of polycaprolactone (PCL) into macroporous structures such as a diamond lattice. Widespread interest in using PCL for bone tissue engineering suggests that PCL lattices are relevant model scaffold geometries for engineering bone. SLS of materials with large powder grain size (~500 μm) leads to part surfaces with high roughness, so we further introduced a simple vapor-smoothing technique to reduce the surface roughness of sintered PCL structures which further improves their elastic modulus and yield stress. Vapor-smoothed PCL can also be used for sacrificial templating of perfusable fluidic networks within orthogonal materials such as poly(dimethylsiloxane) silicone. Finally, we demonstrated that human mesenchymal stem cells were able to adhere, survive, and differentiate down an osteogenic lineage on sintered and smoothed PCL surfaces, suggesting that OpenSLS has the potential to produce PCL scaffolds useful for cell studies. OpenSLS provides the scientific community with an accessible platform for the study of laser sintering and the fabrication of complex geometries in diverse materials. PMID:26841023
Open-Source Selective Laser Sintering (OpenSLS) of Nylon and Biocompatible Polycaprolactone.
Kinstlinger, Ian S; Bastian, Andreas; Paulsen, Samantha J; Hwang, Daniel H; Ta, Anderson H; Yalacki, David R; Schmidt, Tim; Miller, Jordan S
2016-01-01
Selective Laser Sintering (SLS) is an additive manufacturing process that uses a laser to fuse powdered starting materials into solid 3D structures. Despite the potential for fabrication of complex, high-resolution structures with SLS using diverse starting materials (including biomaterials), prohibitive costs of commercial SLS systems have hindered the wide adoption of this technology in the scientific community. Here, we developed a low-cost, open-source SLS system (OpenSLS) and demonstrated its capacity to fabricate structures in nylon with sub-millimeter features and overhanging regions. Subsequently, we demonstrated fabrication of polycaprolactone (PCL) into macroporous structures such as a diamond lattice. Widespread interest in using PCL for bone tissue engineering suggests that PCL lattices are relevant model scaffold geometries for engineering bone. SLS of materials with large powder grain size (~500 μm) leads to part surfaces with high roughness, so we further introduced a simple vapor-smoothing technique to reduce the surface roughness of sintered PCL structures which further improves their elastic modulus and yield stress. Vapor-smoothed PCL can also be used for sacrificial templating of perfusable fluidic networks within orthogonal materials such as poly(dimethylsiloxane) silicone. Finally, we demonstrated that human mesenchymal stem cells were able to adhere, survive, and differentiate down an osteogenic lineage on sintered and smoothed PCL surfaces, suggesting that OpenSLS has the potential to produce PCL scaffolds useful for cell studies. OpenSLS provides the scientific community with an accessible platform for the study of laser sintering and the fabrication of complex geometries in diverse materials.
NASA Astrophysics Data System (ADS)
Vines, Aleksander; Hansen, Morten W.; Korosov, Anton
2017-04-01
Existing infrastructure international and Norwegian projects, e.g., NorDataNet, NMDC and NORMAP, provide open data access through the OPeNDAP protocol following the conventions for CF (Climate and Forecast) metadata, designed to promote the processing and sharing of files created with the NetCDF application programming interface (API). This approach is now also being implemented in the Norwegian Sentinel Data Hub (satellittdata.no) to provide satellite EO data to the user community. Simultaneously with providing simplified and unified data access, these projects also seek to use and establish common standards for use and discovery metadata. This then allows development of standardized tools for data search and (subset) streaming over the internet to perform actual scientific analysis. A combinnation of software tools, which we call a Scientific Platform as a Service (SPaaS), will take advantage of these opportunities to harmonize and streamline the search, retrieval and analysis of integrated satellite and auxiliary observations of the oceans in a seamless system. The SPaaS is a cloud solution for integration of analysis tools with scientific datasets via an API. The core part of the SPaaS is a distributed metadata catalog to store granular metadata describing the structure, location and content of available satellite, model, and in situ datasets. The analysis tools include software for visualization (also online), interactive in-depth analysis, and server-based processing chains. The API conveys search requests between system nodes (i.e., interactive and server tools) and provides easy access to the metadata catalog, data repositories, and the tools. The SPaaS components are integrated in virtual machines, of which provisioning and deployment are automatized using existing state-of-the-art open-source tools (e.g., Vagrant, Ansible, Docker). The open-source code for scientific tools and virtual machine configurations is under version control at https://github.com/nansencenter/, and is coupled to an online continuous integration system (e.g., Travis CI).
THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS
2017-09-01
Evan Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, Saul Perlmutter. Scientific Computing Meets Big Data Technology: An Astronomy ...Processing Astronomy Imagery Using Big Data Technology. IEEE Transaction on Big Data, 2016. Approved for Public Release; Distribution Unlimited. 22 [93
Environmental and Landscape Remote Sensing Using Free and Open Source Image Processing Tools
As global climate change and human activities impact the environment, there is a growing need for scientific tools to monitor and measure environmental conditions that support human and ecological health. Remotely sensed imagery from satellite and airborne platforms provides a g...
NASA Astrophysics Data System (ADS)
Fulker, D. W.; Gallagher, J. H. R.
2015-12-01
OPeNDAP's Hyrax data server is an open-source framework fostering interoperability via easily-deployed Web services. Compatible with solutions listed in the (PA001) session description—federation, rigid standards and brokering/mediation—the framework can support tight or loose coupling, even with dependence on community-contributed software. Hyrax is a Web-services framework with a middleware-like design and a handler-style architecture that together reduce the interoperability challenge (for N datatypes and M user contexts) to an O(N+M) problem, similar to brokering. Combined with an open-source ethos, this reduction makes Hyrax a community tool for gaining interoperability. E.g., in its response to the Big Earth Data Initiative (BEDI), NASA references OPeNDAP-based interoperability. Assuming its suitability, the question becomes: how sustainable is OPeNDAP, a small not-for-profit that produces open-source software, i.e., has no software-sales? In other words, if geoscience interoperability depends on OPeNDAP and similar organizations, are those entities in turn sustainable? Jim Collins (in Good to Great) highlights three questions that successful companies can answer (paraphrased here): What is your passion? Where is your world-class excellence? What drives your economic engine? We attempt to shed light on OPeNDAP sustainability by examining these. Passion: OPeNDAP has a focused passion for improving the effectiveness of scientific data sharing and use, as deeply-cooperative community endeavors. Excellence: OPeNDAP has few peers in remote, scientific data access. Skills include computer science with experience in data science, (operational, secure) Web services, and software design (for servers and clients, where the latter vary from Web pages to standalone apps and end-user programs). Economic Engine: OPeNDAP is an engineering services organization more than a product company, despite software being key to OPeNDAP's reputation. In essence, provision of engineering expertise, via contracts and grants, is the economic engine. Hence sustainability, as needed to address global grand challenges in geoscience, depends on agencies' and others' abilities and willingness to offer grants and let contracts for continually upgrading open-source software from OPeNDAP and others.
NASA Astrophysics Data System (ADS)
Lee, Hee-Sun; Pallant, Amy; Pryputniewicz, Sarah
2015-08-01
Teaching scientific argumentation has emerged as an important goal for K-12 science education. In scientific argumentation, students are actively involved in coordinating evidence with theory based on their understanding of the scientific content and thinking critically about the strengths and weaknesses of the cited evidence in the context of the investigation. We developed a one-week-long online curriculum module called "Is there life in space?" where students conduct a series of four model-based tasks to learn how scientists detect extrasolar planets through the “wobble” and transit methods. The simulation model allows students to manipulate various parameters of an imaginary star and planet system such as planet size, orbit size, planet-orbiting-plane angle, and sensitivity of telescope equipment, and to adjust the display settings for graphs illustrating the relative velocity and light intensity of the star. Students can use model-based evidence to formulate an argument on whether particular signals in the graphs guarantee the presence of a planet. Students' argumentation is facilitated by the four-part prompts consisting of multiple-choice claim, open-ended explanation, Likert-scale uncertainty rating, and open-ended uncertainty rationale. We analyzed 1,013 scientific arguments formulated by 302 high school student groups taught by 7 teachers. We coded these arguments in terms of the accuracy of their claim, the sophistication of explanation connecting evidence to the established knowledge base, the uncertainty rating, and the scientific validity of uncertainty. We found that (1) only 18% of the students' uncertainty rationale involved critical reflection on limitations inherent in data and concepts, (2) 35% of students' uncertainty rationale reflected their assessment of personal ability and knowledge, rather than scientific sources of uncertainty related to the evidence, and (3) the nature of task such as the use of noisy data or the framing of critiquing scientists' discovery encouraged students' articulation of scientific uncertainty sources in different ways.
Frameworks Coordinate Scientific Data Management
NASA Technical Reports Server (NTRS)
2012-01-01
Jet Propulsion Laboratory computer scientists developed a unique software framework to help NASA manage its massive amounts of science data. Through a partnership with the Apache Software Foundation of Forest Hill, Maryland, the technology is now available as an open-source solution and is in use by cancer researchers and pediatric hospitals.
Fenwick, Matthew; Sesanker, Colbert; Schiller, Martin R.; Ellis, Heidi JC; Hinman, M. Lee; Vyas, Jay; Gryk, Michael R.
2012-01-01
Scientists are continually faced with the need to express complex mathematical notions in code. The renaissance of functional languages such as LISP and Haskell is often credited to their ability to implement complex data operations and mathematical constructs in an expressive and natural idiom. The slow adoption of functional computing in the scientific community does not, however, reflect the congeniality of these fields. Unfortunately, the learning curve for adoption of functional programming techniques is steeper than that for more traditional languages in the scientific community, such as Python and Java, and this is partially due to the relative sparseness of available learning resources. To fill this gap, we demonstrate and provide applied, scientifically substantial examples of functional programming, We present a multi-language source-code repository for software integration and algorithm development, which generally focuses on the fields of machine learning, data processing, bioinformatics. We encourage scientists who are interested in learning the basics of functional programming to adopt, reuse, and learn from these examples. The source code is available at: https://github.com/CONNJUR/CONNJUR-Sandbox (see also http://www.connjur.org). PMID:25328913
Fenwick, Matthew; Sesanker, Colbert; Schiller, Martin R; Ellis, Heidi Jc; Hinman, M Lee; Vyas, Jay; Gryk, Michael R
2012-01-01
Scientists are continually faced with the need to express complex mathematical notions in code. The renaissance of functional languages such as LISP and Haskell is often credited to their ability to implement complex data operations and mathematical constructs in an expressive and natural idiom. The slow adoption of functional computing in the scientific community does not, however, reflect the congeniality of these fields. Unfortunately, the learning curve for adoption of functional programming techniques is steeper than that for more traditional languages in the scientific community, such as Python and Java, and this is partially due to the relative sparseness of available learning resources. To fill this gap, we demonstrate and provide applied, scientifically substantial examples of functional programming, We present a multi-language source-code repository for software integration and algorithm development, which generally focuses on the fields of machine learning, data processing, bioinformatics. We encourage scientists who are interested in learning the basics of functional programming to adopt, reuse, and learn from these examples. The source code is available at: https://github.com/CONNJUR/CONNJUR-Sandbox (see also http://www.connjur.org).
Kragic, Rastislav; Kostic, Mirjana
2018-01-01
In this paper, we present the construction of a reliable and inexpensive pH stat device, by using open-source “OpenPhControl” software, inexpensive hardware (a peristaltic and a syringe pump, Arduino, a step motor…), readily available laboratory devices: a pH meter, a computer, a webcam, and some 3D printed parts. We provide a methodology for the design, development and test results of each part of the device, as well as of the entire system. In addition to dosing reagents by means of a low-cost peristaltic pump, we also present carefully controlled dosing of reagents by an open-source syringe pump. The upgrading of the basic open-source syringe pump is given in terms of pump control and application of a larger syringe. In addition to the basic functions of pH stat, i.e. pH value measurement and maintenance, an improvement allowing the device to be used for potentiometric titration has been made as well. We have demonstrated the device’s utility when applied for cellulose fibers oxidation with 2,2,6,6-tetramethylpiperidine-1-oxyl radical, i.e. for TEMPO-mediated oxidation. In support of this, we present the results obtained for the oxidation kinetics, the consumption of added reagent and experimental repeatability. Considering that the open-source scientific tools are available to everyone, and that researchers can construct and adjust the device according to their needs, as well as, that the total cost of the open-source pH stat device, excluding the existing laboratory equipment (pH meter, computer and glossary) was less than 150 EUR, we believe that, at a small fraction of the cost of available commercial offers, our open-source pH stat can significantly improve experimental work where the use of pH stat is necessary. PMID:29509793
Milanovic, Jovana Z; Milanovic, Predrag; Kragic, Rastislav; Kostic, Mirjana
2018-01-01
In this paper, we present the construction of a reliable and inexpensive pH stat device, by using open-source "OpenPhControl" software, inexpensive hardware (a peristaltic and a syringe pump, Arduino, a step motor…), readily available laboratory devices: a pH meter, a computer, a webcam, and some 3D printed parts. We provide a methodology for the design, development and test results of each part of the device, as well as of the entire system. In addition to dosing reagents by means of a low-cost peristaltic pump, we also present carefully controlled dosing of reagents by an open-source syringe pump. The upgrading of the basic open-source syringe pump is given in terms of pump control and application of a larger syringe. In addition to the basic functions of pH stat, i.e. pH value measurement and maintenance, an improvement allowing the device to be used for potentiometric titration has been made as well. We have demonstrated the device's utility when applied for cellulose fibers oxidation with 2,2,6,6-tetramethylpiperidine-1-oxyl radical, i.e. for TEMPO-mediated oxidation. In support of this, we present the results obtained for the oxidation kinetics, the consumption of added reagent and experimental repeatability. Considering that the open-source scientific tools are available to everyone, and that researchers can construct and adjust the device according to their needs, as well as, that the total cost of the open-source pH stat device, excluding the existing laboratory equipment (pH meter, computer and glossary) was less than 150 EUR, we believe that, at a small fraction of the cost of available commercial offers, our open-source pH stat can significantly improve experimental work where the use of pH stat is necessary.
a Framework for AN Open Source Geospatial Certification Model
NASA Astrophysics Data System (ADS)
Khan, T. U. R.; Davis, P.; Behr, F.-J.
2016-06-01
The geospatial industry is forecasted to have an enormous growth in the forthcoming years and an extended need for well-educated workforce. Hence ongoing education and training play an important role in the professional life. Parallel, in the geospatial and IT arena as well in the political discussion and legislation Open Source solutions, open data proliferation, and the use of open standards have an increasing significance. Based on the Memorandum of Understanding between International Cartographic Association, OSGeo Foundation, and ISPRS this development led to the implementation of the ICA-OSGeo-Lab imitative with its mission "Making geospatial education and opportunities accessible to all". Discussions in this initiative and the growth and maturity of geospatial Open Source software initiated the idea to develop a framework for a worldwide applicable Open Source certification approach. Generic and geospatial certification approaches are already offered by numerous organisations, i.e., GIS Certification Institute, GeoAcademy, ASPRS, and software vendors, i. e., Esri, Oracle, and RedHat. They focus different fields of expertise and have different levels and ways of examination which are offered for a wide range of fees. The development of the certification framework presented here is based on the analysis of diverse bodies of knowledge concepts, i.e., NCGIA Core Curriculum, URISA Body Of Knowledge, USGIF Essential Body Of Knowledge, the "Geographic Information: Need to Know", currently under development, and the Geospatial Technology Competency Model (GTCM). The latter provides a US American oriented list of the knowledge, skills, and abilities required of workers in the geospatial technology industry and influenced essentially the framework of certification. In addition to the theoretical analysis of existing resources the geospatial community was integrated twofold. An online survey about the relevance of Open Source was performed and evaluated with 105 respondents worldwide. 15 interviews (face-to-face or by telephone) with experts in different countries provided additional insights into Open Source usage and certification. The findings led to the development of a certification framework of three main categories with in total eleven sub-categories, i.e., "Certified Open Source Geospatial Data Associate / Professional", "Certified Open Source Geospatial Analyst Remote Sensing & GIS", "Certified Open Source Geospatial Cartographer", "Certified Open Source Geospatial Expert", "Certified Open Source Geospatial Associate Developer / Professional Developer", "Certified Open Source Geospatial Architect". Each certification is described by pre-conditions, scope and objectives, course content, recommended software packages, target group, expected benefits, and the methods of examination. Examinations can be flanked by proofs of professional career paths and achievements which need a peer qualification evaluation. After a couple of years a recertification is required. The concept seeks the accreditation by the OSGeo Foundation (and other bodies) and international support by a group of geospatial scientific institutions to achieve wide and international acceptance for this Open Source geospatial certification model. A business case for Open Source certification and a corresponding SWOT model is examined to support the goals of the Geo-For-All initiative of the ICA-OSGeo pact.
Realizing the Living Paper using the ProvONE Model for Reproducible Research
NASA Astrophysics Data System (ADS)
Jones, M. B.; Jones, C. S.; Ludäscher, B.; Missier, P.; Walker, L.; Slaughter, P.; Schildhauer, M.; Cuevas-Vicenttín, V.
2015-12-01
Science has advanced through traditional publications that codify research results as a permenant part of the scientific record. But because publications are static and atomic, researchers can only cite and reference a whole work when building on prior work of colleagues. The open source software model has demonstrated a new approach in which strong version control in an open environment can nurture an open ecosystem of software. Developers now commonly fork and extend software giving proper credit, with less repetition, and with confidence in the relationship to original software. Through initiatives like 'Beyond the PDF', an analogous model has been imagined for open science, in which software, data, analyses, and derived products become first class objects within a publishing ecosystem that has evolved to be finer-grained and is realized through a web of linked open data. We have prototyped a Living Paper concept by developing the ProvONE provenance model for scientific workflows, with prototype deployments in DataONE. ProvONE promotes transparency and openness by describing the authenticity, origin, structure, and processing history of research artifacts and by detailing the steps in computational workflows that produce derived products. To realize the Living Paper, we decompose scientific papers into their constituent products and publish these as compound objects in the DataONE federation of archival repositories. Each individual finding and sub-product of a reseach project (such as a derived data table, a workflow or script, a figure, an image, or a finding) can be independently stored, versioned, and cited. ProvONE provenance traces link these fine-grained products within and across versions of a paper, and across related papers that extend an original analysis. This allows for open scientific publishing in which researchers extend and modify findings, creating a dynamic, evolving web of results that collectively represent the scientific enterprise. The Living Paper provides detailed metadata for properly interpreting and verifying individual research findings, for tracing the origin of ideas, for launching new lines of inquiry, and for implementing transitive credit for research and engineering.
Open Data Infrastructures And The Future Of Science
NASA Astrophysics Data System (ADS)
Boulton, G. S.
2016-12-01
Open publication of the evidence (the data) supporting a scientific claim has been the bedrock on which the scientific advances of the modern era of science have been built. It is also of immense importance in confronting three challenges unleashed by the digital revolution. The first is the threat the digital data storm poses to the principle of "scientific self-correction", in which false concepts are weeded out because of a demonstrable failure in logic or in the replication of observations or experiments. Large and complex data volumes are difficult to make openly available in ways that make rigorous scrutiny possible. Secondly, linking and integrating data from different sources about the same phenomena have created profound new opportunities for understanding the Earth. If data are neither accessible nor useable, such opportunities cannot be seized. Thirdly, open access publication, open data and ubiquitous modern communications enhance the prospects for an era of "Open Science" in which science emerges from behind its laboratory doors to engage in co-production of knowledge with other stakeholders in addressing major contemporary challenges to human society, in particular the need for long term thinking about planetary sustainability. If the benefits of an open data regime are to be realised, only a small part of the challenge lies in providing "hard" infrastructure. The major challenges lie in the "soft" infrastructure of relationships between the components of national science systems, of analytic and software tools, of national and international standards and the normative principles adopted by scientists themselves. The principles that underlie these relationships, the responsibilities of key actors and the rules of the game needed to maximise national performance and facilitate international collaboration are set out in an International Accord on Open Data.
Instrumentino: An Open-Source Software for Scientific Instruments.
Koenka, Israel Joel; Sáiz, Jorge; Hauser, Peter C
2015-01-01
Scientists often need to build dedicated computer-controlled experimental systems. For this purpose, it is becoming common to employ open-source microcontroller platforms, such as the Arduino. These boards and associated integrated software development environments provide affordable yet powerful solutions for the implementation of hardware control of transducers and acquisition of signals from detectors and sensors. It is, however, a challenge to write programs that allow interactive use of such arrangements from a personal computer. This task is particularly complex if some of the included hardware components are connected directly to the computer and not via the microcontroller. A graphical user interface framework, Instrumentino, was therefore developed to allow the creation of control programs for complex systems with minimal programming effort. By writing a single code file, a powerful custom user interface is generated, which enables the automatic running of elaborate operation sequences and observation of acquired experimental data in real time. The framework, which is written in Python, allows extension by users, and is made available as an open source project.
Orchestrating high-throughput genomic analysis with Bioconductor
Huber, Wolfgang; Carey, Vincent J.; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S.; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D.; Irizarry, Rafael A.; Lawrence, Michael; Love, Michael I.; MacDonald, James; Obenchain, Valerie; Oleś, Andrzej K.; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K.; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin
2015-01-01
Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors. PMID:25633503
SedInConnect: a stand-alone, free and open source tool for the assessment of sediment connectivity
NASA Astrophysics Data System (ADS)
Crema, Stefano; Cavalli, Marco
2018-02-01
There is a growing call, within the scientific community, for solid theoretic frameworks and usable indices/models to assess sediment connectivity. Connectivity plays a significant role in characterizing structural properties of the landscape and, when considered in combination with forcing processes (e.g., rainfall-runoff modelling), can represent a valuable analysis for an improved landscape management. In this work, the authors present the development and application of SedInConnect: a free, open source and stand-alone application for the computation of the Index of Connectivity (IC), as expressed in Cavalli et al. (2013) with the addition of specific innovative features. The tool is intended to have a wide variety of users, both from the scientific community and from the authorities involved in the environmental planning. Thanks to its open source nature, the tool can be adapted and/or integrated according to the users' requirements. Furthermore, presenting an easy-to-use interface and being a stand-alone application, the tool can help management experts in the quantitative assessment of sediment connectivity in the context of hazard and risk assessment. An application to a sample dataset and an overview on up-to-date applications of the approach and of the tool shows the development potential of such analyses. The modelled connectivity, in fact, appears suitable not only to characterize sediment dynamics at the catchment scale but also to integrate prediction models and as a tool for helping geomorphological interpretation.
NASA Technical Reports Server (NTRS)
Jedlovec, Gary; Srikishen, Jayanthi; Edwards, Rita; Cross, David; Welch, Jon; Smith, Matt
2013-01-01
The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of "big data" available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Shortterm Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video-teleconferences, presentation slides, documents, spreadsheets or laptop screens. SAGE is cross-platform, community-driven, open-source visualization and collaboration middleware that utilizes shared national and international cyberinfrastructure for the advancement of scientific research and education.
NASA Astrophysics Data System (ADS)
Jedlovec, G.; Srikishen, J.; Edwards, R.; Cross, D.; Welch, J. D.; Smith, M. R.
2013-12-01
The use of collaborative scientific visualization systems for the analysis, visualization, and sharing of 'big data' available from new high resolution remote sensing satellite sensors or four-dimensional numerical model simulations is propelling the wider adoption of ultra-resolution tiled display walls interconnected by high speed networks. These systems require a globally connected and well-integrated operating environment that provides persistent visualization and collaboration services. This abstract and subsequent presentation describes a new collaborative visualization system installed for NASA's Short-term Prediction Research and Transition (SPoRT) program at Marshall Space Flight Center and its use for Earth science applications. The system consists of a 3 x 4 array of 1920 x 1080 pixel thin bezel video monitors mounted on a wall in a scientific collaboration lab. The monitors are physically and virtually integrated into a 14' x 7' for video display. The display of scientific data on the video wall is controlled by a single Alienware Aurora PC with a 2nd Generation Intel Core 4.1 GHz processor, 32 GB memory, and an AMD Fire Pro W600 video card with 6 mini display port connections. Six mini display-to-dual DVI cables are used to connect the 12 individual video monitors. The open source Scalable Adaptive Graphics Environment (SAGE) windowing and media control framework, running on top of the Ubuntu 12 Linux operating system, allows several users to simultaneously control the display and storage of high resolution still and moving graphics in a variety of formats, on tiled display walls of any size. The Ubuntu operating system supports the open source Scalable Adaptive Graphics Environment (SAGE) software which provides a common environment, or framework, enabling its users to access, display and share a variety of data-intensive information. This information can be digital-cinema animations, high-resolution images, high-definition video-teleconferences, presentation slides, documents, spreadsheets or laptop screens. SAGE is cross-platform, community-driven, open-source visualization and collaboration middleware that utilizes shared national and international cyberinfrastructure for the advancement of scientific research and education.
A framework for integration of scientific applications into the OpenTopography workflow
NASA Astrophysics Data System (ADS)
Nandigam, V.; Crosby, C.; Baru, C.
2012-12-01
The NSF-funded OpenTopography facility provides online access to Earth science-oriented high-resolution LIDAR topography data, online processing tools, and derivative products. The underlying cyberinfrastructure employs a multi-tier service oriented architecture that is comprised of an infrastructure tier, a processing services tier, and an application tier. The infrastructure tier consists of storage, compute resources as well as supporting databases. The services tier consists of the set of processing routines each deployed as a Web service. The applications tier provides client interfaces to the system. (e.g. Portal). We propose a "pluggable" infrastructure design that will allow new scientific algorithms and processing routines developed and maintained by the community to be integrated into the OpenTopography system so that the wider earth science community can benefit from its availability. All core components in OpenTopography are available as Web services using a customized open-source Opal toolkit. The Opal toolkit provides mechanisms to manage and track job submissions, with the help of a back-end database. It allows monitoring of job and system status by providing charting tools. All core components in OpenTopography have been developed, maintained and wrapped as Web services using Opal by OpenTopography developers. However, as the scientific community develops new processing and analysis approaches this integration approach is not scalable efficiently. Most of the new scientific applications will have their own active development teams performing regular updates, maintenance and other improvements. It would be optimal to have the application co-located where its developers can continue to actively work on it while still making it accessible within the OpenTopography workflow for processing capabilities. We will utilize a software framework for remote integration of these scientific applications into the OpenTopography system. This will be accomplished by virtually extending the OpenTopography service over the various infrastructures running these scientific applications and processing routines. This involves packaging and distributing a customized instance of the Opal toolkit that will wrap the software application as an OPAL-based web service and integrate it into the OpenTopography framework. We plan to make this as automated as possible. A structured specification of service inputs and outputs along with metadata annotations encoded in XML can be utilized to automate the generation of user interfaces, with appropriate tools tips and user help features, and generation of other internal software. The OpenTopography Opal toolkit will also include the customizations that will enable security authentication, authorization and the ability to write application usage and job statistics back to the OpenTopography databases. This usage information could then be reported to the original service providers and used for auditing and performance improvements. This pluggable framework will enable the application developers to continue to work on enhancing their application while making the latest iteration available in a timely manner to the earth sciences community. This will also help us establish an overall framework that other scientific application providers will also be able to use going forward.
The ESA Scientific Exploitation of Operational Missions element, first results
NASA Astrophysics Data System (ADS)
Desnos, Yves-Louis; Regner, Peter; Delwart, Steven; Benveniste, Jerome; Engdahl, Marcus; Mathieu, Pierre-Philippe; Gascon, Ferran; Donlon, Craig; Davidson, Malcolm; Pinnock, Simon; Foumelis, Michael; Ramoino, Fabrizio
2016-04-01
SEOM is a program element within the fourth period (2013-2017) of ESA's Earth Observation Envelope Programme (http://seom.esa.int/). The prime objective is to federate, support and expand the international research community that the ERS, ENVISAT and the Envelope programmes have built up over the last 25 years. It aims to further strengthen the leadership of the European Earth Observation research community by enabling them to extensively exploit future European operational EO missions. SEOM will enable the science community to address new scientific research that are opened by free and open access to data from operational EO missions. Based on community-wide recommendations for actions on key research issues, gathered through a series of international thematic workshops and scientific user consultation meetings, a work plan is established and is approved every year by ESA Members States. During 2015 SEOM, Science users consultation workshops have been organized for Sentinel1/3/5P ( Fringe, S3 Symposium and Atmospheric science respectively) , new R&D studies for scientific exploitation of the Sentinels have been launched ( S3 for Science SAR Altimetry and Ocean Color , S2 for Science,) , open-source multi-mission scientific toolboxes have been launched (in particular the SNAP/S1-2-3 Toolbox). In addition two advanced international training courses have been organized in Europe to exploit the new S1-A and S2-A data for Land and Ocean remote sensing (over 120 participants from 25 countries) as well as activities for promoting the first scientific results ( e.g. Chili Earthquake) . In addition the First EO Open Science 2.0 was organised at ESA in October 2015 with 225 participants from 31 countries bringing together young EO scientists and data scientists. During the conference precursor activities in EO Open Science and Innovation were presented, while developing a Roadmap preparing for future ESA scientific exploitation activities. Within the conference, the first EO Hackathon event took place bringing together volunteered programmers with the developers of SNAP. An interactive "Jam" session was also held that discussed and scoped challenging scientific and societal issues (e.g. climate change, quality of life and air quality). The status and first results from these SEOM projects will be presented and an outlook for upcoming SEOM studies and events in 2016 will be given.
Early career researchers want Open Science.
Farnham, Andrea; Kurz, Christoph; Öztürk, Mehmet Ali; Solbiati, Monica; Myllyntaus, Oona; Meekes, Jordy; Pham, Tra My; Paz, Clara; Langiewicz, Magda; Andrews, Sophie; Kanninen, Liisa; Agbemabiese, Chantal; Guler, Arzu Tugce; Durieux, Jeffrey; Jasim, Sarah; Viessmann, Olivia; Frattini, Stefano; Yembergenova, Danagul; Benito, Carla Marin; Porte, Marion; Grangeray-Vilmint, Anaïs; Curiel, Rafael Prieto; Rehncrona, Carin; Malas, Tareq; Esposito, Flavia; Hettne, Kristina
2017-11-15
Open Science is encouraged by the European Union and many other political and scientific institutions. However, scientific practice is proving slow to change. We propose, as early career researchers, that it is our task to change scientific research into open scientific research and commit to Open Science principles.
Utilizing social media for informal ocean conservation and education: The BioOceanography Project
NASA Astrophysics Data System (ADS)
Payette, J.
2016-02-01
Science communication through the use of social media is a rapidly evolving and growing pursuit in academic and scientific circles. Online tools and social media are being used in not only scientific communication but also scientific publication, education, and outreach. Standards and usage of social media as well as other online tools for communication, networking, outreach, and publication are always in development. Caution and a conservative attitude towards these novel "Science 2.0" tools is understandable because of their rapidly changing nature and the lack of professional standards for using them. However there are some key benefits and unique ways social media, online systems, and other Open or Open Source technologies, software, and "Science 2.0" tools can be utilized for academic purposes such as education and outreach. Diverse efforts for ocean conservation and education will continue to utilize social media for a variety of purposes. The BioOceanography project is an informal communication, education, outreach, and conservation initiative created for enhancing knowledge related to Oceanography and Marine Science with an unbiased yet conservation-minded approach and in an Open Source format. The BioOceanography project is ongoing and still evolving, but has already contributed to ocean education and conservation communication in key ways through a concerted web presence since 2013, including a curated Twitter account @_Oceanography and BioOceanography blog style website. Social media tools like those used in this project, if used properly can be highly effective and valuable for encouraging students, networking with researchers, and educating the general public in Oceanography.
Processing Uav and LIDAR Point Clouds in Grass GIS
NASA Astrophysics Data System (ADS)
Petras, V.; Petrasova, A.; Jeziorska, J.; Mitasova, H.
2016-06-01
Today's methods of acquiring Earth surface data, namely lidar and unmanned aerial vehicle (UAV) imagery, non-selectively collect or generate large amounts of points. Point clouds from different sources vary in their properties such as number of returns, density, or quality. We present a set of tools with applications for different types of points clouds obtained by a lidar scanner, structure from motion technique (SfM), and a low-cost 3D scanner. To take advantage of the vertical structure of multiple return lidar point clouds, we demonstrate tools to process them using 3D raster techniques which allow, for example, the development of custom vegetation classification methods. Dense point clouds obtained from UAV imagery, often containing redundant points, can be decimated using various techniques before further processing. We implemented and compared several decimation techniques in regard to their performance and the final digital surface model (DSM). Finally, we will describe the processing of a point cloud from a low-cost 3D scanner, namely Microsoft Kinect, and its application for interaction with physical models. All the presented tools are open source and integrated in GRASS GIS, a multi-purpose open source GIS with remote sensing capabilities. The tools integrate with other open source projects, specifically Point Data Abstraction Library (PDAL), Point Cloud Library (PCL), and OpenKinect libfreenect2 library to benefit from the open source point cloud ecosystem. The implementation in GRASS GIS ensures long term maintenance and reproducibility by the scientific community but also by the original authors themselves.
Larsen, Peder Olesen; von Ins, Markus
2010-09-01
The growth rate of scientific publication has been studied from 1907 to 2007 using available data from a number of literature databases, including Science Citation Index (SCI) and Social Sciences Citation Index (SSCI). Traditional scientific publishing, that is publication in peer-reviewed journals, is still increasing although there are big differences between fields. There are no indications that the growth rate has decreased in the last 50 years. At the same time publication using new channels, for example conference proceedings, open archives and home pages, is growing fast. The growth rate for SCI up to 2007 is smaller than for comparable databases. This means that SCI was covering a decreasing part of the traditional scientific literature. There are also clear indications that the coverage by SCI is especially low in some of the scientific areas with the highest growth rate, including computer science and engineering sciences. The role of conference proceedings, open access archives and publications published on the net is increasing, especially in scientific fields with high growth rates, but this has only partially been reflected in the databases. The new publication channels challenge the use of the big databases in measurements of scientific productivity or output and of the growth rate of science. Because of the declining coverage and this challenge it is problematic that SCI has been used and is used as the dominant source for science indicators based on publication and citation numbers. The limited data available for social sciences show that the growth rate in SSCI was remarkably low and indicate that the coverage by SSCI was declining over time. National Science Indicators from Thomson Reuters is based solely on SCI, SSCI and Arts and Humanities Citation Index (AHCI). Therefore the declining coverage of the citation databases problematizes the use of this source.
von Ins, Markus
2010-01-01
The growth rate of scientific publication has been studied from 1907 to 2007 using available data from a number of literature databases, including Science Citation Index (SCI) and Social Sciences Citation Index (SSCI). Traditional scientific publishing, that is publication in peer-reviewed journals, is still increasing although there are big differences between fields. There are no indications that the growth rate has decreased in the last 50 years. At the same time publication using new channels, for example conference proceedings, open archives and home pages, is growing fast. The growth rate for SCI up to 2007 is smaller than for comparable databases. This means that SCI was covering a decreasing part of the traditional scientific literature. There are also clear indications that the coverage by SCI is especially low in some of the scientific areas with the highest growth rate, including computer science and engineering sciences. The role of conference proceedings, open access archives and publications published on the net is increasing, especially in scientific fields with high growth rates, but this has only partially been reflected in the databases. The new publication channels challenge the use of the big databases in measurements of scientific productivity or output and of the growth rate of science. Because of the declining coverage and this challenge it is problematic that SCI has been used and is used as the dominant source for science indicators based on publication and citation numbers. The limited data available for social sciences show that the growth rate in SSCI was remarkably low and indicate that the coverage by SSCI was declining over time. National Science Indicators from Thomson Reuters is based solely on SCI, SSCI and Arts and Humanities Citation Index (AHCI). Therefore the declining coverage of the citation databases problematizes the use of this source. PMID:20700371
NASA Astrophysics Data System (ADS)
Hellman, S. B.; Lisowski, S.; Baker, B.; Hagerty, M.; Lomax, A.; Leifer, J. M.; Thies, D. A.; Schnackenberg, A.; Barrows, J.
2015-12-01
Tsunami Information technology Modernization (TIM) is a National Oceanic and Atmospheric Administration (NOAA) project to update and standardize the earthquake and tsunami monitoring systems currently employed at the U.S. Tsunami Warning Centers in Ewa Beach, Hawaii (PTWC) and Palmer, Alaska (NTWC). While this project was funded by NOAA to solve a specific problem, the requirements that the delivered system be both open source and easily maintainable have resulted in the creation of a variety of open source (OS) software packages. The open source software is now complete and this is a presentation of the OS Software that has been funded by NOAA for benefit of the entire seismic community. The design architecture comprises three distinct components: (1) The user interface, (2) The real-time data acquisition and processing system and (3) The scientific algorithm library. The system follows a modular design with loose coupling between components. We now identify the major project constituents. The user interface, CAVE, is written in Java and is compatible with the existing National Weather Service (NWS) open source graphical system AWIPS. The selected real-time seismic acquisition and processing system is open source SeisComp3 (sc3). The seismic library (libseismic) contains numerous custom written and wrapped open source seismic algorithms (e.g., ML/mb/Ms/Mwp, mantle magnitude (Mm), w-phase moment tensor, bodywave moment tensor, finite-fault inversion, array processing). The seismic library is organized in a way (function naming and usage) that will be familiar to users of Matlab. The seismic library extends sc3 so that it can be called by the real-time system, but it can also be driven and tested outside of sc3, for example, by ObsPy or Earthworm. To unify the three principal components we have developed a flexible and lightweight communication layer called SeismoEdex.
ERIC Educational Resources Information Center
Al-Aziz, Jameel; Christou, Nicolas; Dinov, Ivo D.
2010-01-01
The amount, complexity and provenance of data have dramatically increased in the past five years. Visualization of observed and simulated data is a critical component of any social, environmental, biomedical or scientific quest. Dynamic, exploratory and interactive visualization of multivariate data, without preprocessing by dimensionality…
Institutional Expansion: The Case of Grid Computing
ERIC Educational Resources Information Center
Kertcher, Zack
2010-01-01
Evolutionary and revolutionary approaches have dominated the study of scientific, technological and institutional change. Yet, being focused on change within a single field, these approaches have been mute about a third, pervasive process. This process is found in a variety of cases that range from open source software to the Monte Carlo method to…
ERIC Educational Resources Information Center
Adadan, Emine
2013-01-01
This study explored two groups of Grade 11 (age 16-17) students' conceptual understandings about aspects of particle theory before, immediately after, and 3 months after instruction with multiple representations (IMR) and instruction with verbal representations (IVR). Data sources included open-ended questionnaires, interviews, and student…
Software Writing Skills for Your Research - Lessons Learned from Workshops in the Geosciences
NASA Astrophysics Data System (ADS)
Hammitzsch, Martin
2016-04-01
Findings presented in scientific papers are based on data and software. Once in a while they come along with data - but not commonly with software. However, the software used to gain findings plays a crucial role in the scientific work. Nevertheless, software is rarely seen publishable. Thus researchers may not reproduce the findings without the software which is in conflict with the principle of reproducibility in sciences. For both, the writing of publishable software and the reproducibility issue, the quality of software is of utmost importance. For many programming scientists the treatment of source code, e.g. with code design, version control, documentation, and testing is associated with additional work that is not covered in the primary research task. This includes the adoption of processes following the software development life cycle. However, the adoption of software engineering rules and best practices has to be recognized and accepted as part of the scientific performance. Most scientists have little incentive to improve code and do not publish code because software engineering habits are rarely practised by researchers or students. Software engineering skills are not passed on to followers as for paper writing skill. Thus it is often felt that the software or code produced is not publishable. The quality of software and its source code has a decisive influence on the quality of research results obtained and their traceability. So establishing best practices from software engineering to serve scientific needs is crucial for the success of scientific software. Even though scientists use existing software and code, i.e., from open source software repositories, only few contribute their code back into the repositories. So writing and opening code for Open Science means that subsequent users are able to run the code, e.g. by the provision of sufficient documentation, sample data sets, tests and comments which in turn can be proven by adequate and qualified reviews. This assumes that scientist learn to write and release code and software as they learn to write and publish papers. Having this in mind, software could be valued and assessed as a contribution to science. But this requires the relevant skills that can be passed to colleagues and followers. Therefore, the GFZ German Research Centre for Geosciences performed three workshops in 2015 to address the passing of software writing skills to young scientists, the next generation of researchers in the Earth, planetary and space sciences. Experiences in running these workshops and the lessons learned will be summarized in this presentation. The workshops have received support and funding by Software Carpentry, a volunteer organization whose goal is to make scientists more productive, and their work more reliable, by teaching them basic computing skills, and by FOSTER (Facilitate Open Science Training for European Research), a two-year, EU-Funded (FP7) project, whose goal to produce a European-wide training programme that will help to incorporate Open Access approaches into existing research methodologies and to integrate Open Science principles and practice in the current research workflow by targeting the young researchers and other stakeholders.
Teaching Radiology Physics Interactively with Scientific Notebook Software.
Richardson, Michael L; Amini, Behrang
2018-06-01
The goal of this study is to demonstrate how the teaching of radiology physics can be enhanced with the use of interactive scientific notebook software. We used the scientific notebook software known as Project Jupyter, which is free, open-source, and available for the Macintosh, Windows, and Linux operating systems. We have created a scientific notebook that demonstrates multiple interactive teaching modules we have written for our residents using the Jupyter notebook system. Scientific notebook software allows educators to create teaching modules in a form that combines text, graphics, images, data, interactive calculations, and image analysis within a single document. These notebooks can be used to build interactive teaching modules, which can help explain complex topics in imaging physics to residents. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Incorporating Primary Scientific Literature in Middle and High School Education.
Fankhauser, Sarah C; Lijek, Rebeccah S
2016-03-01
Primary literature is the most reliable and direct source of scientific information, but most middle school and high school science is taught using secondary and tertiary sources. One reason for this is that primary science articles can be difficult to access and interpret for young students and for their teachers, who may lack exposure to this type of writing. The Journal of Emerging Investigators (JEI) was created to fill this gap and provide primary research articles that can be accessed and read by students and their teachers. JEI is a non-profit, online, open-access, peer-reviewed science journal dedicated to mentoring and publishing the scientific research of middle and high school students. JEI articles provide reliable scientific information that is written by students and therefore at a level that their peers can understand. For student-authors who publish in JEI, the review process and the interaction with scientists provide invaluable insight into the scientific process. Moreover, the resulting repository of free, student-written articles allows teachers to incorporate age-appropriate primary literature into the middle and high school science classroom. JEI articles can be used for teaching specific scientific content or for teaching the process of the scientific method itself. The critical thinking skills that students learn by engaging with the primary literature will be invaluable for the development of a scientifically-literate public.
Incorporating Primary Scientific Literature in Middle and High School Education†
Fankhauser, Sarah C.; Lijek, Rebeccah S.
2016-01-01
Primary literature is the most reliable and direct source of scientific information, but most middle school and high school science is taught using secondary and tertiary sources. One reason for this is that primary science articles can be difficult to access and interpret for young students and for their teachers, who may lack exposure to this type of writing. The Journal of Emerging Investigators (JEI) was created to fill this gap and provide primary research articles that can be accessed and read by students and their teachers. JEI is a non-profit, online, open-access, peer-reviewed science journal dedicated to mentoring and publishing the scientific research of middle and high school students. JEI articles provide reliable scientific information that is written by students and therefore at a level that their peers can understand. For student-authors who publish in JEI, the review process and the interaction with scientists provide invaluable insight into the scientific process. Moreover, the resulting repository of free, student-written articles allows teachers to incorporate age-appropriate primary literature into the middle and high school science classroom. JEI articles can be used for teaching specific scientific content or for teaching the process of the scientific method itself. The critical thinking skills that students learn by engaging with the primary literature will be invaluable for the development of a scientifically-literate public. PMID:27047607
Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models
NASA Astrophysics Data System (ADS)
Pallant, Amy; Lee, Hee-Sun
2015-04-01
Modeling and argumentation are two important scientific practices students need to develop throughout school years. In this paper, we investigated how middle and high school students ( N = 512) construct a scientific argument based on evidence from computational models with which they simulated climate change. We designed scientific argumentation tasks with three increasingly complex dynamic climate models. Each scientific argumentation task consisted of four parts: multiple-choice claim, openended explanation, five-point Likert scale uncertainty rating, and open-ended uncertainty rationale. We coded 1,294 scientific arguments in terms of a claim's consistency with current scientific consensus, whether explanations were model based or knowledge based and categorized the sources of uncertainty (personal vs. scientific). We used chi-square and ANOVA tests to identify significant patterns. Results indicate that (1) a majority of students incorporated models as evidence to support their claims, (2) most students used model output results shown on graphs to confirm their claim rather than to explain simulated molecular processes, (3) students' dependence on model results and their uncertainty rating diminished as the dynamic climate models became more and more complex, (4) some students' misconceptions interfered with observing and interpreting model results or simulated processes, and (5) students' uncertainty sources reflected more frequently on their assessment of personal knowledge or abilities related to the tasks than on their critical examination of scientific evidence resulting from models. These findings have implications for teaching and research related to the integration of scientific argumentation and modeling practices to address complex Earth systems.
Open-Source Automated Mapping Four-Point Probe.
Chandra, Handy; Allen, Spencer W; Oberloier, Shane W; Bihari, Nupur; Gwamuri, Jephias; Pearce, Joshua M
2017-01-26
Scientists have begun using self-replicating rapid prototyper (RepRap) 3-D printers to manufacture open source digital designs of scientific equipment. This approach is refined here to develop a novel instrument capable of performing automated large-area four-point probe measurements. The designs for conversion of a RepRap 3-D printer to a 2-D open source four-point probe (OS4PP) measurement device are detailed for the mechanical and electrical systems. Free and open source software and firmware are developed to operate the tool. The OS4PP was validated against a wide range of discrete resistors and indium tin oxide (ITO) samples of different thicknesses both pre- and post-annealing. The OS4PP was then compared to two commercial proprietary systems. Results of resistors from 10 to 1 MΩ show errors of less than 1% for the OS4PP. The 3-D mapping of sheet resistance of ITO samples successfully demonstrated the automated capability to measure non-uniformities in large-area samples. The results indicate that all measured values are within the same order of magnitude when compared to two proprietary measurement systems. In conclusion, the OS4PP system, which costs less than 70% of manual proprietary systems, is comparable electrically while offering automated 100 micron positional accuracy for measuring sheet resistance over larger areas.
Free and open-source automated 3-D microscope.
Wijnen, Bas; Petersen, Emily E; Hunt, Emily J; Pearce, Joshua M
2016-11-01
Open-source technology not only has facilitated the expansion of the greater research community, but by lowering costs it has encouraged innovation and customizable design. The field of automated microscopy has continued to be a challenge in accessibility due the expense and inflexible, noninterchangeable stages. This paper presents a low-cost, open-source microscope 3-D stage. A RepRap 3-D printer was converted to an optical microscope equipped with a customized, 3-D printed holder for a USB microscope. Precision measurements were determined to have an average error of 10 μm at the maximum speed and 27 μm at the minimum recorded speed. Accuracy tests yielded an error of 0.15%. The machine is a true 3-D stage and thus able to operate with USB microscopes or conventional desktop microscopes. It is larger than all commercial alternatives, and is thus capable of high-depth images over unprecedented areas and complex geometries. The repeatability is below 2-D microscope stages, but testing shows that it is adequate for the majority of scientific applications. The open-source microscope stage costs less than 3-9% of the closest proprietary commercial stages. This extreme affordability vastly improves accessibility for 3-D microscopy throughout the world. © 2016 The Authors Journal of Microscopy © 2016 Royal Microscopical Society.
Chaste: An Open Source C++ Library for Computational Physiology and Biology
Mirams, Gary R.; Arthurs, Christopher J.; Bernabeu, Miguel O.; Bordas, Rafel; Cooper, Jonathan; Corrias, Alberto; Davit, Yohan; Dunn, Sara-Jane; Fletcher, Alexander G.; Harvey, Daniel G.; Marsh, Megan E.; Osborne, James M.; Pathmanathan, Pras; Pitt-Francis, Joe; Southern, James; Zemzemi, Nejib; Gavaghan, David J.
2013-01-01
Chaste — Cancer, Heart And Soft Tissue Environment — is an open source C++ library for the computational simulation of mathematical models developed for physiology and biology. Code development has been driven by two initial applications: cardiac electrophysiology and cancer development. A large number of cardiac electrophysiology studies have been enabled and performed, including high-performance computational investigations of defibrillation on realistic human cardiac geometries. New models for the initiation and growth of tumours have been developed. In particular, cell-based simulations have provided novel insight into the role of stem cells in the colorectal crypt. Chaste is constantly evolving and is now being applied to a far wider range of problems. The code provides modules for handling common scientific computing components, such as meshes and solvers for ordinary and partial differential equations (ODEs/PDEs). Re-use of these components avoids the need for researchers to ‘re-invent the wheel’ with each new project, accelerating the rate of progress in new applications. Chaste is developed using industrially-derived techniques, in particular test-driven development, to ensure code quality, re-use and reliability. In this article we provide examples that illustrate the types of problems Chaste can be used to solve, which can be run on a desktop computer. We highlight some scientific studies that have used or are using Chaste, and the insights they have provided. The source code, both for specific releases and the development version, is available to download under an open source Berkeley Software Distribution (BSD) licence at http://www.cs.ox.ac.uk/chaste, together with details of a mailing list and links to documentation and tutorials. PMID:23516352
Software Attribution for Geoscience Applications in the Computational Infrastructure for Geodynamics
NASA Astrophysics Data System (ADS)
Hwang, L.; Dumit, J.; Fish, A.; Soito, L.; Kellogg, L. H.; Smith, M.
2015-12-01
Scientific software is largely developed by individual scientists and represents a significant intellectual contribution to the field. As the scientific culture and funding agencies move towards an expectation that software be open-source, there is a corresponding need for mechanisms to cite software, both to provide credit and recognition to developers, and to aid in discoverability of software and scientific reproducibility. We assess the geodynamic modeling community's current citation practices by examining more than 300 predominantly self-reported publications utilizing scientific software in the past 5 years that is available through the Computational Infrastructure for Geodynamics (CIG). Preliminary results indicate that authors cite and attribute software either through citing (in rank order) peer-reviewed scientific publications, a user's manual, and/or a paper describing the software code. Attributions maybe found directly in the text, in acknowledgements, in figure captions, or in footnotes. What is considered citable varies widely. Citations predominantly lack software version numbers or persistent identifiers to find the software package. Versioning may be implied through reference to a versioned user manual. Authors sometimes report code features used and whether they have modified the code. As an open-source community, CIG requests that researchers contribute their modifications to the repository. However, such modifications may not be contributed back to a repository code branch, decreasing the chances of discoverability and reproducibility. Survey results through CIG's Software Attribution for Geoscience Applications (SAGA) project suggest that lack of knowledge, tools, and workflows to cite codes are barriers to effectively implement the emerging citation norms. Generated on-demand attributions on software landing pages and a prototype extensible plug-in to automatically generate attributions in codes are the first steps towards reproducibility.
Riemann-Hilbert technique scattering analysis of metamaterial-based asymmetric 2D open resonators
NASA Astrophysics Data System (ADS)
Kamiński, Piotr M.; Ziolkowski, Richard W.; Arslanagić, Samel
2017-12-01
The scattering properties of metamaterial-based asymmetric two-dimensional open resonators excited by an electric line source are investigated analytically. The resonators are, in general, composed of two infinite and concentric cylindrical layers covered with an infinitely thin, perfect conducting shell that has an infinite axial aperture. The line source is oriented parallel to the cylinder axis. An exact analytical solution of this problem is derived. It is based on the dual-series approach and its transformation to the equivalent Riemann-Hilbert problem. Asymmetric metamaterial-based configurations are found to lead simultaneously to large enhancements of the radiated power and to highly steerable Huygens-like directivity patterns; properties not attainable with the corresponding structurally symmetric resonators. The presented open resonator designs are thus interesting candidates for many scientific and engineering applications where enhanced directional near- and far-field responses, tailored with beam shaping and steering capabilities, are highly desired.
Open Access to research data - final perspectives from the RECODE project
NASA Astrophysics Data System (ADS)
Bigagli, Lorenzo; Sondervan, Jeroen
2015-04-01
Many networks, initiatives, and communities are addressing the key barriers to Open Access to data in scientific research. These organizations are typically heterogeneous and fragmented by discipline, location, sector (publishers, academics, data centers, etc.), as well as by other features. Besides, they often work in isolation, or with limited contacts with one another. The Policy RECommendations for Open Access to Research Data in Europe (RECODE) project, which will conclude in the first half of 2015, has scoped and addressed the challenges related to Open Access, dissemination and preservation of scientific data, leveraging the existing networks, initiatives, and communities. The overall objective of RECODE was to identify a series of targeted and over-arching policy recommendations for Open Access to European research data based on existing good practice. RECODE has undertaken a review of the existing state of the art and examined five case studies in different scientific disciplines: particle physics and astrophysics, clinical research, medicine and technical physiology (bioengineering), humanities (archaeology), and environmental sciences (Earth Observation). In particular for the latter discipline, GEOSS has been an optimal test bed for investigating the importance of technical and multidisciplinary interoperability, and what the challenges are in sharing and providing Open Access to research data from a variety of sources, and in a variety of formats. RECODE has identified five main technological and infrastructural challenges: • Heterogeneity - relates to interoperability, usability, accessibility, discoverability; • Sustainability - relates to obsolescence, curation, updates/upgrades, persistence, preservation; • Volume - also related to Big Data, which is somehow implied by Open Data; in our context, it relates to discoverability, accessibility (indexing), bandwidth, storage, scalability, energy footprint; • Quality - relates to completeness, description (metadata), usability, data (peer) review; • Security - relates to the technical aspects of policy enforcement, such the AAA-protocol for authentication, authorization and auditing/accounting, privacy issues, etc. RECODE has also focused on the identification of stakeholder values relevant to Open Access to research data, as well as on policy, legal, and institutional aspects. All these issues are of immediate relevance for the whole scientific ecosystem, including researchers, as data producers/users, as well as publishers and libraries, as means for data dissemination and management.
ITK: enabling reproducible research and open science
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46. PMID:24600387
ITK: enabling reproducible research and open science.
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46.
Scientific Evidence and Potential Barriers in the Management of Brazilian Protected Areas.
Giehl, Eduardo L H; Moretti, Marcela; Walsh, Jessica C; Batalha, Marco A; Cook, Carly N
2017-01-01
Protected areas are a crucial tool for halting the loss of biodiversity. Yet, the management of protected areas is under resourced, impacting the ability to achieve effective conservation actions. Effective management depends on the application of the best available knowledge, which can include both scientific evidence and the local knowledge of onsite managers. Despite the clear value of evidence-based conservation, there is still little known about how much scientific evidence is used to guide the management of protected areas. This knowledge gap is especially evident in developing countries, where resource limitations and language barriers may create additional challenges for the use of scientific evidence in management. To assess the extent to which scientific evidence is used to inform management decisions in a developing country, we surveyed Brazilian protected area managers about the information they use to support their management decisions. We targeted on-ground managers who are responsible for management decisions made at the local protected area level. We asked managers about the sources of evidence they use, how frequently they assess the different sources of evidence and the scientific content of the different sources of evidence. We also considered a range of factors that might explain the use of scientific evidence to guide the management of protected areas, such as the language spoken by managers, the accessibility of evidence sources and the characteristics of the managers and the protected areas they manage. The managers who responded to our questionnaire reported that they most frequently made decisions based on their personal experience, with scientific evidence being used relatively infrequently. While managers in our study tended to value scientific evidence less highly than other sources, most managers still considered science important for management decisions. Managers reported that the accessibility of scientific evidence is low relative to other types of evidence, with key barriers being the low levels of open access research and insufficient technical training to enable managers to interpret research findings. Based on our results, we suggest that managers in developing countries face all the same challenges as those in developed countries, along with additional language barriers that can prevent greater use of scientific evidence to support effective management of protected areas in Brazil.
Sochat, Vanessa
2018-01-01
Abstract Background Here, we present the Scientific Filesystem (SCIF), an organizational format that supports exposure of executables and metadata for discoverability of scientific applications. The format includes a known filesystem structure, a definition for a set of environment variables describing it, and functions for generation of the variables and interaction with the libraries, metadata, and executables located within. SCIF makes it easy to expose metadata, multiple environments, installation steps, files, and entry points to render scientific applications consistent, modular, and discoverable. A SCIF can be installed on a traditional host or in a container technology such as Docker or Singularity. We start by reviewing the background and rationale for the SCIF, followed by an overview of the specification and the different levels of internal modules (“apps”) that the organizational format affords. Finally, we demonstrate that SCIF is useful by implementing and discussing several use cases that improve user interaction and understanding of scientific applications. SCIF is released along with a client and integration in the Singularity 2.4 software to quickly install and interact with SCIF. When used inside of a reproducible container, a SCIF is a recipe for reproducibility and introspection of the functions and users that it serves. Results We use SCIF to evaluate container software, provide metrics, serve scientific workflows, and execute a primary function under different contexts. To encourage collaboration and sharing of applications, we developed tools along with an open source, version-controlled, tested, and programmatically accessible web infrastructure. SCIF and associated resources are available at https://sci-f.github.io. The ease of using SCIF, especially in the context of containers, offers promise for scientists’ work to be self-documenting and programatically parseable for maximum reproducibility. SCIF opens up an abstraction from underlying programming languages and packaging logic to work with scientific applications, opening up new opportunities for scientific software development. PMID:29718213
Journal of Open Source Software (JOSS): design and first-year review
NASA Astrophysics Data System (ADS)
Smith, Arfon M.
2018-01-01
JOSS is a free and open-access journal that publishes articles describing research software across all disciplines. It has the dual goals of improving the quality of the software submitted and providing a mechanism for research software developers to receive credit. While designed to work within the current merit system of science, JOSS addresses the dearth of rewards for key contributions to science made in the form of software. JOSS publishes articles that encapsulate scholarship contained in the software itself, and its rigorous peer review targets the software components: functionality, documentation, tests, continuous integration, and the license. A JOSS article contains an abstract describing the purpose and functionality of the software, references, and a link to the software archive. JOSS published more than 100 articles in its first year, many from the scientific python ecosystem (including a number of articles related to astronomy and astrophysics). JOSS is a sponsored project of the nonprofit organization NumFOCUS and is an affiliate of the Open Source Initiative.In this presentation, I'll describes the motivation, design, and progress of the Journal of Open Source Software (JOSS) and how it compares to other avenues for publishing research software in astronomy.
Pre-service elementary teachers' understanding of scientific inquiry and its role in school science
NASA Astrophysics Data System (ADS)
Macaroglu, Esra
The purpose of this research was to explore pre-service elementary teachers' developing understanding of scientific inquiry within the context of their elementary science teaching and learning. More specifically, the study examined 24 pre-service elementary teachers' emerging understanding of (1) the nature of science and scientific inquiry; (2) the "place" of scientific inquiry in school science; and (3) the roles and responsibilities of teachers and students within an inquiry-based learning environment. Data sources consisted primarily of student-generated artifacts collected throughout the semester, including pre/post-philosophy statements and text-based materials collected from electronic dialogue journals. Individual data sources were open-coded to identify concepts and categories expressed by students. Cross-comparisons were conducted and patterns were identified. Assertions were formed with these patterns. Findings are hopeful in that they suggest pre-service teachers can develop a more contemporary view of scientific inquiry when immersed in a context that promotes this perspective. Not surprisingly, however, the prospective teachers encountered a number of barriers when attempting to translate their emerging ideas into practice. More research is needed to determine which teacher preparation experiences are most powerful in supporting pre-service teachers as they construct a framework for science teaching and learning that includes scientific inquiry as a central component.
gadfly: A pandas-based Framework for Analyzing GADGET Simulation Data
NASA Astrophysics Data System (ADS)
Hummel, Jacob A.
2016-11-01
We present the first public release (v0.1) of the open-source gadget Dataframe Library: gadfly. The aim of this package is to leverage the capabilities of the broader python scientific computing ecosystem by providing tools for analyzing simulation data from the astrophysical simulation codes gadget and gizmo using pandas, a thoroughly documented, open-source library providing high-performance, easy-to-use data structures that is quickly becoming the standard for data analysis in python. Gadfly is a framework for analyzing particle-based simulation data stored in the HDF5 format using pandas DataFrames. The package enables efficient memory management, includes utilities for unit handling, coordinate transformations, and parallel batch processing, and provides highly optimized routines for visualizing smoothed-particle hydrodynamics data sets.
Nurturing reliable and robust open-source scientific software
NASA Astrophysics Data System (ADS)
Uieda, L.; Wessel, P.
2017-12-01
Scientific results are increasingly the product of software. The reproducibility and validity of published results cannot be ensured without access to the source code of the software used to produce them. Therefore, the code itself is a fundamental part of the methodology and must be published along with the results. With such a reliance on software, it is troubling that most scientists do not receive formal training in software development. Tools such as version control, continuous integration, and automated testing are routinely used in industry to ensure the correctness and robustness of software. However, many scientist do not even know of their existence (although efforts like Software Carpentry are having an impact on this issue; software-carpentry.org). Publishing the source code is only the first step in creating an open-source project. For a project to grow it must provide documentation, participation guidelines, and a welcoming environment for new contributors. Expanding the project community is often more challenging than the technical aspects of software development. Maintainers must invest time to enforce the rules of the project and to onboard new members, which can be difficult to justify in the context of the "publish or perish" mentality. This problem will continue as long as software contributions are not recognized as valid scholarship by hiring and tenure committees. Furthermore, there are still unsolved problems in providing attribution for software contributions. Many journals and metrics of academic productivity do not recognize citations to sources other than traditional publications. Thus, some authors choose to publish an article about the software and use it as a citation marker. One issue with this approach is that updating the reference to include new contributors involves writing and publishing a new article. A better approach would be to cite a permanent archive of individual versions of the source code in services such as Zenodo (zenodo.org). However, citations to these sources are not always recognized when computing citation metrics. In summary, the widespread development of reliable and robust open-source software relies on the creation of formal training programs in software development best practices and the recognition of software as a valid form of scholarship.
NASA Astrophysics Data System (ADS)
Silva, F.; Maechling, P. J.; Goulet, C.; Somerville, P.; Jordan, T. H.
2013-12-01
The Southern California Earthquake Center (SCEC) Broadband Platform is a collaborative software development project involving SCEC researchers, graduate students, and the SCEC Community Modeling Environment. The SCEC Broadband Platform is open-source scientific software that can generate broadband (0-100Hz) ground motions for earthquakes, integrating complex scientific modules that implement rupture generation, low and high-frequency seismogram synthesis, non-linear site effects calculation, and visualization into a software system that supports easy on-demand computation of seismograms. The Broadband Platform operates in two primary modes: validation simulations and scenario simulations. In validation mode, the Broadband Platform runs earthquake rupture and wave propagation modeling software to calculate seismograms of a historical earthquake for which observed strong ground motion data is available. Also in validation mode, the Broadband Platform calculates a number of goodness of fit measurements that quantify how well the model-based broadband seismograms match the observed seismograms for a certain event. Based on these results, the Platform can be used to tune and validate different numerical modeling techniques. During the past year, we have modified the software to enable the addition of a large number of historical events, and we are now adding validation simulation inputs and observational data for 23 historical events covering the Eastern and Western United States, Japan, Taiwan, Turkey, and Italy. In scenario mode, the Broadband Platform can run simulations for hypothetical (scenario) earthquakes. In this mode, users input an earthquake description, a list of station names and locations, and a 1D velocity model for their region of interest, and the Broadband Platform software then calculates ground motions for the specified stations. By establishing an interface between scientific modules with a common set of input and output files, the Broadband Platform facilitates the addition of new scientific methods, which are written by earth scientists in a number of languages such as C, C++, Fortran, and Python. The Broadband Platform's modular design also supports the reuse of existing software modules as building blocks to create new scientific methods. Additionally, the Platform implements a wrapper around each scientific module, converting input and output files to and from the specific formats required (or produced) by individual scientific codes. Working in close collaboration with scientists and research engineers, the SCEC software development group continues to add new capabilities to the Broadband Platform and to release new versions as open-source scientific software distributions that can be compiled and run on many Linux computer systems. Our latest release includes the addition of 3 new simulation methods and several new data products, such as map and distance-based goodness of fit plots. Finally, as the number and complexity of scenarios simulated using the Broadband Platform increase, we have added batching utilities to substantially improve support for running large-scale simulations on computing clusters.
The what, why, and how of born-open data.
Rouder, Jeffrey N
2016-09-01
Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world's largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication.
ERIC Educational Resources Information Center
Svechkarev, Denis; Grygorovych, Oleksiy V.
2016-01-01
With more than 20 years of history, the Tournament of Young Chemists is an innovative, cross-disciplinary competition that promulgates the everyday life of scientists into the classrooms and on the contest stage. Original, open-type problems, unrestricted access to scientific data sources, and personal interaction with researchers from different…
GABBs: Cyberinfrastructure for Self-Service Geospatial Data Exploration, Computation, and Sharing
NASA Astrophysics Data System (ADS)
Song, C. X.; Zhao, L.; Biehl, L. L.; Merwade, V.; Villoria, N.
2016-12-01
Geospatial data are present everywhere today with the proliferation of location-aware computing devices. This is especially true in the scientific community where large amounts of data are driving research and education activities in many domains. Collaboration over geospatial data, for example, in modeling, data analysis and visualization, must still overcome the barriers of specialized software and expertise among other challenges. In addressing these needs, the Geospatial data Analysis Building Blocks (GABBs) project aims at building geospatial modeling, data analysis and visualization capabilities in an open source web platform, HUBzero. Funded by NSF's Data Infrastructure Building Blocks initiative, GABBs is creating a geospatial data architecture that integrates spatial data management, mapping and visualization, and interfaces in the HUBzero platform for scientific collaborations. The geo-rendering enabled Rappture toolkit, a generic Python mapping library, geospatial data exploration and publication tools, and an integrated online geospatial data management solution are among the software building blocks from the project. The GABBS software will be available through Amazon's AWS Marketplace VM images and open source. Hosting services are also available to the user community. The outcome of the project will enable researchers and educators to self-manage their scientific data, rapidly create GIS-enable tools, share geospatial data and tools on the web, and build dynamic workflows connecting data and tools, all without requiring significant software development skills, GIS expertise or IT administrative privileges. This presentation will describe the GABBs architecture, toolkits and libraries, and showcase the scientific use cases that utilize GABBs capabilities, as well as the challenges and solutions for GABBs to interoperate with other cyberinfrastructure platforms.
Online data analysis using Web GDL
NASA Astrophysics Data System (ADS)
Jaffey, A.; Cheung, M.; Kobashi, A.
2008-12-01
The ever improving capability of modern astronomical instruments to capture data at high spatial resolution and cadence is opening up unprecedented opportunities for scientific discovery. When data sets become so large that they cannot be easily transferred over the internet, the researcher must find alternative ways to perform data analysis. One strategy is to bring the data analysis code to where the data resides. We present Web GDL, an implementation of GDL (GNU Data Language, open source incremental compiler compatible with IDL) that allows users to perform interactive data analysis within a web browser.
2017-12-08
This Solar Dynamics Observatory (SDO) image of the Sun taken on January 20, 2012 in extreme ultraviolet light captures a heart-shaped dark coronal hole. Coronal holes are areas of the Sun's surface that are the source of open magnetic field lines that head way out into space. They are also the source regions of the fast solar wind, which is characterized by a relatively steady speed of approximately 800 km/s (about 1.8 million mph). NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
Two-step web-mining approach to study geology/geophysics-related open-source software projects
NASA Astrophysics Data System (ADS)
Behrends, Knut; Conze, Ronald
2013-04-01
Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github.com and stackoverflow.com. These popular, developer-centric websites have powerful application-programmer interfaces, and follow an open-data policy. In this regard, these sites offer a web-accessible reservoir of information that can be tapped to study questions such as: which open source software projects are eminent in the various geoscience fields? What are the most popular programming languages? How are they trending? Are there any interesting temporal patterns in committer activities? How large are programming teams and how do they change over time? What free software packages exist in the vast realms of related fields? Does the software from these fields have capabilities that might still be useful to me as a researcher, or can help me perform my work better? Are there any open-source projects that might be commercially interesting? This evaluation strategy reveals programming projects that tend to be new. As many important legacy codes are not hosted on open-source code-repositories, the presented search method might overlook some older projects.
Learning from open source software projects to improve scientific review.
Ghosh, Satrajit S; Klein, Arno; Avants, Brian; Millman, K Jarrod
2012-01-01
Peer-reviewed publications are the primary mechanism for sharing scientific results. The current peer-review process is, however, fraught with many problems that undermine the pace, validity, and credibility of science. We highlight five salient problems: (1) reviewers are expected to have comprehensive expertise; (2) reviewers do not have sufficient access to methods and materials to evaluate a study; (3) reviewers are neither identified nor acknowledged; (4) there is no measure of the quality of a review; and (5) reviews take a lot of time, and once submitted cannot evolve. We propose that these problems can be resolved by making the following changes to the review process. Distributing reviews to many reviewers would allow each reviewer to focus on portions of the article that reflect the reviewer's specialty or area of interest and place less of a burden on any one reviewer. Providing reviewers materials and methods to perform comprehensive evaluation would facilitate transparency, greater scrutiny, and replication of results. Acknowledging reviewers makes it possible to quantitatively assess reviewer contributions, which could be used to establish the impact of the reviewer in the scientific community. Quantifying review quality could help establish the importance of individual reviews and reviewers as well as the submitted article. Finally, we recommend expediting post-publication reviews and allowing for the dialog to continue and flourish in a dynamic and interactive manner. We argue that these solutions can be implemented by adapting existing features from open-source software management and social networking technologies. We propose a model of an open, interactive review system that quantifies the significance of articles, the quality of reviews, and the reputation of reviewers.
Learning from open source software projects to improve scientific review
Ghosh, Satrajit S.; Klein, Arno; Avants, Brian; Millman, K. Jarrod
2012-01-01
Peer-reviewed publications are the primary mechanism for sharing scientific results. The current peer-review process is, however, fraught with many problems that undermine the pace, validity, and credibility of science. We highlight five salient problems: (1) reviewers are expected to have comprehensive expertise; (2) reviewers do not have sufficient access to methods and materials to evaluate a study; (3) reviewers are neither identified nor acknowledged; (4) there is no measure of the quality of a review; and (5) reviews take a lot of time, and once submitted cannot evolve. We propose that these problems can be resolved by making the following changes to the review process. Distributing reviews to many reviewers would allow each reviewer to focus on portions of the article that reflect the reviewer's specialty or area of interest and place less of a burden on any one reviewer. Providing reviewers materials and methods to perform comprehensive evaluation would facilitate transparency, greater scrutiny, and replication of results. Acknowledging reviewers makes it possible to quantitatively assess reviewer contributions, which could be used to establish the impact of the reviewer in the scientific community. Quantifying review quality could help establish the importance of individual reviews and reviewers as well as the submitted article. Finally, we recommend expediting post-publication reviews and allowing for the dialog to continue and flourish in a dynamic and interactive manner. We argue that these solutions can be implemented by adapting existing features from open-source software management and social networking technologies. We propose a model of an open, interactive review system that quantifies the significance of articles, the quality of reviews, and the reputation of reviewers. PMID:22529798
NASA Astrophysics Data System (ADS)
Löwe, Peter; Barmuta, Jan; Klump, Jens; Neumann, Janna; Plank, Margret
2014-05-01
The communication of advances in research to the common public for both education and decision making is an important aspect of scientific work. An even more crucial task is to gain recognition within the scientific community, which is judged by impact factor and citation counts. Recently, the latter concepts have been extended from textual publications to include data and software publications. This paper presents a case study for science communication and data citation. For this, tectonic models, Free and Open Source Software (FOSS), best practices for data citation and a multimedia online-portal for scientific content are combined. This approach creates mutual benefits for the stakeholders: Target audiences receive information on the latest research results, while the use of Digital Object Identifiers (DOI) increases the recognition and citation of underlying scientific data. This creates favourable conditions for every researcher as DOI names ensure citeability and long term availability of scientific research. In the developed application, the FOSS tool for tectonic modelling GPlates is used to visualise and manipulate plate-tectonic reconstructions and associated data through geological time. These capabilities are augmented by the Science on a Halfsphere project (SoaH) with a robust and intuitive visualisation hardware environment. The tectonic models used for science communication are provided by the AGH University of Science and Technology. They focus on the Silurian to Early Carboniferous evolution of Central Europe (Bohemian Massif) and were interpreted for the area of the Geopark Bergstraße Odenwald based on the GPlates/SoaH hardware- and software stack. As scientific story-telling is volatile by nature, recordings are a natural means of preservation for further use, reference and analysis. For this, the upcoming portal for audiovisual media of the German National Library of Science and Technology TIB is expected to become a critical service infrastructure. It allows complex search queries, including metadata such as DOI and media fragment identifiers (MFI), thereby linking data citation and science communication.
NASA Astrophysics Data System (ADS)
Kwon, N.; Gentle, J.; Pierce, S. A.
2015-12-01
Software code developed for research is often used for a relatively short period of time before it is abandoned, lost, or becomes outdated. This unintentional abandonment of code is a valid problem in the 21st century scientific process, hindering widespread reusability and increasing the effort needed to develop research software. Potentially important assets, these legacy codes may be resurrected and documented digitally for long-term reuse, often with modest effort. Furthermore, the revived code may be openly accessible in a public repository for researchers to reuse or improve. For this study, the research team has begun to revive the codebase for Groundwater Decision Support System (GWDSS), originally developed for participatory decision making to aid urban planning and groundwater management, though it may serve multiple use cases beyond those originally envisioned. GWDSS was designed as a java-based wrapper with loosely federated commercial and open source components. If successfully revitalized, GWDSS will be useful for both practical applications as a teaching tool and case study for groundwater management, as well as informing theoretical research. Using the knowledge-sharing approaches documented by the NSF-funded Ontosoft project, digital documentation of GWDSS is underway, from conception to development, deployment, characterization, integration, composition, and dissemination through open source communities and geosciences modeling frameworks. Information assets, documentation, and examples are shared using open platforms for data sharing and assigned digital object identifiers. Two instances of GWDSS version 3.0 are being created: 1) a virtual machine instance for the original case study to serve as a live demonstration of the decision support tool, assuring the original version is usable, and 2) an open version of the codebase, executable installation files, and developer guide available via an open repository, assuring the source for the application is accessible with version control and potential for new branch developments. Finally, metadata about the software has been completed within the OntoSoft portal to provide descriptive curation, make GWDSS searchable, and complete documentation of the scientific software lifecycle.
IQM: An Extensible and Portable Open Source Application for Image and Signal Analysis in Java
Kainz, Philipp; Mayrhofer-Reinhartshuber, Michael; Ahammer, Helmut
2015-01-01
Image and signal analysis applications are substantial in scientific research. Both open source and commercial packages provide a wide range of functions for image and signal analysis, which are sometimes supported very well by the communities in the corresponding fields. Commercial software packages have the major drawback of being expensive and having undisclosed source code, which hampers extending the functionality if there is no plugin interface or similar option available. However, both variants cannot cover all possible use cases and sometimes custom developments are unavoidable, requiring open source applications. In this paper we describe IQM, a completely free, portable and open source (GNU GPLv3) image and signal analysis application written in pure Java. IQM does not depend on any natively installed libraries and is therefore runnable out-of-the-box. Currently, a continuously growing repertoire of 50 image and 16 signal analysis algorithms is provided. The modular functional architecture based on the three-tier model is described along the most important functionality. Extensibility is achieved using operator plugins, and the development of more complex workflows is provided by a Groovy script interface to the JVM. We demonstrate IQM’s image and signal processing capabilities in a proof-of-principle analysis and provide example implementations to illustrate the plugin framework and the scripting interface. IQM integrates with the popular ImageJ image processing software and is aiming at complementing functionality rather than competing with existing open source software. Machine learning can be integrated into more complex algorithms via the WEKA software package as well, enabling the development of transparent and robust methods for image and signal analysis. PMID:25612319
IQM: an extensible and portable open source application for image and signal analysis in Java.
Kainz, Philipp; Mayrhofer-Reinhartshuber, Michael; Ahammer, Helmut
2015-01-01
Image and signal analysis applications are substantial in scientific research. Both open source and commercial packages provide a wide range of functions for image and signal analysis, which are sometimes supported very well by the communities in the corresponding fields. Commercial software packages have the major drawback of being expensive and having undisclosed source code, which hampers extending the functionality if there is no plugin interface or similar option available. However, both variants cannot cover all possible use cases and sometimes custom developments are unavoidable, requiring open source applications. In this paper we describe IQM, a completely free, portable and open source (GNU GPLv3) image and signal analysis application written in pure Java. IQM does not depend on any natively installed libraries and is therefore runnable out-of-the-box. Currently, a continuously growing repertoire of 50 image and 16 signal analysis algorithms is provided. The modular functional architecture based on the three-tier model is described along the most important functionality. Extensibility is achieved using operator plugins, and the development of more complex workflows is provided by a Groovy script interface to the JVM. We demonstrate IQM's image and signal processing capabilities in a proof-of-principle analysis and provide example implementations to illustrate the plugin framework and the scripting interface. IQM integrates with the popular ImageJ image processing software and is aiming at complementing functionality rather than competing with existing open source software. Machine learning can be integrated into more complex algorithms via the WEKA software package as well, enabling the development of transparent and robust methods for image and signal analysis.
Campagnola, Luke; Kratz, Megan B; Manis, Paul B
2014-01-01
The complexity of modern neurophysiology experiments requires specialized software to coordinate multiple acquisition devices and analyze the collected data. We have developed ACQ4, an open-source software platform for performing data acquisition and analysis in experimental neurophysiology. This software integrates the tasks of acquiring, managing, and analyzing experimental data. ACQ4 has been used primarily for standard patch-clamp electrophysiology, laser scanning photostimulation, multiphoton microscopy, intrinsic imaging, and calcium imaging. The system is highly modular, which facilitates the addition of new devices and functionality. The modules included with ACQ4 provide for rapid construction of acquisition protocols, live video display, and customizable analysis tools. Position-aware data collection allows automated construction of image mosaics and registration of images with 3-dimensional anatomical atlases. ACQ4 uses free and open-source tools including Python, NumPy/SciPy for numerical computation, PyQt for the user interface, and PyQtGraph for scientific graphics. Supported hardware includes cameras, patch clamp amplifiers, scanning mirrors, lasers, shutters, Pockels cells, motorized stages, and more. ACQ4 is available for download at http://www.acq4.org.
pvsR: An Open Source Interface to Big Data on the American Political Sphere.
Matter, Ulrich; Stutzer, Alois
2015-01-01
Digital data from the political sphere is abundant, omnipresent, and more and more directly accessible through the Internet. Project Vote Smart (PVS) is a prominent example of this big public data and covers various aspects of U.S. politics in astonishing detail. Despite the vast potential of PVS' data for political science, economics, and sociology, it is hardly used in empirical research. The systematic compilation of semi-structured data can be complicated and time consuming as the data format is not designed for conventional scientific research. This paper presents a new tool that makes the data easily accessible to a broad scientific community. We provide the software called pvsR as an add-on to the R programming environment for statistical computing. This open source interface (OSI) serves as a direct link between a statistical analysis and the large PVS database. The free and open code is expected to substantially reduce the cost of research with PVS' new big public data in a vast variety of possible applications. We discuss its advantages vis-à-vis traditional methods of data generation as well as already existing interfaces. The validity of the library is documented based on an illustration involving female representation in local politics. In addition, pvsR facilitates the replication of research with PVS data at low costs, including the pre-processing of data. Similar OSIs are recommended for other big public databases.
Open-Source Development of the Petascale Reactive Flow and Transport Code PFLOTRAN
NASA Astrophysics Data System (ADS)
Hammond, G. E.; Andre, B.; Bisht, G.; Johnson, T.; Karra, S.; Lichtner, P. C.; Mills, R. T.
2013-12-01
Open-source software development has become increasingly popular in recent years. Open-source encourages collaborative and transparent software development and promotes unlimited free redistribution of source code to the public. Open-source development is good for science as it reveals implementation details that are critical to scientific reproducibility, but generally excluded from journal publications. In addition, research funds that would have been spent on licensing fees can be redirected to code development that benefits more scientists. In 2006, the developers of PFLOTRAN open-sourced their code under the U.S. Department of Energy SciDAC-II program. Since that time, the code has gained popularity among code developers and users from around the world seeking to employ PFLOTRAN to simulate thermal, hydraulic, mechanical and biogeochemical processes in the Earth's surface/subsurface environment. PFLOTRAN is a massively-parallel subsurface reactive multiphase flow and transport simulator designed from the ground up to run efficiently on computing platforms ranging from the laptop to leadership-class supercomputers, all from a single code base. The code employs domain decomposition for parallelism and is founded upon the well-established and open-source parallel PETSc and HDF5 frameworks. PFLOTRAN leverages modern Fortran (i.e. Fortran 2003-2008) in its extensible object-oriented design. The use of this progressive, yet domain-friendly programming language has greatly facilitated collaboration in the code's software development. Over the past year, PFLOTRAN's top-level data structures were refactored as Fortran classes (i.e. extendible derived types) to improve the flexibility of the code, ease the addition of new process models, and enable coupling to external simulators. For instance, PFLOTRAN has been coupled to the parallel electrical resistivity tomography code E4D to enable hydrogeophysical inversion while the same code base can be used as a third-party library to provide hydrologic flow, energy transport, and biogeochemical capability to the community land model, CLM, part of the open-source community earth system model (CESM) for climate. In this presentation, the advantages and disadvantages of open source software development in support of geoscience research at government laboratories, universities, and the private sector are discussed. Since the code is open-source (i.e. it's transparent and readily available to competitors), the PFLOTRAN team's development strategy within a competitive research environment is presented. Finally, the developers discuss their approach to object-oriented programming and the leveraging of modern Fortran in support of collaborative geoscience research as the Fortran standard evolves among compiler vendors.
Lessons Learned through the Development and Publication of AstroImageJ
NASA Astrophysics Data System (ADS)
Collins, Karen
2018-01-01
As lead author of the scientific image processing software package AstroImageJ (AIJ), I will discuss the reasoning behind why we decided to release AIJ to the public, and the lessons we learned related to the development, publication, distribution, and support of AIJ. I will also summarize the AIJ code language selection, code documentation and testing approaches, code distribution, update, and support facilities used, and the code citation and licensing decisions. Since AIJ was initially developed as part of my graduate research and was my first scientific open source software publication, many of my experiences and difficulties encountered may parallel those of others new to scientific software publication. Finally, I will discuss the benefits and disadvantages of releasing scientific software that I now recognize after having AIJ in the public domain for more than five years.
Stories from OpenAQ, a Global and Grassroots Open Air Quality Community
NASA Astrophysics Data System (ADS)
Hasenkopf, C. A.; Flasher, J. C.; Veerman, O.; Scalamogna, A.; Silva, D.; Salmon, M.; Buuralda, D.; DeWitt, L. H.
2016-12-01
Air pollution, responsible for more deaths each year than HIV/AIDS and malaria, combined, is a global public health crisis. Yet many scientific questions, including those directly relevant for policy, remain unanswered when it comes to the impact of air pollution on health in highly polluted environments. Often, specific solutions to improving air quality are local and sustained through public engagement, policy and monitoring. Both the overarching science of air quality and health and local solutions rely on access to reliable, timely air quality data. Over the past year, the OpenAQ community has opened up existing disparate air quality data in 24 countries through an open source platform (openaq.org) so that communities around the world can use it to advance science, public engagement, and policy. We will share stories of communities, from Delhi to Ulaanbaatar and from scientists to journalists, using open air quality data from our platform to advance their fight against air inequality. We will share recent research we have conducted on best practices for engaging different communities and building tools that enable the public to fully unleash the power of open air quality data to fight air inequality. The subsequent open-source tools (github.com/openaq) we have developed from this research and our entire data-sharing platform may be of interest to other open data communities.
Open-Source Automated Mapping Four-Point Probe
Chandra, Handy; Allen, Spencer W.; Oberloier, Shane W.; Bihari, Nupur; Gwamuri, Jephias; Pearce, Joshua M.
2017-01-01
Scientists have begun using self-replicating rapid prototyper (RepRap) 3-D printers to manufacture open source digital designs of scientific equipment. This approach is refined here to develop a novel instrument capable of performing automated large-area four-point probe measurements. The designs for conversion of a RepRap 3-D printer to a 2-D open source four-point probe (OS4PP) measurement device are detailed for the mechanical and electrical systems. Free and open source software and firmware are developed to operate the tool. The OS4PP was validated against a wide range of discrete resistors and indium tin oxide (ITO) samples of different thicknesses both pre- and post-annealing. The OS4PP was then compared to two commercial proprietary systems. Results of resistors from 10 to 1 MΩ show errors of less than 1% for the OS4PP. The 3-D mapping of sheet resistance of ITO samples successfully demonstrated the automated capability to measure non-uniformities in large-area samples. The results indicate that all measured values are within the same order of magnitude when compared to two proprietary measurement systems. In conclusion, the OS4PP system, which costs less than 70% of manual proprietary systems, is comparable electrically while offering automated 100 micron positional accuracy for measuring sheet resistance over larger areas. PMID:28772471
Open cyberGIS software for geospatial research and education in the big data era
NASA Astrophysics Data System (ADS)
Wang, Shaowen; Liu, Yan; Padmanabhan, Anand
CyberGIS represents an interdisciplinary field combining advanced cyberinfrastructure, geographic information science and systems (GIS), spatial analysis and modeling, and a number of geospatial domains to improve research productivity and enable scientific breakthroughs. It has emerged as new-generation GIS that enable unprecedented advances in data-driven knowledge discovery, visualization and visual analytics, and collaborative problem solving and decision-making. This paper describes three open software strategies-open access, source, and integration-to serve various research and education purposes of diverse geospatial communities. These strategies have been implemented in a leading-edge cyberGIS software environment through three corresponding software modalities: CyberGIS Gateway, Toolkit, and Middleware, and achieved broad and significant impacts.
Nektar++: An open-source spectral/ hp element framework
NASA Astrophysics Data System (ADS)
Cantwell, C. D.; Moxey, D.; Comerford, A.; Bolis, A.; Rocco, G.; Mengaldo, G.; De Grazia, D.; Yakovlev, S.; Lombard, J.-E.; Ekelschot, D.; Jordi, B.; Xu, H.; Mohamied, Y.; Eskilsson, C.; Nelson, B.; Vos, P.; Biotto, C.; Kirby, R. M.; Sherwin, S. J.
2015-07-01
Nektar++ is an open-source software framework designed to support the development of high-performance scalable solvers for partial differential equations using the spectral/ hp element method. High-order methods are gaining prominence in several engineering and biomedical applications due to their improved accuracy over low-order techniques at reduced computational cost for a given number of degrees of freedom. However, their proliferation is often limited by their complexity, which makes these methods challenging to implement and use. Nektar++ is an initiative to overcome this limitation by encapsulating the mathematical complexities of the underlying method within an efficient C++ framework, making the techniques more accessible to the broader scientific and industrial communities. The software supports a variety of discretisation techniques and implementation strategies, supporting methods research as well as application-focused computation, and the multi-layered structure of the framework allows the user to embrace as much or as little of the complexity as they need. The libraries capture the mathematical constructs of spectral/ hp element methods, while the associated collection of pre-written PDE solvers provides out-of-the-box application-level functionality and a template for users who wish to develop solutions for addressing questions in their own scientific domains.
OASYS (OrAnge SYnchrotron Suite): an open-source graphical environment for x-ray virtual experiments
NASA Astrophysics Data System (ADS)
Rebuffi, Luca; Sanchez del Rio, Manuel
2017-08-01
The evolution of the hardware platforms, the modernization of the software tools, the access to the codes of a large number of young people and the popularization of the open source software for scientific applications drove us to design OASYS (ORange SYnchrotron Suite), a completely new graphical environment for modelling X-ray experiments. The implemented software architecture allows to obtain not only an intuitive and very-easy-to-use graphical interface, but also provides high flexibility and rapidity for interactive simulations, making configuration changes to quickly compare multiple beamline configurations. Its purpose is to integrate in a synergetic way the most powerful calculation engines available. OASYS integrates different simulation strategies via the implementation of adequate simulation tools for X-ray Optics (e.g. ray tracing and wave optics packages). It provides a language to make them to communicate by sending and receiving encapsulated data. Python has been chosen as main programming language, because of its universality and popularity in scientific computing. The software Orange, developed at the University of Ljubljana (SLO), is the high level workflow engine that provides the interaction with the user and communication mechanisms.
Virtual Labs (Science Gateways) as platforms for Free and Open Source Science
NASA Astrophysics Data System (ADS)
Lescinsky, David; Car, Nicholas; Fraser, Ryan; Friedrich, Carsten; Kemp, Carina; Squire, Geoffrey
2016-04-01
The Free and Open Source Software (FOSS) movement promotes community engagement in software development, as well as provides access to a range of sophisticated technologies that would be prohibitively expensive if obtained commercially. However, as geoinformatics and eResearch tools and services become more dispersed, it becomes more complicated to identify and interface between the many required components. Virtual Laboratories (VLs, also known as Science Gateways) simplify the management and coordination of these components by providing a platform linking many, if not all, of the steps in particular scientific processes. These enable scientists to focus on their science, rather than the underlying supporting technologies. We describe a modular, open source, VL infrastructure that can be reconfigured to create VLs for a wide range of disciplines. Development of this infrastructure has been led by CSIRO in collaboration with Geoscience Australia and the National Computational Infrastructure (NCI) with support from the National eResearch Collaboration Tools and Resources (NeCTAR) and the Australian National Data Service (ANDS). Initially, the infrastructure was developed to support the Virtual Geophysical Laboratory (VGL), and has subsequently been repurposed to create the Virtual Hazards Impact and Risk Laboratory (VHIRL) and the reconfigured Australian National Virtual Geophysics Laboratory (ANVGL). During each step of development, new capabilities and services have been added and/or enhanced. We plan on continuing to follow this model using a shared, community code base. The VL platform facilitates transparent and reproducible science by providing access to both the data and methodologies used during scientific investigations. This is further enhanced by the ability to set up and run investigations using computational resources accessed through the VL. Data is accessed using registries pointing to catalogues within public data repositories (notably including the NCI National Environmental Research Data Interoperability Platform), or by uploading data directly from user supplied addresses or files. Similarly, scientific software is accessed through registries pointing to software repositories (e.g., GitHub). Runs are configured by using or modifying default templates designed by subject matter experts. After the appropriate computational resources are identified by the user, Virtual Machines (VMs) are spun up and jobs are submitted to service providers (currently the NeCTAR public cloud or Amazon Web Services). Following completion of the jobs the results can be reviewed and downloaded if desired. By providing a unified platform for science, the VL infrastructure enables sophisticated provenance capture and management. The source of input data (including both collection and queries), user information, software information (version and configuration details) and output information are all captured and managed as a VL resource which can be linked to output data sets. This provenance resource provides a mechanism for publication and citation for Free and Open Source Science.
Earthscape, a Multi-Purpose Interactive 3d Globe Viewer for Hybrid Data Visualization and Analysis
NASA Astrophysics Data System (ADS)
Sarthou, A.; Mas, S.; Jacquin, M.; Moreno, N.; Salamon, A.
2015-08-01
The hybrid visualization and interaction tool EarthScape is presented here. The software is able to display simultaneously LiDAR point clouds, draped videos with moving footprint, volume scientific data (using volume rendering, isosurface and slice plane), raster data such as still satellite images, vector data and 3D models such as buildings or vehicles. The application runs on touch screen devices such as tablets. The software is based on open source libraries, such as OpenSceneGraph, osgEarth and OpenCV, and shader programming is used to implement volume rendering of scientific data. The next goal of EarthScape is to perform data analysis using ENVI Services Engine, a cloud data analysis solution. EarthScape is also designed to be a client of Jagwire which provides multisource geo-referenced video fluxes. When all these components will be included, EarthScape will be a multi-purpose platform that will provide at the same time data analysis, hybrid visualization and complex interactions. The software is available on demand for free at france@exelisvis.com.
Open-source algorithm for detecting sea ice surface features in high-resolution optical imagery
NASA Astrophysics Data System (ADS)
Wright, Nicholas C.; Polashenski, Chris M.
2018-04-01
Snow, ice, and melt ponds cover the surface of the Arctic Ocean in fractions that change throughout the seasons. These surfaces control albedo and exert tremendous influence over the energy balance in the Arctic. Increasingly available meter- to decimeter-scale resolution optical imagery captures the evolution of the ice and ocean surface state visually, but methods for quantifying coverage of key surface types from raw imagery are not yet well established. Here we present an open-source system designed to provide a standardized, automated, and reproducible technique for processing optical imagery of sea ice. The method classifies surface coverage into three main categories: snow and bare ice, melt ponds and submerged ice, and open water. The method is demonstrated on imagery from four sensor platforms and on imagery spanning from spring thaw to fall freeze-up. Tests show the classification accuracy of this method typically exceeds 96 %. To facilitate scientific use, we evaluate the minimum observation area required for reporting a representative sample of surface coverage. We provide an open-source distribution of this algorithm and associated training datasets and suggest the community consider this a step towards standardizing optical sea ice imagery processing. We hope to encourage future collaborative efforts to improve the code base and to analyze large datasets of optical sea ice imagery.
Enabling cross-disciplinary research by linking data to Open Access publications
NASA Astrophysics Data System (ADS)
Rettberg, N.
2012-04-01
OpenAIREplus focuses on the linking of research data to associated publications. The interlinking of research objects has implications for optimising the research process, allowing the sharing, enrichment and reuse of data, and ultimately serving to make open data an essential part of first class research. The growing call for more concrete data management and sharing plans, apparent at funder and national level, is complemented by the increasing support for a scientific infrastructure that supports the seamless access to a range of research materials. This paper will describe the recently launched OpenAIREplus and will detail how it plans to achieve its goals of developing an Open Access participatory infrastructure for scientific information. OpenAIREplus extends the current collaborative OpenAIRE project, which provides European researchers with a service network for the deposit of peer-reviewed FP7 grant-funded Open Access publications. This new project will focus on opening up the infrastructure to data sources from subject-specific communities to provide metadata about research data and publications, facilitating the linking between these objects. The ability to link within a publication out to a citable database, or other research data material, is fairly innovative and this project will enable users to search, browse, view, and create relationships between different information objects. In this regard, OpenAIREplus will build on prototypes of so-called "Enhanced Publications", originally conceived in the DRIVER-II project. OpenAIREplus recognizes the importance of representing the context of publications and datasets, thus linking to resources about the authors, their affiliation, location, project data and funding. The project will explore how links between text-based publications and research data are managed in different scientific fields. This complements a previous study in OpenAIRE on current disciplinary practices and future needs for infrastructural Open Access services, taking into account the variety within research approaches. Adopting Linked Data mechanisms on top of citation and content mining, it will approach the interchange of data between generic infrastructures such as OpenAIREplus and subject specific service providers. The paper will also touch on the other challenges envisaged in the project with regard to the culture of sharing data, as well as IPR, licensing and organisational issues.
The SCEC Broadband Platform: Open-Source Software for Strong Ground Motion Simulation and Validation
NASA Astrophysics Data System (ADS)
Goulet, C.; Silva, F.; Maechling, P. J.; Callaghan, S.; Jordan, T. H.
2015-12-01
The Southern California Earthquake Center (SCEC) Broadband Platform (BBP) is a carefully integrated collection of open-source scientific software programs that can simulate broadband (0-100Hz) ground motions for earthquakes at regional scales. The BBP scientific software modules implement kinematic rupture generation, low and high-frequency seismogram synthesis using wave propagation through 1D layered velocity structures, seismogram ground motion amplitude calculations, and goodness of fit measurements. These modules are integrated into a software system that provides user-defined, repeatable, calculation of ground motion seismograms, using multiple alternative ground motion simulation methods, and software utilities that can generate plots, charts, and maps. The BBP has been developed over the last five years in a collaborative scientific, engineering, and software development project involving geoscientists, earthquake engineers, graduate students, and SCEC scientific software developers. The BBP can run earthquake rupture and wave propagation modeling software to simulate ground motions for well-observed historical earthquakes and to quantify how well the simulated broadband seismograms match the observed seismograms. The BBP can also run simulations for hypothetical earthquakes. In this case, users input an earthquake location and magnitude description, a list of station locations, and a 1D velocity model for the region of interest, and the BBP software then calculates ground motions for the specified stations. The SCEC BBP software released in 2015 can be compiled and run on recent Linux systems with GNU compilers. It includes 5 simulation methods, 7 simulation regions covering California, Japan, and Eastern North America, the ability to compare simulation results against GMPEs, updated ground motion simulation methods, and a simplified command line user interface.
The Computational Infrastructure for Geodynamics as a Community of Practice
NASA Astrophysics Data System (ADS)
Hwang, L.; Kellogg, L. H.
2016-12-01
Computational Infrastructure for Geodynamics (CIG), geodynamics.org, originated in 2005 out of community recognition that the efforts of individual or small groups of researchers to develop scientifically-sound software is impossible to sustain, duplicates effort, and makes it difficult for scientists to adopt state-of-the art computational methods that promote new discovery. As a community of practice, participants in CIG share an interest in computational modeling in geodynamics and work together on open source software to build the capacity to support complex, extensible, scalable, interoperable, reliable, and reusable software in an effort to increase the return on investment in scientific software development and increase the quality of the resulting software. The group interacts regularly to learn from each other and better their practices formally through webinar series, workshops, and tutorials and informally through listservs and hackathons. Over the past decade, we have learned that successful scientific software development requires at a minimum: collaboration between domain-expert researchers, software developers and computational scientists; clearly identified and committed lead developer(s); well-defined scientific and computational goals that are regularly evaluated and updated; well-defined benchmarks and testing throughout development; attention throughout development to usability and extensibility; understanding and evaluation of the complexity of dependent libraries; and managed user expectations through education, training, and support. CIG's code donation standards provide the basis for recently formalized best practices in software development (geodynamics.org/cig/dev/best-practices/). Best practices include use of version control; widely used, open source software libraries; extensive test suites; portable configuration and build systems; extensive documentation internal and external to the code; and structured, human readable input formats.
2011-01-01
Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements. PMID:21798025
Stålring, Jonna C; Carlsson, Lars A; Almeida, Pedro; Boyer, Scott
2011-07-28
Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.
Inexpensive, Low Power, Open-Source Data Logging hardware development
NASA Astrophysics Data System (ADS)
Sandell, C. T.; Schulz, B.; Wickert, A. D.
2017-12-01
Over the past six years, we have developed a suite of open-source, low-cost, and lightweight data loggers for scientific research. These loggers employ the popular and easy-to-use Arduino programming environment, but consist of custom hardware optimized for field research. They may be connected to a broad and expanding range of off-the-shelf sensors, with software support built in directly to the "ALog" library. Three main models exist: The ALog (for Autonomous or Arduino Logger) is the extreme low-power model for years-long deployments with only primary AA or D batteries. The ALog shield is a stripped-down ALog that nests with a standard Arduino board for prototyping or education. The TLog (for Telemetering Logger) contains an embedded radio with 500 m range and a GPS for communications and precision timekeeping. This enables meshed networks of loggers that can send their data back to an internet-connected "home base" logger for near-real-time field data retrieval. All boards feature feature a high-precision clock, full size SD card slot for high-volume data storage, large screw terminals to connect sensors, interrupts, SPI and I2C communication capability, and 3.3V/5V power outputs. The ALog and TLog have fourteen 16-bit analog inputs with a precision voltage reference for precise analog measurements. Their components are rated -40 to +85 degrees C, and they have been tested in harsh field conditions. These low-cost and open-source data loggers have enabled our research group to collect field data across North and South America on a limited budget, support student projects, and build toward better future scientific data systems.
NASA Astrophysics Data System (ADS)
Silva, F.; Maechling, P. J.; Goulet, C. A.; Somerville, P.; Jordan, T. H.
2014-12-01
The Southern California Earthquake Center (SCEC) Broadband Platform is a collaborative software development project involving geoscientists, earthquake engineers, graduate students, and the SCEC Community Modeling Environment. The SCEC Broadband Platform (BBP) is open-source scientific software that can generate broadband (0-100Hz) ground motions for earthquakes, integrating complex scientific modules that implement rupture generation, low and high-frequency seismogram synthesis, non-linear site effects calculation, and visualization into a software system that supports easy on-demand computation of seismograms. The Broadband Platform operates in two primary modes: validation simulations and scenario simulations. In validation mode, the Platform runs earthquake rupture and wave propagation modeling software to calculate seismograms for a well-observed historical earthquake. Then, the BBP calculates a number of goodness of fit measurements that quantify how well the model-based broadband seismograms match the observed seismograms for a certain event. Based on these results, the Platform can be used to tune and validate different numerical modeling techniques. In scenario mode, the Broadband Platform can run simulations for hypothetical (scenario) earthquakes. In this mode, users input an earthquake description, a list of station names and locations, and a 1D velocity model for their region of interest, and the Broadband Platform software then calculates ground motions for the specified stations. Working in close collaboration with scientists and research engineers, the SCEC software development group continues to add new capabilities to the Broadband Platform and to release new versions as open-source scientific software distributions that can be compiled and run on many Linux computer systems. Our latest release includes 5 simulation methods, 7 simulation regions covering California, Japan, and Eastern North America, the ability to compare simulation results against GMPEs, and several new data products, such as map and distance-based goodness of fit plots. As the number and complexity of scenarios simulated using the Broadband Platform increases, we have added batching utilities to substantially improve support for running large-scale simulations on computing clusters.
Computational Infrastructure for Geodynamics (CIG)
NASA Astrophysics Data System (ADS)
Gurnis, M.; Kellogg, L. H.; Bloxham, J.; Hager, B. H.; Spiegelman, M.; Willett, S.; Wysession, M. E.; Aivazis, M.
2004-12-01
Solid earth geophysicists have a long tradition of writing scientific software to address a wide range of problems. In particular, computer simulations came into wide use in geophysics during the decade after the plate tectonic revolution. Solution schemes and numerical algorithms that developed in other areas of science, most notably engineering, fluid mechanics, and physics, were adapted with considerable success to geophysics. This software has largely been the product of individual efforts and although this approach has proven successful, its strength for solving problems of interest is now starting to show its limitations as we try to share codes and algorithms or when we want to recombine codes in novel ways to produce new science. With funding from the NSF, the US community has embarked on a Computational Infrastructure for Geodynamics (CIG) that will develop, support, and disseminate community-accessible software for the greater geodynamics community from model developers to end-users. The software is being developed for problems involving mantle and core dynamics, crustal and earthquake dynamics, magma migration, seismology, and other related topics. With a high level of community participation, CIG is leveraging state-of-the-art scientific computing into a suite of open-source tools and codes. The infrastructure that we are now starting to develop will consist of: (a) a coordinated effort to develop reusable, well-documented and open-source geodynamics software; (b) the basic building blocks - an infrastructure layer - of software by which state-of-the-art modeling codes can be quickly assembled; (c) extension of existing software frameworks to interlink multiple codes and data through a superstructure layer; (d) strategic partnerships with the larger world of computational science and geoinformatics; and (e) specialized training and workshops for both the geodynamics and broader Earth science communities. The CIG initiative has already started to leverage and develop long-term strategic partnerships with open source development efforts within the larger thrusts of scientific computing and geoinformatics. These strategic partnerships are essential as the frontier has moved into multi-scale and multi-physics problems in which many investigators now want to use simulation software for data interpretation, data assimilation, and hypothesis testing.
pvsR: An Open Source Interface to Big Data on the American Political Sphere
2015-01-01
Digital data from the political sphere is abundant, omnipresent, and more and more directly accessible through the Internet. Project Vote Smart (PVS) is a prominent example of this big public data and covers various aspects of U.S. politics in astonishing detail. Despite the vast potential of PVS’ data for political science, economics, and sociology, it is hardly used in empirical research. The systematic compilation of semi-structured data can be complicated and time consuming as the data format is not designed for conventional scientific research. This paper presents a new tool that makes the data easily accessible to a broad scientific community. We provide the software called pvsR as an add-on to the R programming environment for statistical computing. This open source interface (OSI) serves as a direct link between a statistical analysis and the large PVS database. The free and open code is expected to substantially reduce the cost of research with PVS’ new big public data in a vast variety of possible applications. We discuss its advantages vis-à-vis traditional methods of data generation as well as already existing interfaces. The validity of the library is documented based on an illustration involving female representation in local politics. In addition, pvsR facilitates the replication of research with PVS data at low costs, including the pre-processing of data. Similar OSIs are recommended for other big public databases. PMID:26132154
NoSQL data model for semi-automatic integration of ethnomedicinal plant data from multiple sources.
Ningthoujam, Sanjoy Singh; Choudhury, Manabendra Dutta; Potsangbam, Kumar Singh; Chetia, Pankaj; Nahar, Lutfun; Sarker, Satyajit D; Basar, Norazah; Das Talukdar, Anupam
2014-01-01
Sharing traditional knowledge with the scientific community could refine scientific approaches to phytochemical investigation and conservation of ethnomedicinal plants. As such, integration of traditional knowledge with scientific data using a single platform for sharing is greatly needed. However, ethnomedicinal data are available in heterogeneous formats, which depend on cultural aspects, survey methodology and focus of the study. Phytochemical and bioassay data are also available from many open sources in various standards and customised formats. To design a flexible data model that could integrate both primary and curated ethnomedicinal plant data from multiple sources. The current model is based on MongoDB, one of the Not only Structured Query Language (NoSQL) databases. Although it does not contain schema, modifications were made so that the model could incorporate both standard and customised ethnomedicinal plant data format from different sources. The model presented can integrate both primary and secondary data related to ethnomedicinal plants. Accommodation of disparate data was accomplished by a feature of this database that supported a different set of fields for each document. It also allowed storage of similar data having different properties. The model presented is scalable to a highly complex level with continuing maturation of the database, and is applicable for storing, retrieving and sharing ethnomedicinal plant data. It can also serve as a flexible alternative to a relational and normalised database. Copyright © 2014 John Wiley & Sons, Ltd.
Mode-locked thin-disk lasers and their potential application for high-power terahertz generation
NASA Astrophysics Data System (ADS)
Saraceno, Clara J.
2018-04-01
The progress achieved in the last few decades in the performance of ultrafast laser systems with high average power has been tremendous, and continues to provide momentum to new exciting applications, both in scientific research and technology. Among the various technological advances that have shaped this progress, mode-locked thin-disk oscillators have attracted significant attention as a unique technology capable of providing ultrashort pulses with high energy (tens to hundreds of microjoules) and at very high repetition rates (in the megahertz regime) from a single table-top oscillator. This technology opens the door to compact high repetition rate ultrafast sources spanning the entire electromagnetic spectrum from the XUV to the terahertz regime, opening various new application fields. In this article, we focus on their unexplored potential as compact driving sources for high average power terahertz generation.
Using Kepler for Tool Integration in Microarray Analysis Workflows.
Gan, Zhuohui; Stowe, Jennifer C; Altintas, Ilkay; McCulloch, Andrew D; Zambon, Alexander C
Increasing numbers of genomic technologies are leading to massive amounts of genomic data, all of which requires complex analysis. More and more bioinformatics analysis tools are being developed by scientist to simplify these analyses. However, different pipelines have been developed using different software environments. This makes integrations of these diverse bioinformatics tools difficult. Kepler provides an open source environment to integrate these disparate packages. Using Kepler, we integrated several external tools including Bioconductor packages, AltAnalyze, a python-based open source tool, and R-based comparison tool to build an automated workflow to meta-analyze both online and local microarray data. The automated workflow connects the integrated tools seamlessly, delivers data flow between the tools smoothly, and hence improves efficiency and accuracy of complex data analyses. Our workflow exemplifies the usage of Kepler as a scientific workflow platform for bioinformatics pipelines.
Jonnalagadda, Siddhartha; Gonzalez, Graciela
2010-11-13
BioSimplify is an open source tool written in Java that introduces and facilitates the use of a novel model for sentence simplification tuned for automatic discourse analysis and information extraction (as opposed to sentence simplification for improving human readability). The model is based on a "shot-gun" approach that produces many different (simpler) versions of the original sentence by combining variants of its constituent elements. This tool is optimized for processing biomedical scientific literature such as the abstracts indexed in PubMed. We tested our tool on its impact to the task of PPI extraction and it improved the f-score of the PPI tool by around 7%, with an improvement in recall of around 20%. The BioSimplify tool and test corpus can be downloaded from https://biosimplify.sourceforge.net.
NASA Astrophysics Data System (ADS)
Šerpenskienė, Silvija; Skridlaitė, Gražina
2014-05-01
Open Access Centre (OAC) was established in Vilnius, Lithuania in 2013 as a subdivision of the Nature Research Centre (NRC) operating on the principle of open access for both internal and external users. The OAC consists of 15 units, i.e. 15 NRC laboratories or their branches. Forty four sets of research equipment were purchased. The OAC cooperates with Lithuanian science and studies institutions, business sector and other governmental and public institutions. Investigations can be carried in the Geosciences, Biotaxonomy, Ecology and Molecular Research, and Ecotoxicology fields. Environmental radioactivity, radioecology, nuclear geophysics, microscopic and chemical composition of natural compounds (minerals, rocks etc.), paleomagnetic, magnetic and environmental investigations, as well as ground and water contamination by oil products and other organic environment polluting compounds, identification of fossils, rocks and minerals can be studied in the Georesearch field. Ecosystems and identification of plants, animals and microorganisms are main subjects of the Biotaxonomy, Ecology and Molecular Research field. The Ecotoxicologal Research deals with toxic and genotoxic effects of toxic substances and other sources of pollution on macro- and microorganisms and cell cultures. Open access is guaranteed by: (1) providing scientific research and experimental development services; (2) implementing joint business and science projects; (3) using facilities for the training of specialists of the highest qualifications; (4) providing properly qualified and technically trained users with opportunities to carry out their scientific research and/or experiments in the OAC laboratories by themselves. Services provided in the Open Access Centre can be received by both internal and external users: persons undertaking innovative economic activities, students of other educational institutions, interns, external teams of researchers engaged in scientific research activities, teachers etc. Applications for a grant of open access shall be received online in accordance with the established procedure via the NRC website (www.gamtostyrimai.lt). State-of-the-art equipment enables researchers to carry out up-to-date scientific research and educational projects, scientific experiments, graduation and laboratory works. Scientists, researchers and students get the opportunity to deepen their knowledge, conduct new research in the field of natural sciences, to obtain new data to be used for further studies as well as for the development of products of higher added value. Favourable conditions are created for pursuing and developing higher level scientific research, for the implementation of joint and interdisciplinary projects, for enhancing cooperation between business and public institutions as well as between those of studies and science. The implementation of the above mentioned tasks leads to the enhanced competitiveness of Lithuanian scientists and researchers and to dissemination of the high quality scientific knowledge for a society. Tens of students from different universities and researchers from other institutions are using the OAC facilities. "Pan-European coordination action on CO2 Geological Storage (CGS Europe)"; "GEO-SEAS"; "EMODNET"; "Securing the Conservation of biodiversity across Administrative Levels and spatial, temporal, and Ecological Scales (SCALES)"; "Decline Of Fraxinus excelsior in northern Europe" and other projects are being carried out at the OAC so far. This is a contribution to the Open Access Centre activities
Open Access and Civic Scientific Information Literacy
ERIC Educational Resources Information Center
Zuccala, Alesia
2010-01-01
Introduction: We examine how residents and citizens of The Netherlands perceive open access to acquire preliminary insight into the role it might play in cultivating civic scientific literacy. Open access refers to scientific or scholarly research literature available on the Web to scholars and the general public in free online journals and…
Open-source three-dimensional printing of biodegradable polymer scaffolds for tissue engineering.
Trachtenberg, Jordan E; Mountziaris, Paschalia M; Miller, Jordan S; Wettergreen, Matthew; Kasper, Fred K; Mikos, Antonios G
2014-12-01
The fabrication of scaffolds for tissue engineering requires elements of customization depending on the application and is often limited due to the flexibility of the processing technique. This investigation seeks to address this obstacle by utilizing an open-source three-dimensional printing (3DP) system that allows vast customizability and facilitates reproduction of experiments. The effects of processing parameters on printed poly(ε-caprolactone) scaffolds with uniform and gradient pore architectures have been characterized with respect to fiber and pore morphology and mechanical properties. The results demonstrate the ability to tailor the fiber diameter, pore size, and porosity through modification of pressure, printing speed, and programmed fiber spacing. A model was also used to predict the compressive mechanical properties of uniform and gradient scaffolds, and it was found that modulus and yield strength declined with increasing porosity. The use of open-source 3DP technologies for printing tissue-engineering scaffolds provides a flexible system that can be readily modified at a low cost and is supported by community documentation. In this manner, the 3DP system is more accessible to the scientific community, which further facilitates the translation of these technologies toward successful tissue-engineering strategies.
Low-cost embedded systems for democratizing ocean sensor technology in the coastal zone
NASA Astrophysics Data System (ADS)
Glazer, B. T.; Lio, H. I.
2017-12-01
Environmental sciences suffer from undersampling. Enabling sustained and unattended data collection in the coastal zone typically involves expensive instrumentation and infrastructure deployed as cabled observatories or moorings with little flexibility in deployment location following initial installation. High costs of commercially-available or custom instruments have limited the number of sensor sites that can be targeted by academic researchers, and have also limited engagement with the public. We have developed a novel, low-cost, open-source sensor and software platform to enable wireless data transfer of biogeochemical sensors in the coastal zone. The platform is centered upon widely available, low-cost, single board computers and microcontrollers. We have used a blend of on-hand research-grade sensors and low-cost open-source electronics that can be assembled by tech-savvy non-engineers. Robust, open-source code that remains customizable for specific miniNode configurations can match a specific site's measurement needs, depending on the scientific research priorities. We have demonstrated prototype capabilities and versatility through lab testing and field deployments of multiple sensor nodes with multiple sensor inputs, all of which are streaming near-real-time data from Kaneohe Bay over wireless RF links to a shore-based base station.
ACQ4: an open-source software platform for data acquisition and analysis in neurophysiology research
Campagnola, Luke; Kratz, Megan B.; Manis, Paul B.
2014-01-01
The complexity of modern neurophysiology experiments requires specialized software to coordinate multiple acquisition devices and analyze the collected data. We have developed ACQ4, an open-source software platform for performing data acquisition and analysis in experimental neurophysiology. This software integrates the tasks of acquiring, managing, and analyzing experimental data. ACQ4 has been used primarily for standard patch-clamp electrophysiology, laser scanning photostimulation, multiphoton microscopy, intrinsic imaging, and calcium imaging. The system is highly modular, which facilitates the addition of new devices and functionality. The modules included with ACQ4 provide for rapid construction of acquisition protocols, live video display, and customizable analysis tools. Position-aware data collection allows automated construction of image mosaics and registration of images with 3-dimensional anatomical atlases. ACQ4 uses free and open-source tools including Python, NumPy/SciPy for numerical computation, PyQt for the user interface, and PyQtGraph for scientific graphics. Supported hardware includes cameras, patch clamp amplifiers, scanning mirrors, lasers, shutters, Pockels cells, motorized stages, and more. ACQ4 is available for download at http://www.acq4.org. PMID:24523692
NASA Astrophysics Data System (ADS)
Pilone, D.; Cechini, M. F.; Mitchell, A.
2011-12-01
Earth Science applications typically deal with large amounts of data and high throughput rates, if not also high transaction rates. While Open Source is frequently used for smaller scientific applications, large scale, highly available systems frequently fall back to "enterprise" class solutions like Oracle RAC or commercial grade JEE Application Servers. NASA's Earth Observing System Data and Information System (EOSDIS) provides end-to-end capabilities for managing NASA's Earth science data from multiple sources - satellites, aircraft, field measurements, and various other programs. A core capability of EOSDIS, the Earth Observing System (EOS) Clearinghouse (ECHO), is a highly available search and order clearinghouse of over 100 million pieces of science data that has evolved from its early R&D days to a fully operational system. Over the course of this maturity ECHO has largely transitioned from commercial frameworks, databases, and operating systems to Open Source solutions...and in some cases, back. In this talk we discuss the progression of our technological solutions and our lessons learned in the areas of: ? High performance, large scale searching solutions ? GeoSpatial search capabilities and dealing with multiple coordinate systems ? Search and storage of variable format source (science) data ? Highly available deployment solutions ? Scalable (elastic) solutions to visual searching and image handling Throughout the evolution of the ECHO system we have had to evaluate solutions with respect to performance, cost, developer productivity, reliability, and maintainability in the context of supporting global science users. Open Source solutions have played a significant role in our architecture and development but several critical commercial components remain (or have been reinserted) to meet our operational demands.
Maintaining Quality and Confidence in Open-Source, Evolving Software: Lessons Learned with PFLOTRAN
NASA Astrophysics Data System (ADS)
Frederick, J. M.; Hammond, G. E.
2017-12-01
Software evolution in an open-source framework poses a major challenge to a geoscientific simulator, but when properly managed, the pay-off can be enormous for both the developers and the community at large. Developers must juggle implementing new scientific process models, adopting increasingly efficient numerical methods and programming paradigms, changing funding sources (or total lack of funding), while also ensuring that legacy code remains functional and reported bugs are fixed in a timely manner. With robust software engineering and a plan for long-term maintenance, a simulator can evolve over time incorporating and leveraging many advances in the computational and domain sciences. In this positive light, what practices in software engineering and code maintenance can be employed within open-source development to maximize the positive aspects of software evolution and community contributions while minimizing its negative side effects? This presentation will discusses steps taken in the development of PFLOTRAN (www.pflotran.org), an open source, massively parallel subsurface simulator for multiphase, multicomponent, and multiscale reactive flow and transport processes in porous media. As PFLOTRAN's user base and development team continues to grow, it has become increasingly important to implement strategies which ensure sustainable software development while maintaining software quality and community confidence. In this presentation, we will share our experiences and "lessons learned" within the context of our open-source development framework and community engagement efforts. Topics discussed will include how we've leveraged both standard software engineering principles, such as coding standards, version control, and automated testing, as well unique advantages of object-oriented design in process model coupling, to ensure software quality and confidence. We will also be prepared to discuss the major challenges faced by most open-source software teams, such as on-boarding new developers or one-time contributions, dealing with competitors or lookie-loos, and other downsides of complete transparency, as well as our approach to community engagement, including a user group email list, hosting short courses and workshops for new users, and maintaining a website. SAND2017-8174A
Hart, Reece K; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A
2015-01-15
Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Hart, Reece K.; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A.
2015-01-01
Summary: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. Availability and implementation: The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Contact: reecehart@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25273102
Running an open experiment: transparency and reproducibility in soil and ecosystem science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bond-Lamberty, Benjamin; Smith, Ashly P.; Bailey, Vanessa L.
Researchers in soil and ecosystem science, and almost every other field, are being pushed--by funders, journals, governments, and their peers--to increase transparency and reproducibility of their work. A key part of this effort is a move towards open data as a way to fight post-publication data loss, improve data and code quality, enable powerful meta- and cross-disciplinary analyses, and increase trust in, and the efficiency of, publicly-funded research. Many scientists however lack experience in, and may be unsure of the benefits of, making their data and fully-reproducible analyses publicly available. Here we describe a recent "open experiment", in which wemore » documented every aspect of a soil incubation online, making all raw data, scripts, diagnostics, final analyses, and manuscripts available in real time. We found that using tools such as version control, issue tracking, and open-source statistical software improved data integrity, accelerated our team's communication and productivity, and ensured transparency. There are many avenues to improve scientific reproducibility and data availability, of which is this only one example, and it is not an approach suited for every experiment or situation. Nonetheless, we encourage the communities in our respective fields to consider its advantages, and to lead rather than follow with respect to scientific reproducibility, transparency, and data availability.« less
Running an open experiment: transparency and reproducibility in soil and ecosystem science
NASA Astrophysics Data System (ADS)
Bond-Lamberty, Ben; Peyton Smith, A.; Bailey, Vanessa
2016-08-01
Researchers in soil and ecosystem science, and almost every other field, are being pushed—by funders, journals, governments, and their peers—to increase transparency and reproducibility of their work. A key part of this effort is a move towards open data as a way to fight post-publication data loss, improve data and code quality, enable powerful meta- and cross-disciplinary analyses, and increase trust in, and the efficiency of, publicly-funded research. Many scientists however lack experience in, and may be unsure of the benefits of, making their data and fully-reproducible analyses publicly available. Here we describe a recent ‘open experiment’, in which we documented every aspect of a soil incubation online, making all raw data, scripts, diagnostics, final analyses, and manuscripts available in real time. We found that using tools such as version control, issue tracking, and open-source statistical software improved data integrity, accelerated our team’s communication and productivity, and ensured transparency. There are many avenues to improve scientific reproducibility and data availability, of which is this only one example, and it is not an approach suited for every experiment or situation. Nonetheless, we encourage the communities in our respective fields to consider its advantages, and to lead rather than follow with respect to scientific reproducibility, transparency, and data availability.
Embracing Open Software Development in Solar Physics
NASA Astrophysics Data System (ADS)
Hughitt, V. K.; Ireland, J.; Christe, S.; Mueller, D.
2012-12-01
We discuss two ongoing software projects in solar physics that have adopted best practices of the open source software community. The first, the Helioviewer Project, is a powerful data visualization tool which includes online and Java interfaces inspired by Google Maps (tm). This effort allows users to find solar features and events of interest, and download the corresponding data. Having found data of interest, the user now has to analyze it. The dominant solar data analysis platform is an open-source library called SolarSoft (SSW). Although SSW itself is open-source, the programming language used is IDL, a proprietary language with licensing costs that are prohibative for many institutions and individuals. SSW is composed of a collection of related scripts written by missions and individuals for solar data processing and analysis, without any consistent data structures or common interfaces. Further, at the time when SSW was initially developed, many of the best software development processes of today (mirrored and distributed version control, unit testing, continuous integration, etc.) were not standard, and have not since been adopted. The challenges inherent in developing SolarSoft led to a second software project known as SunPy. SunPy is an open-source Python-based library which seeks to create a unified solar data analysis environment including a number of core datatypes such as Maps, Lightcurves, and Spectra which have consistent interfaces and behaviors. By taking advantage of the large and sophisticated body of scientific software already available in Python (e.g. SciPy, NumPy, Matplotlib), and by adopting many of the best practices refined in open-source software development, SunPy has been able to develop at a very rapid pace while still ensuring a high level of reliability. The Helioviewer Project and SunPy represent two pioneering technologies in solar physics - simple yet flexible data visualization and a powerful, new data analysis environment. We discuss the development of both these efforts and how they are beginning to influence the solar physics community.
Nmrglue: an open source Python package for the analysis of multidimensional NMR data.
Helmus, Jonathan J; Jaroniec, Christopher P
2013-04-01
Nmrglue, an open source Python package for working with multidimensional NMR data, is described. When used in combination with other Python scientific libraries, nmrglue provides a highly flexible and robust environment for spectral processing, analysis and visualization and includes a number of common utilities such as linear prediction, peak picking and lineshape fitting. The package also enables existing NMR software programs to be readily tied together, currently facilitating the reading, writing and conversion of data stored in Bruker, Agilent/Varian, NMRPipe, Sparky, SIMPSON, and Rowland NMR Toolkit file formats. In addition to standard applications, the versatility offered by nmrglue makes the package particularly suitable for tasks that include manipulating raw spectrometer data files, automated quantitative analysis of multidimensional NMR spectra with irregular lineshapes such as those frequently encountered in the context of biomacromolecular solid-state NMR, and rapid implementation and development of unconventional data processing methods such as covariance NMR and other non-Fourier approaches. Detailed documentation, install files and source code for nmrglue are freely available at http://nmrglue.com. The source code can be redistributed and modified under the New BSD license.
Nmrglue: An Open Source Python Package for the Analysis of Multidimensional NMR Data
Helmus, Jonathan J.; Jaroniec, Christopher P.
2013-01-01
Nmrglue, an open source Python package for working with multidimensional NMR data, is described. When used in combination with other Python scientific libraries, nmrglue provides a highly flexible and robust environment for spectral processing, analysis and visualization and includes a number of common utilities such as linear prediction, peak picking and lineshape fitting. The package also enables existing NMR software programs to be readily tied together, currently facilitating the reading, writing and conversion of data stored in Bruker, Agilent/Varian, NMRPipe, Sparky, SIMPSON, and Rowland NMR Toolkit file formats. In addition to standard applications, the versatility offered by nmrglue makes the package particularly suitable for tasks that include manipulating raw spectrometer data files, automated quantitative analysis of multidimensional NMR spectra with irregular lineshapes such as those frequently encountered in the context of biomacromolecular solid-state NMR, and rapid implementation and development of unconventional data processing methods such as covariance NMR and other non-Fourier approaches. Detailed documentation, install files and source code for nmrglue are freely available at http://nmrglue.com. The source code can be redistributed and modified under the New BSD license. PMID:23456039
What can the programming language Rust do for astrophysics?
NASA Astrophysics Data System (ADS)
Blanco-Cuaresma, Sergi; Bolmont, Emeline
2017-06-01
The astrophysics community uses different tools for computational tasks such as complex systems simulations, radiative transfer calculations or big data. Programming languages like Fortran, C or C++ are commonly present in these tools and, generally, the language choice was made based on the need for performance. However, this comes at a cost: safety. For instance, a common source of error is the access to invalid memory regions, which produces random execution behaviors and affects the scientific interpretation of the results. In 2015, Mozilla Research released the first stable version of a new programming language named Rust. Many features make this new language attractive for the scientific community, it is open source and it guarantees memory safety while offering zero-cost abstraction. We explore the advantages and drawbacks of Rust for astrophysics by re-implementing the fundamental parts of Mercury-T, a Fortran code that simulates the dynamical and tidal evolution of multi-planet systems.
NASA Astrophysics Data System (ADS)
Löwe, Peter; Marín Arraiza, Paloma; Plank, Margret
2016-04-01
The influence of Free and Open Source Software (FOSS) projects on Earth and Space Science Informatics (ESSI) continues to grow, particularly in the emerging context of Data Science or Open Science. The scientific significance and heritage of FOSS projects is only to a limited amount covered by traditional scientific journal articles: Audiovisual conference recordings contain significant information for analysis, reference and citation. In the context of data driven research, this audiovisual content needs to be accessible by effective search capabilities, enabling the content to be searched in depth and retrieved. Thereby, it is ensured that the content producers receive credit for their efforts within the respective communities. For Geoinformatics and ESSI, one distinguished driver is the OSGeo Foundation (OSGeo), founded in 2006 to support and promote the interdisciplinary collaborative development of open geospatial technologies and data. The organisational structure is based on software projects that have successfully passed the OSGeo incubation process, proving their compliance with FOSS licence models. This quality assurance is crucial for the transparent and unhindered application in (Open) Science. The main communication channels within and between the OSGeo-hosted community projects for face to face meetings are conferences on national, regional and global scale. Video recordings have been complementing the scientific proceedings since 2006. During the last decade, the growing body of OSGeo videos has been negatively affected by content loss, obsolescence of video technology and dependence on commercial video portals. Even worse, the distributed storage and lack of metadata do not guarantee concise and efficient access of the content. This limits the retrospective analysis of video content from past conferences. But, it also indicates a need for reliable, standardized, comparable audiovisual repositories for the future, as the number of OSGeo projects continues to grow - and so does the number of topics to be addressed at conferences. Up to now, commercial Web 2.0 platforms like Youtube and Vimeo were used. However, these platforms lack capabilities for long-term archiving and scientific citation, such as persistent identifiers that permit the citation of specific intervals of the overall content. To address these issues, the scientific library community has started to implement improved multimedia archiving and retrieval services for scientific audiovisual content which fulfil these requirements. Using the reference case of the OSGeo conference video recordings, this paper gives an overview over the new and growing collection activities by the German National Library of Science and Technology for audiovisual content in Geoinformatics/ESSI in the TIB|AV Portal for audiovisual content. Following a successful start in 2014 and positive response from the OSGeo Community, the TIB acquisition strategy for OSGeo video material was extended to include German, European, North-American and global conference content. The collection grows steadily by new conference content and also by harvesting of past conference videos from commercial Web 2.0 platforms like Youtube and Vimeo. This positions the TIB|AV-Portal as a reliable and concise long-term resource for innovation mining, education and scholarly research within the ESSI context both within Academia and Industry.
Development of a Web-Based Visualization Platform for Climate Research Using Google Earth
NASA Technical Reports Server (NTRS)
Sun, Xiaojuan; Shen, Suhung; Leptoukh, Gregory G.; Wang, Panxing; Di, Liping; Lu, Mingyue
2011-01-01
Recently, it has become easier to access climate data from satellites, ground measurements, and models from various data centers, However, searching. accessing, and prc(essing heterogeneous data from different sources are very tim -consuming tasks. There is lack of a comprehensive visual platform to acquire distributed and heterogeneous scientific data and to render processed images from a single accessing point for climate studies. This paper. documents the design and implementation of a Web-based visual, interoperable, and scalable platform that is able to access climatological fields from models, satellites, and ground stations from a number of data sources using Google Earth (GE) as a common graphical interface. The development is based on the TCP/IP protocol and various data sharing open sources, such as OPeNDAP, GDS, Web Processing Service (WPS), and Web Mapping Service (WMS). The visualization capability of integrating various measurements into cE extends dramatically the awareness and visibility of scientific results. Using embedded geographic information in the GE, the designed system improves our understanding of the relationships of different elements in a four dimensional domain. The system enables easy and convenient synergistic research on a virtual platform for professionals and the general public, gr$tly advancing global data sharing and scientific research collaboration.
An open source workflow for 3D printouts of scientific data volumes
NASA Astrophysics Data System (ADS)
Loewe, P.; Klump, J. F.; Wickert, J.; Ludwig, M.; Frigeri, A.
2013-12-01
As the amount of scientific data continues to grow, researchers need new tools to help them visualize complex data. Immersive data-visualisations are helpful, yet fail to provide tactile feedback and sensory feedback on spatial orientation, as provided from tangible objects. The gap in sensory feedback from virtual objects leads to the development of tangible representations of geospatial information to solve real world problems. Examples are animated globes [1], interactive environments like tangible GIS [2], and on demand 3D prints. The production of a tangible representation of a scientific data set is one step in a line of scientific thinking, leading from the physical world into scientific reasoning and back: The process starts with a physical observation, or from a data stream generated by an environmental sensor. This data stream is turned into a geo-referenced data set. This data is turned into a volume representation which is converted into command sequences for the printing device, leading to the creation of a 3D printout. As a last, but crucial step, this new object has to be documented and linked to the associated metadata, and curated in long term repositories to preserve its scientific meaning and context. The workflow to produce tangible 3D data-prints from science data at the German Research Centre for Geosciences (GFZ) was implemented as a software based on the Free and Open Source Geoinformatics tools GRASS GIS and Paraview. The workflow was successfully validated in various application scenarios at GFZ using a RapMan printer to create 3D specimens of elevation models, geological underground models, ice penetrating radar soundings for planetology, and space time stacks for Tsunami model quality assessment. While these first pilot applications have demonstrated the feasibility of the overall approach [3], current research focuses on the provision of the workflow as Software as a Service (SAAS), thematic generalisation of information content and long term curation. [1] http://www.arcscience.com/systemDetails/omniTechnology.html [2] http://video.esri.com/watch/53/landscape-design-with-tangible-gis [3] Löwe et al. (2013), Geophysical Research Abstracts, Vol. 15, EGU2013-1544-1.
The New Landscape of Ethics and Integrity in Scholarly Publishing
NASA Astrophysics Data System (ADS)
Hanson, B.
2016-12-01
Scholarly peer-reviewed publications serve five major functions: They (i) have served as the primary, useful archive of scientific progress for hundreds of years; (ii) have been one principal way that scientists, and more recently departments and institutions, are evaluated; (iii) trigger and are the source of much communication about science to the public; (iv) have been primary revenue sources for scientific societies and companies; and (v) more recently play a critical and codified role in legal and regulatory decisions and advice to governments. Recent dynamics in science as well as in society, including the growth of online communication and new revenue sources, are influencing and altering particularly the first four core functions greatly. The changes in turn are posing important new challenges to the ethics and integrity of scholarly publishing and thus science in ways that are not widely or fully appreciated. For example, the expansion of electronic publishing has raised a number of new challenges for publishers with respect to their responsibility for curating scientific knowledge and even preserving the basic integrity of a manuscript. Many challenges are realted to new or expanded financial conflicts of interest related to the use of metrics such as the Journal Impact Factor, the expansion of alternate business models such as open access and advertising, and the fact that publishers are increasingly involved in framing communication around papers they are publishing. Solutions pose new responsibilities for scientists, publishers, and scientific societies, especially around transparency in their operations.
Wyrm: A Brain-Computer Interface Toolbox in Python.
Venthur, Bastian; Dähne, Sven; Höhne, Johannes; Heller, Hendrik; Blankertz, Benjamin
2015-10-01
In the last years Python has gained more and more traction in the scientific community. Projects like NumPy, SciPy, and Matplotlib have created a strong foundation for scientific computing in Python and machine learning packages like scikit-learn or packages for data analysis like Pandas are building on top of it. In this paper we present Wyrm ( https://github.com/bbci/wyrm ), an open source BCI toolbox in Python. Wyrm is applicable to a broad range of neuroscientific problems. It can be used as a toolbox for analysis and visualization of neurophysiological data and in real-time settings, like an online BCI application. In order to prevent software defects, Wyrm makes extensive use of unit testing. We will explain the key aspects of Wyrm's software architecture and design decisions for its data structure, and demonstrate and validate the use of our toolbox by presenting our approach to the classification tasks of two different data sets from the BCI Competition III. Furthermore, we will give a brief analysis of the data sets using our toolbox, and demonstrate how we implemented an online experiment using Wyrm. With Wyrm we add the final piece to our ongoing effort to provide a complete, free and open source BCI system in Python.
AWOB: A Collaborative Workbench for Astronomers
NASA Astrophysics Data System (ADS)
Kim, J. W.; Lemson, G.; Bulatovic, N.; Makarenko, V.; Vogler, A.; Voges, W.; Yao, Y.; Kiefl, R.; Koychev, S.
2015-09-01
We present the Astronomers Workbench (AWOB1), a web-based collaboration and publication platform for a scientific project of any size, developed in collaboration between the Max-Planck institutes of Astrophysics (MPA) and Extra-terrestrial Physics (MPE) and the Max-Planck Digital Library (MPDL). AWOB facilitates the collaboration between geographically distributed astronomers working on a common project throughout its whole scientific life cycle. AWOB does so by making it very easy for scientists to set up and manage a collaborative workspace for individual projects, where data can be uploaded and shared. It supports inviting project collaborators, provides wikis, automated mailing lists, calendars and event notification and has a built in chat facility. It allows the definition and tracking of tasks within projects and supports easy creation of e-publications for the dissemination of data and images and other resources that cannot be added to submitted papers. AWOB extends the project concept to larger scale consortia, within which it is possible to manage working groups and sub-projects. The existing AWOB instance has so far been limited to Max-Planck members and their collaborators, but will be opened to the whole astronomical community. AWOB is an open-source project and its source code is available upon request. We intend to extend AWOB's functionality also to other disciplines, and would greatly appreciate contributions from the community.
Pöschl, Ulrich
2012-01-01
The traditional forms of scientific publishing and peer review do not live up to all demands of efficient communication and quality assurance in today’s highly diverse and rapidly evolving world of science. They need to be advanced and complemented by interactive and transparent forms of review, publication, and discussion that are open to the scientific community and to the public. The advantages of open access, public peer review, and interactive discussion can be efficiently and flexibly combined with the strengths of traditional scientific peer review. Since 2001 the benefits and viability of this approach are clearly demonstrated by the highly successful interactive open access journal Atmospheric Chemistry and Physics (ACP, www.atmos-chem-phys.net) and a growing number of sister journals launched and operated by the European Geosciences Union (EGU, www.egu.eu) and the open access publisher Copernicus (www.copernicus.org). The interactive open access journals are practicing an integrative multi-stage process of publication and peer review combined with interactive public discussion, which effectively resolves the dilemma between rapid scientific exchange and thorough quality assurance. Key features and achievements of this approach are: top quality and impact, efficient self-regulation and low rejection rates, high attractivity and rapid growth, low costs, and financial sustainability. In fact, ACP and the EGU interactive open access sister journals are by most if not all standards more successful than comparable scientific journals with traditional or alternative forms of peer review (editorial statistics, publication statistics, citation statistics, economic costs, and sustainability). The high efficiency and predictive validity of multi-stage open peer review have been confirmed in a series of dedicated studies by evaluation experts from the social sciences, and the same or similar concepts have recently also been adopted in other disciplines, including the life sciences and economics. Multi-stage open peer review can be flexibly adjusted to the needs and peculiarities of different scientific communities. Due to the flexibility and compatibility with traditional structures of scientific publishing and peer review, the multi-stage open peer review concept enables efficient evolution in scientific communication and quality assurance. It has the potential for swift replacement of hidden peer review as the standard of scientific quality assurance, and it provides a basis for open evaluation in science. PMID:22783183
Scientific Utopia: An agenda for improving scientific communication (Invited)
NASA Astrophysics Data System (ADS)
Nosek, B.
2013-12-01
The scientist's primary incentive is publication. In the present culture, open practices do not increase chances of publication, and they often require additional work. Practicing the abstract scientific values of openness and reproducibility thus requires behaviors in addition to those relevant for the primary, concrete rewards. When in conflict, concrete rewards are likely to dominate over abstract ones. As a consequence, the reward structure for scientists does not encourage openness and reproducibility. This can be changed by nudging incentives to align scientific practices with scientific values. Science will benefit by creating and connecting technologies that nudge incentives while supporting and improving the scientific workflow. For example, it should be as easy to search the research literature for my topic as it is to search the Internet to find hilarious videos of cats falling off of furniture. I will introduce the Center for Open Science (http://centerforopenscience.org/) and efforts to improve openness and reproducibility such as http://openscienceframework.org/. There will be no cats.
A Tale of Two Observing Systems: Interoperability in the World of Microsoft Windows
NASA Astrophysics Data System (ADS)
Babin, B. L.; Hu, L.
2008-12-01
Louisiana Universities Marine Consortium's (LUMCON) and Dauphin Island Sea Lab's (DISL) Environmental Monitoring System provide a unified coastal ocean observing system. These two systems are mirrored to maintain autonomy while offering an integrated data sharing environment. Both systems collect data via Campbell Scientific Data loggers, store the data in Microsoft SQL servers, and disseminate the data in real- time on the World Wide Web via Microsoft Internet Information Servers and Active Server Pages (ASP). The utilization of Microsoft Windows technologies presented many challenges to these observing systems as open source tools for interoperability grow. The current open source tools often require the installation of additional software. In order to make data available through common standards formats, "home grown" software has been developed. One example of this is the development of software to generate xml files for transmission to the National Data Buoy Center (NDBC). OOSTethys partners develop, test and implement easy-to-use, open-source, OGC-compliant software., and have created a working prototype of networked, semantically interoperable, real-time data systems. Partnering with OOSTethys, we are developing a cookbook to implement OGC web services. The implementation will be written in ASP, will run in a Microsoft operating system environment, and will serve data via Sensor Observation Services (SOS). This cookbook will give observing systems running Microsoft Windows the tools to easily participate in the Open Geospatial Consortium (OGC) Oceans Interoperability Experiment (OCEANS IE).
Modern software approaches applied to a Hydrological model: the GEOtop Open-Source Software Project
NASA Astrophysics Data System (ADS)
Cozzini, Stefano; Endrizzi, Stefano; Cordano, Emanuele; Bertoldi, Giacomo; Dall'Amico, Matteo
2017-04-01
The GEOtop hydrological scientific package is an integrated hydrological model that simulates the heat and water budgets at and below the soil surface. It describes the three-dimensional water flow in the soil and the energy exchange with the atmosphere, considering the radiative and turbulent fluxes. Furthermore, it reproduces the highly non-linear interactions between the water and energy balance during soil freezing and thawing, and simulates the temporal evolution of snow cover, soil temperature and moisture. The core components of the package were presented in the 2.0 version (Endrizzi et al, 2014), which was released as Free Software Open-source project. However, despite the high scientific quality of the project, a modern software engineering approach was still missing. Such weakness hindered its scientific potential and its use both as a standalone package and, more importantly, in an integrate way with other hydrological software tools. In this contribution we present our recent software re-engineering efforts to create a robust and stable scientific software package open to the hydrological community, easily usable by researchers and experts, and interoperable with other packages. The activity takes as a starting point the 2.0 version, scientifically tested and published. This version, together with several test cases based on recent published or available GEOtop applications (Cordano and Rigon, 2013, WRR, Kollet et al, 2016, WRR) provides the baseline code and a certain number of referenced results as benchmark. Comparison and scientific validation can then be performed for each software re-engineering activity performed on the package. To keep track of any single change the package is published on its own github repository geotopmodel.github.io/geotop/ under GPL v3.0 license. A Continuous Integration mechanism by means of Travis-CI has been enabled on the github repository on master and main development branches. The usage of CMake configuration tool and the suite of tests (easily manageable by means of ctest tools) greatly reduces the burden of the installation and allows us to enhance portability on different compilers and Operating system platforms. The package was also complemented by several software tools which provide web-based visualization of results based on R plugins, in particular "shiny" (Chang at al, 2016), "geotopbricks" and "geotopOptim2" (Cordano et al, 2016) packages, which allow rapid and efficient scientific validation of new examples and tests. The software re-engineering activities are still under development. However, our first results are promising enough to eventually reach a robust and stable software project that manages in a flexible way a complex state-of-the-art hydrological model like GEOtop and integrates it into wider workflows.
Kan, Guangyuan; He, Xiaoyan; Ding, Liuqian; Li, Jiren; Liang, Ke; Hong, Yang
2017-10-01
The shuffled complex evolution optimization developed at the University of Arizona (SCE-UA) has been successfully applied in various kinds of scientific and engineering optimization applications, such as hydrological model parameter calibration, for many years. The algorithm possesses good global optimality, convergence stability and robustness. However, benchmark and real-world applications reveal the poor computational efficiency of the SCE-UA. This research aims at the parallelization and acceleration of the SCE-UA method based on powerful heterogeneous computing technology. The parallel SCE-UA is implemented on Intel Xeon multi-core CPU (by using OpenMP and OpenCL) and NVIDIA Tesla many-core GPU (by using OpenCL, CUDA, and OpenACC). The serial and parallel SCE-UA were tested based on the Griewank benchmark function. Comparison results indicate the parallel SCE-UA significantly improves computational efficiency compared to the original serial version. The OpenCL implementation obtains the best overall acceleration results however, with the most complex source code. The parallel SCE-UA has bright prospects to be applied in real-world applications.
Perspectives on Open Science and scientific data sharing:an interdisciplinary workshop.
Destro Bisol, Giovanni; Anagnostou, Paolo; Capocasa, Marco; Bencivelli, Silvia; Cerroni, Andrea; Contreras, Jorge; Enke, Neela; Fantini, Bernardino; Greco, Pietro; Heeney, Catherine; Luzi, Daniela; Manghi, Paolo; Mascalzoni, Deborah; Molloy, Jennifer; Parenti, Fabio; Wicherts, Jelte; Boulton, Geoffrey
2014-01-01
Looking at Open Science and Open Data from a broad perspective. This is the idea behind "Scientific data sharing: an interdisciplinary workshop", an initiative designed to foster dialogue between scholars from different scientific domains which was organized by the Istituto Italiano di Antropologia in Anagni, Italy, 2-4 September 2013.We here report summaries of the presentations and discussions at the meeting. They deal with four sets of issues: (i) setting a common framework, a general discussion of open data principles, values and opportunities; (ii) insights into scientific practices, a view of the way in which the open data movement is developing in a variety of scientific domains (biology, psychology, epidemiology and archaeology); (iii) a case study of human genomics, which was a trail-blazer in data sharing, and which encapsulates the tension that can occur between large-scale data sharing and one of the boundaries of openness, the protection of individual data; (iv) open science and the public, based on a round table discussion about the public communication of science and the societal implications of open science. There were three proposals for the planning of further interdisciplinary initiatives on open science. Firstly, there is a need to integrate top-down initiatives by governments, institutions and journals with bottom-up approaches from the scientific community. Secondly, more should be done to popularize the societal benefits of open science, not only in providing the evidence needed by citizens to draw their own conclusions on scientific issues that are of concern to them, but also explaining the direct benefits of data sharing in areas such as the control of infectious disease. Finally, introducing arguments from social sciences and humanities in the educational dissemination of open data may help students become more profoundly engaged with Open Science and look at science from a broader perspective.
An open annotation ontology for science on web 3.0
2011-01-01
Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Methods Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. Results This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO’s Google Code page: http://code.google.com/p/annotation-ontology/ . Conclusions The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors. PMID:21624159
An open annotation ontology for science on web 3.0.
Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim
2011-05-17
There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors.
Managing a tier-2 computer centre with a private cloud infrastructure
NASA Astrophysics Data System (ADS)
Bagnasco, Stefano; Berzano, Dario; Brunetti, Riccardo; Lusso, Stefano; Vallero, Sara
2014-06-01
In a typical scientific computing centre, several applications coexist and share a single physical infrastructure. An underlying Private Cloud infrastructure eases the management and maintenance of such heterogeneous applications (such as multipurpose or application-specific batch farms, Grid sites, interactive data analysis facilities and others), allowing dynamic allocation resources to any application. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques. Such infrastructures are being deployed in some large centres (see e.g. the CERN Agile Infrastructure project), but with several open-source tools reaching maturity this is becoming viable also for smaller sites. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 centre, an Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The private cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem and the OpenWRT Linux distribution (used for network virtualization); a future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and OCCI.
2012-01-01
Background The OpenTox Framework, developed by the partners in the OpenTox project (http://www.opentox.org), aims at providing a unified access to toxicity data, predictive models and validation procedures. Interoperability of resources is achieved using a common information model, based on the OpenTox ontologies, describing predictive algorithms, models and toxicity data. As toxicological data may come from different, heterogeneous sources, a deployed ontology, unifying the terminology and the resources, is critical for the rational and reliable organization of the data, and its automatic processing. Results The following related ontologies have been developed for OpenTox: a) Toxicological ontology – listing the toxicological endpoints; b) Organs system and Effects ontology – addressing organs, targets/examinations and effects observed in in vivo studies; c) ToxML ontology – representing semi-automatic conversion of the ToxML schema; d) OpenTox ontology– representation of OpenTox framework components: chemical compounds, datasets, types of algorithms, models and validation web services; e) ToxLink–ToxCast assays ontology and f) OpenToxipedia community knowledge resource on toxicology terminology. OpenTox components are made available through standardized REST web services, where every compound, data set, and predictive method has a unique resolvable address (URI), used to retrieve its Resource Description Framework (RDF) representation, or to initiate the associated calculations and generate new RDF-based resources. The services support the integration of toxicity and chemical data from various sources, the generation and validation of computer models for toxic effects, seamless integration of new algorithms and scientifically sound validation routines and provide a flexible framework, which allows building arbitrary number of applications, tailored to solving different problems by end users (e.g. toxicologists). Availability The OpenTox toxicological ontology projects may be accessed via the OpenTox ontology development page http://www.opentox.org/dev/ontology; the OpenTox ontology is available as OWL at http://opentox.org/api/1 1/opentox.owl, the ToxML - OWL conversion utility is an open source resource available at http://ambit.svn.sourceforge.net/viewvc/ambit/branches/toxml-utils/ PMID:22541598
Tcheremenskaia, Olga; Benigni, Romualdo; Nikolova, Ivelina; Jeliazkova, Nina; Escher, Sylvia E; Batke, Monika; Baier, Thomas; Poroikov, Vladimir; Lagunin, Alexey; Rautenberg, Micha; Hardy, Barry
2012-04-24
The OpenTox Framework, developed by the partners in the OpenTox project (http://www.opentox.org), aims at providing a unified access to toxicity data, predictive models and validation procedures. Interoperability of resources is achieved using a common information model, based on the OpenTox ontologies, describing predictive algorithms, models and toxicity data. As toxicological data may come from different, heterogeneous sources, a deployed ontology, unifying the terminology and the resources, is critical for the rational and reliable organization of the data, and its automatic processing. The following related ontologies have been developed for OpenTox: a) Toxicological ontology - listing the toxicological endpoints; b) Organs system and Effects ontology - addressing organs, targets/examinations and effects observed in in vivo studies; c) ToxML ontology - representing semi-automatic conversion of the ToxML schema; d) OpenTox ontology- representation of OpenTox framework components: chemical compounds, datasets, types of algorithms, models and validation web services; e) ToxLink-ToxCast assays ontology and f) OpenToxipedia community knowledge resource on toxicology terminology.OpenTox components are made available through standardized REST web services, where every compound, data set, and predictive method has a unique resolvable address (URI), used to retrieve its Resource Description Framework (RDF) representation, or to initiate the associated calculations and generate new RDF-based resources.The services support the integration of toxicity and chemical data from various sources, the generation and validation of computer models for toxic effects, seamless integration of new algorithms and scientifically sound validation routines and provide a flexible framework, which allows building arbitrary number of applications, tailored to solving different problems by end users (e.g. toxicologists). The OpenTox toxicological ontology projects may be accessed via the OpenTox ontology development page http://www.opentox.org/dev/ontology; the OpenTox ontology is available as OWL at http://opentox.org/api/1 1/opentox.owl, the ToxML - OWL conversion utility is an open source resource available at http://ambit.svn.sourceforge.net/viewvc/ambit/branches/toxml-utils/
Pythran: enabling static optimization of scientific Python programs
NASA Astrophysics Data System (ADS)
Guelton, Serge; Brunet, Pierrick; Amini, Mehdi; Merlini, Adrien; Corbillon, Xavier; Raynaud, Alan
2015-01-01
Pythran is an open source static compiler that turns modules written in a subset of Python language into native ones. Assuming that scientific modules do not rely much on the dynamic features of the language, it trades them for powerful, possibly inter-procedural, optimizations. These optimizations include detection of pure functions, temporary allocation removal, constant folding, Numpy ufunc fusion and parallelization, explicit thread-level parallelism through OpenMP annotations, false variable polymorphism pruning, and automatic vector instruction generation such as AVX or SSE. In addition to these compilation steps, Pythran provides a C++ runtime library that leverages the C++ STL to provide generic containers, and the Numeric Template Toolbox for Numpy support. It takes advantage of modern C++11 features such as variadic templates, type inference, move semantics and perfect forwarding, as well as classical idioms such as expression templates. Unlike the Cython approach, Pythran input code remains compatible with the Python interpreter. Output code is generally as efficient as the annotated Cython equivalent, if not more, but without the backward compatibility loss.
Technical note: The Linked Paleo Data framework - a common tongue for paleoclimatology
NASA Astrophysics Data System (ADS)
McKay, Nicholas P.; Emile-Geay, Julien
2016-04-01
Paleoclimatology is a highly collaborative scientific endeavor, increasingly reliant on online databases for data sharing. Yet there is currently no universal way to describe, store and share paleoclimate data: in other words, no standard. Data standards are often regarded by scientists as mere technicalities, though they underlie much scientific and technological innovation, as well as facilitating collaborations between research groups. In this article, we propose a preliminary data standard for paleoclimate data, general enough to accommodate all the archive and measurement types encountered in a large international collaboration (PAGES 2k). We also introduce a vehicle for such structured data (Linked Paleo Data, or LiPD), leveraging recent advances in knowledge representation (Linked Open Data).The LiPD framework enables quick querying and extraction, and we expect that it will facilitate the writing of open-source community codes to access, analyze, model and visualize paleoclimate observations. We welcome community feedback on this standard, and encourage paleoclimatologists to experiment with the format for their own purposes.
Science, society, and the coastal groundwater squeeze
NASA Astrophysics Data System (ADS)
Michael, Holly A.; Post, Vincent E. A.; Wilson, Alicia M.; Werner, Adrian D.
2017-04-01
Coastal zones encompass the complex interface between land and sea. Understanding how water and solutes move within and across this interface is essential for managing resources for society. The increasingly dense human occupation of coastal zones disrupts natural groundwater flow patterns and degrades freshwater resources by both overuse and pollution. This pressure results in a "coastal groundwater squeeze," where the thin veneers of potable freshwater are threatened by contaminant sources at the land surface and saline groundwater at depth. Scientific advances in the field of coastal hydrogeology have enabled responsible management of water resources and protection of important ecosystems. To address the problems of the future, we must continue to make scientific advances, and groundwater hydrology needs to be firmly embedded in integrated coastal zone management. This will require interdisciplinary scientific collaboration, open communication between scientists and the public, and strong partnerships with policymakers.
Multidimensional Environmental Data Resource Brokering on Computational Grids and Scientific Clouds
NASA Astrophysics Data System (ADS)
Montella, Raffaele; Giunta, Giulio; Laccetti, Giuliano
Grid computing has widely evolved over the past years, and its capabilities have found their way even into business products and are no longer relegated to scientific applications. Today, grid computing technology is not restricted to a set of specific grid open source or industrial products, but rather it is comprised of a set of capabilities virtually within any kind of software to create shared and highly collaborative production environments. These environments are focused on computational (workload) capabilities and the integration of information (data) into those computational capabilities. An active grid computing application field is the fully virtualization of scientific instruments in order to increase their availability and decrease operational and maintaining costs. Computational and information grids allow to manage real-world objects in a service-oriented way using industrial world-spread standards.
KLASS: Kennedy Launch Academy Simulation System
NASA Technical Reports Server (NTRS)
Garner, Lesley C.
2007-01-01
Software provides access to many sophisticated scientific instrumentation (Scanning Electron Microscope (SEM), a Light Microscope, a Scanning Probe Microscope (covering Scanning Tunneling, Atomic Force, and Magnetic Force microscopy), and an Energy Dispersive Spectrometer for the SEM). Flash animation videos explain how each of the instruments work. Videos on how they are used at NASA and the sample preparation. Measuring and labeling tools provided with each instrument. Hands on experience of controlling the virtual instrument to conduct investigations, much like the real scientists at NASA do. Very open architecture. Open source on SourceForge. Extensive use of XML Target audience is high school and entry-level college students. "Many beginning students never get closer to an electron microscope than the photos in their textbooks. But anyone can get a sense of what the instrument can do by downloading this simulator from NASA's Kennedy Space Center." Science Magazine, April 8th, 2005
Key Lessons in Building "Data Commons": The Open Science Data Cloud Ecosystem
NASA Astrophysics Data System (ADS)
Patterson, M.; Grossman, R.; Heath, A.; Murphy, M.; Wells, W.
2015-12-01
Cloud computing technology has created a shift around data and data analysis by allowing researchers to push computation to data as opposed to having to pull data to an individual researcher's computer. Subsequently, cloud-based resources can provide unique opportunities to capture computing environments used both to access raw data in its original form and also to create analysis products which may be the source of data for tables and figures presented in research publications. Since 2008, the Open Cloud Consortium (OCC) has operated the Open Science Data Cloud (OSDC), which provides scientific researchers with computational resources for storing, sharing, and analyzing large (terabyte and petabyte-scale) scientific datasets. OSDC has provided compute and storage services to over 750 researchers in a wide variety of data intensive disciplines. Recently, internal users have logged about 2 million core hours each month. The OSDC also serves the research community by colocating these resources with access to nearly a petabyte of public scientific datasets in a variety of fields also accessible for download externally by the public. In our experience operating these resources, researchers are well served by "data commons," meaning cyberinfrastructure that colocates data archives, computing, and storage infrastructure and supports essential tools and services for working with scientific data. In addition to the OSDC public data commons, the OCC operates a data commons in collaboration with NASA and is developing a data commons for NOAA datasets. As cloud-based infrastructures for distributing and computing over data become more pervasive, we ask, "What does it mean to publish data in a data commons?" Here we present the OSDC perspective and discuss several services that are key in architecting data commons, including digital identifier services.
Chemotion ELN: an Open Source electronic lab notebook for chemists in academia.
Tremouilhac, Pierre; Nguyen, An; Huang, Yu-Chieh; Kotov, Serhii; Lütjohann, Dominic Sebastian; Hübsch, Florian; Jung, Nicole; Bräse, Stefan
2017-09-25
The development of an electronic lab notebook (ELN) for researchers working in the field of chemical sciences is presented. The web based application is available as an Open Source software that offers modern solutions for chemical researchers. The Chemotion ELN is equipped with the basic functionalities necessary for the acquisition and processing of chemical data, in particular the work with molecular structures and calculations based on molecular properties. The ELN supports planning, description, storage, and management for the routine work of organic chemists. It also provides tools for communicating and sharing the recorded research data among colleagues. Meeting the requirements of a state of the art research infrastructure, the ELN allows the search for molecules and reactions not only within the user's data but also in conventional external sources as provided by SciFinder and PubChem. The presented development makes allowance for the growing dependency of scientific activity on the availability of digital information by providing Open Source instruments to record and reuse research data. The current version of the ELN has been using for over half of a year in our chemistry research group, serves as a common infrastructure for chemistry research and enables chemistry researchers to build their own databases of digital information as a prerequisite for the detailed, systematic investigation and evaluation of chemical reactions and mechanisms.
NASA Astrophysics Data System (ADS)
Selker, J. S.; Roques, C.; Higgins, C. W.; Good, S. P.; Hut, R.; Selker, A.
2015-12-01
The confluence of 3-Dimensional printing, low-cost solid-state-sensors, low-cost, low-power digital controllers (e.g., Arduinos); and open-source publishing (e.g., Github) is poised to transform environmental sensing. The Open-Source Published Environmental Sensing (OPENS) laboratory has launched and is available for all to use. OPENS combines cutting edge technologies and makes them available to the global environmental sensing community. OPENS includes a Maker lab space where people may collaborate in person or virtually via on-line forum for the publication and discussion of environmental sensing technology (Corvallis, Oregon, USA, please feel free to request a free reservation for space and equipment use). The physical lab houses a test-bed for sensors, as well as a complete classical machine shop, 3-D printers, electronics development benches, and workstations for code development. OPENS will provide a web-based formal publishing framework wherein global students and scientists can peer-review publish (with DOI) novel and evolutionary advancements in environmental sensor systems. This curated and peer-reviewed digital collection will include complete sets of "printable" parts and operating computer code for sensing systems. The physical lab will include all of the machines required to produce these sensing systems. These tools can be addressed in person or virtually, creating a truly global venue for advancement in monitoring earth's environment and agricultural systems. In this talk we will present an example of the process of design and publication the design and data from the OPENS-Permeameter. The publication includes 3-D printing code, Arduino (or other control/logging platform) operational code; sample data sets, and a full discussion of the design set in the scientific context of previous related devices. Editors for the peer-review process are currently sought - contact John.Selker@Oregonstate.edu or Clement.Roques@Oregonstate.edu.
Open source and healthcare in Europe - time to put leading edge ideas into practice.
Murray, Peter J; Wright, Graham; Karopka, Thomas; Betts, Helen; Orel, Andrej
2009-01-01
Free/Libre and Open Source Software (FLOSS) is a process of software development, a method of licensing and a philosophy. Although FLOSS plays a significant role in several market areas, the impact in the health care arena is still limited. FLOSS is promoted as one of the most effective means for overcoming fragmentation in the health care sector and providing a basis for more efficient, timely and cost effective health care provision. The 2008 European Federation for Medical Informatics (EFMI) Special Topic Conference (STC) explored a range of current and future issues related to FLOSS in healthcare (FLOSS-HC). In particular, there was a focus on health records, ubiquitous computing, knowledge sharing, and current and future applications. Discussions resulted in a list of main barriers and challenges for use of FLOSS-HC. Based on the outputs of this event, the 2004 Open Steps events and subsequent workshops at OSEHC2009 and Med-e-Tel 2009, a four-step strategy has been proposed for FLOSS-HC: 1) a FLOSS-HC inventory; 2) a FLOSS-HC collaboration platform, use case database and knowledge base; 3) a worldwide FLOSS-HC network; and 4) FLOSS-HC dissemination activities. The workshop will further refine this strategy and elaborate avenues for FLOSS-HC from scientific, business and end-user perspectives. To gain acceptance by different stakeholders in the health care industry, different activities have to be conducted in collaboration. The workshop will focus on the scientific challenges in developing methodologies and criteria to support FLOSS-HC in becoming a viable alternative to commercial and proprietary software development and deployment.
HDDTOOLS: an R package serving Hydrological Data Discovery Tools
NASA Astrophysics Data System (ADS)
Vitolo, C.; Buytaert, W.
2014-12-01
Many governmental bodies and institutions are currently committed to publish open data as the result of a trend of increasing transparency, based on which a wide variety of information produced at public expense is now becoming open and freely available to improve public involvement in the process of decision and policy making. Discovery, access and retrieval of information is, however, not always a simple task. Especially when programmatic access to data resources is not allowed, downloading metadata catalogue, select the information needed, request datasets, de-compression, conversion, manual filtering and parsing can become rather tedious. The R package "hddtools" is an open source project, designed to make all the above operations more efficient by means of re-usable functions. The package facilitate non programmatic access to various online data sources such as the Global Runoff Data Centre, NASA's TRMM mission, the Data60UK database amongst others. This package complements R's growing functionality in environmental web technologies to bridge the gap between data providers and data consumers and it is designed to be the starting building block of scientific workflows for linking data and models in a seamless fashion.
OpenDrift v1.0: a generic framework for trajectory modelling
NASA Astrophysics Data System (ADS)
Dagestad, Knut-Frode; Röhrs, Johannes; Breivik, Øyvind; Ådlandsvik, Bjørn
2018-04-01
OpenDrift is an open-source Python-based framework for Lagrangian particle modelling under development at the Norwegian Meteorological Institute with contributions from the wider scientific community. The framework is highly generic and modular, and is designed to be used for any type of drift calculations in the ocean or atmosphere. A specific module within the OpenDrift framework corresponds to a Lagrangian particle model in the traditional sense. A number of modules have already been developed, including an oil drift module, a stochastic search-and-rescue module, a pelagic egg module, and a basic module for atmospheric drift. The framework allows for the ingestion of an unspecified number of forcing fields (scalar and vectorial) from various sources, including Eulerian ocean, atmosphere and wave models, but also measurements or a priori values for the same variables. A basic backtracking mechanism is inherent, using sign reversal of the total displacement vector and negative time stepping. OpenDrift is fast and simple to set up and use on Linux, Mac and Windows environments, and can be used with minimal or no Python experience. It is designed for flexibility, and researchers may easily adapt or write modules for their specific purpose. OpenDrift is also designed for performance, and simulations with millions of particles may be performed on a laptop. Further, OpenDrift is designed for robustness and is in daily operational use for emergency preparedness modelling (oil drift, search and rescue, and drifting ships) at the Norwegian Meteorological Institute.
The taxonomic name resolution service: an online tool for automated standardization of plant names
2013-01-01
Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/. PMID:23324024
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moussa, Jonathan E.
2013-05-13
This piece of software is a new feature implemented inside an existing open-source library. Specifically, it is a new implementation of a density functional (HSE, short for Heyd-Scuseria-Ernzerhof) for a repository of density functionals, the libxc library. It fixes some numerical problems with existing implementations, as outlined in a scientific paper recently submitted for publication. Density functionals are components of electronic structure simulations, which model properties of electrons inside molecules and crystals.
Parkhurst, Justin
2016-07-20
The field of cognitive psychology has increasingly provided scientific insights to explore how humans are subject to unconscious sources of evidentiary bias, leading to errors that can affect judgement and decision-making. Increasingly these insights are being applied outside the realm of individual decision-making to the collective arena of policy-making as well. A recent editorial in this journal has particularly lauded the work of the World Bank for undertaking an open and critical reflection on sources of unconscious bias in its own expert staff that could undermine achievement of its key goals. The World Bank case indeed serves as a remarkable case of a global policy-making agency making its own critical reflections transparent for all to see. Yet the recognition that humans are prone to cognitive errors has been known for centuries, and the scientific exploration of such biases provided by cognitive psychology is now well-established. What still remains to be developed, however, is a widespread body of work that can inform efforts to institutionalise strategies to mitigate the multiple sources and forms of evidentiary bias arising within administrative and policy-making environments. Addressing this gap will require a programme of conceptual and empirical work that supports robust development and evaluation of institutional bias mitigation strategies. The cognitive sciences provides a scientific basis on which to proceed, but a critical priority will now be the application of that science to improve policy-making within those agencies taking responsibility for social welfare and development programmes. © 2017 The Author(s); Published by Kerman University of Medical Sciences. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Using R for large spatiotemporal data sets
NASA Astrophysics Data System (ADS)
Pebesma, Edzer
2017-04-01
Writing and sharing scientific software is a means to communicate scientific ideas for finding scientific consensus, no more and no less than writing and sharing scientific papers is. Important factors for successful communication are adopting an open source environment, and using a language that is understood by many. For many scientist, R's combination of rich data abstraction and highly exposed data structures makes it an attractive communication tool. This paper discusses the development of spatial and spatiotemporal data handling and analysis with R since 2000, and will point to some of R's strengths and weaknesses in a historical perspective. We will also discuss a new, S3-based package for feature data ("Simple Features for R"), and point to a way forward into the data science realm, where pipeline-based workflows are assumed. Finally, we will discuss how, in a similar vein, massive satellite or climate model data sets, potentially held in a cloud environment, can be handled and analyzed with R.
Open Babel: An open chemical toolbox
2011-01-01
Background A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendor-neutral formats. Results We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license from http://openbabel.org. PMID:21982300
NASA Astrophysics Data System (ADS)
Taylor, Faith E.; Malamud, Bruce D.; Millington, James D. A.
2016-04-01
Access to reliable spatial and quantitative datasets (e.g., infrastructure maps, historical observations, environmental variables) at regional and site specific scales can be a limiting factor for understanding hazards and risks in developing country settings. Here we present a 'living database' of >75 freely available data sources relevant to hazard and risk in Africa (and more globally). Data sources include national scientific foundations, non-governmental bodies, crowd-sourced efforts, academic projects, special interest groups and others. The database is available at http://tinyurl.com/africa-datasets and is continually being updated, particularly in the context of broader natural hazards research we are doing in the context of Malawi and Kenya. For each data source, we review the spatiotemporal resolution and extent and make our own assessments of reliability and usability of datasets. Although such freely available datasets are sometimes presented as a panacea to improving our understanding of hazards and risk in developing countries, there are both pitfalls and opportunities unique to using this type of freely available data. These include factors such as resolution, homogeneity, uncertainty, access to metadata and training for usage. Based on our experience, use in the field and grey/peer-review literature, we present a suggested set of guidelines for using these free and open source data in developing country contexts.
Layout-aware text extraction from full-text PDF of scientific articles.
Ramakrishnan, Cartic; Patnia, Abhishek; Hovy, Eduard; Burns, Gully Apc
2012-05-28
The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the 'Layout-Aware PDF Text Extraction' (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for our system and identify further areas of improvement. LA-PDFText is an open-source tool for accurately extracting text from full-text scientific articles. The release of the system is available at http://code.google.com/p/lapdftext/.
Layout-aware text extraction from full-text PDF of scientific articles
2012-01-01
Background The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the ‘Layout-Aware PDF Text Extraction’ (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. Results Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for our system and identify further areas of improvement. Conclusions LA-PDFText is an open-source tool for accurately extracting text from full-text scientific articles. The release of the system is available at http://code.google.com/p/lapdftext/. PMID:22640904
Leveraging Open Standards and Technologies to Enhance Community Access to Earth Science Lidar Data
NASA Astrophysics Data System (ADS)
Crosby, C. J.; Nandigam, V.; Krishnan, S.; Cowart, C.; Baru, C.; Arrowsmith, R.
2011-12-01
Lidar (Light Detection and Ranging) data, collected from space, airborne and terrestrial platforms, have emerged as an invaluable tool for a variety of Earth science applications ranging from ice sheet monitoring to modeling of earth surface processes. However, lidar present a unique suite of challenges from the perspective of building cyberinfrastructure systems that enable the scientific community to access these valuable research datasets. Lidar data are typically characterized by millions to billions of individual measurements of x,y,z position plus attributes; these "raw" data are also often accompanied by derived raster products and are frequently terabytes in size. As a relatively new and rapidly evolving data collection technology, relevant open data standards and software projects are immature compared to those for other remote sensing platforms. The NSF-funded OpenTopography Facility project has developed an online lidar data access and processing system that co-locates data with on-demand processing tools to enable users to access both raw point cloud data as well as custom derived products and visualizations. OpenTopography is built on a Service Oriented Architecture (SOA) in which applications and data resources are deployed as standards compliant (XML and SOAP) Web services with the open source Opal Toolkit. To develop the underlying applications for data access, filtering and conversion, and various processing tasks, OpenTopography has heavily leveraged existing open source software efforts for both lidar and raster data. Operating on the de facto LAS binary point cloud format (maintained by ASPRS), open source libLAS and LASlib libraries provide OpenTopography data ingestion, query and translation capabilities. Similarly, raster data manipulation is performed through a suite of services built on the Geospatial Data Abstraction Library (GDAL). OpenTopography has also developed our own algorithm for high-performance gridding of lidar point cloud data, Points2Grid, and have released the code as an open source project. An emerging conversation that the lidar community and OpenTopography are actively engaged in is the need for open, community supported standards and metadata for both full waveform and terrestrial (waveform and discrete return) lidar data. Further, given the immature nature of many lidar data archives and limited online access to public domain data, there is an opportunity to develop interoperable data catalogs based on an open standard such as the OGC CSW specification to facilitate discovery and access to Earth science oriented lidar data.
EarthCollab, building geoscience-centric implementations of the VIVO semantic software suite
NASA Astrophysics Data System (ADS)
Rowan, L. R.; Gross, M. B.; Mayernik, M. S.; Daniels, M. D.; Krafft, D. B.; Kahn, H. J.; Allison, J.; Snyder, C. B.; Johns, E. M.; Stott, D.
2017-12-01
EarthCollab, an EarthCube Building Block project, is extending an existing open-source semantic web application, VIVO, to enable the exchange of information about scientific researchers and resources across institutions. EarthCollab is a collaboration between UNAVCO, a geodetic facility and consortium that supports diverse research projects informed by geodesy, The Bering Sea Project, an interdisciplinary field program whose data archive is hosted by NCAR's Earth Observing Laboratory, and Cornell University. VIVO has been implemented by more than 100 universities and research institutions to highlight research and institutional achievements. This presentation will discuss benefits and drawbacks of working with and extending open source software. Some extensions include plotting georeferenced objects on a map, a mobile-friendly theme, integration of faceting via Elasticsearch, extending the VIVO ontology to capture geoscience-centric objects and relationships, and the ability to cross-link between VIVO instances. Most implementations of VIVO gather information about a single organization. The EarthCollab project created VIVO extensions to enable cross-linking of VIVO instances to reduce the amount of duplicate information about the same people and scientific resources and to enable dynamic linking of related information across VIVO installations. As the list of customizations grows, so does the effort required to maintain compatibility between the EarthCollab forks and the main VIVO code. For example, dozens of libraries and dependencies were updated prior to the VIVO v1.10 release, which introduced conflicts in the EarthCollab cross-linking code. The cross-linking code has been developed to enable sharing of data across different versions of VIVO, however, using a JSON output schema standardized across versions. We will outline lessons learned in working with VIVO and its open source dependencies, which include Jena, Solr, Freemarker, and jQuery and discuss future work by EarthCollab, which includes refining the cross-linking VIVO capabilities by continued integration of persistent and unique identifiers to enable automated lookup and matching across institutional VIVOs.
OpenHelix: bioinformatics education outside of a different box.
Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C
2010-11-01
The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.
OpenHelix: bioinformatics education outside of a different box
Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.
2010-01-01
The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181
The SCEC Broadband Platform: Open-Source Software for Strong Ground Motion Simulation and Validation
NASA Astrophysics Data System (ADS)
Silva, F.; Goulet, C. A.; Maechling, P. J.; Callaghan, S.; Jordan, T. H.
2016-12-01
The Southern California Earthquake Center (SCEC) Broadband Platform (BBP) is a carefully integrated collection of open-source scientific software programs that can simulate broadband (0-100 Hz) ground motions for earthquakes at regional scales. The BBP can run earthquake rupture and wave propagation modeling software to simulate ground motions for well-observed historical earthquakes and to quantify how well the simulated broadband seismograms match the observed seismograms. The BBP can also run simulations for hypothetical earthquakes. In this case, users input an earthquake location and magnitude description, a list of station locations, and a 1D velocity model for the region of interest, and the BBP software then calculates ground motions for the specified stations. The BBP scientific software modules implement kinematic rupture generation, low- and high-frequency seismogram synthesis using wave propagation through 1D layered velocity structures, several ground motion intensity measure calculations, and various ground motion goodness-of-fit tools. These modules are integrated into a software system that provides user-defined, repeatable, calculation of ground-motion seismograms, using multiple alternative ground motion simulation methods, and software utilities to generate tables, plots, and maps. The BBP has been developed over the last five years in a collaborative project involving geoscientists, earthquake engineers, graduate students, and SCEC scientific software developers. The SCEC BBP software released in 2016 can be compiled and run on recent Linux and Mac OS X systems with GNU compilers. It includes five simulation methods, seven simulation regions covering California, Japan, and Eastern North America, and the ability to compare simulation results against empirical ground motion models (aka GMPEs). The latest version includes updated ground motion simulation methods, a suite of new validation metrics and a simplified command line user interface.
Fostering successful scientific software communities
NASA Astrophysics Data System (ADS)
Bangerth, W.; Heister, T.; Hwang, L.; Kellogg, L. H.
2016-12-01
Developing sustainable open source software packages for the sciences appears at first to be primarily a technical challenge: How can one create stable and robust algorithms, appropriate software designs, sufficient documentation, quality assurance strategies such as continuous integration and test suites, or backward compatibility approaches that yield high-quality software usable not only by the authors, but also the broader community of scientists? However, our experience from almost two decades of leading the development of the deal.II software library (http://www.dealii.org, a widely-used finite element package) and the ASPECT code (http://aspect.dealii.org, used to simulate convection in the Earth's mantle) has taught us that technical aspects are not the most difficult ones in scientific open source software. Rather, it is the social challenge of building and maintaining a community of users and developers interested in answering questions on user forums, contributing code, and jointly finding solutions to common technical and non-technical challenges. These problems are posed in an environment where project leaders typically have no resources to reward the majority of contributors, where very few people are specifically paid for the work they do on the project, and with frequent turnover of contributors as project members rotate into and out of jobs. In particular, much software work is done by graduate students who may become fluent enough in a software only a year or two before they leave academia. We will discuss strategies we have found do and do not work in maintaining and growing communities around the scientific software projects we lead. Specifically, we will discuss the management style necessary to keep contributors engaged, ways to give credit where credit is due, and structuring documentation to decrease reliance on forums and thereby allow user communities to grow without straining those who answer questions.
ObsPy: A Python Toolbox for Seismology
NASA Astrophysics Data System (ADS)
Krischer, Lion; Megies, Tobias; Sales de Andrade, Elliott; Barsch, Robert; MacCarthy, Jonathan
2017-04-01
In recent years the Python ecosystem evolved into one of the most powerful and productive scientific environments across disciplines. ObsPy (https://www.obspy.org) is a fully community-driven, open-source project dedicated to providing a bridge for seismology into that ecosystem. It does so by offering Read and write support for essentially every commonly used data format in seismology with a unified interface and automatic format detection. This includes waveform data (MiniSEED, SAC, SEG-Y, Reftek, …) as well as station (SEED, StationXML, …) and event meta information (QuakeML, ZMAP, …). Integrated access to the largest data centers, web services, and real-time data streams (FDSNWS, ArcLink, SeedLink, ...). A powerful signal processing toolbox tuned to the specific needs of seismologists. Utility functionality like travel time calculations with the TauP method, geodetic functions, and data visualizations. ObsPy has been in constant development for more than seven years and is developed and used by scientists around the world with successful applications in all branches of seismology. Additionally it nowadays serves as the foundation for a large number of more specialized packages. This presentation will give a short overview of the capabilities of ObsPy and point out several representative or new use cases. Additionally we will discuss the road ahead as well as the long-term sustainability of open-source scientific software.
New developments in the McStas neutron instrument simulation package
NASA Astrophysics Data System (ADS)
Willendrup, P. K.; Knudsen, E. B.; Klinkby, E.; Nielsen, T.; Farhi, E.; Filges, U.; Lefmann, K.
2014-07-01
The McStas neutron ray-tracing software package is a versatile tool for building accurate simulators of neutron scattering instruments at reactors, short- and long-pulsed spallation sources such as the European Spallation Source. McStas is extensively used for design and optimization of instruments, virtual experiments, data analysis and user training. McStas was founded as a scientific, open-source collaborative code in 1997. This contribution presents the project at its current state and gives an overview of the main new developments in McStas 2.0 (December 2012) and McStas 2.1 (expected fall 2013), including many new components, component parameter uniformisation, partial loss of backward compatibility, updated source brilliance descriptions, developments toward new tools and user interfaces, web interfaces and a new method for estimating beam losses and background from neutron optics.
NASA Astrophysics Data System (ADS)
Löwe, Peter; Plank, Margret; Ziedorn, Frauke
2015-04-01
In data driven research, the access to citation and preservation of the full triad consisting of journal article, research data and -software has started to become good scientific practice. To foster the adoption of this practice the significance of software tools has to be acknowledged, which enable scientists to harness auxiliary audiovisual content in their research work. The advent of ubiquitous computer-based audiovisual recording and corresponding Web 2.0 hosting platforms like Youtube, Slideshare and GitHub has created new ecosystems for contextual information related to scientific software and data, which continues to grow both in size and variety of content. The current Web 2.0 platforms lack capabilities for long term archiving and scientific citation, such as persistent identifiers allowing to reference specific intervals of the overall content. The audiovisual content currently shared by scientists ranges from commented howto-demonstrations on software handling, installation and data-processing, to aggregated visual analytics of the evolution of software projects over time. Such content are crucial additions to the scientific message, as they ensure that software-based data-processing workflows can be assessed, understood and reused in the future. In the context of data driven research, such content needs to be accessible by effective search capabilities, enabling the content to be retrieved and ensuring that the content producers receive credit for their efforts within the scientific community. Improved multimedia archiving and retrieval services for scientific audiovisual content which meet these requirements are currently implemented by the scientific library community. This paper exemplifies the existing challenges, requirements, benefits and the potential of the preservation, accessibility and citability of such audiovisual content for the Open Source communities based on the new audiovisual web service TIB|AV Portal of the German National Library of Science and Technology. The web-based portal allows for extended search capabilities based on enhanced metadata derived by automated video analysis. By combining state-of-the-art multimedia retrieval techniques such as speech-, text-, and image recognition with semantic analysis, content-based access to videos at the segment level is provided. Further, by using the open standard Media Fragment Identifier (MFID), a citable Digital Object Identifier is displayed for each video segment. In addition to the continuously growing footprint of contemporary content, the importance of vintage audiovisual information needs to be considered: This paper showcases the successful application of the TIB|AV-Portal in the preservation and provision of a newly discovered version of a GRASS GIS promotional video produced by US Army -Corps of Enginers Laboratory (US-CERL) in 1987. The video is provides insight into the constraints of the very early days of the GRASS GIS project, which is the oldest active Free and Open Source Software (FOSS) GIS project which has been active for over thirty years. GRASS itself has turned into a collaborative scientific platform and a repository of scientific peer-reviewed code and algorithm/knowledge hub for future generation of scientists [1]. This is a reference case for future preservation activities regarding semantic-enhanced Web 2.0 content from geospatial software projects within Academia and beyond. References: [1] Chemin, Y., Petras V., Petrasova, A., Landa, M., Gebbert, S., Zambelli, P., Neteler, M., Löwe, P.: GRASS GIS: a peer-reviewed scientific platform and future research Repository, Geophysical Research Abstracts, Vol. 17, EGU2015-8314-1, 2015 (submitted)
The Visualization Toolkit (VTK): Rewriting the rendering code for modern graphics cards
NASA Astrophysics Data System (ADS)
Hanwell, Marcus D.; Martin, Kenneth M.; Chaudhary, Aashish; Avila, Lisa S.
2015-09-01
The Visualization Toolkit (VTK) is an open source, permissively licensed, cross-platform toolkit for scientific data processing, visualization, and data analysis. It is over two decades old, originally developed for a very different graphics card architecture. Modern graphics cards feature fully programmable, highly parallelized architectures with large core counts. VTK's rendering code was rewritten to take advantage of modern graphics cards, maintaining most of the toolkit's programming interfaces. This offers the opportunity to compare the performance of old and new rendering code on the same systems/cards. Significant improvements in rendering speeds and memory footprints mean that scientific data can be visualized in greater detail than ever before. The widespread use of VTK means that these improvements will reap significant benefits.
Distributed visualization of gridded geophysical data: the Carbon Data Explorer, version 0.2.3
NASA Astrophysics Data System (ADS)
Endsley, K. A.; Billmire, M. G.
2016-01-01
Due to the proliferation of geophysical models, particularly climate models, the increasing resolution of their spatiotemporal estimates of Earth system processes, and the desire to easily share results with collaborators, there is a genuine need for tools to manage, aggregate, visualize, and share data sets. We present a new, web-based software tool - the Carbon Data Explorer - that provides these capabilities for gridded geophysical data sets. While originally developed for visualizing carbon flux, this tool can accommodate any time-varying, spatially explicit scientific data set, particularly NASA Earth system science level III products. In addition, the tool's open-source licensing and web presence facilitate distributed scientific visualization, comparison with other data sets and uncertainty estimates, and data publishing and distribution.
Singularity: Scientific containers for mobility of compute.
Kurtzer, Gregory M; Sochat, Vanessa; Bauer, Michael W
2017-01-01
Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.
Singularity: Scientific containers for mobility of compute
Kurtzer, Gregory M.; Bauer, Michael W.
2017-01-01
Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science. PMID:28494014
Bonsai: an event-based framework for processing and controlling data streams
Lopes, Gonçalo; Bonacchi, Niccolò; Frazão, João; Neto, Joana P.; Atallah, Bassam V.; Soares, Sofia; Moreira, Luís; Matias, Sara; Itskov, Pavel M.; Correia, Patrícia A.; Medina, Roberto E.; Calcaterra, Lorenza; Dreosti, Elena; Paton, Joseph J.; Kampff, Adam R.
2015-01-01
The design of modern scientific experiments requires the control and monitoring of many different data streams. However, the serial execution of programming instructions in a computer makes it a challenge to develop software that can deal with the asynchronous, parallel nature of scientific data. Here we present Bonsai, a modular, high-performance, open-source visual programming framework for the acquisition and online processing of data streams. We describe Bonsai's core principles and architecture and demonstrate how it allows for the rapid and flexible prototyping of integrated experimental designs in neuroscience. We specifically highlight some applications that require the combination of many different hardware and software components, including video tracking of behavior, electrophysiology and closed-loop control of stimulation. PMID:25904861
Open scientific communication urged
NASA Astrophysics Data System (ADS)
Richman, Barbara T.
In a report released last week the National Academy of Sciences' Panel on Scientific Communication and National Security concluded that the ‘limited and uncertain benefits’ of controls on the dissemination of scientific and technological research are ‘outweighed by the importance of scientific progress, which open communication accelerates, to the overall welfare of the nation.’ The 18-member panel, chaired by Dale R. Corson, president emeritus of Cornell University, was created last spring (Eos, April 20, 1982, p. 241) to examine the delicate balance between open dissemination of scientific and technical information and the U.S. government's desire to protect scientific and technological achievements from being translated into military advantages for our political adversaries.The panel dealt almost exclusively with the relationship between the United States and the Soviet Union but noted that there are ‘clear problems in scientific communication and national security involving Third World countries.’ Further study of this matter is necessary.
NASA Astrophysics Data System (ADS)
Glazer, B. T.
2016-02-01
Here, we describe the development of novel, low-cost, open-source instrumentation to enable wireless data transfer of biogeochemical sensors in the coastal zone. The platform is centered upon the Beaglebone Black single board computer. Process-inquiry in environmental sciences suffers from undersampling; enabling sustained and unattended data collection typically involves expensive instrumentation and infrastructure deployed as cabled observatories with little flexibility in deployment location following initial installation. High cost of commercially-available or custom electronic packages have not only limited the number of sensor node sites that can be targeted by reasonably well-funded academic researchers, but have also entirely prohibited widespread engagement with K-12, public non-profit, and `citizen scientist' STEM audiences. The new platform under development represents a balanced blend of research-grade sensors and low-cost open-source electronics that are easily assembled. Custom, robust, open-source code that remains customizable for specific node configurations can match a specific deployment's measurement needs, depending on the scientific research priorities. We have demonstrated prototype capabilities and versatility through lab testing and field deployments of multiple sensor nodes with multiple sensor inputs, all of which are streaming near-real-time data over wireless RF links to a shore-based base station. On shore, first-pass data processing QA/QC takes place and near-real-time plots are made available on the World Wide Web. Specifically, we have worked closely with an environmental and cultural management and restoration non-profit organization, and middle and high school science classes, engaging their interest in STEM application to local watershed processes. Ultimately, continued successful development of this pilot project can lead to a coastal oceanographic analogue of the popular Weather Underground personal weather station model.
Exploiting Open Environmental Data using Linked Data and Cloud Computing: the MELODIES project
NASA Astrophysics Data System (ADS)
Blower, Jon; Gonçalves, Pedro; Caumont, Hervé; Koubarakis, Manolis; Perkins, Bethan
2015-04-01
The European Open Data Strategy establishes important new principles that ensure that European public sector data will be released at no cost (or marginal cost), in machine-readable, commonly-understood formats, and with liberal licences enabling wide reuse. These data encompass both scientific data about the environment (from Earth Observation and other fields) and other public sector information, including diverse topics such as demographics, health and crime. Many open geospatial datasets (e.g. land use) are already available through the INSPIRE directive and made available through infrastructures such as the Global Earth Observation System of Systems (GEOSS). The intention of the Open Data Strategy is to stimulate the growth of research and value-adding services that build upon these data streams; however, the potential value inherent in open data, and the benefits that can be gained by combining previously-disparate sources of information are only just starting to become understood. The MELODIES project (Maximising the Exploitation of Linked Open Data In Enterprise and Science) is developing eight innovative and sustainable services, based upon Open Data, for users in research, government, industry and the general public in a broad range of societal and environmental benefit areas. MELODIES (http://melodiesproject.eu) is a European FP7 project that is coordinated by the University of Reading and has sixteen partners (including nine SMEs) from eight European countries. It started in November 2013 and will run for three years. The project is therefore in its early stages and therefore we will value the opportunity that this workshop affords to present our plans and interact with the wider Linked Geospatial Data community. The project is developing eight new services[1] covering a range of domains including agriculture, urban ecosystems, land use management, marine information, desertification, crisis management and hydrology. These services will combine Earth Observation data with other open data sources to produce new information for the benefit of scientists, industry, government decision-makers, public service providers and citizens. The long-term sustainability of the services will be assessed critically throughout the project from a number of angles (technical, political and economic), in order to ensure that the full benefits of the MELODIES project are realised in the long term. The priority of the project, therefore, is to demonstrate that releasing data openly leads to concrete commercial and scientific benefits, and can stimulate the production of new applications and viable services. [1] http://www.melodiesproject.eu/services.html
Cloud-based Jupyter Notebooks for Water Data Analysis
NASA Astrophysics Data System (ADS)
Castronova, A. M.; Brazil, L.; Seul, M.
2017-12-01
The development and adoption of technologies by the water science community to improve our ability to openly collaborate and share workflows will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. Jupyter notebooks offer one solution by providing an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Adoption of this technology within the water sciences, coupled with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data intensive toolchains. Moreover, implementing this software stack in a cloud-based environment extends its native functionality to provide researchers a mechanism to build and execute toolchains that are too large or computationally demanding for typical desktop computers. Additionally, this cloud-based solution enables scientists to disseminate data processing routines alongside journal publications in an effort to support reproducibility. For example, these data collection and analysis toolchains can be shared, archived, and published using the HydroShare platform or downloaded and executed locally to reproduce scientific analysis. This work presents the design and implementation of a cloud-based Jupyter environment and its application for collecting, aggregating, and munging various datasets in a transparent, sharable, and self-documented manner. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss recent efforts towards achieving these goals, and describe the architectural design of the notebook server in an effort to support collaborative and reproducible science.
Establishing Long Term Data Management Research Priorities via a Data Decadal Survey
NASA Astrophysics Data System (ADS)
Wilson, A.; Uhlir, P.; Meyer, C. B.; Robinson, E.
2013-12-01
We live in a time of unprecedented collection of and access to scientific data. Improvements in sensor technologies and modeling capabilities are constantly producing new data sources. Data sets are being used for unexpected purposes far from their point of origin, as research spans projects, discipline domains, and temporal and geographic boundaries. The nature of science is evolving, with more open science, open publications, and changes to the nature of peer review and data "publication". Data-intensive, or computational science, has been identified as a new research paradigm. There is recognition that the creation of a data set can be a contribution to science deserving of recognition comparable to other scientific publications. Federally funded projects are generally expected to make their data open and accessible to everyone. In this dynamic environment, scientific progress is ever more dependent on good data management practices and policies. Yet current data management and stewardship practices are insufficient. Data sets created at great, and often public, expense are at risk of being lost for technological or organizational reasons. Insufficient documentation and understanding of data can mean that the data are used incorrectly or not at all. Scientific results are being scrutinized and questioned, and occasionally retracted due to problems in data management. The volume of data is greatly increasing while funding for data management is meager and generally must be found within existing budgets. Many federal government agencies, including NASA, USGS, NOAA and NSF are already making efforts to address data management issues. Executive memos and directives give substantial impetus to those efforts, such as the May 9 Executive Order directing agencies to implement Open Data Policy requirements and regularly report their progress. However, these distributed efforts risk duplicating effort, lack a unifying, long-term strategic vision, and too often work in competition with other priorities of the research enterprise. This presentation will introduce the Data Decadal Survey, an initial concept created in collaboration between the Federation of Earth Science Information Partners (ESIP) and of the National Research Council's Board on Research Data and Information (BRDI). Consistent with Executive open data policies, the Survey will provide a coordinating platform to address overarching issues and identify research needs and funding priorities in scientific data management and stewardship for the long term. The Survey would address at the broadest level gaps in data management knowledge and practices that hold back scientific progress, and recommend a strategy to address them. The goal is to provide a long term strategic vision that will
The Advanced Gamma-ray Imaging System (AGIS)
NASA Astrophysics Data System (ADS)
Buckley, James
2008-04-01
We describe a concept for a ˜km^2 ground-based gamma-ray experiment (AGIS) comprised of an array of ˜100 imaging atmospheric Cherenkov telescopes achieving a sensitivity an order of magnitude better than the current generation of space or ground-based instruments in the energy range of 40 GeV to ˜100 TeV. We present the scientific drivers for AGIS including the prospects for contributions to understanding extragalactic sources such as nearby galaxies, active galaxies, galaxy clusters and GRB; galactic sources such as X-ray binaries, supernova remnants, and pulsar wind nebulae; as well as probes of fundamental physics including indirectly detecting dark matter and probing TeV-scale physics. With the current generation of atmospheric Cherenkov telescope arrays, TeV astronomy has become well established, with the number TeV gamma-ray sources now nearing 100, including many unidentified and serendipitous sources. Improvements in the instantaneous field of view, angular resolution, effective area and energy resolution of AGIS are certain to provide great scientific returns in high energy astrophysics as well as opening up new discovery space. Here we present an overview of the ongoing design studies for AGIS including the optimization of array parameters as well as an overview of the technical drivers for the observatory.
NASA Astrophysics Data System (ADS)
Rosenthal, S.; Kurtz, L.
2017-12-01
Open data plays an important role in facilitating scientific progress and developing sound public policies. However, sometimes important scientific initiatives can be subverted by those seeking to politicize science and attack scientists. There is a critical distinction between data openness, and forcing scientists to live in a fishbowl by demanding disclosure of traditionally confidential materials, such as peer review correspondence or preliminary drafts. The former is intended to promote scientific advances (which may well involve criticisms and disagreements), while the latter is an attempt to chill and disrupt research. Agenda-driven groups have sought to obtain scientists' private emails, preliminary drafts, and peer critiques - which may include technical jargon and "what if" debates that can easily be taken out of context - in an effort to intimidate researchers and disrupt their work. Fostering true transparency requires an understanding of what should be shared to advance scientific research, including disclosing information that might expose potential biases (such as funding information), while also protecting the candid debate, open collaboration, and academic freedom that is integral to the scientific process.
Anatomy of the Crowd4Discovery crowdfunding campaign.
Perlstein, Ethan O
2013-01-01
Crowdfunding allows the public to fund creative projects, including curiosity-driven scientific research. Last Fall, I was part of a team that raised $25,460 from an international coalition of "micropatrons" for an open, pharmacological research project called Crowd4Discovery. The goal of Crowd4Discovery is to determine the precise location of amphetamines inside mouse brain cells, and we are sharing the results of this project on the Internet as they trickle in. In this commentary, I will describe the genesis of Crowd4Discovery, our motivations for crowdfunding, an analysis of our fundraising data, and the nuts and bolts of running a crowdfunding campaign. Science crowdfunding is in its infancy but has already been successfully used by an array of scientists in academia and in the private sector as both a supplement and a substitute to grants. With traditional government sources of funding for basic scientific research contracting, an alternative model that couples fundraising and outreach - and in the process encourages more openness and accountability - may be increasingly attractive to researchers seeking to diversify their funding streams.
TCGA Third Annual Scientific Symposium - TCGA
Monday, May 12 - Tuesday, May 13, 2014 National Institutes of Health, Bethesda, Md. This open scientific meeting will consist of collaborative workshops, poster sessions, and plenary sessions. Registration is now open.
Scalable Regression Tree Learning on Hadoop using OpenPlanet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor
As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework usingmore » a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.« less
The fast azimuthal integration Python library: pyFAI.
Ashiotis, Giannis; Deschildre, Aurore; Nawaz, Zubair; Wright, Jonathan P; Karkoulis, Dimitrios; Picca, Frédéric Emmanuel; Kieffer, Jérôme
2015-04-01
pyFAI is an open-source software package designed to perform azimuthal integration and, correspondingly, two-dimensional regrouping on area-detector frames for small- and wide-angle X-ray scattering experiments. It is written in Python (with binary submodules for improved performance), a language widely accepted and used by the scientific community today, which enables users to easily incorporate the pyFAI library into their processing pipeline. This article focuses on recent work, especially the ease of calibration, its accuracy and the execution speed for integration.
NASA Technical Reports Server (NTRS)
Mattmann, Chris A.; Kelly, Sean; Crichton, Daniel J.; Hughes, J. Steven; Hardman, Sean; Ramirez, Paul; Joyner, Ron
2006-01-01
In this paper, we present a preliminary study of several different electronic data movement technologies. We detail our approach to classifying the technologies included in our study and present the preliminary results of some initial performance benchmarking. Our studies suggest that highly parallel TCP/IP streaming technologies, such as GridFTP and bbFTP, outperform commercial and open-source UDP-bursting technologies in several of the key data movement dimensions that we studied.
Current trends for customized biomedical software tools.
Khan, Haseeb Ahmad
2017-01-01
In the past, biomedical scientists were solely dependent on expensive commercial software packages for various applications. However, the advent of user-friendly programming languages and open source platforms has revolutionized the development of simple and efficient customized software tools for solving specific biomedical problems. Many of these tools are designed and developed by biomedical scientists independently or with the support of computer experts and often made freely available for the benefit of scientific community. The current trends for customized biomedical software tools are highlighted in this short review.
Status of GDL - GNU Data Language
NASA Astrophysics Data System (ADS)
Coulais, A.; Schellens, M.; Gales, J.; Arabas, S.; Boquien, M.; Chanial, P.; Messmer, P.; Fillmore, D.; Poplawski, O.; Maret, S.; Marchal, G.; Galmiche, N.; Mermet, T.
2010-12-01
Gnu Data Language (GDL) is an open-source interpreted language aimed at numerical data analysis and visualisation. It is a free implementation of the Interactive Data Language (IDL) widely used in Astronomy. GDL has a full syntax compatibility with IDL, and includes a large set of library routines targeting advanced matrix manipulation, plotting, time-series and image analysis, mapping, and data input/output including numerous scientific data formats. We will present the current status of the project, the key accomplishments, and the weaknesses - areas where contributions are welcome!
Public Lab: Community-Based Approaches to Urban and Environmental Health and Justice.
Rey-Mazón, Pablo; Keysar, Hagit; Dosemagen, Shannon; D'Ignazio, Catherine; Blair, Don
2018-06-01
This paper explores three cases of Do-It-Yourself, open-source technologies developed within the diverse array of topics and themes in the communities around the Public Laboratory for Open Technology and Science (Public Lab). These cases focus on aerial mapping, water quality monitoring and civic science practices. The techniques discussed have in common the use of accessible, community-built technologies for acquiring data. They are also concerned with embedding collaborative and open source principles into the objects, tools, social formations and data sharing practices that emerge from these inquiries. The focus is on developing processes of collaborative design and experimentation through material engagement with technology and issues of concern. Problem-solving, here, is a tactic, while the strategy is an ongoing engagement with the problem of participation in its technological, social and political dimensions especially considering the increasing centralization and specialization of scientific and technological expertise. The authors also discuss and reflect on the Public Lab's approach to civic science in light of ideas and practices of citizen/civic veillance, or "sousveillance", by emphasizing people before data, and by investigating the new ways of seeing and doing that this shift in perspective might provide.
PYCHEM: a multivariate analysis package for python.
Jarvis, Roger M; Broadhurst, David; Johnson, Helen; O'Boyle, Noel M; Goodacre, Royston
2006-10-15
We have implemented a multivariate statistical analysis toolbox, with an optional standalone graphical user interface (GUI), using the Python scripting language. This is a free and open source project that addresses the need for a multivariate analysis toolbox in Python. Although the functionality provided does not cover the full range of multivariate tools that are available, it has a broad complement of methods that are widely used in the biological sciences. In contrast to tools like MATLAB, PyChem 2.0.0 is easily accessible and free, allows for rapid extension using a range of Python modules and is part of the growing amount of complementary and interoperable scientific software in Python based upon SciPy. One of the attractions of PyChem is that it is an open source project and so there is an opportunity, through collaboration, to increase the scope of the software and to continually evolve a user-friendly platform that has applicability across a wide range of analytical and post-genomic disciplines. http://sourceforge.net/projects/pychem
Fisher Matrix Preloaded — FISHER4CAST
NASA Astrophysics Data System (ADS)
Bassett, Bruce A.; Fantaye, Yabebal; Hlozek, Renée; Kotze, Jacques
The Fisher Matrix is the backbone of modern cosmological forecasting. We describe the Fisher4Cast software: A general-purpose, easy-to-use, Fisher Matrix framework. It is open source, rigorously designed and tested and includes a Graphical User Interface (GUI) with automated LATEX file creation capability and point-and-click Fisher ellipse generation. Fisher4Cast was designed for ease of extension and, although written in Matlab, is easily portable to open-source alternatives such as Octave and Scilab. Here we use Fisher4Cast to present new 3D and 4D visualizations of the forecasting landscape and to investigate the effects of growth and curvature on future cosmological surveys. Early releases have been available at since mid-2008. The current release of the code is Version 2.2 which is described here. For ease of reference a Quick Start guide and the code used to produce the figures in this paper are included, in the hope that it will be useful to the cosmology and wider scientific communities.
LSSGalPy: Interactive Visualization of the Large-scale Environment Around Galaxies
NASA Astrophysics Data System (ADS)
Argudo-Fernández, M.; Duarte Puertas, S.; Ruiz, J. E.; Sabater, J.; Verley, S.; Bergond, G.
2017-05-01
New tools are needed to handle the growth of data in astrophysics delivered by recent and upcoming surveys. We aim to build open-source, light, flexible, and interactive software designed to visualize extensive three-dimensional (3D) tabular data. Entirely written in the Python language, we have developed interactive tools to browse and visualize the positions of galaxies in the universe and their positions with respect to its large-scale structures (LSS). Motivated by a previous study, we created two codes using Mollweide projection and wedge diagram visualizations, where survey galaxies can be overplotted on the LSS of the universe. These are interactive representations where the visualizations can be controlled by widgets. We have released these open-source codes that have been designed to be easily re-used and customized by the scientific community to fulfill their needs. The codes are adaptable to other kinds of 3D tabular data and are robust enough to handle several millions of objects. .
El-Houjeiri, Hassan M; Brandt, Adam R; Duffy, James E
2013-06-04
Existing transportation fuel cycle emissions models are either general and calculate nonspecific values of greenhouse gas (GHG) emissions from crude oil production, or are not available for public review and auditing. We have developed the Oil Production Greenhouse Gas Emissions Estimator (OPGEE) to provide open-source, transparent, rigorous GHG assessments for use in scientific assessment, regulatory processes, and analysis of GHG mitigation options by producers. OPGEE uses petroleum engineering fundamentals to model emissions from oil and gas production operations. We introduce OPGEE and explain the methods and assumptions used in its construction. We run OPGEE on a small set of fictional oil fields and explore model sensitivity to selected input parameters. Results show that upstream emissions from petroleum production operations can vary from 3 gCO2/MJ to over 30 gCO2/MJ using realistic ranges of input parameters. Significant drivers of emissions variation are steam injection rates, water handling requirements, and rates of flaring of associated gas.
Open-source Framework for Storing and Manipulation of Plasma Chemical Reaction Data
NASA Astrophysics Data System (ADS)
Jenkins, T. G.; Averkin, S. N.; Cary, J. R.; Kruger, S. E.
2017-10-01
We present a new open-source framework for storage and manipulation of plasma chemical reaction data that has emerged from our in-house project MUNCHKIN. This framework consists of python scripts and C + + programs. It stores data in an SQL data base for fast retrieval and manipulation. For example, it is possible to fit cross-section data into most widely used analytical expressions, calculate reaction rates for Maxwellian distribution functions of colliding particles, and fit them into different analytical expressions. Another important feature of this framework is the ability to calculate transport properties based on the cross-section data and supplied distribution functions. In addition, this framework allows the export of chemical reaction descriptions in LaTeX format for ease of inclusion in scientific papers. With the help of this framework it is possible to generate corresponding VSim (Particle-In-Cell simulation code) and USim (unstructured multi-fluid code) input blocks with appropriate cross-sections.
The OSG Open Facility: an on-ramp for opportunistic scientific computing
NASA Astrophysics Data System (ADS)
Jayatilaka, B.; Levshina, T.; Sehgal, C.; Gardner, R.; Rynge, M.; Würthwein, F.
2017-10-01
The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource owners and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.
The OSG Open Facility: An On-Ramp for Opportunistic Scientific Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jayatilaka, B.; Levshina, T.; Sehgal, C.
The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource ownersmore » and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.« less
Ravi, Keerthi Sravan; Potdar, Sneha; Poojar, Pavan; Reddy, Ashok Kumar; Kroboth, Stefan; Nielsen, Jon-Fredrik; Zaitsev, Maxim; Venkatesan, Ramesh; Geethanath, Sairam
2018-03-11
To provide a single open-source platform for comprehensive MR algorithm development inclusive of simulations, pulse sequence design and deployment, reconstruction, and image analysis. We integrated the "Pulseq" platform for vendor-independent pulse programming with Graphical Programming Interface (GPI), a scientific development environment based on Python. Our integrated platform, Pulseq-GPI, permits sequences to be defined visually and exported to the Pulseq file format for execution on an MR scanner. For comparison, Pulseq files using either MATLAB only ("MATLAB-Pulseq") or Python only ("Python-Pulseq") were generated. We demonstrated three fundamental sequences on a 1.5 T scanner. Execution times of the three variants of implementation were compared on two operating systems. In vitro phantom images indicate equivalence with the vendor supplied implementations and MATLAB-Pulseq. The examples demonstrated in this work illustrate the unifying capability of Pulseq-GPI. The execution times of all the three implementations were fast (a few seconds). The software is capable of user-interface based development and/or command line programming. The tool demonstrated here, Pulseq-GPI, integrates the open-source simulation, reconstruction and analysis capabilities of GPI Lab with the pulse sequence design and deployment features of Pulseq. Current and future work includes providing an ISMRMRD interface and incorporating Specific Absorption Ratio and Peripheral Nerve Stimulation computations. Copyright © 2018 Elsevier Inc. All rights reserved.
Sticks AND Carrots: Encouraging Open Science at its source.
Leonelli, Sabina; Spichtinger, Daniel; Prainsack, Barbara
2015-06-30
The Open Science (OS) movement has been seen as an important facilitator for public participation in science. This has been underpinned by the assumption that widespread and free access to research outputs leads to (i) better and more efficient science, (ii) economic growth, in particular for small and medium-sized enterprises wishing to capitalise on research findings and (iii) increased transparency of knowledge production and its outcomes. The latter in particular could function as a catalyst for public participation and engagement. Whether OS is likely to help realise these benefits, however, will depend on the emergence of systemic incentives for scientists to utilise OS in a meaningful manner. While some areas, the environmental sciences have a long tradition of open ethos, citizen inclusion and global collaborations, such activities need to be more systematically supported and promoted by funders and learned societies in order to improve scientific research and public participation.
Open Access to Data - Central Role for Geoinformatics
NASA Astrophysics Data System (ADS)
Snyder, W. S.
2006-12-01
The open access to scientific information has become a contentious issue. In the United States there are calls to make all published literature available for free within 6 months of publication, the notion being that this will promote better science and policy decisions based on science. Here, I argue that this is the incorrect approach to the issue of open access to scientific information. A fundamental problem raised by the call for open access to government-supported research results is the viability of our not-for-profit professional scientific societies. These societies provide the base level framework for the exchange of scientific ideas, and hence the very core of how we do science and how scientific knowledge is advanced. Why should a scientist subscribe to a journal if they can read the article for free in six months? A large portion of a society's operational costs come from these subscriptions and the sale of specialized books, all of which contain the results of federally-funded research. Without revenue from journal subscriptions and book sales, not only will these publications disappear, but many of societies may as well. Without a broad venue to publish and in which to interact, our science suffers - many subdisciplines may fade or even die - those that don't "sell well." Very popular publications such as "Nature", "Science", "Tectonics", and "Geology" will continue to thrive, but what about the more specialized journals such as "Journal of Paleontology"? They are costly to publish yet fill a very critical niche for our science. Many will still pay for reading the Nature/Science/Tectonics/Geology article, but where do we publish the mainstream science paper? We have to guard against becoming a "Hollywood Science" - where only the glitzy gets published because those are the articles that sell. We must have peer-reviewed, independent publications and viable professional societies, or our science will severely suffer. We can better approach the need for open access to scientific information by concentrating on the data versus the written word. The written word can and should be copyrighted. This protects the viability of the journals of many societies - large and small - and therefore the viability of the societies and the science they support. Furthermore, the written word is largely interpretation - some of which flows directly from the funded research, but much of which reflects the accumulated knowledge of the scientists; knowledge that has been derived from sources that cannot be tracked. Interpretations and conclusions differ among scientists - that is what drives the progress of science, and that is the written word. The notion that all journal articles must be free to all effectively says that ideas and interpretations cannot be protected by copyright. It should also be noted that all articles are available in libraries as they are published, so it is hard to argue that they are not "freely available." What hinders science and public policy decision making, is the lack of complete access to all relevant data and metadata. To require that all relevant data and metadata be publically available six months after publication is a viable solution. The government dollars clearly pay for the data, but the source of support for scientific interpretation and discussion is impossible to determine. Once the data are public, they are free for all to reinterpret and to use as the basis for additional studies that reflect the unsolved issues of the previous study. Many of the government agencies that supply research funds already have data policies in place. What may be required is the funding base to allow them to implement these policies in consultation with the academic community that they serve.
The Experiment Factory: Standardizing Behavioral Experiments.
Sochat, Vanessa V; Eisenberg, Ian W; Enkavi, A Zeynep; Li, Jamie; Bissett, Patrick G; Poldrack, Russell A
2016-01-01
The administration of behavioral and experimental paradigms for psychology research is hindered by lack of a coordinated effort to develop and deploy standardized paradigms. While several frameworks (Mason and Suri, 2011; McDonnell et al., 2012; de Leeuw, 2015; Lange et al., 2015) have provided infrastructure and methods for individual research groups to develop paradigms, missing is a coordinated effort to develop paradigms linked with a system to easily deploy them. This disorganization leads to redundancy in development, divergent implementations of conceptually identical tasks, disorganized and error-prone code lacking documentation, and difficulty in replication. The ongoing reproducibility crisis in psychology and neuroscience research (Baker, 2015; Open Science Collaboration, 2015) highlights the urgency of this challenge: reproducible research in behavioral psychology is conditional on deployment of equivalent experiments. A large, accessible repository of experiments for researchers to develop collaboratively is most efficiently accomplished through an open source framework. Here we present the Experiment Factory, an open source framework for the development and deployment of web-based experiments. The modular infrastructure includes experiments, virtual machines for local or cloud deployment, and an application to drive these components and provide developers with functions and tools for further extension. We release this infrastructure with a deployment (http://www.expfactory.org) that researchers are currently using to run a set of over 80 standardized web-based experiments on Amazon Mechanical Turk. By providing open source tools for both deployment and development, this novel infrastructure holds promise to bring reproducibility to the administration of experiments, and accelerate scientific progress by providing a shared community resource of psychological paradigms.
The Experiment Factory: Standardizing Behavioral Experiments
Sochat, Vanessa V.; Eisenberg, Ian W.; Enkavi, A. Zeynep; Li, Jamie; Bissett, Patrick G.; Poldrack, Russell A.
2016-01-01
The administration of behavioral and experimental paradigms for psychology research is hindered by lack of a coordinated effort to develop and deploy standardized paradigms. While several frameworks (Mason and Suri, 2011; McDonnell et al., 2012; de Leeuw, 2015; Lange et al., 2015) have provided infrastructure and methods for individual research groups to develop paradigms, missing is a coordinated effort to develop paradigms linked with a system to easily deploy them. This disorganization leads to redundancy in development, divergent implementations of conceptually identical tasks, disorganized and error-prone code lacking documentation, and difficulty in replication. The ongoing reproducibility crisis in psychology and neuroscience research (Baker, 2015; Open Science Collaboration, 2015) highlights the urgency of this challenge: reproducible research in behavioral psychology is conditional on deployment of equivalent experiments. A large, accessible repository of experiments for researchers to develop collaboratively is most efficiently accomplished through an open source framework. Here we present the Experiment Factory, an open source framework for the development and deployment of web-based experiments. The modular infrastructure includes experiments, virtual machines for local or cloud deployment, and an application to drive these components and provide developers with functions and tools for further extension. We release this infrastructure with a deployment (http://www.expfactory.org) that researchers are currently using to run a set of over 80 standardized web-based experiments on Amazon Mechanical Turk. By providing open source tools for both deployment and development, this novel infrastructure holds promise to bring reproducibility to the administration of experiments, and accelerate scientific progress by providing a shared community resource of psychological paradigms. PMID:27199843
PharmTeX: a LaTeX-Based Open-Source Platform for Automated Reporting Workflow.
Rasmussen, Christian Hove; Smith, Mike K; Ito, Kaori; Sundararajan, Vijayakumar; Magnusson, Mats O; Niclas Jonsson, E; Fostvedt, Luke; Burger, Paula; McFadyen, Lynn; Tensfeldt, Thomas G; Nicholas, Timothy
2018-03-16
Every year, the pharmaceutical industry generates a large number of scientific reports related to drug research, development, and regulatory submissions. Many of these reports are created using text processing tools such as Microsoft Word. Given the large number of figures, tables, references, and other elements, this is often a tedious task involving hours of copying and pasting and substantial efforts in quality control (QC). In the present article, we present the LaTeX-based open-source reporting platform, PharmTeX, a community-based effort to make reporting simple, reproducible, and user-friendly. The PharmTeX creators put a substantial effort into simplifying the sometimes complex elements of LaTeX into user-friendly functions that rely on advanced LaTeX and Perl code running in the background. Using this setup makes LaTeX much more accessible for users with no prior LaTeX experience. A software collection was compiled for users not wanting to manually install the required software components. The PharmTeX templates allow for inclusion of tables directly from mathematical software output as well and figures from several formats. Code listings can be included directly from source. No previous experience and only a few hours of training are required to start writing reports using PharmTeX. PharmTeX significantly reduces the time required for creating a scientific report fully compliant with regulatory and industry expectations. QC is made much simpler, since there is a direct link between analysis output and report input. PharmTeX makes available to report authors the strengths of LaTeX document processing without the need for extensive training. Graphical Abstract ᅟ.
Web catalog of oceanographic data using GeoNetwork
NASA Astrophysics Data System (ADS)
Marinova, Veselka; Stefanov, Asen
2017-04-01
Most of the data collected, analyzed and used by Bulgarian oceanographic data center (BgODC) from scientific cruises, argo floats, ferry boxes and real time operating systems are spatially oriented and need to be displayed on the map. The challenge is to make spatial information more accessible to users, decision makers and scientists. In order to meet this challenge, BgODC concentrate its efforts on improving dynamic and standardized access to their geospatial data as well as those from various related organizations and institutions. BgODC currently is implementing a project to create a geospatial portal for distributing metadata and search, exchange and harvesting spatial data. There are many open source software solutions able to create such spatial data infrastructure (SDI). Finally, the GeoNetwork open source is chosen, as it is already widespread. This software is free, effective and "cheap" solution for implementing SDI at organization level. It is platform independent and runs under many operating systems. Filling of the catalog goes through these practical steps: • Managing and storing data reliably within MS SQL spatial data base; • Registration of maps and data of various formats and sources in GeoServer (most popular open source geospatial server embedded with GeoNetwork) ; • Filling added meta data and publishing geospatial data at the desktop of GeoNetwork. GeoServer and GeoNetwork are based on Java so they require installing of a servlet engine like Tomcat. The experience gained from the use of GeoNetwork Open Source confirms that the catalog meets the requirements for data management and is flexible enough to customize. Building the catalog facilitates sustainable data exchange between end users. The catalog is a big step towards implementation of the INSPIRE directive due to availability of many features necessary for producing "INSPIRE compliant" metadata records. The catalog now contains all available GIS data provided by BgODC for Internet access. Searching data within the catalog is based upon geographic extent, theme type and free text search.
Analyzing microtomography data with Python and the scikit-image library.
Gouillart, Emmanuelle; Nunez-Iglesias, Juan; van der Walt, Stéfan
2017-01-01
The exploration and processing of images is a vital aspect of the scientific workflows of many X-ray imaging modalities. Users require tools that combine interactivity, versatility, and performance. scikit-image is an open-source image processing toolkit for the Python language that supports a large variety of file formats and is compatible with 2D and 3D images. The toolkit exposes a simple programming interface, with thematic modules grouping functions according to their purpose, such as image restoration, segmentation, and measurements. scikit-image users benefit from a rich scientific Python ecosystem that contains many powerful libraries for tasks such as visualization or machine learning. scikit-image combines a gentle learning curve, versatile image processing capabilities, and the scalable performance required for the high-throughput analysis of X-ray imaging data.
40 CFR 725.160 - Submission of health and environmental effects data.
Code of Federal Regulations, 2011 CFR
2011-07-01
... environmental release of the new microorganism. (3)(i) If the data do not appear in the open scientific... laboratory that developed the data. (ii) If the data appear in the open scientific literature, the submitter... of data reported under paragraph (b) of this section must include: (i) If the data appear in the open...
40 CFR 725.160 - Submission of health and environmental effects data.
Code of Federal Regulations, 2010 CFR
2010-07-01
... environmental release of the new microorganism. (3)(i) If the data do not appear in the open scientific... laboratory that developed the data. (ii) If the data appear in the open scientific literature, the submitter... of data reported under paragraph (b) of this section must include: (i) If the data appear in the open...
Visualization and Interaction in Research, Teaching, and Scientific Communication
NASA Astrophysics Data System (ADS)
Ammon, C. J.
2017-12-01
Modern computing provides many tools for exploring observations, numerical calculations, and theoretical relationships. The number of options is, in fact, almost overwhelming. But the choices provide those with modest programming skills opportunities to create unique views of scientific information and to develop deeper insights into their data, their computations, and the underlying theoretical data-model relationships. I present simple examples of using animation and human-computer interaction to explore scientific data and scientific-analysis approaches. I illustrate how valuable a little programming ability can free scientists from the constraints of existing tools and can facilitate the development of deeper appreciation data and models. I present examples from a suite of programming languages ranging from C to JavaScript including the Wolfram Language. JavaScript is valuable for sharing tools and insight (hopefully) with others because it is integrated into one of the most powerful communication tools in human history, the web browser. Although too much of that power is often spent on distracting advertisements, the underlying computation and graphics engines are efficient, flexible, and almost universally available in desktop and mobile computing platforms. Many are working to fulfill the browser's potential to become the most effective tool for interactive study. Open-source frameworks for visualizing everything from algorithms to data are available, but advance rapidly. One strategy for dealing with swiftly changing tools is to adopt common, open data formats that are easily adapted (often by framework or tool developers). I illustrate the use of animation and interaction in research and teaching with examples from earthquake seismology.
[Open access :an opportunity for biomedical research].
Duchange, Nathalie; Autard, Delphine; Pinhas, Nicole
2008-01-01
Open access within the scientific community depends on the scientific context and the practices of the field. In the biomedical domain, the communication of research results is characterised by the importance of the peer reviewing process, the existence of a hierarchy among journals and the transfer of copyright to the editor. Biomedical publishing has become a lucrative market and the growth of electronic journals has not helped lower the costs. Indeed, it is difficult for today's public institutions to gain access to all the scientific literature. Open access is thus imperative, as demonstrated through the positions taken by a growing number of research funding bodies, the development of open access journals and efforts made in promoting open archives. This article describes the setting up of an Inserm portal for publication in the context of the French national protocol for open-access self-archiving and in an international context.
Zdravkovski, Zoran
2014-01-01
The development and availability of personal computers and software as well as printing techniques in the last twenty years have made a profound change in the publication of scientific journals. Additionally, the Internet in the last decade has revolutionized the publication process to the point of changing the basic paradigm of printed journals. The Macedonian Journal of Chemistry and Chemical Engineering in its 40-year history has adopted and adapted to all these transformations. In order to keep up with the inevitable changes, as editor-in-chief I felt my responsibility was to introduce an electronic editorial managing of the journal. The choice was between commercial and open source platforms, and because of the limited funding of the journal we chose the latter. We decided on Open Journal Systems, which provided online submission and management of all content, had flexible configuration--requirements, sections, review process, etc., had options for comprehensive indexing, offered various reading tools, had email notification and commenting ability for readers, had an option for thesis abstracts and was installed locally. However, since there is limited support it requires a moderate computer knowledge/skills and effort in order to set up. Overall, it is an excellent editorial platform and a convenient solution for journals with a low budget or journals that do not want to spend their resources on commercial platforms or simply support the idea of open source software.
Swan: A tool for porting CUDA programs to OpenCL
NASA Astrophysics Data System (ADS)
Harvey, M. J.; De Fabritiis, G.
2011-04-01
The use of modern, high-performance graphical processing units (GPUs) for acceleration of scientific computation has been widely reported. The majority of this work has used the CUDA programming model supported exclusively by GPUs manufactured by NVIDIA. An industry standardisation effort has recently produced the OpenCL specification for GPU programming. This offers the benefits of hardware-independence and reduced dependence on proprietary tool-chains. Here we describe a source-to-source translation tool, "Swan" for facilitating the conversion of an existing CUDA code to use the OpenCL model, as a means to aid programmers experienced with CUDA in evaluating OpenCL and alternative hardware. While the performance of equivalent OpenCL and CUDA code on fixed hardware should be comparable, we find that a real-world CUDA application ported to OpenCL exhibits an overall 50% increase in runtime, a reduction in performance attributable to the immaturity of contemporary compilers. The ported application is shown to have platform independence, running on both NVIDIA and AMD GPUs without modification. We conclude that OpenCL is a viable platform for developing portable GPU applications but that the more mature CUDA tools continue to provide best performance. Program summaryProgram title: Swan Catalogue identifier: AEIH_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIH_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU Public License version 2 No. of lines in distributed program, including test data, etc.: 17 736 No. of bytes in distributed program, including test data, etc.: 131 177 Distribution format: tar.gz Programming language: C Computer: PC Operating system: Linux RAM: 256 Mbytes Classification: 6.5 External routines: NVIDIA CUDA, OpenCL Nature of problem: Graphical Processing Units (GPUs) from NVIDIA are preferentially programed with the proprietary CUDA programming toolkit. An alternative programming model promoted as an industry standard, OpenCL, provides similar capabilities to CUDA and is also supported on non-NVIDIA hardware (including multicore ×86 CPUs, AMD GPUs and IBM Cell processors). The adaptation of a program from CUDA to OpenCL is relatively straightforward but laborious. The Swan tool facilitates this conversion. Solution method:Swan performs a translation of CUDA kernel source code into an OpenCL equivalent. It also generates the C source code for entry point functions, simplifying kernel invocation from the host program. A concise host-side API abstracts the CUDA and OpenCL APIs. A program adapted to use Swan has no dependency on the CUDA compiler for the host-side program. The converted program may be built for either CUDA or OpenCL, with the selection made at compile time. Restrictions: No support for CUDA C++ features Running time: Nominal
NASA Astrophysics Data System (ADS)
Carminati, Federico; Perret-Gallix, Denis; Riemann, Tord
2014-06-01
Round table discussions are in the tradition of ACAT. This year's plenary round table discussion was devoted to questions related to the use of scientific software in High Energy Physics and beyond. The 90 minutes of discussion were lively, and quite a lot of diverse opinions were spelled out. Although the discussion was, in part, controversial, the participants agreed unanimously on several basic issues in software sharing: • The importance of having various licensing models in academic research; • The basic value of proper recognition and attribution of intellectual property, including scientific software; • The user respect for the conditions of use, including licence statements, as formulated by the author. The need of a similar discussion on the issues of data sharing was emphasized and it was recommended to cover this subject at the conference round table discussion of next ACAT. In this contribution, we summarise selected topics that were covered in the introductory talks and in the following discussion.
Inspiring Collaboration: The Legacy of Theo Colborn's Transdisciplinary Research on Fracking.
Wylie, Sara; Schultz, Kim; Thomas, Deborah; Kassotis, Chris; Nagel, Susan
2016-09-13
This article describes Dr Theo Colborn's legacy of inspiring complementary and synergistic environmental health research and advocacy. Colborn, a founder of endocrine disruption research, also stimulated study of hydraulic fracturing (fracking). In 2014, the United States led the world in oil and gas production, with fifteen million Americans living within one mile of an oil or gas well. Colborn pioneered efforts to understand and control the impacts of this sea change in energy production. In 2005, her research organization The Endocrine Disruption Exchange (TEDX) developed a database of chemicals used in natural gas extraction and their health effects. This database stimulated novel scientific and social scientific research and informed advocacy by (1) connecting communities' diverse health impacts to chemicals used in natural gas development, (2) inspiring social science research on open-source software and hardware for citizen science, and (3) posing new scientific questions about the endocrine-disrupting properties of fracking chemicals. © The Author(s) 2016.
Executable research compendia in geoscience research infrastructures
NASA Astrophysics Data System (ADS)
Nüst, Daniel
2017-04-01
From generation through analysis and collaboration to communication, scientific research requires the right tools. Scientists create their own software using third party libraries and platforms. Cloud computing, Open Science, public data infrastructures, and Open Source enable scientists with unprecedented opportunites, nowadays often in a field "Computational X" (e.g. computational seismology) or X-informatics (e.g. geoinformatics) [0]. This increases complexity and generates more innovation, e.g. Environmental Research Infrastructures (environmental RIs [1]). Researchers in Computational X write their software relying on both source code (e.g. from https://github.com) and binary libraries (e.g. from package managers such as APT, https://wiki.debian.org/Apt, or CRAN, https://cran.r-project.org/). They download data from domain specific (cf. https://re3data.org) or generic (e.g. https://zenodo.org) data repositories, and deploy computations remotely (e.g. European Open Science Cloud). The results themselves are archived, given persistent identifiers, connected to other works (e.g. using https://orcid.org/), and listed in metadata catalogues. A single researcher, intentionally or not, interacts with all sub-systems of RIs: data acquisition, data access, data processing, data curation, and community support [3]. To preserve computational research [3] proposes the Executable Research Compendium (ERC), a container format closing the gap of dependency preservation by encapsulating the runtime environment. ERCs and RIs can be integrated for different uses: (i) Coherence: ERC services validate completeness, integrity and results (ii) Metadata: ERCs connect the different parts of a piece of research and faciliate discovery (iii) Exchange and Preservation: ERC as usable building blocks are the shared and archived entity (iv) Self-consistency: ERCs remove dependence on ephemeral sources (v) Execution: ERC services create and execute a packaged analysis but integrate with existing platforms for display and control These integrations are vital for capturing workflows in RIs and connect key stakeholders (scientists, publishers, librarians). They are demonstrated using developments by the DFG-funded project Opening Reproducible Research (http://o2r.info). Semi-automatic creation of ERCs based on research workflows is a core goal of the project. References [0] Tony Hey, Stewart Tansley, Kristin Tolle (eds), 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research. [1] P. Martin et al., Open Information Linking for Environmental Research Infrastructures, 2015 IEEE 11th International Conference on e-Science, Munich, 2015, pp. 513-520. doi: 10.1109/eScience.2015.66 [2] Y. Chen et al., Analysis of Common Requirements for Environmental Science Research Infrastructures, The International Symposium on Grids and Clouds (ISGC) 2013, Taipei, 2013, http://pos.sissa.it/archive/conferences/179/032/ISGC [3] Opening Reproducible Research, Geophysical Research Abstracts Vol. 18, EGU2016-7396, 2016, http://meetingorganizer.copernicus.org/EGU2016/EGU2016-7396.pdf
NASA Astrophysics Data System (ADS)
Lucido, J. M.
2013-12-01
Scientists in the fields of hydrology, geophysics, and climatology are increasingly using the vast quantity of publicly-available data to address broadly-scoped scientific questions. For example, researchers studying contamination of nearshore waters could use a combination of radar indicated precipitation, modeled water currents, and various sources of in-situ monitoring data to predict water quality near a beach. In discovering, gathering, visualizing and analyzing potentially useful data sets, data portals have become invaluable tools. The most effective data portals often aggregate distributed data sets seamlessly and allow multiple avenues for accessing the underlying data, facilitated by the use of open standards. Additionally, adequate metadata are necessary for attribution, documentation of provenance and relating data sets to one another. Metadata also enable thematic, geospatial and temporal indexing of data sets and entities. Furthermore, effective portals make use of common vocabularies for scientific methods, units of measure, geologic features, chemical, and biological constituents as they allow investigators to correctly interpret and utilize data from external sources. One application that employs these principles is the National Ground Water Monitoring Network (NGWMN) Data Portal (http://cida.usgs.gov/ngwmn), which makes groundwater data from distributed data providers available through a single, publicly accessible web application by mediating and aggregating native data exposed via web services on-the-fly into Open Geospatial Consortium (OGC) compliant service output. That output may be accessed either through the map-based user interface or through the aforementioned OGC web services. Furthermore, the Geo Data Portal (http://cida.usgs.gov/climate/gdp/), which is a system that provides users with data access, subsetting and geospatial processing of large and complex climate and land use data, exemplifies the application of International Standards Organization (ISO) metadata records to enhance data discovery for both human and machine interpretation. Lastly, the Water Quality Portal (http://www.waterqualitydata.us/) achieves interoperable dissemination of water quality data by referencing a vocabulary service for mapping constituents and methods between the USGS and USEPA. The NGWMN Data Portal, Geo Data Portal and Water Quality Portal are three examples of best practices when implementing data portals that provide distributed scientific data in an integrated, standards-based approach.
Huh, Sun
2013-01-01
ScienceCentral, a free or open access, full-text archive of scientific journal literature at the Korean Federation of Science and Technology Societies, was under test in September 2013. Since it is a Journal Article Tag Suite-based full text database, extensible markup language files of all languages can be presented, according to Unicode Transformation Format 8-bit encoding. It is comparable to PubMed Central: however, there are two distinct differences. First, its scope comprises all science fields; second, it accepts all language journals. Launching ScienceCentral is the first step for free access or open access academic scientific journals of all languages to leap to the world, including scientific journals from Croatia.
Spheres of Knowledge that Require Open-mindedness and Open Data
NASA Astrophysics Data System (ADS)
Branch, B. D.
2009-05-01
The progress of social knowledge is impeded if not paralyzed at present by two fundamental factors, one impinging from knowledge without, and the other operating within the world of science itself' (Mannheim, Wirth, and Shils, p. xi). Hence, a Sphere of Knowledge (SK) defined here as a pseudo-ontology, may require a societal open-mindedness as defined by Dewey (1912). With professional open-mindedness and open data use, such social constructs may bridge and build relations towards efficient and effective societal problem solving. Open data use is defined where information has to be gathered from many sources to provided input for latter decision-making in the public interest. Here, spatial thinking may be the heart of data collection, analysis, and reporting that sustains an informatics experience among all parts of society. Here, at risk may be human survival and sustainability if policy and politics has hindered scientific or evidence-based transfer. Executive Order 12096, Coordinating Geographical Data Acquisition and Access: The National Spatial Data Infrastructure, by the federal government in 1994, may be an example of an informatics gap of knowledge where a federal mandate is not being connected to geosciences tools and community leaders that could benefit an open society. Critical in a SK is how an open society makes effective decisions if the issues it faces are new with unpredictable outcomes. Policy and politics should not impede the scientific or evidence based knowledge transfer but should be a root of democratic tools. Policy development and implementation should reflect such complexities' (Gardner, et al, 2003, p. 2). Conceptually, a SK may be too broad for any one disciplinary to address effectively as a next generation concern. The demand for deep integration of scientific data within and between disciplines is also growing, as larger and broader science questions are becoming more common. Concurrent with the growing demand for next generation information technology for science is a growth in semantic technologies' (McGuinness, Fox, and Brodaric, 2008, p. 1). Thus, if human survivability and sustainability exist in this manner as a societal issue, then effective interdisciplinary collaboration among social and hard sciences must effectively value the other to see an advance of evidence based and science based habits in the citizenry. The effective decision making of society may be dependent on the skills of science, its data sharing, and collaboration skills of multiple disciplines to reach feasible solutions for the public interest.
A service-oriented distributed semantic mediator: integrating multiscale biomedical information.
Mora, Oscar; Engelbrecht, Gerhard; Bisbal, Jesus
2012-11-01
Biomedical research continuously generates large amounts of heterogeneous and multimodal data spread over multiple data sources. These data, if appropriately shared and exploited, could dramatically improve the research practice itself, and ultimately the quality of health care delivered. This paper presents DISMED (DIstributed Semantic MEDiator), an open source semantic mediator that provides a unified view of a federated environment of multiscale biomedical data sources. DISMED is a Web-based software application to query and retrieve information distributed over a set of registered data sources, using semantic technologies. It also offers a userfriendly interface specifically designed to simplify the usage of these technologies by non-expert users. Although the architecture of the software mediator is generic and domain independent, in the context of this paper, DISMED has been evaluated for managing biomedical environments and facilitating research with respect to the handling of scientific data distributed in multiple heterogeneous data sources. As part of this contribution, a quantitative evaluation framework has been developed. It consist of a benchmarking scenario and the definition of five realistic use-cases. This framework, created entirely with public datasets, has been used to compare the performance of DISMED against other available mediators. It is also available to the scientific community in order to evaluate progress in the domain of semantic mediation, in a systematic and comparable manner. The results show an average improvement in the execution time by DISMED of 55% compared to the second best alternative in four out of the five use-cases of the experimental evaluation.
Dynamic Extension of a Virtualized Cluster by using Cloud Resources
NASA Astrophysics Data System (ADS)
Oberst, Oliver; Hauth, Thomas; Kernert, David; Riedel, Stephan; Quast, Günter
2012-12-01
The specific requirements concerning the software environment within the HEP community constrain the choice of resource providers for the outsourcing of computing infrastructure. The use of virtualization in HPC clusters and in the context of cloud resources is therefore a subject of recent developments in scientific computing. The dynamic virtualization of worker nodes in common batch systems provided by ViBatch serves each user with a dynamically virtualized subset of worker nodes on a local cluster. Now it can be transparently extended by the use of common open source cloud interfaces like OpenNebula or Eucalyptus, launching a subset of the virtual worker nodes within the cloud. This paper demonstrates how a dynamically virtualized computing cluster is combined with cloud resources by attaching remotely started virtual worker nodes to the local batch system.
Enabling a new Paradigm to Address Big Data and Open Science Challenges
NASA Astrophysics Data System (ADS)
Ramamurthy, Mohan; Fisher, Ward
2017-04-01
Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers also need to give scientists an ecosystem that includes data, tools, workflows and other services needed to perform analytics, integration, interpretation, and synthesis - all in the same environment or platform. Instead of moving data to processing systems near users, as is the tradition, the cloud permits one to bring processing, computing, analysis and visualization to data - so called data proximate workbench capabilities, also known as server-side processing. In this talk, I will present the ongoing work at Unidata to facilitate a new paradigm for doing science by offering a suite of tools, resources, and platforms to leverage cloud technologies for addressing both big data and Open Science/reproducibility challenges. That work includes the development and deployment of new protocols for data access and server-side operations and Docker container images of key applications, JupyterHub Python notebook tools, and cloud-based analysis and visualization capability via the CloudIDV tool to enable reproducible workflows and effectively use the accessed data.
Four simple recommendations to encourage best practices in research software
Jiménez, Rafael C.; Kuzak, Mateusz; Alhamdoosh, Monther; Barker, Michelle; Batut, Bérénice; Borg, Mikael; Capella-Gutierrez, Salvador; Chue Hong, Neil; Cook, Martin; Corpas, Manuel; Flannery, Madison; Garcia, Leyla; Gelpí, Josep Ll.; Gladman, Simon; Goble, Carole; González Ferreiro, Montserrat; Gonzalez-Beltran, Alejandra; Griffin, Philippa C.; Grüning, Björn; Hagberg, Jonas; Holub, Petr; Hooft, Rob; Ison, Jon; Katz, Daniel S.; Leskošek, Brane; López Gómez, Federico; Oliveira, Luis J.; Mellor, David; Mosbergen, Rowland; Mulder, Nicola; Perez-Riverol, Yasset; Pergl, Robert; Pichler, Horst; Pope, Bernard; Sanz, Ferran; Schneider, Maria V.; Stodden, Victoria; Suchecki, Radosław; Svobodová Vařeková, Radka; Talvik, Harry-Anton; Todorov, Ilian; Treloar, Andrew; Tyagi, Sonika; van Gompel, Maarten; Vaughan, Daniel; Via, Allegra; Wang, Xiaochuan; Watson-Haigh, Nathan S.; Crouch, Steve
2017-01-01
Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software development best practices promote better quality software, and better quality software improves the reproducibility and reusability of research. These recommendations are designed around Open Source values, and provide practical suggestions that contribute to making research software and its source code more discoverable, reusable and transparent. This manuscript is aimed at developers, but also at organisations, projects, journals and funders that can increase the quality and sustainability of research software by encouraging the adoption of these recommendations. PMID:28751965
Four simple recommendations to encourage best practices in research software.
Jiménez, Rafael C; Kuzak, Mateusz; Alhamdoosh, Monther; Barker, Michelle; Batut, Bérénice; Borg, Mikael; Capella-Gutierrez, Salvador; Chue Hong, Neil; Cook, Martin; Corpas, Manuel; Flannery, Madison; Garcia, Leyla; Gelpí, Josep Ll; Gladman, Simon; Goble, Carole; González Ferreiro, Montserrat; Gonzalez-Beltran, Alejandra; Griffin, Philippa C; Grüning, Björn; Hagberg, Jonas; Holub, Petr; Hooft, Rob; Ison, Jon; Katz, Daniel S; Leskošek, Brane; López Gómez, Federico; Oliveira, Luis J; Mellor, David; Mosbergen, Rowland; Mulder, Nicola; Perez-Riverol, Yasset; Pergl, Robert; Pichler, Horst; Pope, Bernard; Sanz, Ferran; Schneider, Maria V; Stodden, Victoria; Suchecki, Radosław; Svobodová Vařeková, Radka; Talvik, Harry-Anton; Todorov, Ilian; Treloar, Andrew; Tyagi, Sonika; van Gompel, Maarten; Vaughan, Daniel; Via, Allegra; Wang, Xiaochuan; Watson-Haigh, Nathan S; Crouch, Steve
2017-01-01
Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software development best practices promote better quality software, and better quality software improves the reproducibility and reusability of research. These recommendations are designed around Open Source values, and provide practical suggestions that contribute to making research software and its source code more discoverable, reusable and transparent. This manuscript is aimed at developers, but also at organisations, projects, journals and funders that can increase the quality and sustainability of research software by encouraging the adoption of these recommendations.
GillesPy: A Python Package for Stochastic Model Building and Simulation.
Abel, John H; Drawert, Brian; Hellander, Andreas; Petzold, Linda R
2016-09-01
GillesPy is an open-source Python package for model construction and simulation of stochastic biochemical systems. GillesPy consists of a Python framework for model building and an interface to the StochKit2 suite of efficient simulation algorithms based on the Gillespie stochastic simulation algorithms (SSA). To enable intuitive model construction and seamless integration into the scientific Python stack, we present an easy to understand, action-oriented programming interface. Here, we describe the components of this package and provide a detailed example relevant to the computational biology community.
GillesPy: A Python Package for Stochastic Model Building and Simulation
Abel, John H.; Drawert, Brian; Hellander, Andreas; Petzold, Linda R.
2017-01-01
GillesPy is an open-source Python package for model construction and simulation of stochastic biochemical systems. GillesPy consists of a Python framework for model building and an interface to the StochKit2 suite of efficient simulation algorithms based on the Gillespie stochastic simulation algorithms (SSA). To enable intuitive model construction and seamless integration into the scientific Python stack, we present an easy to understand, action-oriented programming interface. Here, we describe the components of this package and provide a detailed example relevant to the computational biology community. PMID:28630888
SESAME seeks cash to open its doors
NASA Astrophysics Data System (ADS)
Banks, Michael
2008-12-01
Researchers from the Middle East met in Jordan last month to celebrate progress towards the region's first synchrotron-radiation source, which is due to be ready for users by 2011. Known as SESAME, the facility is being built near the Al-Baqa' Applied University in Allan, about 30 km from the Jordanian capital Amman, and is designed to promote peaceful scientific co-operation between nations in the troubled region. However, the project still has to find 15m - a fifth of the total costs - in order to finish construction or SESAME risks being delayed beyond 2011.
Havery Mudd 2014-2015 Computer Science Conduit Clinic Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aspesi, G; Bai, J; Deese, R
2015-05-12
Conduit, a new open-source library developed at Lawrence Livermore National Laboratories, provides a C++ application programming interface (API) to describe and access scientific data. Conduit’s primary use is for inmemory data exchange in high performance computing (HPC) applications. Our team tested and improved Conduit to make it more appealing to potential adopters in the HPC community. We extended Conduit’s capabilities by prototyping four libraries: one for parallel communication using MPI, one for I/O functionality, one for aggregating performance data, and one for data visualization.
JHelioviewer: Open-Source Software for Discovery and Image Access in the Petabyte Age (Invited)
NASA Astrophysics Data System (ADS)
Mueller, D.; Dimitoglou, G.; Langenberg, M.; Pagel, S.; Dau, A.; Nuhn, M.; Garcia Ortiz, J. P.; Dietert, H.; Schmidt, L.; Hughitt, V. K.; Ireland, J.; Fleck, B.
2010-12-01
The unprecedented torrent of data returned by the Solar Dynamics Observatory is both a blessing and a barrier: a blessing for making available data with significantly higher spatial and temporal resolution, but a barrier for scientists to access, browse and analyze them. With such staggering data volume, the data is bound to be accessible only from a few repositories and users will have to deal with data sets effectively immobile and practically difficult to download. From a scientist's perspective this poses three challenges: accessing, browsing and finding interesting data while avoiding the proverbial search for a needle in a haystack. To address these challenges, we have developed JHelioviewer, an open-source visualization software that lets users browse large data volumes both as still images and movies. We did so by deploying an efficient image encoding, storage, and dissemination solution using the JPEG 2000 standard. This solution enables users to access remote images at different resolution levels as a single data stream. Users can view, manipulate, pan, zoom, and overlay JPEG 2000 compressed data quickly, without severe network bandwidth penalties. Besides viewing data, the browser provides third-party metadata and event catalog integration to quickly locate data of interest, as well as an interface to the Virtual Solar Observatory to download science-quality data. As part of the Helioviewer Project, JHelioviewer offers intuitive ways to browse large amounts of heterogeneous data remotely and provides an extensible and customizable open-source platform for the scientific community.
ThinkHazard!: an open-source, global tool for understanding hazard information
NASA Astrophysics Data System (ADS)
Fraser, Stuart; Jongman, Brenden; Simpson, Alanna; Nunez, Ariel; Deparday, Vivien; Saito, Keiko; Murnane, Richard; Balog, Simone
2016-04-01
Rapid and simple access to added-value natural hazard and disaster risk information is a key issue for various stakeholders of the development and disaster risk management (DRM) domains. Accessing available data often requires specialist knowledge of heterogeneous data, which are often highly technical and can be difficult for non-specialists in DRM to find and exploit. Thus, availability, accessibility and processing of these information sources are crucial issues, and an important reason why many development projects suffer significant impacts from natural hazards. The World Bank's Global Facility for Disaster Reduction and Recovery (GFDRR) is currently developing a new open-source tool to address this knowledge gap: ThinkHazard! The main aim of the ThinkHazard! project is to develop an analytical tool dedicated to facilitating improvements in knowledge and understanding of natural hazards among non-specialists in DRM. It also aims at providing users with relevant guidance and information on handling the threats posed by the natural hazards present in a chosen location. Furthermore, all aspects of this tool will be open and transparent, in order to give users enough information to understand its operational principles. In this presentation, we will explain the technical approach behind the tool, which translates state-of-the-art probabilistic natural hazard data into understandable hazard classifications and practical recommendations. We will also demonstrate the functionality of the tool, and discuss limitations from a scientific as well as an operational perspective.
The image-guided surgery toolkit IGSTK: an open source C++ software toolkit.
Enquobahrie, Andinet; Cheng, Patrick; Gary, Kevin; Ibanez, Luis; Gobbi, David; Lindseth, Frank; Yaniv, Ziv; Aylward, Stephen; Jomier, Julien; Cleary, Kevin
2007-11-01
This paper presents an overview of the image-guided surgery toolkit (IGSTK). IGSTK is an open source C++ software library that provides the basic components needed to develop image-guided surgery applications. It is intended for fast prototyping and development of image-guided surgery applications. The toolkit was developed through a collaboration between academic and industry partners. Because IGSTK was designed for safety-critical applications, the development team has adopted lightweight software processes that emphasizes safety and robustness while, at the same time, supporting geographically separated developers. A software process that is philosophically similar to agile software methods was adopted emphasizing iterative, incremental, and test-driven development principles. The guiding principle in the architecture design of IGSTK is patient safety. The IGSTK team implemented a component-based architecture and used state machine software design methodologies to improve the reliability and safety of the components. Every IGSTK component has a well-defined set of features that are governed by state machines. The state machine ensures that the component is always in a valid state and that all state transitions are valid and meaningful. Realizing that the continued success and viability of an open source toolkit depends on a strong user community, the IGSTK team is following several key strategies to build an active user community. These include maintaining a users and developers' mailing list, providing documentation (application programming interface reference document and book), presenting demonstration applications, and delivering tutorial sessions at relevant scientific conferences.
SmartR: an open-source platform for interactive visual analytics for translational research data
Herzinger, Sascha; Gu, Wei; Satagopam, Venkata; Eifes, Serge; Rege, Kavita; Barbosa-Silva, Adriano; Schneider, Reinhard
2017-01-01
Abstract Summary: In translational research, efficient knowledge exchange between the different fields of expertise is crucial. An open platform that is capable of storing a multitude of data types such as clinical, pre-clinical or OMICS data combined with strong visual analytical capabilities will significantly accelerate the scientific progress by making data more accessible and hypothesis generation easier. The open data warehouse tranSMART is capable of storing a variety of data types and has a growing user community including both academic institutions and pharmaceutical companies. tranSMART, however, currently lacks interactive and dynamic visual analytics and does not permit any post-processing interaction or exploration. For this reason, we developed SmartR, a plugin for tranSMART, that equips the platform not only with several dynamic visual analytical workflows, but also provides its own framework for the addition of new custom workflows. Modern web technologies such as D3.js or AngularJS were used to build a set of standard visualizations that were heavily improved with dynamic elements. Availability and Implementation: The source code is licensed under the Apache 2.0 License and is freely available on GitHub: https://github.com/transmart/SmartR. Contact: reinhard.schneider@uni.lu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28334291
Advancing global marine biogeography research with open-source GIS software and cloud-computing
Fujioka, Ei; Vanden Berghe, Edward; Donnelly, Ben; Castillo, Julio; Cleary, Jesse; Holmes, Chris; McKnight, Sean; Halpin, patrick
2012-01-01
Across many scientific domains, the ability to aggregate disparate datasets enables more meaningful global analyses. Within marine biology, the Census of Marine Life served as the catalyst for such a global data aggregation effort. Under the Census framework, the Ocean Biogeographic Information System was established to coordinate an unprecedented aggregation of global marine biogeography data. The OBIS data system now contains 31.3 million observations, freely accessible through a geospatial portal. The challenges of storing, querying, disseminating, and mapping a global data collection of this complexity and magnitude are significant. In the face of declining performance and expanding feature requests, a redevelopment of the OBIS data system was undertaken. Following an Open Source philosophy, the OBIS technology stack was rebuilt using PostgreSQL, PostGIS, GeoServer and OpenLayers. This approach has markedly improved the performance and online user experience while maintaining a standards-compliant and interoperable framework. Due to the distributed nature of the project and increasing needs for storage, scalability and deployment flexibility, the entire hardware and software stack was built on a Cloud Computing environment. The flexibility of the platform, combined with the power of the application stack, enabled rapid re-development of the OBIS infrastructure, and ensured complete standards-compliance.
Visualization and Quality Control Web Tools for CERES Products
NASA Astrophysics Data System (ADS)
Mitrescu, C.; Doelling, D. R.
2017-12-01
The NASA CERES project continues to provide the scientific communities a wide variety of satellite-derived data products such as observed TOA broadband shortwave and longwave observed fluxes, computed TOA and Surface fluxes, as well as cloud, aerosol, and other atmospheric parameters. They encompass a wide range of temporal and spatial resolutions, suited to specific applications. CERES data is used mostly by climate modeling communities but also by a wide variety of educational institutions. To better serve our users, a web-based Ordering and Visualization Tool (OVT) was developed by using Opens Source Software such as Eclipse, java, javascript, OpenLayer, Flot, Google Maps, python, and others. Due to increased demand by our own scientists, we also implemented a series of specialized functions to be used in the process of CERES Data Quality Control (QC) such as 1- and 2-D histograms, anomalies and differences, temporal and spatial averaging, side-by-side parameter comparison, and others that made the process of QC far easier and faster, but more importantly far more portable. With the integration of ground site observed surface fluxes we further facilitate the CERES project to QC the CERES computed surface fluxes. An overview of the CERES OVT basic functions using Open Source Software, as well as future steps in expanding its capabilities will be presented at the meeting.
SmartR: an open-source platform for interactive visual analytics for translational research data.
Herzinger, Sascha; Gu, Wei; Satagopam, Venkata; Eifes, Serge; Rege, Kavita; Barbosa-Silva, Adriano; Schneider, Reinhard
2017-07-15
In translational research, efficient knowledge exchange between the different fields of expertise is crucial. An open platform that is capable of storing a multitude of data types such as clinical, pre-clinical or OMICS data combined with strong visual analytical capabilities will significantly accelerate the scientific progress by making data more accessible and hypothesis generation easier. The open data warehouse tranSMART is capable of storing a variety of data types and has a growing user community including both academic institutions and pharmaceutical companies. tranSMART, however, currently lacks interactive and dynamic visual analytics and does not permit any post-processing interaction or exploration. For this reason, we developed SmartR , a plugin for tranSMART, that equips the platform not only with several dynamic visual analytical workflows, but also provides its own framework for the addition of new custom workflows. Modern web technologies such as D3.js or AngularJS were used to build a set of standard visualizations that were heavily improved with dynamic elements. The source code is licensed under the Apache 2.0 License and is freely available on GitHub: https://github.com/transmart/SmartR . reinhard.schneider@uni.lu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Pérez-Rodríguez, Gael; Glez-Peña, Daniel; Azevedo, Nuno F; Pereira, Maria Olívia; Fdez-Riverola, Florentino; Lourenço, Anália
2015-03-01
Biofilms are receiving increasing attention from the biomedical community. Biofilm-like growth within human body is considered one of the key microbial strategies to augment resistance and persistence during infectious processes. The Biofilms Experiment Workbench is a novel software workbench for the operation and analysis of biofilms experimental data. The goal is to promote the interchange and comparison of data among laboratories, providing systematic, harmonised and large-scale data computation. The workbench was developed with AIBench, an open-source Java desktop application framework for scientific software development in the domain of translational biomedicine. Implementation favours free and open-source third-parties, such as the R statistical package, and reaches for the Web services of the BiofOmics database to enable public experiment deposition. First, we summarise the novel, free, open, XML-based interchange format for encoding biofilms experimental data. Then, we describe the execution of common scenarios of operation with the new workbench, such as the creation of new experiments, the importation of data from Excel spreadsheets, the computation of analytical results, the on-demand and highly customised construction of Web publishable reports, and the comparison of results between laboratories. A considerable and varied amount of biofilms data is being generated, and there is a critical need to develop bioinformatics tools that expedite the interchange and comparison of microbiological and clinical results among laboratories. We propose a simple, open-source software infrastructure which is effective, extensible and easy to understand. The workbench is freely available for non-commercial use at http://sing.ei.uvigo.es/bew under LGPL license. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Science in the Wild: Adventure Citizen Science in the Arctic and Himalaya
NASA Astrophysics Data System (ADS)
Horodyskyj, U. N.; Rufat-Latre, J.; Reimuller, J. D.; Rowe, P.; Pothier, B.; Thapa, A.
2016-12-01
Science in the Wild provides educational hands-on adventure science expeditions for the everyday person, blending athletics and academics in remote regions of the planet. Participants receive training on field data collection techniques in order to be able to help scientists in the field while on expedition with them. At SITW, we also involve our participants in analyzing and interpreting the data, thus teaching them about data quality and sources of error and uncertainty. SITW teaches citizens the art of science storytelling, aims to make science more open and transparent, and utilizes open source software and hardware in projects. Open science serves both the research community and the greater public. For the former, it makes science reproducible, transparent and more impactful by mobilizing multidisciplinary and international collaborative research efforts. For the latter, it minimizes mistrust in the sciences by allowing the public a `behind-the-scenes' look into how scientific research is conducted, raw and unfiltered. We present results from a citizen-science expedition to Baffin Island (Canadian Arctic), which successfully skied and sampled snow for dust and black carbon concentration from the Penny Ice Cap, down the 25-mile length of Coronation Glacier, and back to the small Arctic town of Qikitarjuaq. From a May/June 2016 citizen-science expedition to Nepal (Himalaya), we present results comparing 2014/16 depth and lake floor compositional data from supraglacial lakes on Ngozumpa glacier while using open-source surface and underwater robotics. The Sherpa-Scientist Initiative, a program aimed at empowering locals in data collection and interpretation, successfully trained half a dozen Sherpas during this expedition and demonstrates the value of local engagement. In future expeditions to the region, efforts will be made to scale up the number of trainees and expand our spatial reach in the Himalaya.
Avionics and Power Management for Low-Cost High-Altitude Balloon Science Platforms
NASA Technical Reports Server (NTRS)
Chin, Jeffrey; Roberts, Anthony; McNatt, Jeremiah
2016-01-01
High-altitude balloons (HABs) have become popular as educational and scientific platforms for planetary research. This document outlines key components for missions where low cost and rapid development are desired. As an alternative to ground-based vacuum and thermal testing, these systems can be flight tested at comparable costs. Communication, solar, space, and atmospheric sensing experiments often require environments where ground level testing can be challenging or impossible in certain cases. When performing HAB research the ability to monitor the status of the platform and gather data is key for both scientific and recoverability aspects of the mission. A few turnkey platform solutions are outlined that leverage rapidly evolving open-source engineering ecosystems. Rather than building custom components from scratch, these recommendations attempt to maximize simplicity and cost of HAB platforms to make launches more accessible to everyone.
Ward, Tony J; Delaloye, Naomi; Adams, Earle Raymond; Ware, Desirae; Vanek, Diana; Knuth, Randy; Hester, Carolyn Laurie; Marra, Nancy Noel; Holian, Andrij
2016-01-01
Air Toxics Under the Big Sky is an environmental science outreach/education program that incorporates the Next Generation Science Standards (NGSS) 8 Practices with the goal of promoting knowledge and understanding of authentic scientific research in high school classrooms through air quality research. A quasi-experimental design was used in order to understand: 1) how the program affects student understanding of scientific inquiry and research and 2) how the open inquiry learning opportunities provided by the program increase student interest in science as a career path . Treatment students received instruction related to air pollution (airborne particulate matter), associated health concerns, and training on how to operate air quality testing equipment. They then participated in a yearlong scientific research project in which they developed and tested hypotheses through research of their own design regarding the sources and concentrations of air pollution in their homes and communities. Results from an external evaluation revealed that treatment students developed a deeper understanding of scientific research than did comparison students, as measured by their ability to generate good hypotheses and research designs, and equally expressed an increased interest in pursuing a career in science. These results emphasize the value of and need for authentic science learning opportunities in the modern science classroom.
NASA Astrophysics Data System (ADS)
Ward, Tony J.; Delaloye, Naomi; Adams, Earle Raymond; Ware, Desirae; Vanek, Diana; Knuth, Randy; Hester, Carolyn Laurie; Marra, Nancy Noel; Holian, Andrij
2016-04-01
Air Toxics Under the Big Sky is an environmental science outreach/education program that incorporates the Next Generation Science Standards (NGSS) 8 Practices with the goal of promoting knowledge and understanding of authentic scientific research in high school classrooms through air quality research. This research explored: (1) how the program affects student understanding of scientific inquiry and research and (2) how the open-inquiry learning opportunities provided by the program increase student interest in science as a career path. Treatment students received instruction related to air pollution (airborne particulate matter), associated health concerns, and training on how to operate air quality testing equipment. They then participated in a yearlong scientific research project in which they developed and tested hypotheses through research of their own design regarding the sources and concentrations of air pollution in their homes and communities. Results from an external evaluation revealed that treatment students developed a deeper understanding of scientific research than did comparison students, as measured by their ability to generate good hypotheses and research designs, and equally expressed an increased interest in pursuing a career in science. These results emphasize the value of and need for authentic science learning opportunities in the modern science classroom.
Delaloye, Naomi; Adams, Earle Raymond; Ware, Desirae; Vanek, Diana; Knuth, Randy; Hester, Carolyn Laurie; Marra, Nancy Noel; Holian, Andrij
2016-01-01
Air Toxics Under the Big Sky is an environmental science outreach/education program that incorporates the Next Generation Science Standards (NGSS) 8 Practices with the goal of promoting knowledge and understanding of authentic scientific research in high school classrooms through air quality research. A quasi-experimental design was used in order to understand: 1) how the program affects student understanding of scientific inquiry and research and 2) how the open inquiry learning opportunities provided by the program increase student interest in science as a career path. Treatment students received instruction related to air pollution (airborne particulate matter), associated health concerns, and training on how to operate air quality testing equipment. They then participated in a yearlong scientific research project in which they developed and tested hypotheses through research of their own design regarding the sources and concentrations of air pollution in their homes and communities. Results from an external evaluation revealed that treatment students developed a deeper understanding of scientific research than did comparison students, as measured by their ability to generate good hypotheses and research designs, and equally expressed an increased interest in pursuing a career in science. These results emphasize the value of and need for authentic science learning opportunities in the modern science classroom. PMID:28286375
Huh, Sun
2013-01-01
ScienceCentral, a free or open access, full-text archive of scientific journal literature at the Korean Federation of Science and Technology Societies, was under test in September 2013. Since it is a Journal Article Tag Suite-based full text database, extensible markup language files of all languages can be presented, according to Unicode Transformation Format 8-bit encoding. It is comparable to PubMed Central: however, there are two distinct differences. First, its scope comprises all science fields; second, it accepts all language journals. Launching ScienceCentral is the first step for free access or open access academic scientific journals of all languages to leap to the world, including scientific journals from Croatia. PMID:24266292
ObsPy: A Python Toolbox for Seismology - Recent Developments and Applications
NASA Astrophysics Data System (ADS)
Megies, T.; Krischer, L.; Barsch, R.; Sales de Andrade, E.; Beyreuther, M.
2014-12-01
ObsPy (http://www.obspy.org) is a community-driven, open-source project dedicated to building a bridge for seismology into the scientific Python ecosystem. It offersa) read and write support for essentially all commonly used waveform, station, and event metadata file formats with a unified interface,b) a comprehensive signal processing toolbox tuned to the needs of seismologists,c) integrated access to all large data centers, web services and databases, andd) convenient wrappers to legacy codes like libtau and evalresp.Python, currently the most popular language for teaching introductory computer science courses at top-ranked U.S. departments, is a full-blown programming language with the flexibility of an interactive scripting language. Its extensive standard library and large variety of freely available high quality scientific modules cover most needs in developing scientific processing workflows. Together with packages like NumPy, SciPy, Matplotlib, IPython, Pandas, lxml, and PyQt, ObsPy enables the construction of complete workflows in Python. These vary from reading locally stored data or requesting data from one or more different data centers through to signal analysis and data processing and on to visualizations in GUI and web applications, output of modified/derived data and the creation of publication-quality figures.ObsPy enjoys a large world-wide rate of adoption in the community. Applications successfully using it include time-dependent and rotational seismology, big data processing, event relocations, and synthetic studies about attenuation kernels and full-waveform inversions to name a few examples. All functionality is extensively documented and the ObsPy tutorial and gallery give a good impression of the wide range of possible use cases.We will present the basic features of ObsPy, new developments and applications, and a roadmap for the near future and discuss the sustainability of our open-source development model.
Data processing with Pymicra, the Python tool for Micrometeorological Analyses
NASA Astrophysics Data System (ADS)
Chor, T. L.; Dias, N. L.
2017-12-01
With the ever-increasing capability of instrumentation of collecting high-frequency turbulence data, micrometeorological experiments are now generating significant amounts of data. Clearly, data processing -- and not data collection anymore -- has become the limiting factor for those very large data sets. The ability of extracting useful scientific information from those experiments, therefore, hinges on tools that (i) are able to process those data effectively and accurately, (ii) are flexible enough to be adapted to the specific requirements of each investigation, and (iii) are robust enough to make data analysis easily reproducible over different sets of large data sets. We have developed a framework for micrometeorological data analysis called Pymicra which does deliver such capabilities while maintaining proximity of the investigator with the data. It is fully written in an open-source, very high level language, Python, which has been gaining widespread acceptance as a scientific tool. It follows the philosophy of "not reinventing the wheel" and, as a result, relies on existing well-established open-source Python packages such as Numpy and Pandas. Thus, minimum effort is needed to program statistics, array processing, Fourier analysis, etc. Among the things that Pymicra does are reading and organizing data from virtually any format, applying common quality control procedures, extracting fluctuations in a number of ways, correcting for sensor drift, automatic calculation of fluid properties (such as air and dry air density), handling of units, calculation of cross-spectra, calculation of turbulent fluxes and scales, and all other features already provided by Pandas (interpolation, statistical tests, handling of missing data, etc.). Pymicra is freely available on Github and the fact that it makes heavy use of high-level programming makes adding and modifying code considerably easy for any scientific programmer, making it straightforward for other scientists to contribute with new functionality and point out room for improvements. Because of that, Pymicra is a candidate to be a community-developed code in the future and to centralize part of the data processing aimed at micrometeorology.
ObsPy: A Python toolbox for seismology - Sustainability, New Features, and Applications
NASA Astrophysics Data System (ADS)
Krischer, L.; Megies, T.; Sales de Andrade, E.; Barsch, R.; MacCarthy, J.
2016-12-01
ObsPy (https://www.obspy.org) is a community-driven, open-source project dedicated to offer a bridge for seismology into the scientific Python ecosystem. Amongst other things, it provides Read and write support for essentially every commonly used data format in seismology with a unified interface. This includes waveform data as well as station and event meta information. A signal processing toolbox tuned to the specific needs of seismologists. Integrated access to the largest data centers, web services, and databases. Wrappers around third party codes like libmseed and evalresp. Using ObsPy enables users to take advantage of the vast scientific ecosystem that has developed around Python. In contrast to many other programming languages and tools, Python is simple enough to enable an exploratory and interactive coding style desired by many scientists. At the same time it is a full-fledged programming language usable by software engineers to build complex and large programs. This combination makes it very suitable for use in seismology where research code often must be translated to stable and production ready environments, especially in the age of big data. ObsPy has seen constant development for more than six years and enjoys a large rate of adoption in the seismological community with thousands of users. Successful applications include time-dependent and rotational seismology, big data processing, event relocations, and synthetic studies about attenuation kernels and full-waveform inversions to name a few examples. Additionally it sparked the development of several more specialized packages slowly building a modern seismological ecosystem around it. We will present a short overview of the capabilities of ObsPy and point out several representative use cases and more specialized software built around ObsPy. Additionally we will discuss new and upcoming features, as well as the sustainability of open-source scientific software.
OpenAQ: A Platform to Aggregate and Freely Share Global Air Quality Data
NASA Astrophysics Data System (ADS)
Hasenkopf, C. A.; Flasher, J. C.; Veerman, O.; DeWitt, H. L.
2015-12-01
Thousands of ground-based air quality monitors around the world publicly publish real-time air quality data; however, researchers and the public do not have access to this information in the ways most useful to them. Often, air quality data are posted on obscure websites showing only current values, are programmatically inaccessible, and/or are in inconsistent data formats across sites. Yet, historical and programmatic access to such a global dataset would be transformative to several scientific fields, from epidemiology to low-cost sensor technologies to estimates of ground-level aerosol by satellite retrievals. To increase accessibility and standardize this disparate dataset, we have built OpenAQ, an innovative, open platform created by a group of scientists and open data programmers. The source code for the platform is viewable at github.com/openaq. Currently, we are aggregating, storing, and making publicly available real-time air quality data (PM2.5, PM10, SO2, NO2, and O3) via an Application Program Interface (API). We will present the OpenAQ platform, which currently has the following specific capabilities: A continuous ingest mechanism for some of the most polluted cities, generalizable to more sources An API providing data-querying, including ability to filter by location, measurement type and value and date, as well as custom sort options A generalized, chart-based visualization tool to explore data accessible via the API At this stage, we are seeking wider participation and input from multiple research communities in expanding our data retrieval sites, standardizing our protocols, receiving feedback on quality issues, and creating tools that can be built on top of this open platform.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aquila, Andrew Lee
The development of multilayer optics for extreme ultraviolet (EUV) radiation has led to advancements in many areas of science and technology, including materials studies, EUV lithography, water window microscopy, plasma imaging, and orbiting solar physics imaging. Recent developments in femtosecond and attosecond EUV pulse generation from sources such as high harmonic generation lasers, combined with the elemental and chemical specificity provided by EUV radiation, are opening new opportunities to study fundamental dynamic processes in materials. Critical to these efforts is the design and fabrication of multilayer optics to transport, focus, shape and image these ultra-fast pulses This thesis describes themore » design, fabrication, characterization, and application of multilayer optics for EUV femtosecond and attosecond scientific studies. Multilayer mirrors for bandwidth control, pulse shaping and compression, tri-material multilayers, and multilayers for polarization control are described. Characterization of multilayer optics, including measurement of material optical constants, reflectivity of multilayer mirrors, and metrology of reflected phases of the multilayer, which is critical to maintaining pulse size and shape, were performed. Two applications of these multilayer mirrors are detailed in the thesis. In the first application, broad bandwidth multilayers were used to characterize and measure sub-100 attosecond pulses from a high harmonic generation source and was performed in collaboration with the Max-Planck institute for Quantum Optics and Ludwig- Maximilians University in Garching, Germany, with Professors Krausz and Kleineberg. In the second application, multilayer mirrors with polarization control are useful to study femtosecond spin dynamics in an ongoing collaboration with the T-REX group of Professor Parmigiani at Elettra in Trieste, Italy. As new ultrafast x-ray sources become available, for example free electron lasers, the multilayer designs described in this thesis can be extended to higher photon energies, and such designs can be used with those sources to enable new scientific studies, such as molecular bonding, phonon, and spin dynamics.« less
Cross-language Babel structs—making scientific interfaces more efficient
NASA Astrophysics Data System (ADS)
Prantl, Adrian; Ebner, Dietmar; Epperly, Thomas G. W.
2013-01-01
Babel is an open-source language interoperability framework tailored to the needs of high-performance scientific computing. As an integral element of the Common Component Architecture, it is employed in a wide range of scientific applications where it is used to connect components written in different programming languages. In this paper we describe how we extended Babel to support interoperable tuple data types (structs). Structs are a common idiom in (mono-lingual) scientific application programming interfaces (APIs); they are an efficient way to pass tuples of nonuniform data between functions, and are supported natively by most programming languages. Using our extended version of Babel, developers of scientific codes can now pass structs as arguments between functions implemented in any of the supported languages. In C, C++, Fortran 2003/2008 and Chapel, structs can be passed without the overhead of data marshaling or copying, providing language interoperability at minimal cost. Other supported languages are Fortran 77, Fortran 90/95, Java and Python. We will show how we designed a struct implementation that is interoperable with all of the supported languages and present benchmark data to compare the performance of all language bindings, highlighting the differences between languages that offer native struct support and an object-oriented interface with getter/setter methods. A case study shows how structs can help simplify the interfaces of scientific codes significantly.
QSAR DataBank - an approach for the digital organization and archiving of QSAR model information
2014-01-01
Background Research efforts in the field of descriptive and predictive Quantitative Structure-Activity Relationships or Quantitative Structure–Property Relationships produce around one thousand scientific publications annually. All the materials and results are mainly communicated using printed media. The printed media in its present form have obvious limitations when they come to effectively representing mathematical models, including complex and non-linear, and large bodies of associated numerical chemical data. It is not supportive of secondary information extraction or reuse efforts while in silico studies poses additional requirements for accessibility, transparency and reproducibility of the research. This gap can and should be bridged by introducing domain-specific digital data exchange standards and tools. The current publication presents a formal specification of the quantitative structure-activity relationship data organization and archival format called the QSAR DataBank (QsarDB for shorter, or QDB for shortest). Results The article describes QsarDB data schema, which formalizes QSAR concepts (objects and relationships between them) and QsarDB data format, which formalizes their presentation for computer systems. The utility and benefits of QsarDB have been thoroughly tested by solving everyday QSAR and predictive modeling problems, with examples in the field of predictive toxicology, and can be applied for a wide variety of other endpoints. The work is accompanied with open source reference implementation and tools. Conclusions The proposed open data, open source, and open standards design is open to public and proprietary extensions on many levels. Selected use cases exemplify the benefits of the proposed QsarDB data format. General ideas for future development are discussed. PMID:24910716
... version of this page please turn JavaScript on. Technology Opens Doors to Scientific Discovery Past Issues / Spring 2016 Table of Contents Susannah Fox, chief technology officer of the U.S. Department of Health and ...
PlasmaPy: initial development of a Python package for plasma physics
NASA Astrophysics Data System (ADS)
Murphy, Nicholas; Leonard, Andrew J.; Stańczak, Dominik; Haggerty, Colby C.; Parashar, Tulasi N.; Huang, Yu-Min; PlasmaPy Community
2017-10-01
We report on initial development of PlasmaPy: an open source community-driven Python package for plasma physics. PlasmaPy seeks to provide core functionality that is needed for the formation of a fully open source Python ecosystem for plasma physics. PlasmaPy prioritizes code readability, consistency, and maintainability while using best practices for scientific computing such as version control, continuous integration testing, embedding documentation in code, and code review. We discuss our current and planned capabilities, including features presently under development. The development roadmap includes features such as fluid and particle simulation capabilities, a Grad-Shafranov solver, a dispersion relation solver, atomic data retrieval methods, and tools to analyze simulations and experiments. We describe several ways to contribute to PlasmaPy. PlasmaPy has a code of conduct and is being developed under a BSD license, with a version 0.1 release planned for 2018. The success of PlasmaPy depends on active community involvement, so anyone interested in contributing to this project should contact the authors. This work was partially supported by the U.S. Department of Energy.
Mocking the weak lensing universe: The LensTools Python computing package
NASA Astrophysics Data System (ADS)
Petri, A.
2016-10-01
We present a newly developed software package which implements a wide range of routines frequently used in Weak Gravitational Lensing (WL). With the continuously increasing size of the WL scientific community we feel that easy to use Application Program Interfaces (APIs) for common calculations are a necessity to ensure efficiency and coordination across different working groups. Coupled with existing open source codes, such as CAMB (Lewis et al., 2000) and Gadget2 (Springel, 2005), LensTools brings together a cosmic shear simulation pipeline which, complemented with a variety of WL feature measurement tools and parameter sampling routines, provides easy access to the numerics for theoretical studies of WL as well as for experiment forecasts. Being implemented in PYTHON (Rossum, 1995), LensTools takes full advantage of a range of state-of-the art techniques developed by the large and growing open-source software community (Jones et al., 2001; McKinney, 2010; Astrophy Collaboration, 2013; Pedregosa et al., 2011; Foreman-Mackey et al., 2013). We made the LensTools code available on the Python Package Index and published its documentation on http://lenstools.readthedocs.io.
Sun, Ryan; Bouchard, Matthew B.; Hillman, Elizabeth M. C.
2010-01-01
Camera-based in-vivo optical imaging can provide detailed images of living tissue that reveal structure, function, and disease. High-speed, high resolution imaging can reveal dynamic events such as changes in blood flow and responses to stimulation. Despite these benefits, commercially available scientific cameras rarely include software that is suitable for in-vivo imaging applications, making this highly versatile form of optical imaging challenging and time-consuming to implement. To address this issue, we have developed a novel, open-source software package to control high-speed, multispectral optical imaging systems. The software integrates a number of modular functions through a custom graphical user interface (GUI) and provides extensive control over a wide range of inexpensive IEEE 1394 Firewire cameras. Multispectral illumination can be incorporated through the use of off-the-shelf light emitting diodes which the software synchronizes to image acquisition via a programmed microcontroller, allowing arbitrary high-speed illumination sequences. The complete software suite is available for free download. Here we describe the software’s framework and provide details to guide users with development of this and similar software. PMID:21258475
CellProfiler Tracer: exploring and validating high-throughput, time-lapse microscopy image data.
Bray, Mark-Anthony; Carpenter, Anne E
2015-11-04
Time-lapse analysis of cellular images is an important and growing need in biology. Algorithms for cell tracking are widely available; what researchers have been missing is a single open-source software package to visualize standard tracking output (from software like CellProfiler) in a way that allows convenient assessment of track quality, especially for researchers tuning tracking parameters for high-content time-lapse experiments. This makes quality assessment and algorithm adjustment a substantial challenge, particularly when dealing with hundreds of time-lapse movies collected in a high-throughput manner. We present CellProfiler Tracer, a free and open-source tool that complements the object tracking functionality of the CellProfiler biological image analysis package. Tracer allows multi-parametric morphological data to be visualized on object tracks, providing visualizations that have already been validated within the scientific community for time-lapse experiments, and combining them with simple graph-based measures for highlighting possible tracking artifacts. CellProfiler Tracer is a useful, free tool for inspection and quality control of object tracking data, available from http://www.cellprofiler.org/tracer/.
2MASS Catalog Server Kit Version 2.1
NASA Astrophysics Data System (ADS)
Yamauchi, C.
2013-10-01
The 2MASS Catalog Server Kit is open source software for use in easily constructing a high performance search server for important astronomical catalogs. This software utilizes the open source RDBMS PostgreSQL, therefore, any users can setup the database on their local computers by following step-by-step installation guide. The kit provides highly optimized stored functions for positional searchs similar to SDSS SkyServer. Together with these, the powerful SQL environment of PostgreSQL will meet various user's demands. We released 2MASS Catalog Server Kit version 2.1 in 2012 May, which supports the latest WISE All-Sky catalog (563,921,584 rows) and 9 major all-sky catalogs. Local databases are often indispensable for observatories with unstable or narrow-band networks or severe use, such as retrieving large numbers of records within a small period of time. This software is the best for such purposes, and increasing supported catalogs and improvements of version 2.1 can cover a wider range of applications including advanced calibration system, scientific studies using complicated SQL queries, etc. Official page: http://www.ir.isas.jaxa.jp/~cyamauch/2masskit/
Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery.
Karthikeyan, Muthukumarasamy; Vyas, Renu
2015-01-01
Advancement in chemoinformatics research in parallel with availability of high performance computing platform has made handling of large scale multi-dimensional scientific data for high throughput drug discovery easier. In this study we have explored publicly available molecular databases with the help of open-source based integrated in-house molecular informatics tools for virtual screening. The virtual screening literature for past decade has been extensively investigated and thoroughly analyzed to reveal interesting patterns with respect to the drug, target, scaffold and disease space. The review also focuses on the integrated chemoinformatics tools that are capable of harvesting chemical data from textual literature information and transform them into truly computable chemical structures, identification of unique fragments and scaffolds from a class of compounds, automatic generation of focused virtual libraries, computation of molecular descriptors for structure-activity relationship studies, application of conventional filters used in lead discovery along with in-house developed exhaustive PTC (Pharmacophore, Toxicophores and Chemophores) filters and machine learning tools for the design of potential disease specific inhibitors. A case study on kinase inhibitors is provided as an example.
CFHT data processing and calibration ESPaDOnS pipeline: Upena and OPERA (optical spectropolarimetry)
NASA Astrophysics Data System (ADS)
Martioli, Eder; Teeple, D.; Manset, Nadine
2011-03-01
CFHT is ESPaDOnS responsible for processing raw images, removing instrument related artifacts, and delivering science-ready data to the PIs. Here we describe the Upena pipeline, which is the software used to reduce the echelle spectro-polarimetric data obtained with the ESPaDOnS instrument. Upena is an automated pipeline that performs calibration and reduction of raw images. Upena has the capability of both performing real-time image-by-image basis reduction and a post observing night complete reduction. Upena produces polarization and intensity spectra in FITS format. The pipeline is designed to perform parallel computing for improved speed, which assures that the final products are delivered to the PIs before noon HST after each night of observations. We also present the OPERA project, which is an open-source pipeline to reduce ESPaDOnS data that will be developed as a collaborative work between CFHT and the scientific community. OPERA will match the core capabilities of Upena and in addition will be open-source, flexible and extensible.
Ten tips for authors of scientific articles.
Hong, Sung-Tae
2014-08-01
Writing a good quality scientific article takes experience and skill. I propose 'Ten Tips' that may help to improve the quality of manuscripts for scholarly journals. It is advisable to draft first version of manuscript and revise it repeatedly for consistency and accuracy of the writing. During the drafting and revising the following tips can be considered: 1) focus on design to have proper content, conclusion, points compliant with scope of the target journal, appropriate authors and contributors list, and relevant references from widely visible sources; 2) format the manuscript in accordance with instructions to authors of the target journal; 3) ensure consistency and logical flow of ideas and scientific facts; 4) provide scientific confidence; 5) make your story interesting for your readers; 6) write up short, simple and attractive sentences; 7) bear in mind that properly composed and reflective titles increase chances of attracting more readers; 8) do not forget that well-structured and readable abstracts improve citability of your publications; 9) when revising adhere to the rule of 'First and Last' - open your text with topic paragraph and close it with resolution paragraph; 10) use connecting words linking sentences within a paragraph by repeating relevant keywords.
Ten Tips for Authors of Scientific Articles
2014-01-01
Writing a good quality scientific article takes experience and skill. I propose 'Ten Tips' that may help to improve the quality of manuscripts for scholarly journals. It is advisable to draft first version of manuscript and revise it repeatedly for consistency and accuracy of the writing. During the drafting and revising the following tips can be considered: 1) focus on design to have proper content, conclusion, points compliant with scope of the target journal, appropriate authors and contributors list, and relevant references from widely visible sources; 2) format the manuscript in accordance with instructions to authors of the target journal; 3) ensure consistency and logical flow of ideas and scientific facts; 4) provide scientific confidence; 5) make your story interesting for your readers; 6) write up short, simple and attractive sentences; 7) bear in mind that properly composed and reflective titles increase chances of attracting more readers; 8) do not forget that well-structured and readable abstracts improve citability of your publications; 9) when revising adhere to the rule of 'First and Last' - open your text with topic paragraph and close it with resolution paragraph; 10) use connecting words linking sentences within a paragraph by repeating relevant keywords. PMID:25120310
Beyond open access: open discourse, the next great equalizer.
Dayton, Andrew I
2006-08-30
The internet is expanding the realm of scientific publishing to include free and open public debate of published papers. Journals are beginning to support web posting of comments on their published articles and independent organizations are providing centralized web sites for posting comments about any published article. The trend promises to give one and all access to read and contribute to cutting edge scientific criticism and debate.
Free/Libre open source software in health care: a review.
Karopka, Thomas; Schmuhl, Holger; Demski, Hans
2014-01-01
To assess the current state of the art and the contribution of Free/Libre Open Source Software in health care (FLOSS-HC). The review is based on a narrative review of the scientific literature as well as sources in the context of FLOSS-HC available through the Internet. All relevant available sources have been integrated into the MedFLOSS database and are freely available to the community. The literature review reveals that publications about FLOSS-HC are scarce. The largest part of information about FLOSS-HC is available on dedicated websites and not in the academic literature. There are currently FLOSS alternatives available for nearly every specialty in health care. Maturity and quality varies considerably and there is little information available on the percentage of systems that are actually used in health care delivery. The global impact of FLOSS-HC is still very limited and no figures on the penetration and usage of FLOSS-HC are available. However, there has been a considerable growth in the last 5 to 10 years. While there where only few systems available a decade ago, in the meantime many systems got available (e.g., more than 300 in the MedFLOSS database). While FLOSS concepts play an important role in most IT related sectors (e.g., telecommunications, embedded devices) the healthcare industry is lagging behind this trend.
Skyline: an open source document editor for creating and analyzing targeted proteomics experiments.
MacLean, Brendan; Tomazela, Daniela M; Shulman, Nicholas; Chambers, Matthew; Finney, Gregory L; Frewen, Barbara; Kern, Randall; Tabb, David L; Liebler, Daniel C; MacCoss, Michael J
2010-04-01
Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. It is open source and freely available for academic and commercial use. The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using selected reaction monitoring (SRM). Skyline supports using and creating MS/MS spectral libraries from a wide variety of sources to choose SRM filters and verify results based on previously observed ion trap data. Skyline exports transition lists to and imports the native output files from Agilent, Applied Biosystems, Thermo Fisher Scientific and Waters triple quadrupole instruments, seamlessly connecting mass spectrometer output back to the experimental design document. The fast and compact Skyline file format is easily shared, even for experiments requiring many sample injections. A rich array of graphs displays results and provides powerful tools for inspecting data integrity as data are acquired, helping instrument operators to identify problems early. The Skyline dynamic report designer exports tabular data from the Skyline document model for in-depth analysis with common statistical tools. Single-click, self-updating web installation is available at http://proteome.gs.washington.edu/software/skyline. This web site also provides access to instructional videos, a support board, an issues list and a link to the source code project.
Free/Libre Open Source Software in Health Care: A Review
Schmuhl, Holger; Demski, Hans
2014-01-01
Objectives To assess the current state of the art and the contribution of Free/Libre Open Source Software in health care (FLOSS-HC). Methods The review is based on a narrative review of the scientific literature as well as sources in the context of FLOSS-HC available through the Internet. All relevant available sources have been integrated into the MedFLOSS database and are freely available to the community. Results The literature review reveals that publications about FLOSS-HC are scarce. The largest part of information about FLOSS-HC is available on dedicated websites and not in the academic literature. There are currently FLOSS alternatives available for nearly every specialty in health care. Maturity and quality varies considerably and there is little information available on the percentage of systems that are actually used in health care delivery. Conclusions The global impact of FLOSS-HC is still very limited and no figures on the penetration and usage of FLOSS-HC are available. However, there has been a considerable growth in the last 5 to 10 years. While there where only few systems available a decade ago, in the meantime many systems got available (e.g., more than 300 in the MedFLOSS database). While FLOSS concepts play an important role in most IT related sectors (e.g., telecommunications, embedded devices) the healthcare industry is lagging behind this trend. PMID:24627814
NASA Astrophysics Data System (ADS)
Buxner, Sanlyn; Impey, Chris David; Formanek, Martin; Wenger, Matthew
2018-01-01
Introductory astronomy courses are exciting opportunities to engage non-major students in scientific issues, new discoveries, and scientific thinking. Many undergraduate students take these courses to complete their general education requirements. Many free-choice learners also take these courses, but for their own interest. We report on a study comparing the basic science knowledge, interest in science, and information literacy of undergraduate students and free choice learners enrolled in introductory astronomy courses run by the University of Arizona. Undergraduate students take both in-person and online courses for college credit. Free choice learners enroll in massive open online courses (MOOCs), through commercial platforms, that can earn them a certificate (although most do not take advantage of that opportunity). In general, we find that undergraduate students outperform the general public on basic science knowledge and that learners in our astronomy MOOCs outperform the undergraduate students in the study. Learners in the MOOC have higher interest in science in general. Overall, learners in both groups report getting information about science from online sources. Additionally, learners’ judgement of the reliability of different sources of information is weakly related to their basic science knowledge and more strongly related to how they describe what it means to study something scientifically. We discuss the implications of our findings for both undergraduate students and free-choice learners as well as instructors of these types of courses.
Youpi: YOUr processing PIpeline
NASA Astrophysics Data System (ADS)
Monnerville, Mathias; Sémah, Gregory
2012-03-01
Youpi is a portable, easy to use web application providing high level functionalities to perform data reduction on scientific FITS images. Built on top of various open source reduction tools released to the community by TERAPIX (http://terapix.iap.fr), Youpi can help organize data, manage processing jobs on a computer cluster in real time (using Condor) and facilitate teamwork by allowing fine-grain sharing of results and data. Youpi is modular and comes with plugins which perform, from within a browser, various processing tasks such as evaluating the quality of incoming images (using the QualityFITS software package), computing astrometric and photometric solutions (using SCAMP), resampling and co-adding FITS images (using SWarp) and extracting sources and building source catalogues from astronomical images (using SExtractor). Youpi is useful for small to medium-sized data reduction projects; it is free and is published under the GNU General Public License.
Engaging the public in the nascent era of gravitational-wave astronomy
NASA Astrophysics Data System (ADS)
Hendry, Martin A.
2015-08-01
Within the next few years a global network of ground-based laser interferometers will become fully operational. These ultra-sensitive instruments are confidently expected to directly detect gravitational waves from astrophysical sources before the end of the decade. In anticipation of opening this entirely new window on the Universe, the LIGO (Laser Interferometer Gravitational Wave Observatory) Scientific Collaboration has recently developed a substantive program of education and public outreach activities that includes exhibitions, documentary films, social media and interactive games - as well as more traditional modes of science communication such as schools and public lectures.As the gravitational wave 'detection era' unfolds over the next decade, it will present exciting challenges for future public engagement by the LIGO Scientific Collaboration and by other gravitational-wave astronomy collaborations around the world. Perhaps the most interesting opportunities will be in the area of citizen science, building upon the infrastructure already being developed through e.g. the LIGO Open Science Center (see arXiv:1410.4839) and the remarkable success of the Einstein@Home project (www.einsteinathome.org).In this presentation I will give an overview of the LSC education and public outreach program, highlighting its goals, major successes and future strategy - particularly in relation to the release of future LIGO and other gravitational wave datasets to the scientific community and to the public, and the opportunities this will present for directly engaging citizen scientists in this exciting new field of observational astronomy.
OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system.
Senderov, Viktor; Simov, Kiril; Franz, Nico; Stoev, Pavel; Catapano, Terry; Agosti, Donat; Sautter, Guido; Morris, Robert A; Penev, Lyubomir
2018-01-18
The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies for biodiversity resources, such as DarwinCore-based ontologies, and semantic publishing ontologies, such as the SPAR Ontologies. We bridge this gap by providing an ontology focusing on biological taxonomy. OpenBiodiv-O introduces classes, properties, and axioms in the domains of scholarly biodiversity publishing and biological taxonomy and aligns them with several important domain ontologies (FaBiO, DoCO, DwC, Darwin-SW, NOMEN, ENVO). By doing so, it bridges the ontological gap across scholarly biodiversity publishing and biological taxonomy and allows for the creation of a Linked Open Dataset (LOD) of biodiversity information (a biodiversity knowledge graph) and enables the creation of the OpenBiodiv Knowledge Management System. A key feature of the ontology is that it is an ontology of the scientific process of biological taxonomy and not of any particular state of knowledge. This feature allows it to express a multiplicity of scientific opinions. The resulting OpenBiodiv knowledge system may gain a high level of trust in the scientific community as it does not force a scientific opinion on its users (e.g. practicing taxonomists, library researchers, etc.), but rather provides the tools for experts to encode different views as science progresses. OpenBiodiv-O provides a conceptual model of the structure of a biodiversity publication and the development of related taxonomic concepts. It also serves as the basis for the OpenBiodiv Knowledge Management System.
SIGNUM: A Matlab, TIN-based landscape evolution model
NASA Astrophysics Data System (ADS)
Refice, A.; Giachetta, E.; Capolongo, D.
2012-08-01
Several numerical landscape evolution models (LEMs) have been developed to date, and many are available as open source codes. Most are written in efficient programming languages such as Fortran or C, but often require additional code efforts to plug in to more user-friendly data analysis and/or visualization tools to ease interpretation and scientific insight. In this paper, we present an effort to port a common core of accepted physical principles governing landscape evolution directly into a high-level language and data analysis environment such as Matlab. SIGNUM (acronym for Simple Integrated Geomorphological Numerical Model) is an independent and self-contained Matlab, TIN-based landscape evolution model, built to simulate topography development at various space and time scales. SIGNUM is presently capable of simulating hillslope processes such as linear and nonlinear diffusion, fluvial incision into bedrock, spatially varying surface uplift which can be used to simulate changes in base level, thrust and faulting, as well as effects of climate changes. Although based on accepted and well-known processes and algorithms in its present version, it is built with a modular structure, which allows to easily modify and upgrade the simulated physical processes to suite virtually any user needs. The code is conceived as an open-source project, and is thus an ideal tool for both research and didactic purposes, thanks to the high-level nature of the Matlab environment and its popularity among the scientific community. In this paper the simulation code is presented together with some simple examples of surface evolution, and guidelines for development of new modules and algorithms are proposed.
NASA Astrophysics Data System (ADS)
Wilson, Cian R.; Spiegelman, Marc; van Keken, Peter E.
2017-02-01
We introduce and describe a new software infrastructure TerraFERMA, the Transparent Finite Element Rapid Model Assembler, for the rapid and reproducible description and solution of coupled multiphysics problems. The design of TerraFERMA is driven by two computational needs in Earth sciences. The first is the need for increased flexibility in both problem description and solution strategies for coupled problems where small changes in model assumptions can lead to dramatic changes in physical behavior. The second is the need for software and models that are more transparent so that results can be verified, reproduced, and modified in a manner such that the best ideas in computation and Earth science can be more easily shared and reused. TerraFERMA leverages three advanced open-source libraries for scientific computation that provide high-level problem description (FEniCS), composable solvers for coupled multiphysics problems (PETSc), and an options handling system (SPuD) that allows the hierarchical management of all model options. TerraFERMA integrates these libraries into an interface that organizes the scientific and computational choices required in a model into a single options file from which a custom compiled application is generated and run. Because all models share the same infrastructure, models become more reusable and reproducible, while still permitting the individual researcher considerable latitude in model construction. TerraFERMA solves partial differential equations using the finite element method. It is particularly well suited for nonlinear problems with complex coupling between components. TerraFERMA is open-source and available at http://terraferma.github.io, which includes links to documentation and example input files.
Using Data Maturity Metrics to Help Insure Scientific Integrity
NASA Astrophysics Data System (ADS)
Bates, J. J.
2016-12-01
In the past few years, the full and open sharing of data and other artifacts of research in the geophysics have become mandatory. These include commitments from scientific societies, calls from international organizations, and in the United States, Executive Orders and legislation for open data. Unfortunately, these calls for open data have had, at best, mixed results. Audits by publishers indicate most who do not comply with are unaware of the policies or simply ignore them. A recent high profile publication on global warming resulted in repeated demands of `all data' from Congress and a prolonged back and forth on what `all data' meant. The emerging field of Data Management Maturity is making substantial progress on quantifying the elements needed to fully document, and making open and accessible, geophysical data sets. The proposed metrics have been culled from best practices for observational data sets and these metrics have been applied to a number of geophysical disciplines. A case study applying this metric to climate change indicators will be presented. It is recommended that U.S. Agencies and scientific societies formally adopt data maturity metrics to help ensure a consistent approach to ensure open data and scientific integrity.
PHARAO space atomic clock: new developments on the laser source
NASA Astrophysics Data System (ADS)
Saccoccio, Muriel; Loesel, Jacques; Coatantiec, Claude; Simon, Eric; Laurent, Philippe; Lemonde, Pierre; Maksimovic, I.; Abgrall, M.
2017-11-01
The PHARAO project purpose is to open the way for a new atomic clock generation in space, where laser cooling techniques and microgravity allow high frequency stability and accuracy. The French space agency, CNES is funding and managing the clock construction. The French SYRTE and LKB laboratories are scientific and technical advisers for the clock requirements and the follow-up of subsystem development in industrial companies. EADS SODERN is developing two main subsystems of the PHARAO clock: the Laser Source and the Cesium Tube where atoms are cooled, launched, selected and detected by laser beams. The Laser Source includes an optical bench and electronic devices to generate the laser beams required. This paper describes PHARAO and the role laser beams play in its principle of operation. Then we present the Laser Source design, the technologies involved, and the status of development. Lastly, we focus of a key equipment to reach the performances expected, which is the Extended Cavity Laser Diode.
NASA STI Program Coordinating Council Twelfth Meeting: Standards
NASA Technical Reports Server (NTRS)
1994-01-01
The theme of this NASA Scientific and Technical Information Program Coordinating Council Meeting was standards and their formation and application. Topics covered included scientific and technical information architecture, the Open Systems Interconnection Transmission Control Protocol/Internet Protocol, Machine-Readable Cataloging (MARC) open system environment procurement, and the Government Information Locator Service.
GNU Data Language (GDL) - a free and open-source implementation of IDL
NASA Astrophysics Data System (ADS)
Arabas, Sylwester; Schellens, Marc; Coulais, Alain; Gales, Joel; Messmer, Peter
2010-05-01
GNU Data Language (GDL) is developed with the aim of providing an open-source drop-in replacement for the ITTVIS's Interactive Data Language (IDL). It is free software developed by an international team of volunteers led by Marc Schellens - the project's founder (a list of contributors is available on the project's website). The development is hosted on SourceForge where GDL continuously ranks in the 99th percentile of most active projects. GDL with its library routines is designed as a tool for numerical data analysis and visualisation. As its proprietary counterparts (IDL and PV-WAVE), GDL is used particularly in geosciences and astronomy. GDL is dynamically-typed, vectorized and has object-oriented programming capabilities. The library routines handle numerical calculations, data visualisation, signal/image processing, interaction with host OS and data input/output. GDL supports several data formats such as netCDF, HDF4, HDF5, GRIB, PNG, TIFF, DICOM, etc. Graphical output is handled by X11, PostScript, SVG or z-buffer terminals, the last one allowing output to be saved in a variety of raster graphics formats. GDL is an incremental compiler with integrated debugging facilities. It is written in C++ using the ANTLR language-recognition framework. Most of the library routines are implemented as interfaces to open-source packages such as GNU Scientific Library, PLPlot, FFTW, ImageMagick, and others. GDL features a Python bridge (Python code can be called from GDL; GDL can be compiled as a Python module). Extensions to GDL can be written in C++, GDL, and Python. A number of open software libraries written in IDL, such as the NASA Astronomy Library, MPFIT, CMSVLIB and TeXtoIDL are fully or partially functional under GDL. Packaged versions of GDL are available for several Linux distributions and Mac OS X. The source code compiles on some other UNIX systems, including BSD and OpenSolaris. The presentation will cover the current status of the project, the key accomplishments, and the weaknesses - areas where contributions and users' feedback are welcome! While still being in beta-stage of development, GDL proved to be a useful tool for classroom work on data analysis. Its usage for teaching meteorological-data processing at the University of Warsaw will serve as an example.
Conversion of HSPF Legacy Model to a Platform-Independent, Open-Source Language
NASA Astrophysics Data System (ADS)
Heaphy, R. T.; Burke, M. P.; Love, J. T.
2015-12-01
Since its initial development over 30 years ago, the Hydrologic Simulation Program - FORTAN (HSPF) model has been used worldwide to support water quality planning and management. In the United States, HSPF receives widespread endorsement as a regulatory tool at all levels of government and is a core component of the EPA's Better Assessment Science Integrating Point and Nonpoint Sources (BASINS) system, which was developed to support nationwide Total Maximum Daily Load (TMDL) analysis. However, the model's legacy code and data management systems have limitations in their ability to integrate with modern software, hardware, and leverage parallel computing, which have left voids in optimization, pre-, and post-processing tools. Advances in technology and our scientific understanding of environmental processes that have occurred over the last 30 years mandate that upgrades be made to HSPF to allow it to evolve and continue to be a premiere tool for water resource planners. This work aims to mitigate the challenges currently facing HSPF through two primary tasks: (1) convert code to a modern widely accepted, open-source, high-performance computing (hpc) code; and (2) convert model input and output files to modern widely accepted, open-source, data model, library, and binary file format. Python was chosen as the new language for the code conversion. It is an interpreted, object-oriented, hpc code with dynamic semantics that has become one of the most popular open-source languages. While python code execution can be slow compared to compiled, statically typed programming languages, such as C and FORTRAN, the integration of Numba (a just-in-time specializing compiler) has allowed this challenge to be overcome. For the legacy model data management conversion, HDF5 was chosen to store the model input and output. The code conversion for HSPF's hydrologic and hydraulic modules has been completed. The converted code has been tested against HSPF's suite of "test" runs and shown good agreement and similar execution times while using the Numba compiler. Continued verification of the accuracy of the converted code against more complex legacy applications and improvement upon execution times by incorporating an intelligent network change detection tool is currently underway, and preliminary results will be presented.
A strategy to establish Food Safety Model Repositories.
Plaza-Rodríguez, C; Thoens, C; Falenski, A; Weiser, A A; Appel, B; Kaesbohrer, A; Filter, M
2015-07-02
Transferring the knowledge of predictive microbiology into real world food manufacturing applications is still a major challenge for the whole food safety modelling community. To facilitate this process, a strategy for creating open, community driven and web-based predictive microbial model repositories is proposed. These collaborative model resources could significantly improve the transfer of knowledge from research into commercial and governmental applications and also increase efficiency, transparency and usability of predictive models. To demonstrate the feasibility, predictive models of Salmonella in beef previously published in the scientific literature were re-implemented using an open source software tool called PMM-Lab. The models were made publicly available in a Food Safety Model Repository within the OpenML for Predictive Modelling in Food community project. Three different approaches were used to create new models in the model repositories: (1) all information relevant for model re-implementation is available in a scientific publication, (2) model parameters can be imported from tabular parameter collections and (3) models have to be generated from experimental data or primary model parameters. All three approaches were demonstrated in the paper. The sample Food Safety Model Repository is available via: http://sourceforge.net/projects/microbialmodelingexchange/files/models and the PMM-Lab software can be downloaded from http://sourceforge.net/projects/pmmlab/. This work also illustrates that a standardized information exchange format for predictive microbial models, as the key component of this strategy, could be established by adoption of resources from the Systems Biology domain. Copyright © 2015. Published by Elsevier B.V.
The assessment and evaluation of low-frequency noise near the region of infrasound.
Ziaran, Stanislav
2014-01-01
The main aim of this paper is to present recent knowledge about the assessment and evaluation of low-frequency sounds (noise) and infrasound, close to the threshold of hearing, and identify their potential effect on human health and annoyance. Low-frequency noise generated by air flowing over a moving car with an open window was chosen as a typical scenario which can be subjectively assessed by people traveling by automobile. The principle of noise generated within the interior of the car and its effects on the comfort of the driver and passengers are analyzed at different velocities. An open window of a car at high velocity behaves as a source of specifically strong tonal low-frequency noise which is generally perceived as annoying. The interior noise generated by an open window of a passenger car was measured under different conditions: Driving on a highway and driving on a typical roadway. First, an octave-band analysis was used to assess the noise level and its impact on the driver's comfort. Second, a fast Fourier transform (FFT) analysis and one-third octave-band analysis were used for the detection of tonal low-frequency noise. Comparison between two different car makers was also done. Finally, the paper suggests some possibilities for scientifically assessing and evaluating low-frequency sounds in general, and some recommendations are introduced for scientific discussion, since sounds with strong low-frequency content (but not only strong) engender greater annoyance than is predicted by an A-weighted sound pressure level.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Existing Open Molding Sources, New Open Molding Sources Emitting Less Than 100 TPY of HAP, and New and... CATEGORIES National Emissions Standards for Hazardous Air Pollutants: Reinforced Plastic Composites... Existing Open Molding Sources, New Open Molding Sources Emitting Less Than 100 TPY of HAP, and New and...
PiCO QL: A software library for runtime interactive queries on program data
NASA Astrophysics Data System (ADS)
Fragkoulis, Marios; Spinellis, Diomidis; Louridas, Panos
PiCO QL is an open source C/C++ software whose scientific scope is real-time interactive analysis of in-memory data through SQL queries. It exposes a relational view of a system's or application's data structures, which is queryable through SQL. While the application or system is executing, users can input queries through a web-based interface or issue web service requests. Queries execute on the live data structures through the respective relational views. PiCO QL makes a good candidate for ad-hoc data analysis in applications and for diagnostics in systems settings. Applications of PiCO QL include the Linux kernel, the Valgrind instrumentation framework, a GIS application, a virtual real-time observatory of stellar objects, and a source code analyser.
From Big Data to Knowledge in the Social Sciences.
Hesse, Bradford W; Moser, Richard P; Riley, William T
2015-05-01
One of the challenges associated with high-volume, diverse datasets is whether synthesis of open data streams can translate into actionable knowledge. Recognizing that challenge and other issues related to these types of data, the National Institutes of Health developed the Big Data to Knowledge or BD2K initiative. The concept of translating "big data to knowledge" is important to the social and behavioral sciences in several respects. First, a general shift to data-intensive science will exert an influence on all scientific disciplines, but particularly on the behavioral and social sciences given the wealth of behavior and related constructs captured by big data sources. Second, science is itself a social enterprise; by applying principles from the social sciences to the conduct of research, it should be possible to ameliorate some of the systemic problems that plague the scientific enterprise in the age of big data. We explore the feasibility of recalibrating the basic mechanisms of the scientific enterprise so that they are more transparent and cumulative; more integrative and cohesive; and more rapid, relevant, and responsive.
From Big Data to Knowledge in the Social Sciences
Hesse, Bradford W.; Moser, Richard P.; Riley, William T.
2015-01-01
One of the challenges associated with high-volume, diverse datasets is whether synthesis of open data streams can translate into actionable knowledge. Recognizing that challenge and other issues related to these types of data, the National Institutes of Health developed the Big Data to Knowledge or BD2K initiative. The concept of translating “big data to knowledge” is important to the social and behavioral sciences in several respects. First, a general shift to data-intensive science will exert an influence on all scientific disciplines, but particularly on the behavioral and social sciences given the wealth of behavior and related constructs captured by big data sources. Second, science is itself a social enterprise; by applying principles from the social sciences to the conduct of research, it should be possible to ameliorate some of the systemic problems that plague the scientific enterprise in the age of big data. We explore the feasibility of recalibrating the basic mechanisms of the scientific enterprise so that they are more transparent and cumulative; more integrative and cohesive; and more rapid, relevant, and responsive. PMID:26294799
Implementation and use of a highly available and innovative IaaS solution: the Cloud Area Padovana
NASA Astrophysics Data System (ADS)
Aiftimiei, C.; Andreetto, P.; Bertocco, S.; Biasotto, M.; Dal Pra, S.; Costa, F.; Crescente, A.; Dorigo, A.; Fantinel, S.; Fanzago, F.; Frizziero, E.; Gulmini, M.; Michelotto, M.; Sgaravatto, M.; Traldi, S.; Venaruzzo, M.; Verlato, M.; Zangrando, L.
2015-12-01
While in the business world the cloud paradigm is typically implemented purchasing resources and services from third party providers (e.g. Amazon), in the scientific environment there's usually the need of on-premises IaaS infrastructures which allow efficient usage of the hardware distributed among (and owned by) different scientific administrative domains. In addition, the requirement of open source adoption has led to the choice of products like OpenStack by many organizations. We describe a use case of the Italian National Institute for Nuclear Physics (INFN) which resulted in the implementation of a unique cloud service, called ’Cloud Area Padovana’, which encompasses resources spread over two different sites: the INFN Legnaro National Laboratories and the INFN Padova division. We describe how this IaaS has been implemented, which technologies have been adopted and how services have been configured in high-availability (HA) mode. We also discuss how identity and authorization management were implemented, adopting a widely accepted standard architecture based on SAML2 and OpenID: by leveraging the versatility of those standards the integration with authentication federations like IDEM was implemented. We also discuss some other innovative developments, such as a pluggable scheduler, implemented as an extension of the native OpenStack scheduler, which allows the allocation of resources according to a fair-share based model and which provides a persistent queuing mechanism for handling user requests that can not be immediately served. Tools, technologies, procedures used to install, configure, monitor, operate this cloud service are also discussed. Finally we present some examples that show how this IaaS infrastructure is being used.
GPCRdb: an information system for G protein-coupled receptors
Isberg, Vignir; Mordalski, Stefan; Munk, Christian; Rataj, Krzysztof; Harpsøe, Kasper; Hauser, Alexander S.; Vroling, Bas; Bojarski, Andrzej J.; Vriend, Gert; Gloriam, David E.
2016-01-01
Recent developments in G protein-coupled receptor (GPCR) structural biology and pharmacology have greatly enhanced our knowledge of receptor structure-function relations, and have helped improve the scientific foundation for drug design studies. The GPCR database, GPCRdb, serves a dual role in disseminating and enabling new scientific developments by providing reference data, analysis tools and interactive diagrams. This paper highlights new features in the fifth major GPCRdb release: (i) GPCR crystal structure browsing, superposition and display of ligand interactions; (ii) direct deposition by users of point mutations and their effects on ligand binding; (iii) refined snake and helix box residue diagram looks; and (iii) phylogenetic trees with receptor classification colour schemes. Under the hood, the entire GPCRdb front- and back-ends have been re-coded within one infrastructure, ensuring a smooth browsing experience and development. GPCRdb is available at http://www.gpcrdb.org/ and it's open source code at https://bitbucket.org/gpcr/protwis. PMID:26582914
Development of a testlet generator in re-engineering the Indonesian physics national-exams
NASA Astrophysics Data System (ADS)
Mindyarto, Budi Naini; Mardapi, Djemari; Bastari
2017-08-01
The Indonesian Physics national-exams are end-of-course summative assessments that could be utilized to support the assessment for learning in physics educations. This paper discusses the development and evaluation of a testlet generator based on a re-engineering of Indonesian physics national exams. The exam problems were dissected and decomposed into testlets revealing the deeper understanding of the underlying physical concepts by inserting a qualitative question and its scientific reasoning question. A template-based generator was built to facilitate teachers in generating testlet variants that would be more conform to students' scientific attitude development than their original simple multiple-choice formats. The testlet generator was built using open source software technologies and was evaluated focusing on the black-box testing by exploring the generator's execution, inputs and outputs. The results showed the correctly-performed functionalities of the developed testlet generator in validating inputs, generating testlet variants, and accommodating polytomous item characteristics.
The QuakeSim Project: Web Services for Managing Geophysical Data and Applications
NASA Astrophysics Data System (ADS)
Pierce, Marlon E.; Fox, Geoffrey C.; Aktas, Mehmet S.; Aydin, Galip; Gadgil, Harshawardhan; Qi, Zhigang; Sayar, Ahmet
2008-04-01
We describe our distributed systems research efforts to build the “cyberinfrastructure” components that constitute a geophysical Grid, or more accurately, a Grid of Grids. Service-oriented computing principles are used to build a distributed infrastructure of Web accessible components for accessing data and scientific applications. Our data services fall into two major categories: Archival, database-backed services based around Geographical Information System (GIS) standards from the Open Geospatial Consortium, and streaming services that can be used to filter and route real-time data sources such as Global Positioning System data streams. Execution support services include application execution management services and services for transferring remote files. These data and execution service families are bound together through metadata information and workflow services for service orchestration. Users may access the system through the QuakeSim scientific Web portal, which is built using a portlet component approach.
NASA Astrophysics Data System (ADS)
Seyler, F.; Bonnet, M.-P.; Calmant, S.; Cauhopé, M.; Cazenave, A.; Cochonneau, G.; Divol, J.; Do-Minh, K.; Frappart, F.; Gennero, M.-C.; Guyenne-Blin, K.; Huynh, F.; Leon, J. G.; Mangeas, M.; Mercier, F.; Rocquelain, GH.; Tocqueville, L.; Zanifé, O.-Z.
2006-07-01
CASH « Contribution of spatial altimetry to hydrology » aims at the definition of a global, standard, fast and long term access to a set of hydrological data concerning the greatest river basins in the world. The key questions to be answered are: what are the conditions for monitoring river water stages from altimetric radar data and how is it possible to combine altimetric data with other spatial sources or/and in-situ data in order to deliver useful parameters for hydrology community, both scientific and end users. The CASH project is ending mid-May of 2006 and there is yet a lot of tasks to be performed for altimetric heigths of continental water bodies becoming part of the scientific and end-users hydrologists day-to- day practice. The project has nethertheless delineated the way this use could be improved in a near future, and opened very interesting perspectives for ungauged or poorly gauged great basins in the world.
Scientific Journal Publishing: Yearly Volume and Open Access Availability
ERIC Educational Resources Information Center
Bjork, Bo-Christer; Roos, Annikki; Lauri, Mari
2009-01-01
Introduction: We estimate the total yearly volume of peer-reviewed scientific journal articles published world-wide as well as the share of these articles available openly on the Web either directly or as copies in e-print repositories. Method: We rely on data from two commercial databases (ISI and Ulrich's Periodicals Directory) supplemented by…
Bring NASA Scientific Data into GIS
NASA Astrophysics Data System (ADS)
Xu, H.
2016-12-01
NASA's Earth Observation System (EOS) and many other missions produce data of huge volume and near real time which drives the research and understanding of climate change. Geographic Information System (GIS) is a technology used for the management, visualization and analysis of spatial data. Since it's inception in the 1960s, GIS has been applied to many fields at the city, state, national, and world scales. People continue to use it today to analyze and visualize trends, patterns, and relationships from the massive datasets of scientific data. There is great interest in both the scientific and GIS communities in improving technologies that can bring scientific data into a GIS environment, where scientific research and analysis can be shared through the GIS platform to the public. Most NASA scientific data are delivered in the Hierarchical Data Format (HDF), a format is both flexible and powerful. However, this flexibility results in challenges when trying to develop supported GIS software - data stored with HDF formats lack a unified standard and convention among these products. The presentation introduces an information model that enables ArcGIS software to ingest NASA scientific data and create a multidimensional raster - univariate and multivariate hypercubes - for scientific visualization and analysis. We will present the framework how ArcGIS leverages the open source GDAL (Geospatial Data Abstract Library) to support its raster data access, discuss how we overcame the GDAL drivers limitations in handing scientific products that are stored with HDF4 and HDF5 formats and how we improve the way in modeling the multidimensionality with GDAL. In additional, we will talk about the direction of ArcGIS handling NASA products and demonstrate how the multidimensional information model can help scientists work with various data products such as MODIS, MOPPIT, SMAP as well as many data products in a GIS environment.
nSTAT: Open-Source Neural Spike Train Analysis Toolbox for Matlab
Cajigas, I.; Malik, W.Q.; Brown, E.N.
2012-01-01
Over the last decade there has been a tremendous advance in the analytical tools available to neuroscientists to understand and model neural function. In particular, the point process - Generalized Linear Model (PPGLM) framework has been applied successfully to problems ranging from neuro-endocrine physiology to neural decoding. However, the lack of freely distributed software implementations of published PP-GLM algorithms together with problem-specific modifications required for their use, limit wide application of these techniques. In an effort to make existing PP-GLM methods more accessible to the neuroscience community, we have developed nSTAT – an open source neural spike train analysis toolbox for Matlab®. By adopting an Object-Oriented Programming (OOP) approach, nSTAT allows users to easily manipulate data by performing operations on objects that have an intuitive connection to the experiment (spike trains, covariates, etc.), rather than by dealing with data in vector/matrix form. The algorithms implemented within nSTAT address a number of common problems including computation of peri-stimulus time histograms, quantification of the temporal response properties of neurons, and characterization of neural plasticity within and across trials. nSTAT provides a starting point for exploratory data analysis, allows for simple and systematic building and testing of point process models, and for decoding of stimulus variables based on point process models of neural function. By providing an open-source toolbox, we hope to establish a platform that can be easily used, modified, and extended by the scientific community to address limitations of current techniques and to extend available techniques to more complex problems. PMID:22981419
Gimli: open source and high-performance biomedical name recognition
2013-01-01
Background Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research. Results We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions. Conclusions Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli. PMID:23413997
NASA Astrophysics Data System (ADS)
Slaughter, A. E.; Permann, C.; Peterson, J. W.; Gaston, D.; Andrs, D.; Miller, J.
2014-12-01
The Idaho National Laboratory (INL)-developed Multiphysics Object Oriented Simulation Environment (MOOSE; www.mooseframework.org), is an open-source, parallel computational framework for enabling the solution of complex, fully implicit multiphysics systems. MOOSE provides a set of computational tools that scientists and engineers can use to create sophisticated multiphysics simulations. Applications built using MOOSE have computed solutions for chemical reaction and transport equations, computational fluid dynamics, solid mechanics, heat conduction, mesoscale materials modeling, geomechanics, and others. To facilitate the coupling of diverse and highly-coupled physical systems, MOOSE employs the Jacobian-free Newton-Krylov (JFNK) method when solving the coupled nonlinear systems of equations arising in multiphysics applications. The MOOSE framework is written in C++, and leverages other high-quality, open-source scientific software packages such as LibMesh, Hypre, and PETSc. MOOSE uses a "hybrid parallel" model which combines both shared memory (thread-based) and distributed memory (MPI-based) parallelism to ensure efficient resource utilization on a wide range of computational hardware. MOOSE-based applications are inherently modular, which allows for simulation expansion (via coupling of additional physics modules) and the creation of multi-scale simulations. Any application developed with MOOSE supports running (in parallel) any other MOOSE-based application. Each application can be developed independently, yet easily communicate with other applications (e.g., conductivity in a slope-scale model could be a constant input, or a complete phase-field micro-structure simulation) without additional code being written. This method of development has proven effective at INL and expedites the development of sophisticated, sustainable, and collaborative simulation tools.
Willighagen, Egon L; Mayfield, John W; Alvarsson, Jonathan; Berg, Arvid; Carlsson, Lars; Jeliazkova, Nina; Kuhn, Stefan; Pluskal, Tomáš; Rojas-Chertó, Miquel; Spjuth, Ola; Torrance, Gilleain; Evelo, Chris T; Guha, Rajarshi; Steinbeck, Christoph
2017-06-06
The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance.
PyMICE: APython library for analysis of IntelliCage data.
Dzik, Jakub M; Puścian, Alicja; Mijakowska, Zofia; Radwanska, Kasia; Łęski, Szymon
2018-04-01
IntelliCage is an automated system for recording the behavior of a group of mice housed together. It produces rich, detailed behavioral data calling for new methods and software for their analysis. Here we present PyMICE, a free and open-source library for analysis of IntelliCage data in the Python programming language. We describe the design and demonstrate the use of the library through a series of examples. PyMICE provides easy and intuitive access to IntelliCage data, and thus facilitates the possibility of using numerous other Python scientific libraries to form a complete data analysis workflow.
Can Routine Commercial Cord Blood Banking Be Scientifically and Ethically Justified?
Fisk, Nicholas M; Roberts, Irene A. G; Markwald, Roger; Mironov, Vladimir
2005-01-01
Background to the debate: Umbilical cord blood—the blood that remains in the placenta after birth—can be collected and stored frozen for years. A well-accepted use of cord blood is as an alternative to bone marrow as a source of hematopoietic stem cells for allogeneic transplantation to siblings or to unrelated recipients; women can donate cord blood for unrelated recipients to public banks. However, private banks are now open that offer expectant parents the option to pay a fee for the chance to store cord blood for possible future use by that same child (autologous transplantation.) PMID:15737000
Can routine commercial cord blood banking be scientifically and ethically justified?
Fisk, Nicholas M; Roberts, Irene A G; Markwald, Roger; Mironov, Vladimir
2005-02-01
Umbilical cord blood--the blood that remains in the placenta after birth--can be collected and stored frozen for years. A well-accepted use of cord blood is as an alternative to bone marrow as a source of hematopoietic stem cells for allogeneic transplantation to siblings or to unrelated recipients; women can donate cord blood for unrelated recipients to public banks. However, private banks are now open that offer expectant parents the option to pay a fee for the chance to store cord blood for possible future use by that same child (autologous transplantation).
Growing Cutting-edge X-ray Optics
Conley, Ray
2018-03-02
Ever imagined that an Xbox controller could help open a window into a world spanning just one billionth of a meter? Brookhaven Lab's Ray Conley grows cutting-edge optics called multilayer Laue lenses (MLL) one atomic layer at a time to focus high-energy x-rays to within a single nanometer. To achieve this focusing feat, Ray uses a massive, custom-built atomic deposition device, an array of computers, and a trusty Xbox controller. These lenses will be deployed at the Lab's National Synchrotron Light Source II, due to begin shining super-bright light on pressing scientific puzzles in 2015.
A Case for Crowd Sourcing in Stem Cell Research
Mummery, Christine L.; Rabelink, Ton J.
2014-01-01
SUMMARY Thousands of patients and placebo-treated controls have been included in many clinical trials of stem cell therapy over the last decade or so, but often the study groups have been small. Their scientific value may therefore be limited and their ethical justification questionable. Would “crowd sourcing” for data sharing be a means of increasing the collective value of clinical trials? Here, we make a case for open access of all data emerging from stem cell studies (trials but also observational studies) independent of whether they are investigator-initiated or commercially driven. PMID:25232185
Contribution to the Understanding of Particle Motion Perception in Marine Invertebrates.
André, Michel; Kaifu, Kenzo; Solé, Marta; van der Schaar, Mike; Akamatsu, Tomonari; Balastegui, Andreu; Sánchez, Antonio M; Castell, Joan V
2016-01-01
Marine invertebrates potentially represent a group of species whose ecology may be influenced by artificial noise. Exposure to anthropogenic sound sources could have a direct consequence on the functionality and sensitivity of their sensory organs, the statocysts, which are responsible for their equilibrium and movements in the water column. The availability of novel laser Doppler vibrometer techniques has recently opened the possibility of measuring whole body (distance, velocity, and acceleration) vibration as a direct stimulus eliciting statocyst response, offering the scientific community a new level of understanding of the marine invertebrate hearing mechanism.
Redlarski, Grzegorz; Lewczuk, Bogdan; Żak, Arkadiusz; Koncicki, Andrzej; Krawczuk, Marek; Piechocki, Janusz; Jakubiuk, Kazimierz; Tojza, Piotr; Jaworski, Jacek; Ambroziak, Dominik; Skarbek, Łukasz; Gradolewski, Dawid
2015-01-01
Current technologies have become a source of omnipresent electromagnetic pollution from generated electromagnetic fields and resulting electromagnetic radiation. In many cases this pollution is much stronger than any natural sources of electromagnetic fields or radiation. The harm caused by this pollution is still open to question since there is no clear and definitive evidence of its negative influence on humans. This is despite the fact that extremely low frequency electromagnetic fields were classified as potentially carcinogenic. For these reasons, in recent decades a significant growth can be observed in scientific research in order to understand the influence of electromagnetic radiation on living organisms. However, for this type of research the appropriate selection of relevant model organisms is of great importance. It should be noted here that the great majority of scientific research papers published in this field concerned various tests performed on mammals, practically neglecting lower organisms. In that context the objective of this paper is to systematise our knowledge in this area, in which the influence of electromagnetic radiation on lower organisms was investigated, including bacteria, E. coli and B. subtilis, nematode, Caenorhabditis elegans, land snail, Helix pomatia, common fruit fly, Drosophila melanogaster, and clawed frog, Xenopus laevis.
Żak, Arkadiusz; Koncicki, Andrzej; Piechocki, Janusz; Jakubiuk, Kazimierz; Tojza, Piotr; Jaworski, Jacek; Ambroziak, Dominik; Skarbek, Łukasz
2015-01-01
Current technologies have become a source of omnipresent electromagnetic pollution from generated electromagnetic fields and resulting electromagnetic radiation. In many cases this pollution is much stronger than any natural sources of electromagnetic fields or radiation. The harm caused by this pollution is still open to question since there is no clear and definitive evidence of its negative influence on humans. This is despite the fact that extremely low frequency electromagnetic fields were classified as potentially carcinogenic. For these reasons, in recent decades a significant growth can be observed in scientific research in order to understand the influence of electromagnetic radiation on living organisms. However, for this type of research the appropriate selection of relevant model organisms is of great importance. It should be noted here that the great majority of scientific research papers published in this field concerned various tests performed on mammals, practically neglecting lower organisms. In that context the objective of this paper is to systematise our knowledge in this area, in which the influence of electromagnetic radiation on lower organisms was investigated, including bacteria, E. coli and B. subtilis, nematode, Caenorhabditis elegans, land snail, Helix pomatia, common fruit fly, Drosophila melanogaster, and clawed frog, Xenopus laevis. PMID:25811025
Practices in NASA's EOSDIS to Promote Open Data and Research Integrity
NASA Astrophysics Data System (ADS)
Behnke, J.; Ramapriyan, H.
2017-12-01
The purpose of this paper is to highlight the key practices adopted by NASA in its Earth Observing System Data and Information System (EOSDIS) to promote and facilitate open data and research integrity. EOSDIS is the system that manages most of NASA's Earth science data from various sources - satellites, aircraft, field campaigns and some research projects. Since its inception in 1990 as a part of the Earth Observing System (EOS) Program, EOSDIS has been following NASA's free and open data and information policy, whereby data are shared with all users on a non-discriminatory basis and are provided at no cost. To ensure that the data are discoverable and accessible to the user community, NASA follows an evolutionary development approach, whereby the latest technologies that can be practically adopted are infused into EOSDIS. This results in continuous improvements in system capabilities such that technologies that users are accustomed to in other environments are brought to bear in their access to NASA's Earth observation data. Mechanisms have existed for ensuring that the data products offered by EOSDIS are vetted by the community before they are released. Information about data products such as Algorithm Theoretical Basis Documents and quality assessments are openly available with the products. The EOSDIS Distributed Active Archive Centers (DAACs) work with the science teams responsible for product generation to assist with proper use of metadata. The DAACs have knowledgeable staff to answer users' questions and have access to scientific experts as needed. Citation of data products in scientific papers are facilitated by assignment of Digital Object Identifiers (DOIs) - at present, over 50% of data products in EOSDIS have been assigned DOIs. NASA gathers and publishes citation metrics for the datasets offered by the DAACs. Through its Software and Services Citations Working Group, NASA is currently investigating broadening DOI assignments to promote greater provenance traceability. NASA has developed Preservation Content Specifications for Earth science data to ensure that provenance and context are captured and preserved for the future and is applying them to data and information from its missions. All these actions promote availability of information to promote integrity in scientific research.
77 FR 62523 - Scientific Earthquake Studies Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-15
... DEPARTMENT OF THE INTERIOR Geological Survey [GX13GG009950000] Scientific Earthquake Studies... Law 106-503, the Scientific Earthquake Studies Advisory Committee (SESAC) will hold its next meeting... the Scientific Earthquake Studies Advisory Committee are open to the public. DATES: October 29, 2012...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maingi, Rajesh; Zinkle, Steven J.; Foster, Mark S.
2015-05-01
The realization of controlled thermonuclear fusion as an energy source would transform society, providing a nearly limitless energy source with renewable fuel. Under the auspices of the U.S. Department of Energy, the Fusion Energy Sciences (FES) program management recently launched a series of technical workshops to “seek community engagement and input for future program planning activities” in the targeted areas of (1) Integrated Simulation for Magnetic Fusion Energy Sciences, (2) Control of Transients, (3) Plasma Science Frontiers, and (4) Plasma-Materials Interactions aka Plasma-Materials Interface (PMI). Over the past decade, a number of strategic planning activities1-6 have highlighted PMI and plasmamore » facing components as a major knowledge gap, which should be a priority for fusion research towards ITER and future demonstration fusion energy systems. There is a strong international consensus that new PMI solutions are required in order for fusion to advance beyond ITER. The goal of the 2015 PMI community workshop was to review recent innovations and improvements in understanding the challenging PMI issues, identify high-priority scientific challenges in PMI, and to discuss potential options to address those challenges. The community response to the PMI research assessment was enthusiastic, with over 80 participants involved in the open workshop held at Princeton Plasma Physics Laboratory on May 4-7, 2015. The workshop provided a useful forum for the scientific community to review progress in scientific understanding achieved during the past decade, and to openly discuss high-priority unresolved research questions. One of the key outcomes of the workshop was a focused set of community-initiated Priority Research Directions (PRDs) for PMI. Five PRDs were identified, labeled A-E, which represent community consensus on the most urgent near-term PMI scientific issues. For each PRD, an assessment was made of the scientific challenges, as well as a set of actions to address those challenges. No prioritization was attempted amongst these five PRDs. We note that ITER, an international collaborative project to substantially extend fusion science and technology, is implicitly a driver and beneficiary of the research described in these PRDs; specific ITER issues are discussed in the background and PRD chapters. For succinctness, we describe these PRDs directly below; a brief introduction to magnetic fusion and the workshop process/timeline is given in Chapter I, and panelists are listed in the Appendix.« less
The research data alliance photon and neutron science interest group
Boehnlein, Amber; Matthews, Brian; Proffen, Thomas; ...
2015-04-01
Scientific research data provides unique challenges that are distinct from classic “Big Data” sources. One common element in research data is that the experiment, observations, or simulation were designed, and data were specifically acquired, to shed light on an open scientific question. The data and methods are usually “owned” by the researcher(s) and the data itself might not be viewed to have long-term scientific significance after the results have been published. Often, the data volume was relatively low, with data sometimes easier to reproduce than to catalog and store. Some data and meta-data were not collected in a digital form,more » or were stored on antiquated or obsolete media. Generally speaking, policies, tools, and management of digital research data have reflected an ad hoc approach that varies domain by domain and research group by research group. This model, which treats research data as disposable, is proving to be a serious limitation as the volume and complexity of research data explodes. Changes are required at every level of scientific research: within the individual groups, and across scientific domains and interdisciplinary collaborations. Enabling researchers to learn about available tools, processes, and procedures should encourage a spirit of cooperation and collaboration, allowing researchers to come together for the common good. In conclusion, these community-oriented efforts provide the potential for targeted projects with high impact.« less
The research data alliance photon and neutron science interest group
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boehnlein, Amber; Matthews, Brian; Proffen, Thomas
Scientific research data provides unique challenges that are distinct from classic “Big Data” sources. One common element in research data is that the experiment, observations, or simulation were designed, and data were specifically acquired, to shed light on an open scientific question. The data and methods are usually “owned” by the researcher(s) and the data itself might not be viewed to have long-term scientific significance after the results have been published. Often, the data volume was relatively low, with data sometimes easier to reproduce than to catalog and store. Some data and meta-data were not collected in a digital form,more » or were stored on antiquated or obsolete media. Generally speaking, policies, tools, and management of digital research data have reflected an ad hoc approach that varies domain by domain and research group by research group. This model, which treats research data as disposable, is proving to be a serious limitation as the volume and complexity of research data explodes. Changes are required at every level of scientific research: within the individual groups, and across scientific domains and interdisciplinary collaborations. Enabling researchers to learn about available tools, processes, and procedures should encourage a spirit of cooperation and collaboration, allowing researchers to come together for the common good. In conclusion, these community-oriented efforts provide the potential for targeted projects with high impact.« less
The Emergence of Open-Source Software in China
ERIC Educational Resources Information Center
Pan, Guohua; Bonk, Curtis J.
2007-01-01
The open-source software movement is gaining increasing momentum in China. Of the limited numbers of open-source software in China, "Red Flag Linux" stands out most strikingly, commanding 30 percent share of Chinese software market. Unlike the spontaneity of open-source movement in North America, open-source software development in…
A Study of Clinically Related Open Source Software Projects
Hogarth, Michael A.; Turner, Stuart
2005-01-01
Open source software development has recently gained significant interest due to several successful mainstream open source projects. This methodology has been proposed as being similarly viable and beneficial in the clinical application domain as well. However, the clinical software development venue differs significantly from the mainstream software venue. Existing clinical open source projects have not been well characterized nor formally studied so the ‘fit’ of open source in this domain is largely unknown. In order to better understand the open source movement in the clinical application domain, we undertook a study of existing open source clinical projects. In this study we sought to characterize and classify existing clinical open source projects and to determine metrics for their viability. This study revealed several findings which we believe could guide the healthcare community in its quest for successful open source clinical software projects. PMID:16779056
Salamone, Francesco; Belussi, Lorenzo; Danza, Ludovico; Ghellere, Matteo; Meroni, Italo
2015-01-01
The Indoor Environmental Quality (IEQ) refers to the quality of the environment in relation to the health and well-being of the occupants. It is a holistic concept, which considers several categories, each related to a specific environmental parameter. This article describes a low-cost and open-source hardware architecture able to detect the indoor variables necessary for the IEQ calculation as an alternative to the traditional hardware used for this purpose. The system consists of some sensors and an Arduino board. One of the key strengths of Arduino is the possibility it affords of loading the script into the board’s memory and letting it run without interfacing with computers, thus granting complete independence, portability and accuracy. Recent works have demonstrated that the cost of scientific equipment can be reduced by applying open-source principles to their design using a combination of the Arduino platform and a 3D printer. The evolution of the 3D printer has provided a new means of open design capable of accelerating self-directed development. The proposed nano Environmental Monitoring System (nEMoS) instrument is shown to have good reliability and it provides the foundation for a more critical approach to the use of professional sensors as well as for conceiving new scenarios and potential applications. PMID:26053749
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.; West, P.; Erickson, J. S.; Ma, X.; Fox, P. A.
2014-12-01
Deep Carbon Observatory (DCO) is a decade-long scientific endeavor to understand carbon in the complex deep Earth system. Thousands of DCO scientists from institutions across the globe are organized into communities representing four domains of exploration: Extreme Physics and Chemistry, Reservoirs and Fluxes, Deep Energy, and Deep Life. Cross-community and cross-disciplinary collaboration is one of the most distinctive features in DCO's flexible research framework. VIVO is an open-source Semantic Web platform that facilitates cross-institutional researcher and research discovery. it includes a number of standard ontologies that interconnect people, organizations, publications, activities, locations, and other entities of research interest to enable browsing, searching, visualizing, and generating Linked Open (research) Data. The DCO-VIVO solution expedites research collaboration between DCO scientists and communities. Based on DCO's specific requirements, the DCO Data Science team developed a series of extensions to the VIVO platform including extending the VIVO information model, extended query over the semantic information within VIVO, integration with other open source collaborative environments and data management systems, using single sign-on, assigning of unique Handles to DCO objects, and publication and dataset ingesting extensions using existing publication systems. We present here the iterative development of these requirements that are now in daily use by the DCO community of scientists for research reporting, information sharing, and resource discovery in support of research activities and program management.
Salamone, Francesco; Belussi, Lorenzo; Danza, Ludovico; Ghellere, Matteo; Meroni, Italo
2015-06-04
The Indoor Environmental Quality (IEQ) refers to the quality of the environment in relation to the health and well-being of the occupants. It is a holistic concept, which considers several categories, each related to a specific environmental parameter. This article describes a low-cost and open-source hardware architecture able to detect the indoor variables necessary for the IEQ calculation as an alternative to the traditional hardware used for this purpose. The system consists of some sensors and an Arduino board. One of the key strengths of Arduino is the possibility it affords of loading the script into the board's memory and letting it run without interfacing with computers, thus granting complete independence, portability and accuracy. Recent works have demonstrated that the cost of scientific equipment can be reduced by applying open-source principles to their design using a combination of the Arduino platform and a 3D printer. The evolution of the 3D printer has provided a new means of open design capable of accelerating self-directed development. The proposed nano Environmental Monitoring System (nEMoS) instrument is shown to have good reliability and it provides the foundation for a more critical approach to the use of professional sensors as well as for conceiving new scenarios and potential applications.
NASA Astrophysics Data System (ADS)
Löwe, Peter; Klump, Jens; Robertson, Jesse
2015-04-01
Text mining is commonly employed as a tool in data science to investigate and chart emergent information from corpora of research abstracts, such as the Geophysical Research Abstracts (GRA) published by Copernicus. In this context current standards, such as persistent identifiers like DOI and ORCID, allow us to trace, cite and map links between journal publications, the underlying research data and scientific software. This network can be expressed as a directed graph which enables us to chart networks of cooperation and innovation, thematic foci and the locations of research communities in time and space. However, this approach of data science, focusing on the research process in a self-referential manner, rather than the topical work, is still in a developing stage. Scientific work presented at the EGU General Assembly is often the first step towards new approaches and innovative ideas to the geospatial community. It represents a rich, deep and heterogeneous source of geoscientific thought. This corpus is a significant data source for data science, which has not been analysed on this scale previously. In this work, the corpus of the Geophysical Research Abstracts is used for the first time as a data base for analyses of topical text mining. For this, we used a sturdy and customizable software framework, based on the work of Schmitt et al. [1]. For the analysis we used the High Performance Computing infrastructure of the German Research Centre for Geosciences GFZ in Potsdam, Germany. Here, we report on the first results from the analysis of the continuous spreading the of use of Free and Open Source Software Tools (FOSS) within the EGU communities, mapping the general increase of FOSS-themed GRA articles in the last decade and the developing spatial patterns of involved parties and FOSS topics. References: [1] Schmitt, L. M., Christianson, K.T, Gupta R..: Linguistic Computing with UNIX Tools, in Kao, A., Poteet S.R. (Eds.): Natural Language processing and Text Mining, Springer, 2007. doi:10.1007/978-1-84628-754-1_12.
75 FR 2159 - Scientific Earthquake Studies Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-14
... DEPARTMENT OF THE INTERIOR U.S. Geological Survey Scientific Earthquake Studies Advisory Committee... Scientific Earthquake Studies Advisory Committee (SESAC) will hold its next meeting at the U.S. Geological.... Meetings of the Scientific Earthquake Studies Advisory Committee are open to the public. DATES: January 26...
76 FR 61113 - Scientific Earthquake Studies Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-03
... DEPARTMENT OF THE INTERIOR U.S. Geological Survey Scientific Earthquake Studies Advisory Committee...-503, the Scientific Earthquake Studies Advisory Committee (SESAC) will hold its next meeting at the.... Meetings of the Scientific Earthquake Studies Advisory Committee are open to the public. DATES: November 2...
78 FR 19004 - Scientific Earthquake Studies Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-03-28
... DEPARTMENT OF THE INTERIOR U.S. Geological Survey [GX13GG009950000] Scientific Earthquake Studies... Law 106-503, the Scientific Earthquake Studies Advisory Committee (SESAC) will hold its next meeting... international activities. Meetings of the Scientific Earthquake Studies Advisory Committee are open to the...
77 FR 12323 - Scientific Earthquake Studies Advisory Committee
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-29
... DEPARTMENT OF THE INTERIOR U.S. Geological Survey Scientific Earthquake Studies Advisory Committee...-503, the Scientific Earthquake Studies Advisory Committee (SESAC) will hold its next meeting at the.... Meetings of the Scientific Earthquake Studies Advisory Committee are open to the public. DATES: March 29...
NASA Astrophysics Data System (ADS)
Fraser, Ryan; Gross, Lutz; Wyborn, Lesley; Evans, Ben; Klump, Jens
2015-04-01
Recent investments in HPC, cloud and Petascale data stores, have dramatically increased the scale and resolution that earth science challenges can now be tackled. These new infrastructures are highly parallelised and to fully utilise them and access the large volumes of earth science data now available, a new approach to software stack engineering needs to be developed. The size, complexity and cost of the new infrastructures mean any software deployed has to be reliable, trusted and reusable. Increasingly software is available via open source repositories, but these usually only enable code to be discovered and downloaded. As a user it is hard for a scientist to judge the suitability and quality of individual codes: rarely is there information on how and where codes can be run, what the critical dependencies are, and in particular, on the version requirements and licensing of the underlying software stack. A trusted software framework is proposed to enable reliable software to be discovered, accessed and then deployed on multiple hardware environments. More specifically, this framework will enable those who generate the software, and those who fund the development of software, to gain credit for the effort, IP, time and dollars spent, and facilitate quantification of the impact of individual codes. For scientific users, the framework delivers reviewed and benchmarked scientific software with mechanisms to reproduce results. The trusted framework will have five separate, but connected components: Register, Review, Reference, Run, and Repeat. 1) The Register component will facilitate discovery of relevant software from multiple open source code repositories. The registration process of the code should include information about licensing, hardware environments it can be run on, define appropriate validation (testing) procedures and list the critical dependencies. 2) The Review component is targeting on the verification of the software typically against a set of benchmark cases. This will be achieved by linking the code in the software framework to peer review forums such as Mozilla Science or appropriate Journals (e.g. Geoscientific Model Development Journal) to assist users to know which codes to trust. 3) Referencing will be accomplished by linking the Software Framework to groups such as Figshare or ImpactStory that help disseminate and measure the impact of scientific research, including program code. 4) The Run component will draw on information supplied in the registration process, benchmark cases described in the review and relevant information to instantiate the scientific code on the selected environment. 5) The Repeat component will tap into existing Provenance Workflow engines that will automatically capture information that relate to a particular run of that software, including identification of all input and output artefacts, and all elements and transactions within that workflow. The proposed trusted software framework will enable users to rapidly discover and access reliable code, reduce the time to deploy it and greatly facilitate sharing, reuse and reinstallation of code. Properly designed it could enable an ability to scale out to massively parallel systems and be accessed nationally/ internationally for multiple use cases, including Supercomputer centres, cloud facilities, and local computers.
Open high-level data formats and software for gamma-ray astronomy
NASA Astrophysics Data System (ADS)
Deil, Christoph; Boisson, Catherine; Kosack, Karl; Perkins, Jeremy; King, Johannes; Eger, Peter; Mayer, Michael; Wood, Matthew; Zabalza, Victor; Knödlseder, Jürgen; Hassan, Tarek; Mohrmann, Lars; Ziegler, Alexander; Khelifi, Bruno; Dorner, Daniela; Maier, Gernot; Pedaletti, Giovanna; Rosado, Jaime; Contreras, José Luis; Lefaucheur, Julien; Brügge, Kai; Servillat, Mathieu; Terrier, Régis; Walter, Roland; Lombardi, Saverio
2017-01-01
In gamma-ray astronomy, a variety of data formats and proprietary software have been traditionally used, often developed for one specific mission or experiment. Especially for ground-based imaging atmospheric Cherenkov telescopes (IACTs), data and software are mostly private to the collaborations operating the telescopes. However, there is a general movement in science towards the use of open data and software. In addition, the next-generation IACT instrument, the Cherenkov Telescope Array (CTA), will be operated as an open observatory. We have created a Github organisation at https://github.com/open-gamma-ray-astro where we are developing high-level data format specifications. A public mailing list was set up at https://lists.nasa.gov/mailman/listinfo/open-gamma-ray-astro and a first face-to-face meeting on the IACT high-level data model and formats took place in April 2016 in Meudon (France). This open multi-mission effort will help to accelerate the development of open data formats and open-source software for gamma-ray astronomy, leading to synergies in the development of analysis codes and eventually better scientific results (reproducible, multi-mission). This write-up presents this effort for the first time, explaining the motivation and context, the available resources and process we use, as well as the status and planned next steps for the data format specifications. We hope that it will stimulate feedback and future contributions from the gamma-ray astronomy community.
ERIC Educational Resources Information Center
Krishnamurthy, M.
2008-01-01
Purpose: The purpose of this paper is to describe the open access and open source movement in the digital library world. Design/methodology/approach: A review of key developments in the open access and open source movement is provided. Findings: Open source software and open access to research findings are of great use to scholars in developing…
Interactive, process-oriented climate modeling with CLIMLAB
NASA Astrophysics Data System (ADS)
Rose, B. E. J.
2016-12-01
Global climate is a complex emergent property of the rich interactions between simpler components of the climate system. We build scientific understanding of this system by breaking it down into component process models (e.g. radiation, large-scale dynamics, boundary layer turbulence), understanding each components, and putting them back together. Hands-on experience and freedom to tinker with climate models (whether simple or complex) is invaluable for building physical understanding. CLIMLAB is an open-ended software engine for interactive, process-oriented climate modeling. With CLIMLAB you can interactively mix and match model components, or combine simpler process models together into a more comprehensive model. It was created primarily to support classroom activities, using hands-on modeling to teach fundamentals of climate science at both undergraduate and graduate levels. CLIMLAB is written in Python and ties in with the rich ecosystem of open-source scientific Python tools for numerics and graphics. The Jupyter Notebook format provides an elegant medium for distributing interactive example code. I will give an overview of the current capabilities of CLIMLAB, the curriculum we have developed thus far, and plans for the future. Using CLIMLAB requires some basic Python coding skills. We consider this an educational asset, as we are targeting upper-level undergraduates and Python is an increasingly important language in STEM fields.
The Mid and Far IR Universe and Facilities To See It
NASA Technical Reports Server (NTRS)
Mather, John C.; Fisher, Richard R. (Technical Monitor)
2001-01-01
Although half the luminosity of the universe appears in the band from 20-450 $\\mu$m, almost nothing is known about the sources of this radiation. Moreover, many molecules, atoms, and ions of astrophysical interest have some of their strongest lines in this wavelength range. Now Infrared Astronomy Satellite (IRAS) and ISO have flown, Space Infrared Telescope Facility (SIRTF) is nearly ready, SOFIA is under construction, and NGST, ALMA, Herschel, Planck, and a dozen 8 m and a 25 m ground-based visible/near IR telescopes could all be operational by the end of decade. Nevertheless, the mid and far IR region will still not have telescopes that are comparable to neighboring bands in energy sensitivity or angular resolution, despite these many advances. What scientific questions will still be open, and what instrumentation will be required? We anticipate that cold filled-aperture telescopes operating out to 100 $\\mu$m and cold imaging interferometers, operating out to about 450 $\\mu$m in space could be very powerful, using direct detection rather than heterodyne systems. I will give a brief overview of the scientific questions that may still be open, the main factors governing the choice of equipment, and the technological developments that would be required to actually build and use these new facilities.
Mathur, Shawn; Schmidt, Christian; Das, Chhaya; Tucker, Philip W
2006-01-01
Uncensored exchange of scientific results hastens progress. Open Access does not stop at the removal of price and permission barriers; still, censorship and reading disabilities, to name a few, hamper access to information. Here, we invite the scientific community and the public to discuss new methods to distribute, store and manage literature in order to achieve unfettered access to literature. PMID:16956402
Mathur, Shawn; Schmidt, Christian; Das, Chhaya; Tucker, Philip W
2006-09-06
Uncensored exchange of scientific results hastens progress. Open Access does not stop at the removal of price and permission barriers; still, censorship and reading disabilities, to name a few, hamper access to information. Here, we invite the scientific community and the public to discuss new methods to distribute, store and manage literature in order to achieve unfettered access to literature.
Data management strategies for multinational large-scale systems biology projects.
Wruck, Wasco; Peuker, Martin; Regenbrecht, Christian R A
2014-01-01
Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don't Embrace Open Access. The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects.
Data management strategies for multinational large-scale systems biology projects
Peuker, Martin; Regenbrecht, Christian R.A.
2014-01-01
Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don’t Embrace Open Access. The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects. PMID:23047157
New Open-Source Version of FLORIS Released | News | NREL
New Open-Source Version of FLORIS Released New Open-Source Version of FLORIS Released January 26 , 2018 National Renewable Energy Laboratory (NREL) researchers recently released an updated open-source simplified and documented. Because of the living, open-source nature of the newly updated utility, NREL
Skyline: an open source document editor for creating and analyzing targeted proteomics experiments
MacLean, Brendan; Tomazela, Daniela M.; Shulman, Nicholas; Chambers, Matthew; Finney, Gregory L.; Frewen, Barbara; Kern, Randall; Tabb, David L.; Liebler, Daniel C.; MacCoss, Michael J.
2010-01-01
Summary: Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. It is open source and freely available for academic and commercial use. The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using selected reaction monitoring (SRM). Skyline supports using and creating MS/MS spectral libraries from a wide variety of sources to choose SRM filters and verify results based on previously observed ion trap data. Skyline exports transition lists to and imports the native output files from Agilent, Applied Biosystems, Thermo Fisher Scientific and Waters triple quadrupole instruments, seamlessly connecting mass spectrometer output back to the experimental design document. The fast and compact Skyline file format is easily shared, even for experiments requiring many sample injections. A rich array of graphs displays results and provides powerful tools for inspecting data integrity as data are acquired, helping instrument operators to identify problems early. The Skyline dynamic report designer exports tabular data from the Skyline document model for in-depth analysis with common statistical tools. Availability: Single-click, self-updating web installation is available at http://proteome.gs.washington.edu/software/skyline. This web site also provides access to instructional videos, a support board, an issues list and a link to the source code project. Contact: brendanx@u.washington.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20147306
Re/Production of science process skills and a scientific ethos in an early childhood classroom
NASA Astrophysics Data System (ADS)
Kirch, Susan A.
2007-10-01
Many educators and researchers are convinced that age limits what students can learn and achieve in science. Elementary school curricula focus on isolated process skills under the faulty assumption that young students are not capable of combining the process skills and content knowledge necessary for reasoning scientifically. In the present study, I demonstrate that many process skills are produced in conversations between second grade students and between these students and their teachers, including: questioning, hypothesis formation, experimental design, identifying relevant evidence, critical analysis of hypotheses and predictions, hypothesis reconstruction, and variable identification. Through conversation analysis I show that most classroom community members adopted the role of skeptic at some time, but there was a strong tendency to defer to authoritative sources when resolving debates. This latter observation led to further investigation of when and how authoritative sources were consulted and used, and when and how a skeptical stance was taken. I show that, as students used science process skills and interacted with each other and teacher-mediators, community practices, values, and mores were shaped and an ethos of science began to emerge. It is my contention that this ethos often emerges unconsciously as part of the community's dynamic set of rules and schema. Teachers who are attuned to the tension between open-mindedness and skepticism, and how they and their students cope with this dialectic, however, can actively shape the scientific ethos of their classroom community.
Coats, Andrew J Stewart; Nijjer, Sukhjinder S; Francis, Darrel P
2014-10-20
Ticagrelor, a potent antiplatelet, has been shown to be beneficial in patients with acute coronary syndromes in a randomised controlled trial published in a highly ranked peer reviewed journal. Accordingly it has entered guidelines and has been approved for clinical use by authorities. However, there remains a controversy regarding aspects of the PLATO trial, which are not immediately apparent from the peer-reviewed publications. A number of publications have sought to highlight potential discrepancies, using data available in publicly published documents from the US Food and Drug Administration (FDA) leading to disagreement regarding the value of open science and data sharing. We reflect upon potential sources of bias present in even rigorously performed randomised controlled trials, on whether peer review can establish the presence of bias and the need to constantly challenge and question even accepted data. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Thermo-hydro-mechanical-chemical processes in fractured-porous media: Benchmarks and examples
NASA Astrophysics Data System (ADS)
Kolditz, O.; Shao, H.; Görke, U.; Kalbacher, T.; Bauer, S.; McDermott, C. I.; Wang, W.
2012-12-01
The book comprises an assembly of benchmarks and examples for porous media mechanics collected over the last twenty years. Analysis of thermo-hydro-mechanical-chemical (THMC) processes is essential to many applications in environmental engineering, such as geological waste deposition, geothermal energy utilisation, carbon capture and storage, water resources management, hydrology, even climate change. In order to assess the feasibility as well as the safety of geotechnical applications, process-based modelling is the only tool to put numbers, i.e. to quantify future scenarios. This charges a huge responsibility concerning the reliability of computational tools. Benchmarking is an appropriate methodology to verify the quality of modelling tools based on best practices. Moreover, benchmarking and code comparison foster community efforts. The benchmark book is part of the OpenGeoSys initiative - an open source project to share knowledge and experience in environmental analysis and scientific computation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonnal, P.; Féral, B.; Kershaw, K.
Particle accelerator projects share many characteristics with industrial projects. However, experience has shown that best practice of industrial project management is not always well suited to particle accelerator projects. Major differences include the number and complexity of technologies involved, the importance of collaborative work, development phases that can last more than a decade, and the importance of telerobotics and remote handling to address future preventive and corrective maintenance requirements due to induced radioactivity, to cite just a few. The openSE framework it is a systems engineering and project management framework specifically designed for scientific facilities’ systems and equipment studies andmore » development projects. Best practices in project management, in systems and requirements engineering, in telerobotics and remote handling and in radiation safety management were used as sources of inspiration, together with analysis of current practices surveyed at CERN, GSI and ESS.« less
MOSFiT: Modular Open Source Fitter for Transients
NASA Astrophysics Data System (ADS)
Guillochon, James; Nicholl, Matt; Villar, V. Ashley; Mockler, Brenna; Narayan, Gautham; Mandel, Kaisey S.; Berger, Edo; Williams, Peter K. G.
2018-05-01
Much of the progress made in time-domain astronomy is accomplished by relating observational multiwavelength time-series data to models derived from our understanding of physical laws. This goal is typically accomplished by dividing the task in two: collecting data (observing), and constructing models to represent that data (theorizing). Owing to the natural tendency for specialization, a disconnect can develop between the best available theories and the best available data, potentially delaying advances in our understanding new classes of transients. We introduce MOSFiT: the Modular Open Source Fitter for Transients, a Python-based package that downloads transient data sets from open online catalogs (e.g., the Open Supernova Catalog), generates Monte Carlo ensembles of semi-analytical light-curve fits to those data sets and their associated Bayesian parameter posteriors, and optionally delivers the fitting results back to those same catalogs to make them available to the rest of the community. MOSFiT is designed to help bridge the gap between observations and theory in time-domain astronomy; in addition to making the application of existing models and creation of new models as simple as possible, MOSFiT yields statistically robust predictions for transient characteristics, with a standard output format that includes all the setup information necessary to reproduce a given result. As large-scale surveys such as that conducted with the Large Synoptic Survey Telescope (LSST), discover entirely new classes of transients, tools such as MOSFiT will be critical for enabling rapid comparison of models against data in statistically consistent, reproducible, and scientifically beneficial ways.
NASA Astrophysics Data System (ADS)
Mattmann, Chris
2014-04-01
In this era of exascale instruments for astronomy we must naturally develop next generation capabilities for the unprecedented data volume and velocity that will arrive due to the veracity of these ground-based sensor and observatories. Integrating scientific algorithms stewarded by scientific groups unobtrusively and rapidly; intelligently selecting data movement technologies; making use of cloud computing for storage and processing; and automatically extracting text and metadata and science from any type of file are all needed capabilities in this exciting time. Our group at NASA JPL has promoted the use of open source data management technologies available from the Apache Software Foundation (ASF) in pursuit of constructing next generation data management and processing systems for astronomical instruments including the Expanded Very Large Array (EVLA) in Socorro, NM and the Atacama Large Milimetre/Sub Milimetre Array (ALMA); as well as for the KAT-7 project led by SKA South Africa as a precursor to the full MeerKAT telescope. In addition we are funded currently by the National Science Foundation in the US to work with MIT Haystack Observatory and the University of Cambridge in the UK to construct a Radio Array of Portable Interferometric Devices (RAPID) that will undoubtedly draw from the rich technology advances underway. NASA JPL is investing in a strategic initiative for Big Data that is pulling in these capabilities and technologies for astronomical instruments and also for Earth science remote sensing. In this talk I will describe the above collaborative efforts underway and point to solutions in open source from the Apache Software Foundation that can be deployed and used today and that are already bringing our teams and projects benefits. I will describe how others can take advantage of our experience and point towards future application and contribution of these tools.
ObsPy: A Python Toolbox for Seismology
NASA Astrophysics Data System (ADS)
Wassermann, J. M.; Krischer, L.; Megies, T.; Barsch, R.; Beyreuther, M.
2013-12-01
Python combines the power of a full-blown programming language with the flexibility and accessibility of an interactive scripting language. Its extensive standard library and large variety of freely available high quality scientific modules cover most needs in developing scientific processing workflows. ObsPy is a community-driven, open-source project extending Python's capabilities to fit the specific needs that arise when working with seismological data. It a) comes with a continuously growing signal processing toolbox that covers most tasks common in seismological analysis, b) provides read and write support for many common waveform, station and event metadata formats and c) enables access to various data centers, webservices and databases to retrieve waveform data and station/event metadata. In combination with mature and free Python packages like NumPy, SciPy, Matplotlib, IPython, Pandas, lxml, and PyQt, ObsPy makes it possible to develop complete workflows in Python, ranging from reading locally stored data or requesting data from one or more different data centers via signal analysis and data processing to visualization in GUI and web applications, output of modified/derived data and the creation of publication-quality figures. All functionality is extensively documented and the ObsPy Tutorial and Gallery give a good impression of the wide range of possible use cases. ObsPy is tested and running on Linux, OS X and Windows and comes with installation routines for these systems. ObsPy is developed in a test-driven approach and is available under the LGPLv3 open source licence. Users are welcome to request help, report bugs, propose enhancements or contribute code via either the user mailing list or the project page on GitHub.
Judicious use of custom development in an open source component architecture
NASA Astrophysics Data System (ADS)
Bristol, S.; Latysh, N.; Long, D.; Tekell, S.; Allen, J.
2014-12-01
Modern software engineering is not as much programming from scratch as innovative assembly of existing components. Seamlessly integrating disparate components into scalable, performant architecture requires sound engineering craftsmanship and can often result in increased cost efficiency and accelerated capabilities if software teams focus their creativity on the edges of the problem space. ScienceBase is part of the U.S. Geological Survey scientific cyberinfrastructure, providing data and information management, distribution services, and analysis capabilities in a way that strives to follow this pattern. ScienceBase leverages open source NoSQL and relational databases, search indexing technology, spatial service engines, numerous libraries, and one proprietary but necessary software component in its architecture. The primary engineering focus is cohesive component interaction, including construction of a seamless Application Programming Interface (API) across all elements. The API allows researchers and software developers alike to leverage the infrastructure in unique, creative ways. Scaling the ScienceBase architecture and core API with increasing data volume (more databases) and complexity (integrated science problems) is a primary challenge addressed by judicious use of custom development in the component architecture. Other data management and informatics activities in the earth sciences have independently resolved to a similar design of reusing and building upon established technology and are working through similar issues for managing and developing information (e.g., U.S. Geoscience Information Network; NASA's Earth Observing System Clearing House; GSToRE at the University of New Mexico). Recent discussions facilitated through the Earth Science Information Partners are exploring potential avenues to exploit the implicit relationships between similar projects for explicit gains in our ability to more rapidly advance global scientific cyberinfrastructure.
NASA Astrophysics Data System (ADS)
Collins, L.
2014-12-01
Students in the Environmental Studies major at the University of Southern California fulfill their curriculum requirements by taking a broad range of courses in the social and natural sciences. Climate change is often taught in 1-2 lectures in these courses with limited examination of this complex topic. Several upper division elective courses focus on the science, policy, and social impacts of climate change. In an upper division course focused on the scientific tools used to determine paleoclimate and predict future climate, I have developed a project where students download, manipulate, and analyze data from the National Climatic Data Center. Students are required to download 100 or more years of daily temperature records and use the statistical program R to analyze that data, calculating daily, monthly, and yearly temperature averages along with changes in the number of extreme hot or cold days (≥90˚F and ≤30˚F, respectively). In parallel, they examine population growth, city expansion, and changes in transportation looking for correlations between the social data and trends observed in the temperature data. Students examine trends over time to determine correlations to urban heat island effect. This project exposes students to "real" data, giving them the tools necessary to critically analyze scientific studies without being experts in the field. Utilizing the existing, public, online databases provides almost unlimited, free data. Open source statistical programs provide a cost-free platform for examining the data although some in-class time is required to help students navigate initial data importation and analysis. Results presented will highlight data compiled over three years of course projects.
The future of academic publishing: what is open access?
Collins, Jannette
2005-04-01
For more than 200 years, publishers have been charging users (i.e., subscribers) for access to scientific information to make a profit. Authors have been required to grant copyright ownership to the publisher. This system was not questioned until the Internet popularized electronic publishing. The Internet allows for rapid dissemination of information to millions of readers. Some people have seen this as an opportunity to revolutionize the system of scientific publishing and to make it one that provides free, open access to all scientific information to all persons everywhere in the world. Such systems have been launched and have instigated a wave of dialogue among proponents and opponents alike. At the center of the controversy is the issue of who will pay for the costs of publishing, because an open-access system is not free, and this threatens the backbone of the traditional publishing industry. Currently, open-access publishers charge authors a fee to have their articles published. Because of this and the uncertainty of the sustainability of the open-access system, some authors are hesitant to participate in the new system. This article reviews the events that led to the creation of open-access publishing, the arguments for and against it, and the implications of open access for the future of academic publishing.
The successes and challenges of open-source biopharmaceutical innovation.
Allarakhia, Minna
2014-05-01
Increasingly, open-source-based alliances seek to provide broad access to data, research-based tools, preclinical samples and downstream compounds. The challenge is how to create value from open-source biopharmaceutical innovation. This value creation may occur via transparency and usage of data across the biopharmaceutical value chain as stakeholders move dynamically between open source and open innovation. In this article, several examples are used to trace the evolution of biopharmaceutical open-source initiatives. The article specifically discusses the technological challenges associated with the integration and standardization of big data; the human capacity development challenges associated with skill development around big data usage; and the data-material access challenge associated with data and material access and usage rights, particularly as the boundary between open source and open innovation becomes more fluid. It is the author's opinion that the assessment of when and how value creation will occur, through open-source biopharmaceutical innovation, is paramount. The key is to determine the metrics of value creation and the necessary technological, educational and legal frameworks to support the downstream outcomes of now big data-based open-source initiatives. The continued focus on the early-stage value creation is not advisable. Instead, it would be more advisable to adopt an approach where stakeholders transform open-source initiatives into open-source discovery, crowdsourcing and open product development partnerships on the same platform.
Gerhard, Stephan; Daducci, Alessandro; Lemkaddem, Alia; Meuli, Reto; Thiran, Jean-Philippe; Hagmann, Patric
2011-01-01
Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit - a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org/
icoshift: A versatile tool for the rapid alignment of 1D NMR spectra
NASA Astrophysics Data System (ADS)
Savorani, F.; Tomasi, G.; Engelsen, S. B.
2010-02-01
The increasing scientific and industrial interest towards metabonomics takes advantage from the high qualitative and quantitative information level of nuclear magnetic resonance (NMR) spectroscopy. However, several chemical and physical factors can affect the absolute and the relative position of an NMR signal and it is not always possible or desirable to eliminate these effects a priori. To remove misalignment of NMR signals a posteriori, several algorithms have been proposed in the literature. The icoshift program presented here is an open source and highly efficient program designed for solving signal alignment problems in metabonomic NMR data analysis. The icoshift algorithm is based on correlation shifting of spectral intervals and employs an FFT engine that aligns all spectra simultaneously. The algorithm is demonstrated to be faster than similar methods found in the literature making full-resolution alignment of large datasets feasible and thus avoiding down-sampling steps such as binning. The algorithm uses missing values as a filling alternative in order to avoid spectral artifacts at the segment boundaries. The algorithm is made open source and the Matlab code including documentation can be downloaded from www.models.life.ku.dk.
Big Software for SmallSats: Adapting CFS to CubeSat Missions
NASA Technical Reports Server (NTRS)
Cudmore, Alan P.; Crum, Gary; Sheikh, Salman; Marshall, James
2015-01-01
Expanding capabilities and mission objectives for SmallSats and CubeSats is driving the need for reliable, reusable, and robust flight software. While missions are becoming more complicated and the scientific goals more ambitious, the level of acceptable risk has decreased. Design challenges are further compounded by budget and schedule constraints that have not kept pace. NASA's Core Flight Software System (cFS) is an open source solution which enables teams to build flagship satellite level flight software within a CubeSat schedule and budget. NASA originally developed cFS to reduce mission and schedule risk for flagship satellite missions by increasing code reuse and reliability. The Lunar Reconnaissance Orbiter, which launched in 2009, was the first of a growing list of Class B rated missions to use cFS. Large parts of cFS are now open source, which has spurred adoption outside of NASA. This paper reports on the experiences of two teams using cFS for current CubeSat missions. The performance overheads of cFS are quantified, and the reusability of code between missions is discussed. The analysis shows that cFS is well suited to use on CubeSats and demonstrates the portability and modularity of cFS code.
Introducing GHOST: The Geospace/Heliosphere Observation & Simulation Tool-kit
NASA Astrophysics Data System (ADS)
Murphy, J. J.; Elkington, S. R.; Schmitt, P.; Wiltberger, M. J.; Baker, D. N.
2013-12-01
Simulation models of the heliospheric and geospace environments can provide key insights into the geoeffective potential of solar disturbances such as Coronal Mass Ejections and High Speed Solar Wind Streams. Advanced post processing of the results of these simulations greatly enhances the utility of these models for scientists and other researchers. Currently, no supported centralized tool exists for performing these processing tasks. With GHOST, we introduce a toolkit for the ParaView visualization environment that provides a centralized suite of tools suited for Space Physics post processing. Building on the work from the Center For Integrated Space Weather Modeling (CISM) Knowledge Transfer group, GHOST is an open-source tool suite for ParaView. The tool-kit plugin currently provides tools for reading LFM and Enlil data sets, and provides automated tools for data comparison with NASA's CDAweb database. As work progresses, many additional tools will be added and through open-source collaboration, we hope to add readers for additional model types, as well as any additional tools deemed necessary by the scientific public. The ultimate end goal of this work is to provide a complete Sun-to-Earth model analysis toolset.
PyXRF: Python-based X-ray fluorescence analysis package
NASA Astrophysics Data System (ADS)
Li, Li; Yan, Hanfei; Xu, Wei; Yu, Dantong; Heroux, Annie; Lee, Wah-Keat; Campbell, Stuart I.; Chu, Yong S.
2017-09-01
We developed a python-based fluorescence analysis package (PyXRF) at the National Synchrotron Light Source II (NSLS-II) for the X-ray fluorescence-microscopy beamlines, including Hard X-ray Nanoprobe (HXN), and Submicron Resolution X-ray Spectroscopy (SRX). This package contains a high-level fitting engine, a comprehensive commandline/ GUI design, rigorous physics calculations, and a visualization interface. PyXRF offers a method of automatically finding elements, so that users do not need to spend extra time selecting elements manually. Moreover, PyXRF provides a convenient and interactive way of adjusting fitting parameters with physical constraints. This will help us perform quantitative analysis, and find an appropriate initial guess for fitting. Furthermore, we also create an advanced mode for expert users to construct their own fitting strategies with a full control of each fitting parameter. PyXRF runs single-pixel fitting at a fast speed, which opens up the possibilities of viewing the results of fitting in real time during experiments. A convenient I/O interface was designed to obtain data directly from NSLS-II's experimental database. PyXRF is under open-source development and designed to be an integral part of NSLS-II's scientific computation library.
Atwood, Robert C.; Bodey, Andrew J.; Price, Stephen W. T.; Basham, Mark; Drakopoulos, Michael
2015-01-01
Tomographic datasets collected at synchrotrons are becoming very large and complex, and, therefore, need to be managed efficiently. Raw images may have high pixel counts, and each pixel can be multidimensional and associated with additional data such as those derived from spectroscopy. In time-resolved studies, hundreds of tomographic datasets can be collected in sequence, yielding terabytes of data. Users of tomographic beamlines are drawn from various scientific disciplines, and many are keen to use tomographic reconstruction software that does not require a deep understanding of reconstruction principles. We have developed Savu, a reconstruction pipeline that enables users to rapidly reconstruct data to consistently create high-quality results. Savu is designed to work in an ‘orthogonal’ fashion, meaning that data can be converted between projection and sinogram space throughout the processing workflow as required. The Savu pipeline is modular and allows processing strategies to be optimized for users' purposes. In addition to the reconstruction algorithms themselves, it can include modules for identification of experimental problems, artefact correction, general image processing and data quality assessment. Savu is open source, open licensed and ‘facility-independent’: it can run on standard cluster infrastructure at any institution. PMID:25939626
Open source libraries and frameworks for biological data visualisation: a guide for developers.
Wang, Rui; Perez-Riverol, Yasset; Hermjakob, Henning; Vizcaíno, Juan Antonio
2015-04-01
Recent advances in high-throughput experimental techniques have led to an exponential increase in both the size and the complexity of the data sets commonly studied in biology. Data visualisation is increasingly used as the key to unlock this data, going from hypothesis generation to model evaluation and tool implementation. It is becoming more and more the heart of bioinformatics workflows, enabling scientists to reason and communicate more effectively. In parallel, there has been a corresponding trend towards the development of related software, which has triggered the maturation of different visualisation libraries and frameworks. For bioinformaticians, scientific programmers and software developers, the main challenge is to pick out the most fitting one(s) to create clear, meaningful and integrated data visualisation for their particular use cases. In this review, we introduce a collection of open source or free to use libraries and frameworks for creating data visualisation, covering the generation of a wide variety of charts and graphs. We will focus on software written in Java, JavaScript or Python. We truly believe this software offers the potential to turn tedious data into exciting visual stories. © 2014 The Authors. PROTEOMICS published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Gerhard, Stephan; Daducci, Alessandro; Lemkaddem, Alia; Meuli, Reto; Thiran, Jean-Philippe; Hagmann, Patric
2011-01-01
Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit – a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org/ PMID:21713110
Open source integrated modeling environment Delta Shell
NASA Astrophysics Data System (ADS)
Donchyts, G.; Baart, F.; Jagers, B.; van Putten, H.
2012-04-01
In the last decade, integrated modelling has become a very popular topic in environmental modelling since it helps solving problems, which is difficult to model using a single model. However, managing complexity of integrated models and minimizing time required for their setup remains a challenging task. The integrated modelling environment Delta Shell simplifies this task. The software components of Delta Shell are easy to reuse separately from each other as well as a part of integrated environment that can run in a command-line or a graphical user interface mode. The most components of the Delta Shell are developed using C# programming language and include libraries used to define, save and visualize various scientific data structures as well as coupled model configurations. Here we present two examples showing how Delta Shell simplifies process of setting up integrated models from the end user and developer perspectives. The first example shows coupling of a rainfall-runoff, a river flow and a run-time control models. The second example shows how coastal morphological database integrates with the coastal morphological model (XBeach) and a custom nourishment designer. Delta Shell is also available as open-source software released under LGPL license and accessible via http://oss.deltares.nl.
Architectures for Rainfall Property Estimation From Polarimetric Radar
NASA Astrophysics Data System (ADS)
Collis, S. M.; Giangrande, S. E.; Helmus, J.; Troemel, S.
2014-12-01
Radars that transmit and receive signals in polarizations aligned both horizontal and vertical to the horizon collect a number of measurements. The relation both between these measurements and between measurements and desired microphysical quantities (such as rainfall rate) is complicated due to a number of scattering mechanisms. The result is that there ends up being an intractable number of often incompatible techniques for extracting geophysical insight. This presentation will discuss methods developed by the Atmospheric Measurement Climate (ARM) Research Facility to streamline the creation of application chains for retrieving rainfall properties for the purposes of fine scale model evaluation. By using a Common Data Model (CDM) approach and working in the popular open source Python scientific environment analysis techniques such as Linear Programming (LP) can be bought to bear on the task of retrieving insight from radar signals. This presentation will outline how we have used these techniques to detangle polarimetric phase signals, estimate a three-dimensional precipitation field and then objectively compare to cloud resolving model derived rainfall fields from the NASA/DoE Mid-Latitude Continental Convective Clouds Experiment (MC3E). All techniques show will be available, open source, in the Python-ARM Radar Toolkit (Py-ART).
Open source libraries and frameworks for biological data visualisation: A guide for developers
Wang, Rui; Perez-Riverol, Yasset; Hermjakob, Henning; Vizcaíno, Juan Antonio
2015-01-01
Recent advances in high-throughput experimental techniques have led to an exponential increase in both the size and the complexity of the data sets commonly studied in biology. Data visualisation is increasingly used as the key to unlock this data, going from hypothesis generation to model evaluation and tool implementation. It is becoming more and more the heart of bioinformatics workflows, enabling scientists to reason and communicate more effectively. In parallel, there has been a corresponding trend towards the development of related software, which has triggered the maturation of different visualisation libraries and frameworks. For bioinformaticians, scientific programmers and software developers, the main challenge is to pick out the most fitting one(s) to create clear, meaningful and integrated data visualisation for their particular use cases. In this review, we introduce a collection of open source or free to use libraries and frameworks for creating data visualisation, covering the generation of a wide variety of charts and graphs. We will focus on software written in Java, JavaScript or Python. We truly believe this software offers the potential to turn tedious data into exciting visual stories. PMID:25475079
NASA Astrophysics Data System (ADS)
Lemmens, R.; Maathuis, B.; Mannaerts, C.; Foerster, T.; Schaeffer, B.; Wytzisk, A.
2009-12-01
This paper involves easy accessible integrated web-based analysis of satellite images with a plug-in based open source software. The paper is targeted to both users and developers of geospatial software. Guided by a use case scenario, we describe the ILWIS software and its toolbox to access satellite images through the GEONETCast broadcasting system. The last two decades have shown a major shift from stand-alone software systems to networked ones, often client/server applications using distributed geo-(web-)services. This allows organisations to combine without much effort their own data with remotely available data and processing functionality. Key to this integrated spatial data analysis is a low-cost access to data from within a user-friendly and flexible software. Web-based open source software solutions are more often a powerful option for developing countries. The Integrated Land and Water Information System (ILWIS) is a PC-based GIS & Remote Sensing software, comprising a complete package of image processing, spatial analysis and digital mapping and was developed as commercial software from the early nineties onwards. Recent project efforts have migrated ILWIS into a modular, plug-in-based open source software, and provide web-service support for OGC-based web mapping and processing. The core objective of the ILWIS Open source project is to provide a maintainable framework for researchers and software developers to implement training components, scientific toolboxes and (web-) services. The latest plug-ins have been developed for multi-criteria decision making, water resources analysis and spatial statistics analysis. The development of this framework is done since 2007 in the context of 52°North, which is an open initiative that advances the development of cutting edge open source geospatial software, using the GPL license. GEONETCast, as part of the emerging Global Earth Observation System of Systems (GEOSS), puts essential environmental data at the fingertips of users around the globe. This user-friendly and low-cost information dissemination provides global information as a basis for decision-making in a number of critical areas, including public health, energy, agriculture, weather, water, climate, natural disasters and ecosystems. GEONETCast makes available satellite images via Digital Video Broadcast (DVB) technology. An OGC WMS interface and plug-ins which convert GEONETCast data streams allow an ILWIS user to integrate various distributed data sources with data locally stored on his machine. Our paper describes a use case in which ILWIS is used with GEONETCast satellite imagery for decision making processes in Ghana. We also explain how the ILWIS software can be extended with additional functionality by means of building plug-ins and unfold our plans to implement other OGC standards, such as WCS and WPS in the same context. Especially, the latter one can be seen as a major step forward in terms of moving well-proven desktop based processing functionality to the web. This enables the embedding of ILWIS functionality in Spatial Data Infrastructures or even the execution in scalable and on-demand cloud computing environments.
Open Archives Initiative (OAI) Services | OSTI, US Dept of Energy Office of
Scientific and Technical Information skip to main content Sign In Create Account OSTI.GOV title logo U.S. Department of Energy Office of Scientific and Technical Information Search terms: Advanced Sign In Create Account This page is being shared by OSTI.GOV. Open Archives Initiative (OAI) Services
SciELO, Scientific Electronic Library Online, a Database of Open Access Journals
ERIC Educational Resources Information Center
Meneghini, Rogerio
2013-01-01
This essay discusses SciELO, a scientific journal database operating in 14 countries. It covers over 1000 journals providing open access to full text and table sets of scientometrics data. In Brazil it is responsible for a collection of nearly 300 journals, selected along 15 years as the best Brazilian periodicals in natural and social sciences.…
Inquiry-based science education: towards a pedagogical framework for primary school teachers
NASA Astrophysics Data System (ADS)
van Uum, Martina S. J.; Verhoeff, Roald P.; Peeters, Marieke
2016-02-01
Inquiry-based science education (IBSE) has been promoted as an inspiring way of learning science by engaging pupils in designing and conducting their own scientific investigations. For primary school teachers, the open nature of IBSE poses challenges as they often lack experience in supporting their pupils during the different phases of an open IBSE project, such as formulating a research question and designing and conducting an investigation. The current study aims to meet these challenges by presenting a pedagogical framework in which four domains of scientific knowledge are addressed in seven phases of inquiry. The framework is based on video analyses of pedagogical interventions by primary school teachers participating in open IBSE projects. Our results show that teachers can guide their pupils successfully through the process of open inquiry by explicitly addressing the conceptual, epistemic, social and/or procedural domain of scientific knowledge in the subsequent phases of inquiry. The paper concludes by suggesting further research to validate our framework and to develop a pedagogy for primary school teachers to guide their pupils through the different phases of open inquiry.
Sandia National Laboratories: Livermore Valley Open Campus (LVOC)
Visiting the LVOC Locations Livermore Valley Open Campus (LVOC) Open engagement Expanding opportunities for open engagement of the broader scientific community. Building on success Sandia's Combustion Research Facility pioneered open collaboration over 30 years ago. Access to DOE-funded capabilities Expanding access
ERIC Educational Resources Information Center
Voyles, Bennett
2007-01-01
People know about the Sakai Project (open source course management system); they may even know about Kuali (open source financials). So, what is the next wave in open source software? This article discusses business intelligence (BI) systems. Though open source BI may still be only a rumor in most campus IT departments, some brave early adopters…
Financial anatomy of biomedical research.
Moses, Hamilton; Dorsey, E Ray; Matheson, David H M; Thier, Samuel O
2005-09-21
Public and private financial support of biomedical research have increased over the past decade. Few comprehensive analyses of the sources and uses of funds are available. This results in inadequate information on which to base investment decisions because not all sources allow equal latitude to explore hypotheses having scientific or clinical importance and creates a barrier to judging the value of research to society. To quantify funding trends from 1994 to 2004 of basic, translational, and clinical biomedical research by principal sponsors based in the United States. Publicly available data were compiled for the federal, state, and local governments; foundations; charities; universities; and industry. Proprietary (by subscription but openly available) databases were used to supplement public sources. Total actual research spending, growth rates, and type of research with inflation adjustment. Biomedical research funding increased from 37.1 billion dollars in 1994 to 94.3 billion dollars in 2003 and doubled when adjusted for inflation. Principal research sponsors in 2003 were industry (57%) and the National Institutes of Health (28%). Relative proportions from all public and private sources did not change. Industry sponsorship of clinical trials increased from 4.0 dollars to 14.2 billion dollars (in real terms) while federal proportions devoted to basic and applied research were unchanged. The United States spent an estimated 5.6% of its total health expenditures on biomedical research, more than any other country, but less than 0.1% for health services research. From an economic perspective, biotechnology and medical device companies were most productive, as measured by new diagnostic and therapeutic devices per dollar of research and development cost. Productivity declined for new pharmaceuticals. Enhancing research productivity and evaluation of benefit are pressing challenges, requiring (1) more effective translation of basic scientific knowledge to clinical application; (2) critical appraisal of rapidly moving scientific areas to guide investment where clinical need is greatest, not only where commercial opportunity is currently perceived; and (3) more specific information about sources and uses of research funds than is generally available to allow informed investment decisions. Responsibility falls on industry, government, and foundations to bring these changes about with a longer-term view of research value.
Biomedical ontologies: toward scientific debate.
Maojo, V; Crespo, J; García-Remesal, M; de la Iglesia, D; Perez-Rey, D; Kulikowski, C
2011-01-01
Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.
The Commercial Open Source Business Model
NASA Astrophysics Data System (ADS)
Riehle, Dirk
Commercial open source software projects are open source software projects that are owned by a single firm that derives a direct and significant revenue stream from the software. Commercial open source at first glance represents an economic paradox: How can a firm earn money if it is making its product available for free as open source? This paper presents the core properties of com mercial open source business models and discusses how they work. Using a commercial open source approach, firms can get to market faster with a superior product at lower cost than possible for traditional competitors. The paper shows how these benefits accrue from an engaged and self-supporting user community. Lacking any prior comprehensive reference, this paper is based on an analysis of public statements by practitioners of commercial open source. It forges the various anecdotes into a coherent description of revenue generation strategies and relevant business functions.
Pika: A snow science simulation tool built using the open-source framework MOOSE
NASA Astrophysics Data System (ADS)
Slaughter, A.; Johnson, M.
2017-12-01
The Department of Energy (DOE) is currently investing millions of dollars annually into various modeling and simulation tools for all aspects of nuclear energy. An important part of this effort includes developing applications based on the open-source Multiphysics Object Oriented Simulation Environment (MOOSE; mooseframework.org) from Idaho National Laboratory (INL).Thanks to the efforts of the DOE and outside collaborators, MOOSE currently contains a large set of physics modules, including phase-field, level set, heat conduction, tensor mechanics, Navier-Stokes, fracture and crack propagation (via the extended finite-element method), flow in porous media, and others. The heat conduction, tensor mechanics, and phase-field modules, in particular, are well-suited for snow science problems. Pika--an open-source MOOSE-based application--is capable of simulating both 3D, coupled nonlinear continuum heat transfer and large-deformation mechanics applications (such as settlement) and phase-field based micro-structure applications. Additionally, these types of problems may be coupled tightly in a single solve or across length and time scales using a loosely coupled Picard iteration approach. In addition to the wide range of physics capabilities, MOOSE-based applications also inherit an extensible testing framework, graphical user interface, and documentation system; tools that allow MOOSE and other applications to adhere to nuclear software quality standards. The snow science community can learn from the nuclear industry and harness the existing effort to build simulation tools that are open, modular, and share a common framework. In particular, MOOSE-based multiphysics solvers are inherently parallel, dimension agnostic, adaptive in time and space, fully coupled, and capable of interacting with other applications. The snow science community should build on existing tools to enable collaboration between researchers and practitioners throughout the world, and advance the state-of-the-art in line with other scientific research efforts.
Nature of air pollution, emission sources, and management in the Indian cities
NASA Astrophysics Data System (ADS)
Guttikunda, Sarath K.; Goel, Rahul; Pant, Pallavi
2014-10-01
The global burden of disease study estimated 695,000 premature deaths in 2010 due to continued exposure to outdoor particulate matter and ozone pollution for India. By 2030, the expected growth in many of the sectors (industries, residential, transportation, power generation, and construction) will result in an increase in pollution related health impacts for most cities. The available information on urban air pollution, their sources, and the potential of various interventions to control pollution, should help us propose a cleaner path to 2030. In this paper, we present an overview of the emission sources and control options for better air quality in Indian cities, with a particular focus on interventions like urban public transportation facilities; travel demand management; emission regulations for power plants; clean technology for brick kilns; management of road dust; and waste management to control open waste burning. Also included is a broader discussion on key institutional measures, like public awareness and scientific studies, necessary for building an effective air quality management plan in Indian cities.
Building a Snow Data System on the Apache OODT Open Technology Stack
NASA Astrophysics Data System (ADS)
Goodale, C. E.; Painter, T. H.; Mattmann, C. A.; Hart, A. F.; Ramirez, P.; Zimdars, P.; Bryant, A. C.; Snow Data System Team
2011-12-01
Snow cover and its melt dominate regional climate and hydrology in many of the world's mountainous regions. One-sixth of Earth's population depends on snow- or glacier-melt for water resources. Operationally, seasonal forecasts of snowmelt-generated streamflow are leveraged through empirical relations based on past snowmelt periods. These historical data show that climate is changing, but the changes reduce the reliability of the empirical relations. Therefore optimal future management of snowmelt derived water resources will require explicit physical models driven by remotely sensed snow property data. Toward this goal, the Snow Optics Laboratory at the Jet Propulsion Laboratory has initiated a near real-time processing pipeline to generate and publish post-processed snow data products within a few hours of satellite acquisition. To solve this challenge, a Scientific Data Management and Processing System was required and the JPL Team leveraged an open-source project called Object Oriented Data Technology (OODT). OODT was developed within NASA's Jet Propulsion Laboratory across the last 10 years. OODT has supported various scientific data management and processing projects, providing solutions in the Earth, Planetary, and Medical science fields. It became apparent that the project needed to be opened to a larger audience to foster and promote growth and adoption. OODT was open-sourced at the Apache Software Foundation in November 2010 and has a growing community of users and committers that are constantly improving the software. Leveraging OODT, the JPL Snow Data System (SnowDS) Team was able to install and configure a core Data Management System (DMS) that would download MODIS raw data files and archive the products in a local repository for post processing. The team has since built an online data portal, and an algorithm-processing pipeline using the Apache OODT software as the foundation. We will present the working SnowDS system with its core remote sensing components: the MODIS Snow Covered Area and Grain size model (MODSCAG) and the MODIS Dust Radiative Forcing in Snow (MOD-DRFS). These products will be delivered in near real time to water managers and the broader cryosphere and climate community beginning in Winter 2012. We will then present the challenges and opportunities we see in the future as the SnowDS matures and contributions are made back to the OODT project.
Reengineering Workflow for Curation of DICOM Datasets.
Bennett, William; Smith, Kirk; Jarosz, Quasar; Nolan, Tracy; Bosch, Walter
2018-06-15
Reusable, publicly available data is a pillar of open science and rapid advancement of cancer imaging research. Sharing data from completed research studies not only saves research dollars required to collect data, but also helps insure that studies are both replicable and reproducible. The Cancer Imaging Archive (TCIA) is a global shared repository for imaging data related to cancer. Insuring the consistency, scientific utility, and anonymity of data stored in TCIA is of utmost importance. As the rate of submission to TCIA has been increasing, both in volume and complexity of DICOM objects stored, the process of curation of collections has become a bottleneck in acquisition of data. In order to increase the rate of curation of image sets, improve the quality of the curation, and better track the provenance of changes made to submitted DICOM image sets, a custom set of tools was developed, using novel methods for the analysis of DICOM data sets. These tools are written in the programming language perl, use the open-source database PostgreSQL, make use of the perl DICOM routines in the open-source package Posda, and incorporate DICOM diagnostic tools from other open-source packages, such as dicom3tools. These tools are referred to as the "Posda Tools." The Posda Tools are open source and available via git at https://github.com/UAMS-DBMI/PosdaTools . In this paper, we briefly describe the Posda Tools and discuss the novel methods employed by these tools to facilitate rapid analysis of DICOM data, including the following: (1) use a database schema which is more permissive, and differently normalized from traditional DICOM databases; (2) perform integrity checks automatically on a bulk basis; (3) apply revisions to DICOM datasets on an bulk basis, either through a web-based interface or via command line executable perl scripts; (4) all such edits are tracked in a revision tracker and may be rolled back; (5) a UI is provided to inspect the results of such edits, to verify that they are what was intended; (6) identification of DICOM Studies, Series, and SOP instances using "nicknames" which are persistent and have well-defined scope to make expression of reported DICOM errors easier to manage; and (7) rapidly identify potential duplicate DICOM datasets by pixel data is provided; this can be used, e.g., to identify submission subjects which may relate to the same individual, without identifying the individual.
Kriegeskorte, Nikolaus
2012-01-01
The two major functions of a scientific publishing system are to provide access to and evaluation of scientific papers. While open access (OA) is becoming a reality, open evaluation (OE), the other side of the coin, has received less attention. Evaluation steers the attention of the scientific community and thus the very course of science. It also influences the use of scientific findings in public policy. The current system of scientific publishing provides only journal prestige as an indication of the quality of new papers and relies on a non-transparent and noisy pre-publication peer-review process, which delays publication by many months on average. Here I propose an OE system, in which papers are evaluated post-publication in an ongoing fashion by means of open peer review and rating. Through signed ratings and reviews, scientists steer the attention of their field and build their reputation. Reviewers are motivated to be objective, because low-quality or self-serving signed evaluations will negatively impact their reputation. A core feature of this proposal is a division of powers between the accumulation of evaluative evidence and the analysis of this evidence by paper evaluation functions (PEFs). PEFs can be freely defined by individuals or groups (e.g., scientific societies) and provide a plurality of perspectives on the scientific literature. Simple PEFs will use averages of ratings, weighting reviewers (e.g., by H-index), and rating scales (e.g., by relevance to a decision process) in different ways. Complex PEFs will use advanced statistical techniques to infer the quality of a paper. Papers with initially promising ratings will be more deeply evaluated. The continual refinement of PEFs in response to attempts by individuals to influence evaluations in their own favor will make the system ungameable. OA and OE together have the power to revolutionize scientific publishing and usher in a new culture of transparency, constructive criticism, and collaboration. PMID:23087639
Open-source mobile digital platform for clinical trial data collection in low-resource settings.
van Dam, Joris; Omondi Onyango, Kevin; Midamba, Brian; Groosman, Nele; Hooper, Norman; Spector, Jonathan; Pillai, Goonaseelan Colin; Ogutu, Bernhards
2017-02-01
Governments, universities and pan-African research networks are building durable infrastructure and capabilities for biomedical research in Africa. This offers the opportunity to adopt from the outset innovative approaches and technologies that would be challenging to retrofit into fully established research infrastructures such as those regularly found in high-income countries. In this context we piloted the use of a novel mobile digital health platform, designed specifically for low-resource environments, to support high-quality data collection in a clinical research study. Our primary aim was to assess the feasibility of a using a mobile digital platform for clinical trial data collection in a low-resource setting. Secondarily, we sought to explore the potential benefits of such an approach. The investigative site was a research institute in Nairobi, Kenya. We integrated an open-source platform for mobile data collection commonly used in the developing world with an open-source, standard platform for electronic data capture in clinical trials. The integration was developed using common data standards (Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model), maximising the potential to extend the approach to other platforms. The system was deployed in a pharmacokinetic study involving healthy human volunteers. The electronic data collection platform successfully supported conduct of the study. Multidisciplinary users reported high levels of satisfaction with the mobile application and highlighted substantial advantages when compared with traditional paper record systems. The new system also demonstrated a potential for expediting data quality review. This pilot study demonstrated the feasibility of using a mobile digital platform for clinical research data collection in low-resource settings. Sustainable scientific capabilities and infrastructure are essential to attract and support clinical research studies. Since many research structures in Africa are being developed anew, stakeholders should consider implementing innovative technologies and approaches.
PLOCAN glider portal: a gateway for useful data management and visualization system
NASA Astrophysics Data System (ADS)
Morales, Tania; Lorenzo, Alvaro; Viera, Josue; Barrera, Carlos; José Rueda, María
2014-05-01
Nowadays monitoring ocean behavior and its characteristics involves a wide range of sources able to gather and provide a vast amount of data in spatio-temporal scales. Multiplatform infrastructures, like PLOCAN, hold a variety of autonomous Lagrangian and Eulerian devices addressed to collect information then transferred to land in near-real time. Managing all this data collection in an efficient way is a major issue. Advances in ocean observation technologies, where underwater autonomous gliders play a key role, has brought as a consequence an improvement of spatio-temporal resolution which offers a deeper understanding of the ocean but requires a bigger effort in the data management process. There are general requirements in terms of data management in that kind of environments, such as processing raw data at different levels to obtain valuable information, storing data coherently and providing accurate products to final users according to their specific needs. Managing large amount of data can be certainly tedious and complex without having right tools and operational procedures; hence automating these tasks through software applications saves time and reduces errors. Moreover, data distribution is highly relevant since scientist tent to assimilate different sources for comparison and validation. The use of web applications has boosted the necessary scientific dissemination. Within this argument, PLOCAN has implemented a set of independent but compatible applications to process, store and disseminate information gathered through different oceanographic platforms. These applications have been implemented using open standards, such as HTML and CSS, and open source software, like python as programming language and Django as framework web. More specifically, a glider application has been developed within the framework of FP7-GROOM project. Regarding data management, this project focuses on collecting and making available consistent and quality controlled datasets as well as fostering open access to glider data.
JHelioviewer: Open-Source Software for Discovery and Image Access in the Petabyte Age
NASA Astrophysics Data System (ADS)
Mueller, D.; Dimitoglou, G.; Garcia Ortiz, J.; Langenberg, M.; Nuhn, M.; Dau, A.; Pagel, S.; Schmidt, L.; Hughitt, V. K.; Ireland, J.; Fleck, B.
2011-12-01
The unprecedented torrent of data returned by the Solar Dynamics Observatory is both a blessing and a barrier: a blessing for making available data with significantly higher spatial and temporal resolution, but a barrier for scientists to access, browse and analyze them. With such staggering data volume, the data is accessible only from a few repositories and users have to deal with data sets effectively immobile and practically difficult to download. From a scientist's perspective this poses three challenges: accessing, browsing and finding interesting data while avoiding the proverbial search for a needle in a haystack. To address these challenges, we have developed JHelioviewer, an open-source visualization software that lets users browse large data volumes both as still images and movies. We did so by deploying an efficient image encoding, storage, and dissemination solution using the JPEG 2000 standard. This solution enables users to access remote images at different resolution levels as a single data stream. Users can view, manipulate, pan, zoom, and overlay JPEG 2000 compressed data quickly, without severe network bandwidth penalties. Besides viewing data, the browser provides third-party metadata and event catalog integration to quickly locate data of interest, as well as an interface to the Virtual Solar Observatory to download science-quality data. As part of the ESA/NASA Helioviewer Project, JHelioviewer offers intuitive ways to browse large amounts of heterogeneous data remotely and provides an extensible and customizable open-source platform for the scientific community. In addition, the easy-to-use graphical user interface enables the general public and educators to access, enjoy and reuse data from space missions without barriers.
NASA Astrophysics Data System (ADS)
Udell, C.; Selker, J. S.
2017-12-01
The increasing availability and functionality of Open-Source software and hardware along with 3D printing, low-cost electronics, and proliferation of open-access resources for learning rapid prototyping are contributing to fundamental transformations and new technologies in environmental sensing. These tools invite reevaluation of time-tested methodologies and devices toward more efficient, reusable, and inexpensive alternatives. Building upon Open-Source design facilitates community engagement and invites a Do-It-Together (DIT) collaborative framework for research where solutions to complex problems may be crowd-sourced. However, barriers persist that prevent researchers from taking advantage of the capabilities afforded by open-source software, hardware, and rapid prototyping. Some of these include: requisite technical skillsets, knowledge of equipment capabilities, identifying inexpensive sources for materials, money, space, and time. A university MAKER space staffed by engineering students to assist researchers is one proposed solution to overcome many of these obstacles. This presentation investigates the unique capabilities the USDA-funded Openly Published Environmental Sensing (OPEnS) Lab affords researchers, within Oregon State and internationally, and the unique functions these types of initiatives support at the intersection of MAKER spaces, Open-Source academic research, and open-access dissemination.
Data Cube Visualization with Blender
NASA Astrophysics Data System (ADS)
Kent, Brian R.; Gárate, Matías
2017-06-01
With the increasing data acquisition rates from observational and computational astrophysics, new tools are needed to study and visualize data. We present a methodology for rendering 3D data cubes using the open-source 3D software Blender. By importing processed observations and numerical simulations through the Voxel Data format, we are able use the Blender interface and Python API to create high-resolution animated visualizations. We review the methods for data import, animation, and camera movement, and present examples of this methodology. The 3D rendering of data cubes gives scientists the ability to create appealing displays that can be used for both scientific presentations as well as public outreach.
Babcock, Hazen P
2018-01-29
This work explores the use of industrial grade CMOS cameras for single molecule localization microscopy (SMLM). We show that industrial grade CMOS cameras approach the performance of scientific grade CMOS cameras at a fraction of the cost. This makes it more economically feasible to construct high-performance imaging systems with multiple cameras that are capable of a diversity of applications. In particular we demonstrate the use of industrial CMOS cameras for biplane, multiplane and spectrally resolved SMLM. We also provide open-source software for simultaneous control of multiple CMOS cameras and for the reduction of the movies that are acquired to super-resolution images.
Open-source software: not quite endsville.
Stahl, Matthew T
2005-02-01
Open-source software will never achieve ubiquity. There are environments in which it simply does not flourish. By its nature, open-source development requires free exchange of ideas, community involvement, and the efforts of talented and dedicated individuals. However, pressures can come from several sources that prevent this from happening. In addition, openness and complex licensing issues invite misuse and abuse. Care must be taken to avoid the pitfalls of open-source software.
Developing an Open Source Option for NASA Software
NASA Technical Reports Server (NTRS)
Moran, Patrick J.; Parks, John W. (Technical Monitor)
2003-01-01
We present arguments in favor of developing an Open Source option for NASA software; in particular we discuss how Open Source is compatible with NASA's mission. We compare and contrast several of the leading Open Source licenses, and propose one - the Mozilla license - for use by NASA. We also address some of the related issues for NASA with respect to Open Source. In particular, we discuss some of the elements in the External Release of NASA Software document (NPG 2210.1A) that will likely have to be changed in order to make Open Source a reality withm the agency.
NASA Astrophysics Data System (ADS)
Tajudin, Nor'ain Mohd.; Saad, Noor Shah; Rahman, Nurulhuda Abd; Yahaya, Asmayati; Alimon, Hasimah; Dollah, Mohd. Uzi; Abd Karim, Mohd. Mustaman
2012-05-01
The objectives of this quantitative survey research were (1) to establish the level of scientific reasoning (SR) skills among science, mathematics and engineering (SME) undergraduates in Malaysian Institute of Higher Learning (IHL); (b) to identify the types of instructional methods in teaching SME at universities; and (c) to map instructional methods employed to the level of SR skills among the undergraduates. There were six universities according to zone involved in this study using the stratification random sampling technique. For each university, the faculties that involved were faculties which have degree students in science, mathematics and engineering programme. A total of 975 students were participated in this study. There were two instruments used in this study namely, the Lawson Scientific Reasoning Skills Test and the Lecturers' Teaching Style Survey. The descriptive statistics and the inferential statistics such as mean, t-test and Pearson correlation were used to analyze the data. Findings of the study showed that most students had concrete level of scientific reasoning skills where the overall mean was 3.23. The expert and delegator were dominant lecturers' teaching styles according to students' perception. In addition, there was no correlation between lecturers' teaching style and the level of scientific reasoning skills. Thus, this study cannot map the dominant lecturers' teaching style to the level of scientific reasoning skills of Science, Mathematics and Engineering undergraduates in Malaysian Public Institute of Higher Learning. Nevertheless, this study gave some indications that the expert and delegator teaching styles were not contributed to the development of students' scientific reasoning skills. This study can be used as a baseline for Science, Mathematics and Engineering undergraduates' level of scientific reasoning skills in Malaysian Public Institute of Higher Learning. Overall, this study also opens an endless source of other researchers to investigate more areas on scientific reasoning skills so that the potential instructional model can be developed to enhance students' level of scientific reasoning skills in Malaysian Public Institute of Higher Learning.
A robust scientific workflow for assessing fire danger levels using open-source software
NASA Astrophysics Data System (ADS)
Vitolo, Claudia; Di Giuseppe, Francesca; Smith, Paul
2017-04-01
Modelling forest fires is theoretically and computationally challenging because it involves the use of a wide variety of information, in large volumes and affected by high uncertainties. In-situ observations of wildfire, for instance, are highly sparse and need to be complemented by remotely sensed data measuring biomass burning to achieve homogeneous coverage at global scale. Fire models use weather reanalysis products to measure energy release and rate of spread but can only assess the potential predictability of fire danger as the actual ignition is due to human behaviour and, therefore, very unpredictable. Lastly, fire forecasting systems rely on weather forecasts to extend the advance warning but are currently calibrated using fire danger thresholds that are defined at global scale and do not take into account the spatial variability of fuel availability. As a consequence, uncertainties sharply increase cascading from the observational to the modelling stage and they might be further inflated by non-reproducible analyses. Although uncertainties in observations will only decrease with technological advances over the next decades, the other uncertainties (i.e. generated during modelling and post-processing) can already be addressed by developing transparent and reproducible analysis workflows, even more if implemented within open-source initiatives. This is because reproducible workflows aim to streamline the processing task as they present ready-made solutions to handle and manipulate complex and heterogeneous datasets. Also, opening the code to the scrutiny of other experts increases the chances to implement more robust solutions and avoids duplication of efforts. In this work we present our contribution to the forest fire modelling community: an open-source tool called "caliver" for the calibration and verification of forest fire model results. This tool is developed in the R programming language and publicly available under an open license. We will present the caliver R package, illustrate the main functionalities and show the results of our preliminary experiments calculating fire danger thresholds for various regions on Earth. We will compare these with the existing global thresholds and, lastly, demonstrate how these newly-calculated regional thresholds can lead to improved calibration of fire forecast models in an operational setting.
Undergraduate honors students' images of science: Nature of scientific work and scientific knowledge
NASA Astrophysics Data System (ADS)
Wallace, Michael L.
This exploratory study assessed the influence of an implicit, inquiry-oriented nature of science (NOS) instructional approach undertaken in an interdisciplinary college science course on undergraduate honor students' (UHS) understanding of the aspects of NOS for scientific work and scientific knowledge. In this study, the nature of scientific work concentrated upon the delineation of science from pseudoscience and the value scientists place on reproducibility. The nature of scientific knowledge concentrated upon how UHS view scientific theories and how they believe scientists utilize scientific theories in their research. The 39 UHS who participated in the study were non-science majors enrolled in a Honors College sponsored interdisciplinary science course where the instructors took an implicit NOS instructional approach. An open-ended assessment instrument, the UFO Scenario, was designed for the course and used to assess UHS' images of science at the beginning and end of the semester. The mixed-design study employed both qualitative and quantitative techniques to analyze the open-ended responses. The qualitative techniques of open and axial coding were utilized to find recurring themes within UHS' responses. McNemar's chi-square test for two dependent samples was used to identify whether any statistically significant changes occurred within responses from the beginning to the end of the semester. At the start of the study, the majority of UHS held mixed NOS views, but were able to accurately define what a scientific theory is and explicate how scientists utilize theories within scientific research. Postinstruction assessment indicated that UHS did not make significant gains in their understanding of the nature of scientific work or scientific knowledge and their overall images of science remained static. The results of the present study found implicit NOS instruction even with an extensive inquiry-oriented component was an ineffective approach for modifying UHS' images of science towards a more informed view of NOS.
Yarkoni, Tal
2012-01-01
Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible. PMID:23060783
The last bite was deadly--about responsibility in scientific publishing.
Pavlovic, Dragan; Usichenko, Taras I; Lehmann, Christian
2014-01-01
Some open access journals are believed to have devaluated the highly respected image of the scientific journal. This has been, it is claimed, verified. Yet the project we believe failed and we show why we think that it failed. The study itself was badly conducted and the report, which Science published, was itself a perfect example of "bad science". If the article that was published in Science were to be taken as one of the "test" articles and Science as a victim journal (a perfect control though), the study would show the opposite of what author concluded in his paper: 100% of the controls (normal non-open access journals, in the present study this was Science) accepted the "bait" paper for publication, while in the experimental group only about 60% (open access journals) accepted the bait paper for publication. The conclusion is that, with respect to non-open access and open access, the probability of accepting pseudoscience is well in favor of this being done by a non-open access journal. Since this interpretation is based on some facts that were not included in the project itself, the only warranted result of this study would be that nothing could be concluded from it. It is concluded that the method that Bohannon used was heavily flawed and in addition immoral; that the report that was published by Science was inconclusive and that the act of publishing such report cannot be morally justified either. Various methods to improve the quality of published papers exist but scientific fraud with "good intentions" as a method to promote scientific publishing should be avoided.
Johns, Douglas O.; Walker, Katherine; Benromdhane, Souad; Hubbell, Bryan; Ross, Mary; Devlin, Robert B.; Costa, Daniel L.; Greenbaum, Daniel S.
2012-01-01
Objectives: The U.S. Environmental Protection Agency is working toward gaining a better understanding of the human health impacts of exposure to complex air pollutant mixtures and the key features that drive the toxicity of these mixtures, which can then be used for future scientific and risk assessments. Data sources: A public workshop was held in Chapel Hill, North Carolina, 22–24 February 2011, to discuss scientific issues and data gaps related to adopting multipollutant science and risk assessment approaches, with a particular focus on the criteria air pollutants. Expert panelists in the fields of epidemiology, toxicology, and atmospheric and exposure sciences led open discussions to encourage workshop participants to think broadly about available and emerging scientific evidence related to multipollutant approaches to evaluating the health effects of air pollution. Synthesis: Although there is clearly a need for novel research and analytical approaches to better characterize the health effects of multipollutant exposures, much progress can be made by using existing scientific information and statistical methods to evaluate the effects of single pollutants in a multipollutant context. This work will have a direct impact on the development of a multipollutant science assessment and a conceptual framework for conducting multipollutant risk assessments. Conclusions: Transitioning to a multipollutant paradigm can be aided through the adoption of a framework for multipollutant science and risk assessment that encompasses well-studied and ubiquitous air pollutants. Successfully advancing methods for conducting these assessments will require collaborative and parallel efforts between the scientific and environmental regulatory and policy communities. PMID:22645280
Berman, Jules J; Edgerton, Mary E; Friedman, Bruce A
2003-01-01
Background Tissue Microarrays (TMAs) allow researchers to examine hundreds of small tissue samples on a single glass slide. The information held in a single TMA slide may easily involve Gigabytes of data. To benefit from TMA technology, the scientific community needs an open source TMA data exchange specification that will convey all of the data in a TMA experiment in a format that is understandable to both humans and computers. A data exchange specification for TMAs allows researchers to submit their data to journals and to public data repositories and to share or merge data from different laboratories. In May 2001, the Association of Pathology Informatics (API) hosted the first in a series of four workshops, co-sponsored by the National Cancer Institute, to develop an open, community-supported TMA data exchange specification. Methods A draft tissue microarray data exchange specification was developed through workshop meetings. The first workshop confirmed community support for the effort and urged the creation of an open XML-based specification. This was to evolve in steps with approval for each step coming from the stakeholders in the user community during open workshops. By the fourth workshop, held October, 2002, a set of Common Data Elements (CDEs) was established as well as a basic strategy for organizing TMA data in self-describing XML documents. Results The TMA data exchange specification is a well-formed XML document with four required sections: 1) Header, containing the specification Dublin Core identifiers, 2) Block, describing the paraffin-embedded array of tissues, 3)Slide, describing the glass slides produced from the Block, and 4) Core, containing all data related to the individual tissue samples contained in the array. Eighty CDEs, conforming to the ISO-11179 specification for data elements constitute XML tags used in the TMA data exchange specification. A set of six simple semantic rules describe the complete data exchange specification. Anyone using the data exchange specification can validate their TMA files using a software implementation written in Perl and distributed as a supplemental file with this publication. Conclusion The TMA data exchange specification is now available in a draft form with community-approved Common Data Elements and a community-approved general file format and data structure. The specification can be freely used by the scientific community. Efforts sponsored by the Association for Pathology Informatics to refine the draft TMA data exchange specification are expected to continue for at least two more years. The interested public is invited to participate in these open efforts. Information on future workshops will be posted at (API we site). PMID:12769826
Open-Source Data and the Study of Homicide.
Parkin, William S; Gruenewald, Jeff
2015-07-20
To date, no discussion has taken place in the social sciences as to the appropriateness of using open-source data to augment, or replace, official data sources in homicide research. The purpose of this article is to examine whether open-source data have the potential to be used as a valid and reliable data source in testing theory and studying homicide. Official and open-source homicide data were collected as a case study in a single jurisdiction over a 1-year period. The data sets were compared to determine whether open-sources could recreate the population of homicides and variable responses collected in official data. Open-source data were able to replicate the population of homicides identified in the official data. Also, for every variable measured, the open-sources captured as much, or more, of the information presented in the official data. Also, variables not available in official data, but potentially useful for testing theory, were identified in open-sources. The results of the case study show that open-source data are potentially as effective as official data in identifying individual- and situational-level characteristics, provide access to variables not found in official homicide data, and offer geographic data that can be used to link macro-level characteristics to homicide events. © The Author(s) 2015.
The Open Spectral Database: an open platform for sharing and searching spectral data.
Chalk, Stuart J
2016-01-01
A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.
NASA Astrophysics Data System (ADS)
Beggrow, Elizabeth P.; Ha, Minsu; Nehm, Ross H.; Pearl, Dennis; Boone, William J.
2014-02-01
The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices—such as explanation, argumentation, and communication—in science teaching, learning, and assessment. A major challenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple-choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit >1.3), while both oral interview measures and computer-generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple-choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation measures (1) have the strongest correspondence to oral interview measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learning tools for assessing key scientific practices highlighted in the new Framework for Science Education.
NASA Astrophysics Data System (ADS)
Schwietzke, S.; Sherwood, O.; Michel, S. E.; Bruhwiler, L.; Dlugokencky, E. J.; Tans, P. P.
2017-12-01
Methane isotopic data have increasingly been used in recent studies to help constrain global atmospheric methane sources and sinks. The added scientific contributions to this field include (i) careful comparisons and merging of atmospheric isotope measurement datasets to increase spatial coverage, (ii) in-depth analyses of observed isotopic spatial gradients and seasonal patterns, and (iii) improved datasets of isotopic source signatures. Different interpretations have been made regarding the utility of the isotopic data on the diagnosis of methane sources and sinks. Some studies have found isotopic evidence of a largely microbial source causing the renewed growth in global atmospheric methane since 2007, and underestimated global fossil fuel methane emissions compared to most previous studies. However, other studies have challenged these conclusions by pointing out substantial spatial variability in isotopic source signatures as well as open questions in atmospheric sinks and biomass burning trends. This presentation will review and contrast the main arguments and evidence for the different conclusions. The analysis will distinguish among the different research objectives including (i) global methane budget source attribution in steady-state, (ii) source attribution of recent global methane trends, and (iii) identifying specific methane sources in individual plumes during field campaigns. Additional comparisons of model experiments with atmospheric measurements and updates on isotopic source signature data will complement the analysis.
Getman, Rebekah; Saraf, Avinash; Zhang, Lily H.; Kalenderian, Elsbeth
2015-01-01
Objectives. In an antifluoridation case study, we explored digital pandemics and the social spread of scientifically inaccurate health information across the Web, and we considered the potential health effects. Methods. Using the social networking site Facebook and the open source applications Netvizz and Gephi, we analyzed the connectedness of antifluoride networks as a measure of social influence, the social diffusion of information based on conversations about a sample scientific publication as a measure of spread, and the engagement and sentiment about the publication as a measure of attitudes and behaviors. Results. Our study sample was significantly more connected than was the social networking site overall (P < .001). Social diffusion was evident; users were forced to navigate multiple pages or never reached the sample publication being discussed 60% and 12% of the time, respectively. Users had a 1 in 2 chance of encountering negative and nonempirical content about fluoride unrelated to the sample publication. Conclusions. Network sociology may be as influential as the information content and scientific validity of a particular health topic discussed using social media. Public health must employ social strategies for improved communication management. PMID:25602893
Harnessing the power of emerging petascale platforms
NASA Astrophysics Data System (ADS)
Mellor-Crummey, John
2007-07-01
As part of the US Department of Energy's Scientific Discovery through Advanced Computing (SciDAC-2) program, science teams are tackling problems that require computational simulation and modeling at the petascale. A grand challenge for computer science is to develop software technology that makes it easier to harness the power of these systems to aid scientific discovery. As part of its activities, the SciDAC-2 Center for Scalable Application Development Software (CScADS) is building open source software tools to support efficient scientific computing on the emerging leadership-class platforms. In this paper, we describe two tools for performance analysis and tuning that are being developed as part of CScADS: a tool for analyzing scalability and performance, and a tool for optimizing loop nests for better node performance. We motivate these tools by showing how they apply to S3D, a turbulent combustion code under development at Sandia National Laboratory. For S3D, our node performance analysis tool helped uncover several performance bottlenecks. Using our loop nest optimization tool, we transformed S3D's most costly loop nest to reduce execution time by a factor of 2.94 for a processor working on a 503 domain.
OOI CyberInfrastructure - Next Generation Oceanographic Research
NASA Astrophysics Data System (ADS)
Farcas, C.; Fox, P.; Arrott, M.; Farcas, E.; Klacansky, I.; Krueger, I.; Meisinger, M.; Orcutt, J.
2008-12-01
Software has become a key enabling technology for scientific discovery, observation, modeling, and exploitation of natural phenomena. New value emerges from the integration of individual subsystems into networked federations of capabilities exposed to the scientific community. Such data-intensive interoperability networks are crucial for future scientific collaborative research, as they open up new ways of fusing data from different sources and across various domains, and analysis on wide geographic areas. The recently established NSF OOI program, through its CyberInfrastructure component addresses this challenge by providing broad access from sensor networks for data acquisition up to computational grids for massive computations and binding infrastructure facilitating policy management and governance of the emerging system-of-scientific-systems. We provide insight into the integration core of this effort, namely, a hierarchic service-oriented architecture for a robust, performant, and maintainable implementation. We first discuss the relationship between data management and CI crosscutting concerns such as identity management, policy and governance, which define the organizational contexts for data access and usage. Next, we detail critical services including data ingestion, transformation, preservation, inventory, and presentation. To address interoperability issues between data represented in various formats we employ a semantic framework derived from the Earth System Grid technology, a canonical representation for scientific data based on DAP/OPeNDAP, and related data publishers such as ERDDAP. Finally, we briefly present the underlying transport based on a messaging infrastructure over the AMQP protocol, and the preservation based on a distributed file system through SDSC iRODS.
Computer simulation of the CSPAD, ePix10k, and RayonixMX170HS X-ray detectors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tina, Adrienne
2015-08-21
The invention of free-electron lasers (FELs) has opened a door to an entirely new level of scientific research. The Linac Coherent Light Source (LCLS) at SLAC National Accelerator Laboratory is an X-ray FEL that houses several instruments, each with its own unique X-ray applications. This light source is revolutionary in that while its properties allow for a whole new range of scientific opportunities, it also poses numerous challenges. For example, the intensity of a focused X-ray beam is enough to damage a sample in one mere pulse; however, the pulse speed and extreme brightness of the source together are enoughmore » to obtain enough information about that sample, so that no further measurements are necessary. An important device in the radiation detection process, particularly for X-ray imaging, is the detector. The power of the LCLS X-rays has instigated a need for better performing detectors. The research conducted for this project consisted of the study of X-ray detectors to imitate their behaviors in a computer program. The analysis of the Rayonix MX170-HS, CSPAD, and ePix10k in particular helped to understand their properties. This program simulated the interaction of X-ray photons with these detectors to discern the patterns of their responses. A scientist’s selection process of a detector for a specific experiment is simplified from the characterization of the detectors in the program.« less
The Chandra X-Ray Observatory: Progress Report and Highlights
NASA Technical Reports Server (NTRS)
Weisskopf, Martin C.
2012-01-01
Over the past 13 years, the Chandra X-ray Observatory's ability to provide high resolution X-ray images and spectra have established it as one of the most versatile and powerful tools for astrophysical research in the 21st century. Chandra explores the hot, high-energy regions of the universe, observing X-ray sources with fluxes spanning more than 10 orders of magnitude, from the X-ray brightest, Sco X-1, to the faintest sources in the Chandra Deep Field South survey. Thanks to its continuing operational life, the Chandra mission now also provides a long observing baseline which, in and of itself, is opening new research opportunities. Observations in the past few years alone have deepened our understanding of the co-evolution of supermassive black holes and galaxies, the details of black hole accretion, the nature of dark energy and dark matter, the details of supernovae and their progenitors, the interiors of neutron stars, the evolution of massive stars, and the high-energy environment of protoplanetary nebulae and the interaction of an exo-planet with its star. Here we update the technical status, highlight some of the scientific results, and very briefly discuss future prospects. We fully expect that the Observatory will continue to provide outstanding scientific results for many years to come.
Bayesian modeling to assess populated areas impacted by radiation from Fukushima
NASA Astrophysics Data System (ADS)
Hultquist, C.; Cervone, G.
2017-12-01
Citizen-led movements producing spatio-temporal big data are increasingly important sources of information about populations that are impacted by natural disasters. Citizen science can be used to fill gaps in disaster monitoring data, in addition to inferring human exposure and vulnerability to extreme environmental impacts. As a response to the 2011 release of radiation from Fukushima, Japan, the Safecast project began collecting open radiation data which grew to be a global dataset of over 70 million measurements to date. This dataset is spatially distributed primarily where humans are located and demonstrates abnormal patterns of population movements as a result of the disaster. Previous work has demonstrated that Safecast is highly correlated in comparison to government radiation observations. However, there is still a scientific need to understand the geostatistical variability of Safecast data and to assess how reliable the data are over space and time. The Bayesian hierarchical approach can be used to model the spatial distribution of datasets and flexibly integrate new flows of data without losing previous information. This enables an understanding of uncertainty in the spatio-temporal data to inform decision makers on areas of high levels of radiation where populations are located. Citizen science data can be scientifically evaluated and used as a critical source of information about populations that are impacted by a disaster.
ERIC Educational Resources Information Center
Aydeniz, Mehmet; Baksa, Kristen; Skinner, Jane
2011-01-01
The purpose of this study was to understand the impact of an apprenticeship program on high school students' understanding of the nature of scientific inquiry. Data related to seventeen students' understanding of science and scientific inquiry were collected through open-ended questionnaires. Findings suggest that although engagement in authentic…
Contaminants of emerging concern in the open sea waters of the Western Mediterranean.
Brumovský, Miroslav; Bečanová, Jitka; Kohoutek, Jiří; Borghini, Mireno; Nizzetto, Luca
2017-10-01
Pollution by chemical substances is of concern for the maintenance of healthy and sustainable aquatic environments. While the occurrence and fate of numerous emerging contaminants, especially pharmaceuticals, is well documented in freshwater, their occurrence and behavior in coastal and marine waters is much less studied and understood. This study investigates the occurrence of 58 chemicals in the open surface water of the Western Mediterranean Sea for the first time. 70 samples in total were collected in 10 different sampling areas. 3 pesticides, 11 pharmaceuticals and personal care products and 2 artificial sweeteners were detected at sub-ng to ng/L levels. Among them, the herbicide terbuthylazine, the pharmaceuticals caffeine, carbamazepine, naproxen and paracetamol, the antibiotic sulfamethoxazole, the antibacterial triclocarban and the two artificial sweeteners acesulfame and saccharin were detected in all samples. The compound detected at the highest concentration was saccharin (up to 5.23 ng/L). Generally small spatial differences among individual sampling areas point to a diffuse character of sources which are likely dominated by WWTP effluents and runoffs from agricultural areas or even, at least for pharmaceuticals and artificial food additives, from offshore sources such as ferries and cruising ships. The implications of the ubiquitous presence in the open sea of chemicals that are bio-active or toxic at low doses on photosynthetic organisms and/or bacteria (i.e., terbuthylazine, sulfamethoxazole or triclocarban) deserve scientific attention, especially concerning possible subtle impacts from chronic exposure of pelagic microorganisms. Copyright © 2017 Elsevier Ltd. All rights reserved.
Visualization and Quality Control Web Tools for CERES Products
NASA Astrophysics Data System (ADS)
Mitrescu, C.; Doelling, D. R.; Rutan, D. A.
2016-12-01
The CERES project continues to provide the scientific community a wide variety of satellite-derived data products such as observed TOA broadband shortwave and longwave observed fluxes, computed TOA and Surface fluxes, as well as cloud, aerosol, and other atmospheric parameters. They encompass a wide range of temporal and spatial resolutions, suited to specific applications. Now in its 16-year, CERES products are mostly used by climate modeling communities that focus on global mean energetics, meridianal heat transport, and climate trend studies. In order to serve all our users, we developed a web-based Ordering and Visualization Tool (OVT). Using Opens Source Software such as Eclipse, java, javascript, OpenLayer, Flot, Google Maps, python, and others, the OVT Team developed a series of specialized functions to be used in the process of CERES Data Quality Control (QC). We mention 1- and 2-D histogram, anomaly, deseasonalization, temporal and spatial averaging, side-by-side parameter comparison, and others that made the process of QC far easier and faster, but more importantly far more portable. We are now in the process of integrating ground site observed surface fluxes to further facilitate the CERES project to QC the CERES computed surface fluxes. These features will give users the opportunity to perform their own comparisons of the CERES computed surface fluxes and observed ground site fluxes. An overview of the CERES OVT basic functions using Open Source Software, as well as future steps in expanding its capabilities will be presented at the meeting.
International Symposium on Grids and Clouds (ISGC) 2017
NASA Astrophysics Data System (ADS)
2017-03-01
The International Symposium on Grids and Clouds (ISGC) 2017 will be held at Academia Sinica in Taipei, Taiwan from 5-10 March 2017, with co- located events and workshops. The main theme of ISGC 2017 is "Global Challenges: From Open Data to Open Science". The unprecedented progress in ICT has transformed the way education is conducted and research is carried out. The emerging global e-Infrastructure, championed by global science communities such as High Energy Physics, Astronomy, and Bio- medicine, must permeate into other sciences. Many areas, such as climate change, disaster mitigation, and human sustainability and well-being, represent global challenges where collaboration over e-Infrastructure will presumably help resolve the common problems of the people who are impacted. Access to global e-Infrastructure helps also the less globally organized, long-tail sciences, with their own collaboration challenges. Open data are not only a political phenomenon serving government transparency; they also create an opportunity to eliminate access barriers to all scientific data, specifically data from global sciences and regional data that concern natural phenomena and people. In this regard, the purpose of open data is to improve sciences, accelerating specifically those that may benefit people. Nevertheless, to eliminate barriers to open data is itself a daunting task and the barriers to individuals, institutions and big collaborations are manifold. Open science is a step beyond open data, where the tools and understanding of scientific data must be made available to whoever is interested to participate in such scientific research. The promotion of open science may change the academic tradition practiced over the past few hundred years. This change of dynamics may contribute to the resolution of common challenges of human sustainability where the current pace of scientific progress is not sufficiently fast. ISGC 2017 created a face-to-face venue where individual communities and national representatives can present and share their contributions to the global puzzle and contribute thus to the solution of global challenges.
ERIC Educational Resources Information Center
Kapor, Mitchell
2005-01-01
Open source software projects involve the production of goods, but in software projects, the "goods" consist of information. The open source model is an alternative to the conventional centralized, command-and-control way in which things are usually made. In contrast, open source projects are genuinely decentralized and transparent. Transparent…
Quantum game theory and open access publishing
NASA Astrophysics Data System (ADS)
Hanauske, Matthias; Bernius, Steffen; Dugall, Berndt
2007-08-01
The digital revolution of the information age and in particular the sweeping changes of scientific communication brought about by computing and novel communication technology, potentiate global, high grade scientific information for free. The arXiv, for example, is the leading scientific communication platform, mainly for mathematics and physics, where everyone in the world has free access on. While in some scientific disciplines the open access way is successfully realized, other disciplines (e.g. humanities and social sciences) dwell on the traditional path, even though many scientists belonging to these communities approve the open access principle. In this paper we try to explain these different publication patterns by using a game theoretical approach. Based on the assumption, that the main goal of scientists is the maximization of their reputation, we model different possible game settings, namely a zero sum game, the prisoners’ dilemma case and a version of the stag hunt game, that show the dilemma of scientists belonging to “non-open access communities”. From an individual perspective, they have no incentive to deviate from the Nash equilibrium of traditional publishing. By extending the model using the quantum game theory approach it can be shown, that if the strength of entanglement exceeds a certain value, the scientists will overcome the dilemma and terminate to publish only traditionally in all three settings.
Ardal, Christine; Alstadsæter, Annette; Røttingen, John-Arne
2011-09-28
Innovation through an open source model has proven to be successful for software development. This success has led many to speculate if open source can be applied to other industries with similar success. We attempt to provide an understanding of open source software development characteristics for researchers, business leaders and government officials who may be interested in utilizing open source innovation in other contexts and with an emphasis on drug discovery. A systematic review was performed by searching relevant, multidisciplinary databases to extract empirical research regarding the common characteristics and barriers of initiating and maintaining an open source software development project. Common characteristics to open source software development pertinent to open source drug discovery were extracted. The characteristics were then grouped into the areas of participant attraction, management of volunteers, control mechanisms, legal framework and physical constraints. Lastly, their applicability to drug discovery was examined. We believe that the open source model is viable for drug discovery, although it is unlikely that it will exactly follow the form used in software development. Hybrids will likely develop that suit the unique characteristics of drug discovery. We suggest potential motivations for organizations to join an open source drug discovery project. We also examine specific differences between software and medicines, specifically how the need for laboratories and physical goods will impact the model as well as the effect of patents.
Open Source Paradigm: A Synopsis of The Cathedral and the Bazaar for Health and Social Care.
Benson, Tim
2016-07-04
Open source software (OSS) is becoming more fashionable in health and social care, although the ideas are not new. However progress has been slower than many had expected. The purpose is to summarise the Free/Libre Open Source Software (FLOSS) paradigm in terms of what it is, how it impacts users and software engineers and how it can work as a business model in health and social care sectors. Much of this paper is a synopsis of Eric Raymond's seminal book The Cathedral and the Bazaar, which was the first comprehensive description of the open source ecosystem, set out in three long essays. Direct quotes from the book are used liberally, without reference to specific passages. The first part contrasts open and closed source approaches to software development and support. The second part describes the culture and practices of the open source movement. The third part considers business models. A key benefit of open source is that users can access and collaborate on improving the software if they wish. Closed source code may be regarded as a strategic business risk that that may be unacceptable if there is an open source alternative. The sharing culture of the open source movement fits well with that of health and social care.
Open access publishing, article downloads, and citations: randomised controlled trial
Lewenstein, Bruce V; Simon, Daniel H; Booth, James G; Connolly, Mathew J L
2008-01-01
Objective To measure the effect of free access to the scientific literature on article downloads and citations. Design Randomised controlled trial. Setting 11 journals published by the American Physiological Society. Participants 1619 research articles and reviews. Main outcome measures Article readership (measured as downloads of full text, PDFs, and abstracts) and number of unique visitors (internet protocol addresses). Citations to articles were gathered from the Institute for Scientific Information after one year. Interventions Random assignment on online publication of articles published in 11 scientific journals to open access (treatment) or subscription access (control). Results Articles assigned to open access were associated with 89% more full text downloads (95% confidence interval 76% to 103%), 42% more PDF downloads (32% to 52%), and 23% more unique visitors (16% to 30%), but 24% fewer abstract downloads (−29% to −19%) than subscription access articles in the first six months after publication. Open access articles were no more likely to be cited than subscription access articles in the first year after publication. Fifty nine per cent of open access articles (146 of 247) were cited nine to 12 months after publication compared with 63% (859 of 1372) of subscription access articles. Logistic and negative binomial regression analysis of article citation counts confirmed no citation advantage for open access articles. Conclusions Open access publishing may reach more readers than subscription access publishing. No evidence was found of a citation advantage for open access articles in the first year after publication. The citation advantage from open access reported widely in the literature may be an artefact of other causes. PMID:18669565
AGU Council adopts position statement on scientific expression
NASA Astrophysics Data System (ADS)
Landau, Elizabeth; Uhlenbrock, Kristan
2011-09-01
On 17 August the AGU Council voted to adopt an American Meteorological Society (AMS) statement on free and open communication of scientific findings as an official position of AGU. The statement appears below. Recent attacks on scientists who present facts that are controversial or politically charged, such as in cases involving climate science, have sparked action by AGU and other scientific societies, including the American Association for the Advancement of Science. Open communication and collaboration are essential to the scientific process and must not be deterred by politics, media, or faith. In a recent letter to the New York Times, AGU president Michael McPhaden stated that “misguided attempts to suppress scientific research, particularly through political pressure, will not make climate change or the role human activity plays in it magically disappear. It will, however, make the objective knowledge needed to inform good policy decisions disappear.”
NASA Astrophysics Data System (ADS)
Salvi, S.; Poland, M. P.; Sigmundsson, F.; Puglisi, G.; Borgstrom, S.; Ergintav, S.; Vogfjord, K. S.; Fournier, N.; Hamling, I. J.; Mothes, P. A.; Savvaidis, A.; Wicks, C. W., Jr.
2017-12-01
In 2010, the Geohazard Supersites and Natural Laboratories initiative (GSNL) established, in the framework of GEO, the concept of a global partnership among the geophysical scientific community, space agencies, and in-situ data providers, with the aim of promoting scientific advancements in the understanding of seismic and volcanic phenomena. This goal is achieved through open sharing of large volumes of remote sensing and in-situ data from specific volcanic or seismic areas of particularly high risk or scientific interest (the Supersites) as proposed by the scientific community. Data provision to the Supersites is coordinated by local research and monitoring institutions, which deploy and manage geophysical monitoring networks and have an institutional mandate for the provision of scientific data and services to the national government and other regional users. Starting in 2015, following the changes in GEO and the call for action given by the Sendai Framework 2015-2030, the GSNL initiative has promoted the rapid uptake of newly developed scientific information for maximum societal benefit in Disaster Risk Management (DRM). While the procedures by which the scientific products are provided to the local decision makers depend on the different national operational frameworks and are largely independent of the Supersite existence, the quality of the scientific information, and thus its actual benefit for DRM, is considerably enhanced at each Supersite. This growth in scientific understanding of specific volcanic and seismic areas is not only due to wider accessibility of data, but also to the increased collaboration and sharing of resources and capacities that occurs inside the Supersite scientific community. For maximum effectiveness, the GSNL initiative supports an Open Science approach, where different collaboration and communication approaches and technological solutions are developed, tested, and shared, thereby helping to sustain the scientific investigation process. We will present at the meeting the latest developments and results of the GSNL Supersite network.
Weather forecasting with open source software
NASA Astrophysics Data System (ADS)
Rautenhaus, Marc; Dörnbrack, Andreas
2013-04-01
To forecast the weather situation during aircraft-based atmospheric field campaigns, we employ a tool chain of existing and self-developed open source software tools and open standards. Of particular value are the Python programming language with its extension libraries NumPy, SciPy, PyQt4, Matplotlib and the basemap toolkit, the NetCDF standard with the Climate and Forecast (CF) Metadata conventions, and the Open Geospatial Consortium Web Map Service standard. These open source libraries and open standards helped to implement the "Mission Support System", a Web Map Service based tool to support weather forecasting and flight planning during field campaigns. The tool has been implemented in Python and has also been released as open source (Rautenhaus et al., Geosci. Model Dev., 5, 55-71, 2012). In this presentation we discuss the usage of free and open source software for weather forecasting in the context of research flight planning, and highlight how the field campaign work benefits from using open source tools and open standards.
Open Source Software Development
2011-01-01
Software, 2002, 149(1), 3-17. 3. DiBona , C., Cooper, D., and Stone, M. (Eds.), Open Sources 2.0, 2005, O’Reilly Media, Sebastopol, CA. Also see, C... DiBona , S. Ockman, and M. Stone (Eds.). Open Sources: Vocides from the Open Source Revolution, 1999. O’Reilly Media, Sebastopol, CA. 4. Ducheneaut, N
NASA Astrophysics Data System (ADS)
Cannata, Massimiliano; Ratnayake, Rangajeewa; Antonovic, Milan; Strigaro, Daniele
2017-04-01
Environmental monitoring systems in low economies countries are often in decline, outdated or missing with the consequence that there is a very scarce availability and accessibility to these information that are vital for coping and mitigating natural hazards. Non-conventional monitoring systems based on open technologies may constitute a viable solution to create low cost and sustainable monitoring systems that may be fully developed, deployed and maintained at local level without lock-in dependances on copyrights or patents or high costs of replacements. The 4onse research project , funded under the Research for Development program of the Swiss National Science Foundation and the Swiss Office for Development and Cooperation, propose a complete monitoring system that integrates Free & Open Source Software, Open Hardware, Open Data, and Open Standards. After its engineering, it will be tested in the Deduru Oya catchment (Sri Lanka) to evaluate the system and develop a water management information system to optimize the regulation of artificial basins levels and mitigate flash floods. One of the objective is to better scientifically understand strengths, criticalities and applicabilities in terms of data quality; system durability; management costs; performances; sustainability. Results, challenges and experiences from the first six months of the projects will be presented with particular focus on the activities of synergies building and data collection and dissemination system advances.
Adapting federated cyberinfrastructure for shared data collection facilities in structural biology
Stokes-Rees, Ian; Levesque, Ian; Murphy, Frank V.; Yang, Wei; Deacon, Ashley; Sliz, Piotr
2012-01-01
Early stage experimental data in structural biology is generally unmaintained and inaccessible to the public. It is increasingly believed that this data, which forms the basis for each macromolecular structure discovered by this field, must be archived and, in due course, published. Furthermore, the widespread use of shared scientific facilities such as synchrotron beamlines complicates the issue of data storage, access and movement, as does the increase of remote users. This work describes a prototype system that adapts existing federated cyberinfrastructure technology and techniques to significantly improve the operational environment for users and administrators of synchrotron data collection facilities used in structural biology. This is achieved through software from the Virtual Data Toolkit and Globus, bringing together federated users and facilities from the Stanford Synchrotron Radiation Lightsource, the Advanced Photon Source, the Open Science Grid, the SBGrid Consortium and Harvard Medical School. The performance and experience with the prototype provide a model for data management at shared scientific facilities. PMID:22514186
Adapting federated cyberinfrastructure for shared data collection facilities in structural biology.
Stokes-Rees, Ian; Levesque, Ian; Murphy, Frank V; Yang, Wei; Deacon, Ashley; Sliz, Piotr
2012-05-01
Early stage experimental data in structural biology is generally unmaintained and inaccessible to the public. It is increasingly believed that this data, which forms the basis for each macromolecular structure discovered by this field, must be archived and, in due course, published. Furthermore, the widespread use of shared scientific facilities such as synchrotron beamlines complicates the issue of data storage, access and movement, as does the increase of remote users. This work describes a prototype system that adapts existing federated cyberinfrastructure technology and techniques to significantly improve the operational environment for users and administrators of synchrotron data collection facilities used in structural biology. This is achieved through software from the Virtual Data Toolkit and Globus, bringing together federated users and facilities from the Stanford Synchrotron Radiation Lightsource, the Advanced Photon Source, the Open Science Grid, the SBGrid Consortium and Harvard Medical School. The performance and experience with the prototype provide a model for data management at shared scientific facilities.
Parallel and Scalable Clustering and Classification for Big Data in Geosciences
NASA Astrophysics Data System (ADS)
Riedel, M.
2015-12-01
Machine learning, data mining, and statistical computing are common techniques to perform analysis in earth sciences. This contribution will focus on two concrete and widely used data analytics methods suitable to analyse 'big data' in the context of geoscience use cases: clustering and classification. From the broad class of available clustering methods we focus on the density-based spatial clustering of appliactions with noise (DBSCAN) algorithm that enables the identification of outliers or interesting anomalies. A new open source parallel and scalable DBSCAN implementation will be discussed in the light of a scientific use case that detects water mixing events in the Koljoefjords. The second technique we cover is classification, with a focus set on the support vector machines algorithm (SVMs), as one of the best out-of-the-box classification algorithm. A parallel and scalable SVM implementation will be discussed in the light of a scientific use case in the field of remote sensing with 52 different classes of land cover types.
Agnotology: learning from mistakes
NASA Astrophysics Data System (ADS)
Benestad, R. E.; Hygen, H. O.; van Dorland, R.; Cook, J.; Nuccitelli, D.
2013-05-01
Replication is an important part of science, and by repeating past analyses, we show that a number of papers in the scientific literature contain severe methodological flaws which can easily be identified through simple tests and demonstrations. In many cases, shortcomings are related to a lack of robustness, leading to results that are not universally valid but rather an artifact of a particular experimental set-up. Some examples presented here have ignored data that do not fit the conclusions, and in several other cases, inappropriate statistical methods have been adopted or conclusions have been based on misconceived physics. These papers may serve as educational case studies for why certain analytical approaches sometimes are unsuitable in providing reliable answers. They also highlight the merit of replication. A lack of common replication has repercussions for the quality of the scientific literature, and may be a reason why some controversial questions remain unanswered even when ignorance could be reduced. Agnotology is the study of such ignorance. A free and open-source software is provided for demonstration purposes.
Consensus nomenclature rules for radiopharmaceutical chemistry — Setting the record straight
Coenen, Heinz H.; Gee, Antony D.; Adam, Michael; ...
2017-10-21
Over recent years, within the community of radiopharmaceutical sciences, there has been an increased incidence of incorrect usage of established scientific terms and conventions, and even the emergence of ‘self-invented’ terms. Here, in order to address these concerns, an international Working Group on ‘Nomenclature in Radiopharmaceutical Chemistry and related areas’ was established in 2015 to achieve clarification of terms and to generate consensus on the utilisation of a standardised nomenclature pertinent to the field. Upon open consultation, the following consensus guidelines were agreed, which aim to: Provide a reference source for nomenclature good practice in the radiopharma-ceutical sciences; Clarify themore » use of terms and rules concerning exclusively radiopharmaceutical terminology, i.e. nuclear- and radiochemical terms, symbols and expressions; Address gaps and inconsistencies in existing radiochemistry nomenclature rules; Provide source literature for further harmonisation beyond our immediate peer group (publishers, editors, IUPAC, pharmacopoeias, etc.).« less
The Imaging X-Ray Polarimetry Explorer (IXPE)
NASA Technical Reports Server (NTRS)
Weisskopf, Martin C.; Ramsey, Brian; O’Dell, Stephen; Tennant, Allyn; Elsner, Ronald; Soffita, Paolo; Bellazzini, Ronaldo; Costa, Enrico; Kolodziejczak, Jeffery; Kaspi, Victoria;
2016-01-01
The Imaging X-ray Polarimetry Explorer (IXPE) is an exciting international collaboration for a scientific mission that dramatically brings together the unique talents of the partners to expand observation space by simultaneously adding polarization measurements to the array of source properties currently measured (energy, time, and location). IXPE uniquely brings to the table polarimetric imaging. IXPE will thus open new dimensions for understanding how X-ray emission is produced in astrophysical objects, especially systems under extreme physical conditions-such as neutron stars and black holes. Polarization singularly probes physical anisotropies-ordered magnetic fields, aspheric matter distributions, or general relativistic coupling to black-hole spin-that are not otherwise measurable. Hence, IXPE complements all other investigations in high-energy astrophysics by adding important and relatively unexplored information to the parameter space for studying cosmic X-ray sources and processes, as well as for using extreme astrophysical environments as laboratories for fundamental physics.
ePMV embeds molecular modeling into professional animation software environments.
Johnson, Graham T; Autin, Ludovic; Goodsell, David S; Sanner, Michel F; Olson, Arthur J
2011-03-09
Increasingly complex research has made it more difficult to prepare data for publication, education, and outreach. Many scientists must also wade through black-box code to interface computational algorithms from diverse sources to supplement their bench work. To reduce these barriers we have developed an open-source plug-in, embedded Python Molecular Viewer (ePMV), that runs molecular modeling software directly inside of professional 3D animation applications (hosts) to provide simultaneous access to the capabilities of these newly connected systems. Uniting host and scientific algorithms into a single interface allows users from varied backgrounds to assemble professional quality visuals and to perform computational experiments with relative ease. By enabling easy exchange of algorithms, ePMV can facilitate interdisciplinary research, smooth communication between broadly diverse specialties, and provide a common platform to frame and visualize the increasingly detailed intersection(s) of cellular and molecular biology. Copyright © 2011 Elsevier Ltd. All rights reserved.
Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools
Diaz Acosta, B.
2011-01-01
The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.
ePMV Embeds Molecular Modeling into Professional Animation Software Environments
Johnson, Graham T.; Autin, Ludovic; Goodsell, David S.; Sanner, Michel F.; Olson, Arthur J.
2011-01-01
SUMMARY Increasingly complex research has made it more difficult to prepare data for publication, education, and outreach. Many scientists must also wade through black-box code to interface computational algorithms from diverse sources to supplement their bench work. To reduce these barriers, we have developed an open-source plug-in, embedded Python Molecular Viewer (ePMV), that runs molecular modeling software directly inside of professional 3D animation applications (hosts) to provide simultaneous access to the capabilities of these newly connected systems. Uniting host and scientific algorithms into a single interface allows users from varied backgrounds to assemble professional quality visuals and to perform computational experiments with relative ease. By enabling easy exchange of algorithms, ePMV can facilitate interdisciplinary research, smooth communication between broadly diverse specialties and provide a common platform to frame and visualize the increasingly detailed intersection(s) of cellular and molecular biology. PMID:21397181
Consensus nomenclature rules for radiopharmaceutical chemistry — Setting the record straight
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coenen, Heinz H.; Gee, Antony D.; Adam, Michael
Over recent years, within the community of radiopharmaceutical sciences, there has been an increased incidence of incorrect usage of established scientific terms and conventions, and even the emergence of ‘self-invented’ terms. Here, in order to address these concerns, an international Working Group on ‘Nomenclature in Radiopharmaceutical Chemistry and related areas’ was established in 2015 to achieve clarification of terms and to generate consensus on the utilisation of a standardised nomenclature pertinent to the field. Upon open consultation, the following consensus guidelines were agreed, which aim to: Provide a reference source for nomenclature good practice in the radiopharma-ceutical sciences; Clarify themore » use of terms and rules concerning exclusively radiopharmaceutical terminology, i.e. nuclear- and radiochemical terms, symbols and expressions; Address gaps and inconsistencies in existing radiochemistry nomenclature rules; Provide source literature for further harmonisation beyond our immediate peer group (publishers, editors, IUPAC, pharmacopoeias, etc.).« less
The Kepler Mission on Two Reaction Wheels is K2
NASA Astrophysics Data System (ADS)
Haas, Michael R.; Barclay, T.; Batalha, N. M.; Bryson, S.; Caldwell, D. A.; Campbell, J.; Coughlin, J.; Howell, S. B.; Jenkins, J. M.; Klaus, T. C.; Mullally, F.; Sanderfer, D. T.; Sobeck, C. K.; Still, M. D.; Troeltzsch, J.; Twicken, J. D.
2014-01-01
Although data collection for the original Kepler mission is complete, a repurposed Kepler has the potential to discover many hundreds of new, small exoplanets around low-mass stars located in or near the ecliptic plane. This repurposing of the Kepler spacecraft, dubbed “K2,” seeks to maximize photometric performance using its two operational reaction wheels by observing in the ecliptic plane where solar torques can be carefully balanced to minimize boresight roll. The K2 mission shows great promise and, once approved, will observe many different fields during a sequence of two- to three-month campaigns over the next few years. Like the original Kepler mission, K2 has many challenges, but is anticipated to be well worth the climb scientifically. K2 can observe many thousands of new sources during each campaign and hundreds of thousands of new sources over its lifetime. In addition to its continued search for exoplanets, the K2 mission will provide access to a wide variety of scientifically interesting targets that include young and variable stars, open clusters of differing ages, star-forming regions, supernovae, white dwarfs, microlensing events, solar system objects, AGNs, normal galaxies, and the Galactic Center. Performance testing began in September, 2013, and has continued throughout the fall and early winter. The results of the first ecliptic-plane tests are described and used to predict photometric performance. A trade study reveals the likely number of targets, cadence durations, initial fields of view, and planned observing strategies. K2 is an exciting new mission that addresses a wide variety of scientific questions with expanded opportunities for community participation.
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
GIS Technologies For The New Planetary Science Archive (PSA)
NASA Astrophysics Data System (ADS)
Docasal, R.; Barbarisi, I.; Rios, C.; Macfarlane, A. J.; Gonzalez, J.; Arviset, C.; De Marchi, G.; Martinez, S.; Grotheer, E.; Lim, T.; Besse, S.; Heather, D.; Fraga, D.; Barthelemy, M.
2015-12-01
Geographical information system (GIS) is becoming increasingly used for planetary science. GIS are computerised systems for the storage, retrieval, manipulation, analysis, and display of geographically referenced data. Some data stored in the Planetary Science Archive (PSA), for instance, a set of Mars Express/Venus Express data, have spatial metadata associated to them. To facilitate users in handling and visualising spatial data in GIS applications, the new PSA should support interoperability with interfaces implementing the standards approved by the Open Geospatial Consortium (OGC). These standards are followed in order to develop open interfaces and encodings that allow data to be exchanged with GIS Client Applications, well-known examples of which are Google Earth and NASA World Wind as well as open source tools such as Openlayers. The technology already exists within PostgreSQL databases to store searchable geometrical data in the form of the PostGIS extension. An existing open source maps server is GeoServer, an instance of which has been deployed for the new PSA, uses the OGC standards to allow, among others, the sharing, processing and editing of data and spatial data through the Web Feature Service (WFS) standard as well as serving georeferenced map images through the Web Map Service (WMS). The final goal of the new PSA, being developed by the European Space Astronomy Centre (ESAC) Science Data Centre (ESDC), is to create an archive which enables science exploitation of ESA's planetary missions datasets. This can be facilitated through the GIS framework, offering interfaces (both web GUI and scriptable APIs) that can be used more easily and scientifically by the community, and that will also enable the community to build added value services on top of the PSA.
Open-source hardware for medical devices
2016-01-01
Open-source hardware is hardware whose design is made publicly available so anyone can study, modify, distribute, make and sell the design or the hardware based on that design. Some open-source hardware projects can potentially be used as active medical devices. The open-source approach offers a unique combination of advantages, including reducing costs and faster innovation. This article compares 10 of open-source healthcare projects in terms of how easy it is to obtain the required components and build the device. PMID:27158528
Open-source hardware for medical devices.
Niezen, Gerrit; Eslambolchilar, Parisa; Thimbleby, Harold
2016-04-01
Open-source hardware is hardware whose design is made publicly available so anyone can study, modify, distribute, make and sell the design or the hardware based on that design. Some open-source hardware projects can potentially be used as active medical devices. The open-source approach offers a unique combination of advantages, including reducing costs and faster innovation. This article compares 10 of open-source healthcare projects in terms of how easy it is to obtain the required components and build the device.
The case for open-source software in drug discovery.
DeLano, Warren L
2005-02-01
Widespread adoption of open-source software for network infrastructure, web servers, code development, and operating systems leads one to ask how far it can go. Will "open source" spread broadly, or will it be restricted to niches frequented by hopeful hobbyists and midnight hackers? Here we identify reasons for the success of open-source software and predict how consumers in drug discovery will benefit from new open-source products that address their needs with increased flexibility and in ways complementary to proprietary options.
NASA Astrophysics Data System (ADS)
Jaffe, D. A.
2014-12-01
Crowdfunding involves going directly to the public for financial support of your research project. It's new and cool and may help you carry out important scientific research. But before starting a crowdfunding project, I suggest you ask yourself these questions: Can I carry out a useful scientific investigation on a relatively low budget? Do I like to work on "hot" topics where traditional scientific support may not be available? Can I clearly identify the scientific questions to be addressed? Is there a constituency for this project who are likely to provide financial support? Am I willing to publish my work in a peer-reviewed, open-access journal and then see it criticized by non-scientists? Do I like to "tweet," write and publish lab notes and otherwise communicate with my backers? Can I conduct my work with complete transparency, knowing that all written and electronic information I generate may be subject to a "Freedom of Information" request? And finally…. 8. Am I unwilling to sit back as important policy-relevant research remains undone? If you answered "yes" to all of these questions, then congratulations! Crowdfunding may be right for you! In 2013, I chose to become involved in research on the air quality impacts of diesel rail traffic. This was in response to several proposals to substantially increase rail shipments of coal through the Pacific Northwest en route to Asia. At the time, I pursued funding from traditional sources to carry out a scientific investigation on the relationship between diesel rail traffic and air quality. But no funding could be found, likely due to the highly polarized nature of the debate. So I turned to crowdfunding to support this research, using Microryza.com (now Experiment.com), and quickly raised about $24,000. We carried out the first phase of this study in the summer of 2013 at two sites in the Pacific Northwest. The goals were to investigate the diesel particulate matter emissions from all trains, and possible coal dust from open coal trains. Our findings from this phase were published in 2014, and we are now working on a second phase. In this presentation, I will present the results of our work in both 2013 and 2014 and discuss the application of crowdfunding to scientific research.
An automated and integrated framework for dust storm detection based on ogc web processing services
NASA Astrophysics Data System (ADS)
Xiao, F.; Shea, G. Y. K.; Wong, M. S.; Campbell, J.
2014-11-01
Dust storms are known to have adverse effects on public health. Atmospheric dust loading is also one of the major uncertainties in global climatic modelling as it is known to have a significant impact on the radiation budget and atmospheric stability. The complexity of building scientific dust storm models is coupled with the scientific computation advancement, ongoing computing platform development, and the development of heterogeneous Earth Observation (EO) networks. It is a challenging task to develop an integrated and automated scheme for dust storm detection that combines Geo-Processing frameworks, scientific models and EO data together to enable the dust storm detection and tracking processes in a dynamic and timely manner. This study develops an automated and integrated framework for dust storm detection and tracking based on the Web Processing Services (WPS) initiated by Open Geospatial Consortium (OGC). The presented WPS framework consists of EO data retrieval components, dust storm detecting and tracking component, and service chain orchestration engine. The EO data processing component is implemented based on OPeNDAP standard. The dust storm detecting and tracking component combines three earth scientific models, which are SBDART model (for computing aerosol optical depth (AOT) of dust particles), WRF model (for simulating meteorological parameters) and HYSPLIT model (for simulating the dust storm transport processes). The service chain orchestration engine is implemented based on Business Process Execution Language for Web Service (BPEL4WS) using open-source software. The output results, including horizontal and vertical AOT distribution of dust particles as well as their transport paths, were represented using KML/XML and displayed in Google Earth. A serious dust storm, which occurred over East Asia from 26 to 28 Apr 2012, is used to test the applicability of the proposed WPS framework. Our aim here is to solve a specific instance of a complex EO data and scientific model integration problem by using a framework and scientific workflow approach together. The experimental result shows that this newly automated and integrated framework can be used to give advance near real-time warning of dust storms, for both environmental authorities and public. The methods presented in this paper might be also generalized to other types of Earth system models, leading to improved ease of use and flexibility.
SFO-Project: The New Generation of Sharable, Editable and Open-Access CFD Tutorials
NASA Astrophysics Data System (ADS)
Javaherchi, Teymour; Javaherchi, Ardeshir; Aliseda, Alberto
2016-11-01
One of the most common approaches to develop a Computational Fluid Dynamic (CFD) simulation for a new case study of interest is to search for the most similar, previously developed and validated CFD simulation among other works. A simple search would result into a pool of written/visual tutorials. However, users should spend significant amount of time and effort to find the most correct, compatible and valid tutorial in this pool and further modify it toward their simulation of interest. SFO is an open-source project with the core idea of saving the above-mentioned time and effort. This is done via documenting/sharing scientific and methodological approaches to develop CFD simulations for a wide spectrum of fundamental and industrial case studies in three different CFD solvers; STAR-CCM +, FLUENT and Open FOAM (SFO). All of the steps and required files of these tutorials are accessible and editable under the common roof of Github (a web-based Git repository hosting service). In this presentation we will present the current library of 20 + developed CFD tutorials, discuss the idea and benefit of using them, their educational values and explain how the next generation of open-access and live resource of CFD tutorials can be built further hand-in-hand within our community.
Choosing Open Source ERP Systems: What Reasons Are There For Doing So?
NASA Astrophysics Data System (ADS)
Johansson, Björn; Sudzina, Frantisek
Enterprise resource planning (ERP) systems attract a high attention and open source software does it as well. The question is then if, and if so, when do open source ERP systems take off. The paper describes the status of open source ERP systems. Based on literature review of ERP system selection criteria based on Web of Science articles, it discusses reported reasons for choosing open source or proprietary ERP systems. Last but not least, the article presents some conclusions that could act as input for future research. The paper aims at building up a foundation for the basic question: What are the reasons for an organization to adopt open source ERP systems.
Synthesizes peer-reviewed scientific literature on the biological, chemical, and hydrologic connectivity of waters and the effects that small streams, wetlands, and open waters have on larger downstream waters such as rivers, lakes, estuaries, and oceans.
Developing open-source codes for electromagnetic geophysics using industry support
NASA Astrophysics Data System (ADS)
Key, K.
2017-12-01
Funding for open-source software development in academia often takes the form of grants and fellowships awarded by government bodies and foundations where there is no conflict-of-interest between the funding entity and the free dissemination of the open-source software products. Conversely, funding for open-source projects in the geophysics industry presents challenges to conventional business models where proprietary licensing offers value that is not present in open-source software. Such proprietary constraints make it easier to convince companies to fund academic software development under exclusive software distribution agreements. A major challenge for obtaining commercial funding for open-source projects is to offer a value proposition that overcomes the criticism that such funding is a give-away to the competition. This work draws upon a decade of experience developing open-source electromagnetic geophysics software for the oil, gas and minerals exploration industry, and examines various approaches that have been effective for sustaining industry sponsorship.
Behind Linus's Law: Investigating Peer Review Processes in Open Source
ERIC Educational Resources Information Center
Wang, Jing
2013-01-01
Open source software has revolutionized the way people develop software, organize collaborative work, and innovate. The numerous open source software systems that have been created and adopted over the past decade are influential and vital in all aspects of work and daily life. The understanding of open source software development can enhance its…
ERIC Educational Resources Information Center
Kisworo, Marsudi Wahyu
2016-01-01
Information and Communication Technology (ICT)-supported learning using free and open source platform draws little attention as open source initiatives were focused in secondary or tertiary educations. This study investigates possibilities of ICT-supported learning using open source platform for primary educations. The data of this study is taken…
An Analysis of Open Source Security Software Products Downloads
ERIC Educational Resources Information Center
Barta, Brian J.
2014-01-01
Despite the continued demand for open source security software, a gap in the identification of success factors related to the success of open source security software persists. There are no studies that accurately assess the extent of this persistent gap, particularly with respect to the strength of the relationships of open source software…
Riera, M; Aibar, E
2013-05-01
Some studies suggest that open access articles are more often cited than non-open access articles. However, the relationship between open access and citations count in a discipline such as intensive care medicine has not been studied to date. The present article analyzes the effect of open access publishing of scientific articles in intensive care medicine journals in terms of citations count. We evaluated a total of 161 articles (76% being non-open access articles) published in Intensive Care Medicine in the year 2008. Citation data were compared between the two groups up until April 30, 2011. Potentially confounding variables for citation counts were adjusted for in a linear multiple regression model. The median number (interquartile range) of citations of non-open access articles was 8 (4-12) versus 9 (6-18) in the case of open access articles (p=0.084). In the highest citation range (>8), the citation count was 13 (10-16) and 18 (13-21) (p=0.008), respectively. The mean follow-up was 37.5 ± 3 months in both groups. In the 30-35 months after publication, the average number (mean ± standard deviation) of citations per article per month of non-open access articles was 0.28 ± 0.6 versus 0.38 ± 0.7 in the case of open access articles (p=0.043). Independent factors for citation advantage were the Hirsch index of the first signing author (β=0.207; p=0.015) and open access status (β=3.618; p=0.006). Open access publishing and the Hirsch index of the first signing author increase the impact of scientific articles. The open access advantage is greater for the more highly cited articles, and appears in the 30-35 months after publication. Copyright © 2012 Elsevier España, S.L. and SEMICYUC. All rights reserved.
Research on OpenStack of open source cloud computing in colleges and universities’ computer room
NASA Astrophysics Data System (ADS)
Wang, Lei; Zhang, Dandan
2017-06-01
In recent years, the cloud computing technology has a rapid development, especially open source cloud computing. Open source cloud computing has attracted a large number of user groups by the advantages of open source and low cost, have now become a large-scale promotion and application. In this paper, firstly we briefly introduced the main functions and architecture of the open source cloud computing OpenStack tools, and then discussed deeply the core problems of computer labs in colleges and universities. Combining with this research, it is not that the specific application and deployment of university computer rooms with OpenStack tool. The experimental results show that the application of OpenStack tool can efficiently and conveniently deploy cloud of university computer room, and its performance is stable and the functional value is good.
iBiology: communicating the process of science
Goodwin, Sarah S.
2014-01-01
The Internet hosts an abundance of science video resources aimed at communicating scientific knowledge, including webinars, massive open online courses, and TED talks. Although these videos are efficient at disseminating information for diverse types of users, they often do not demonstrate the process of doing science, the excitement of scientific discovery, or how new scientific knowledge is developed. iBiology (www.ibiology.org), a project that creates open-access science videos about biology research and science-related topics, seeks to fill this need by producing videos by science leaders that make their ideas, stories, and experiences available to anyone with an Internet connection. PMID:25080124